Abstract:To solve the problems of hierarchical Softmax algorithm in the training process of word vectors with inability of incremental training and inefficient training of massive data, the dynamic hierarchical Softmax algorithm was proposed. By the incremental loading of data samples, an adaptive Huffman coding tree was dynamically constructed by the node adjustment replacement method. To avoid the oscillatory decline of loss function due to the small sample size, the firstorder and the secondorder moment estimations of the gradient were used to dynamically adjust the parameters update direction and learning rate. The weight variation range and the convergence training network error were reduced by the gradient descent algorithm to improve the training efficiency of the word vector from massive data. The Wikipedia Chinese corpus was adopted as the data to test the training efficiency and quality. The experimental results show that the dynamic hierarchical Softmax algorithm can significantly improve the training efficiency and ensure the quality of word vector training. When the incremental samples are from 10 kB to 1 MB, the training speed is increased about 30 times, which can effectively shorten the training period.