基于神经网络语言模型的动态层序Softmax训练算法

doi:10.3969/j.issn.1671-7775.2020.01.011

摘要
图/表
参考文献(0)
相关文章 (15)

全文: PDF (1530 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要针对词向量训练过程中层序Softmax算法无法进行增量训练及海量数据训练低效的问题,提出了动态层序Softmax算法.通过对数据样本的增量加载,采用结点置换方法动态构建编码树,实现对样本的增量训练.为避免损失函数因样本量较少而呈现震荡式下降,利用梯度的一阶矩估计与二阶矩估计动态调整参数更新方向与学习率,通过梯度迭代缩小权值变化范围和收敛训练误差,提高词向量的训练效率.以维基百科中文语料作为数据进行了试验,完成了训练效率和质量的分析.结果表明：相较于现有方法动态层序Softmax算法显著提高了训练效率,当增量样本大小为10 kB～1 MB时,训练增速有近30倍的提升,有效地缩短训练周期.

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	杨鹤标
	胡惊涛
	刘芳

关键词 ： , 词向量, 层序Softmax, 增量训练, 矩估计, 梯度迭代

Abstract：To solve the problems of hierarchical Softmax algorithm in the training process of word vectors with inability of incremental training and inefficient training of massive data, the dynamic hierarchical Softmax algorithm was proposed. By the incremental loading of data samples, an adaptive Huffman coding tree was dynamically constructed by the node adjustment replacement method. To avoid the oscillatory decline of loss function due to the small sample size, the firstorder and the secondorder moment estimations of the gradient were used to dynamically adjust the parameters update direction and learning rate. The weight variation range and the convergence training network error were reduced by the gradient descent algorithm to improve the training efficiency of the word vector from massive data. The Wikipedia Chinese corpus was adopted as the data to test the training efficiency and quality. The experimental results show that the dynamic hierarchical Softmax algorithm can significantly improve the training efficiency and ensure the quality of word vector training. When the incremental samples are from 10 kB to 1 MB, the training speed is increased about 30 times, which can effectively shorten the training period.

Key words： word vector hierarchical Softmax algorithm incremental training moment estimation gradient iteration

收稿日期: 2019-09-10

基金资助: 国家自然科学基金资助项目（61872167）；江苏省社会发展基金资助项目（BE2017700）

作者简介: 杨鹤标(1960—)，男，江苏淮安人，教授（yhbjj@ujs.edu.cn），主要从事数据挖掘、自然语言处理等研究. 胡惊涛(1994—)，男，江苏南通人，硕士研究生（2211708028@ujs.edu.cn），主要从事模式识别、知识抽取研究.

引用本文:

杨鹤标, 胡惊涛, 刘芳. 基于神经网络语言模型的动态层序Softmax训练算法[J]. 江苏大学学报（自然科学版）, 2020, 41(1): 67-72. YANG Hebiao, HU Jingtao, LIU Fang.

Training algorithm of dynamic hierarchical Softmax

based on neural network language model

[J]. Journal of Jiangsu University(Natural Science Eidtion) , 2020, 41(1): 67-72.

链接本文:

http://zzs.ujs.edu.cn/xbzkb/CN/10.3969/j.issn.1671-7775.2020.01.011 或 http://zzs.ujs.edu.cn/xbzkb/CN/Y2020/V41/I1/67