|
|
Named entity recognition of LSTM based on hierarchical residual connection |
Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications, Chongqing 400065, China |
|
|
Abstract To solve the problem of the existing LSTM extracted feature vectors in named entity recognition tasks with insufficient ability for expressing shortterm information features, the LSTM network was proposed based on hierarchical residual connections. The nonlinear fitting capability of shortterm information features was improved by adding residual blocks to stack the LSTM network depth. The activation function was dynamically selected by global information encoding to reduce the number of parameters during enhancing the computational power of network. The model fitting capability was improved by dynamically adjusting the number of layers of residual connections to the input through attention mechanism. The residual network and Dynamic ReLU activation function were given, and the overall framework of LSTM named entity recognition was established based on hierarchical residual connections. The residual connection module, the Dynamic ReLU module and the attention mechanism module were defined. The proposed method was compared with related algorithms of FLAT and Lattice LSTM, and the experiments were conducted on Weibo and Resume dataset. The results show that the LSTM based on hierarchical residual connectivity achieves the best effect on the Weibo corpus and is second only to FLAT on the Resume corpus with F1 of 0.700 1 and 0.958 6, respectively.
|
|
|
|
|
[1] |
郑余祥,左祥麟,左万利,等.基于时间关系的BiLSTM+GCN因果关系抽取[J].吉林大学学报(理学版),2021,59(3):643-648.
|
|
ZHENG Y X, ZUO X L, ZUO W L, et al. BiLSTM+GCN causality extraction based on time relationship[J].Journal of Jilin University (Science Edition), 2021,59(3):643-648. (in Chinese)
|
[2] |
杨红梅,李琳,杨日东,等.基于双向LSTM神经网络电子病历命名实体的识别模型[J].中国组织工程研究,2018,22(20):3237-3242.
|
|
YANG H M, LI L, YANG R D, et al. Named entity recognition based on bidirectional long shortterm memory combined with case report form[J]. Chinese Journal of Tissue Engineering Research, 2018,22(20): 3237-3242. (in Chinese)
|
[3] |
孙俊,才华,朱新丽,等.基于双重注意力机制的深度人脸表示算法[J].吉林大学学报(理学版),2021,59(4):883-890.
|
|
SUN J, CAI H, ZHU X L, et al. Deep face representation algorithm based on dual attention mechanism[J]. Journal of Jilin University (Science Edition), 2021,59(4):883-890. (in Chinese)
|
[4] |
LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.[S.l.]:ACL, 2016: 260-270.
|
[5] |
MA X Z, HOVY E. Endtoend sequence labeling via bidirectional LSTMCNNsCRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.\[S.l.\]:ACL, 2016: 1064-1074.
|
[6] |
ZHANG Y, YANG J. Chinese NER using lattice LSTM\[C\]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]: ACL, 2018: 1554-1564.
|
[7] |
冯建周,宋沙沙,王元卓,等.基于改进注意力机制的实体关系抽取方法[J].电子学报,2019,47(8):1692-1700.
|
|
FENG J Z, SONG S S, WANG Y Z, et al. Entity relation extraction based on improved attention mechanism[J].Acta Electronica Sinica, 2019,47(8):1692-1700.(in Chinese)
|
[8] |
SHEN Y K, TAN S, SORDONI A, et al. Ordered neurons: integrating tree structures into recurrent neural networks[C]//Proceedings of the 7th International Conference on Learning Representations.[S.l.]: ICLR, 2019.
|
[9] |
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition.[S.l.]: IEEE Computer Society, 2016: 770-778.
|
[10] |
CHEN Y P, DAI X Y, LIU M C, et al. Dynamic ReLU[C]//Proceedings of the 16th European Conference on Computer Vision.[S.l.]: Springer Science and Business Media Deutschland GmbH, 2020: 351-367.
|
[11] |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 2017 Conference on Advances in Neural Information Processing Systems.[S.l.]: Neural Information Processing Systems Foundation, 2017: 5999-6009.
|
[12] |
王莉莉,王宏渊,白玛曲珍,等.基于BiLSTM_CRF模型的藏文分词方法[J].重庆邮电大学学报(自然科学版),2020,32(4):648-654.
|
|
WANG L L, WANG H Y, BAIMA Q Z, et al. Tibetan word segmentation method based on BiLSTM_CRF model[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2020,32(4):648-654. (in Chinese)
|
[13] |
鄂海红,张文静,肖思琪,等.深度学习实体关系抽取研究综述[J].软件学报,2019,30(6):1793-1818.
|
|
E H H, ZHANG W J, XIAO S Q, et al. Survey of entity relationship extraction based on deep learning[J]. Journal of Software, 2019,30(6):1793-1818. (in Chinese)
|
[14] |
LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: probabilistic models for segmenting and labeling sequence data\[C\]∥Proceedings of the ICML, doi: 10.1109/ICIP.2012.6466940.
|
[15] |
马玉环,张瑞军,武晨,等.深度残差网络和LSTM结合的图像序列表情识别[J].重庆邮电大学学报(自然科学版),2020,32(5):874-883.
|
|
MA Y H, ZHANG R J, WU C, et al. Expression recognition of image sequence based on deep residual network and LSTM[J].Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition),2020,32(5):874-883. (in Chinese)
|
[16] |
COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J].Journal of Machine Learning Research, 2011, 12: 2493-2537.
|
[17] |
HU J, SHEN L, SUN G. Squeezeandexcitation networks[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.[S.l.]: IEEE Computer Society, 2018: 7132-7141.
|
|
|
|