Abstract:To solve the problem of the existing LSTM extracted feature vectors in named entity recognition tasks with insufficient ability for expressing shortterm information features, the LSTM network was proposed based on hierarchical residual connections. The nonlinear fitting capability of shortterm information features was improved by adding residual blocks to stack the LSTM network depth. The activation function was dynamically selected by global information encoding to reduce the number of parameters during enhancing the computational power of network. The model fitting capability was improved by dynamically adjusting the number of layers of residual connections to the input through attention mechanism. The residual network and Dynamic ReLU activation function were given, and the overall framework of LSTM named entity recognition was established based on hierarchical residual connections. The residual connection module, the Dynamic ReLU module and the attention mechanism module were defined. The proposed method was compared with related algorithms of FLAT and Lattice LSTM, and the experiments were conducted on Weibo and Resume dataset. The results show that the LSTM based on hierarchical residual connectivity achieves the best effect on the Weibo corpus and is second only to FLAT on the Resume corpus with F1 of 0.700 1 and 0.958 6, respectively.
ZHENG Y X, ZUO X L, ZUO W L, et al. BiLSTM+GCN causality extraction based on time relationship[J].Journal of Jilin University (Science Edition), 2021,59(3):643-648. (in Chinese)
YANG H M, LI L, YANG R D, et al. Named entity recognition based on bidirectional long shortterm memory combined with case report form[J]. Chinese Journal of Tissue Engineering Research, 2018,22(20): 3237-3242. (in Chinese)
SUN J, CAI H, ZHU X L, et al. Deep face representation algorithm based on dual attention mechanism[J]. Journal of Jilin University (Science Edition), 2021,59(4):883-890. (in Chinese)
[4]
LAMPLE G, BALLESTEROS M, SUBRAMANIAN S, et al. Neural architectures for named entity recognition[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.[S.l.]:ACL, 2016: 260-270.
[5]
MA X Z, HOVY E. Endtoend sequence labeling via bidirectional LSTMCNNsCRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.\[S.l.\]:ACL, 2016: 1064-1074.
[6]
ZHANG Y, YANG J. Chinese NER using lattice LSTM\[C\]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.[S.l.]: ACL, 2018: 1554-1564.
FENG J Z, SONG S S, WANG Y Z, et al. Entity relation extraction based on improved attention mechanism[J].Acta Electronica Sinica, 2019,47(8):1692-1700.(in Chinese)
[8]
SHEN Y K, TAN S, SORDONI A, et al. Ordered neurons: integrating tree structures into recurrent neural networks[C]//Proceedings of the 7th International Conference on Learning Representations.[S.l.]: ICLR, 2019.
[9]
HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition.[S.l.]: IEEE Computer Society, 2016: 770-778.
[10]
CHEN Y P, DAI X Y, LIU M C, et al. Dynamic ReLU[C]//Proceedings of the 16th European Conference on Computer Vision.[S.l.]: Springer Science and Business Media Deutschland GmbH, 2020: 351-367.
[11]
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 2017 Conference on Advances in Neural Information Processing Systems.[S.l.]: Neural Information Processing Systems Foundation, 2017: 5999-6009.
WANG L L, WANG H Y, BAIMA Q Z, et al. Tibetan word segmentation method based on BiLSTM_CRF model[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2020,32(4):648-654. (in Chinese)
E H H, ZHANG W J, XIAO S Q, et al. Survey of entity relationship extraction based on deep learning[J]. Journal of Software, 2019,30(6):1793-1818. (in Chinese)
[14]
LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: probabilistic models for segmenting and labeling sequence data\[C\]∥Proceedings of the ICML, doi: 10.1109/ICIP.2012.6466940.
MA Y H, ZHANG R J, WU C, et al. Expression recognition of image sequence based on deep residual network and LSTM[J].Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition),2020,32(5):874-883. (in Chinese)
[16]
COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[J].Journal of Machine Learning Research, 2011, 12: 2493-2537.
[17]
HU J, SHEN L, SUN G. Squeezeandexcitation networks[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.[S.l.]: IEEE Computer Society, 2018: 7132-7141.