|
|
Chinese named entity recognition based on multi-head attention characterword integration |
WANG Jin, WANG Mengqi, ZHANG Xinyue, SUN Kaiwei, PIAO Changhao
|
Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications, Chongqing 400065, China |
|
|
Guide |
|
Abstract To solve the problems that the existing Chinese named entity recognition (NER) methods based on characterword integration with introducing redundant word interference, complex model architecture and difficult combining with other sequence models, a novel Chinese NER algorithm based on multihead attention was proposed. The attention mechanism was used to efficiently fuse word boundary information and reduce the interference of redundant word by fusing BIE word sets at different locations. A multihead attention characterword joint model was established with characterword integrating modules, multihead attention modules and fusion modules. Compared with the existing Chinese NER schemes, the proposed algorithm could avoid the design of complex sequence models, which was convenient to combine with the existing character based Chinese NER models. The recall, precision and F1 value were used as evaluation indicators, and the effects of each part of the model were verified by ablation experiments. The results show that by the proposed algorithm, the F1 values are increased by 0.28 and 0.69 on MSRA and Weibo, respectively, and the precision is improved by 0.07 on Resume data set.
|
Received: 17 December 2021
|
|
Fund: |
|
|
|
[1] |
ZHANG Y, LIU Q, SONG L F. Sentencestate LSTM for text representation[C]∥ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018: 317-327.
|
[2] |
SHEN Y K, TAN S, SORDONI A, et al. Ordered neurons: integrating tree structures into recurrent neural networks[C]∥ Proceedings of the 7th International Conference on Learning Representations. [S.l.]: ICLR,2019,DOI:10.48550/arXiv.1810.09536.
|
[3] |
DUAN H Z, YAN Z. A study on features of the CRFs-based Chinese named entity recognition[J]. International Journal of Advanced Intelligence, 2011,3(2):287-294.
|
[4] |
HE J Z, WANG H F. Chinese named entity recognition and word segmentation based on character[C]∥ Proceedings of the 6th SIGHAN Workshop on Chinese Language Processing. [S.l.]:ACL, 2008:128-132.
|
[5] |
LIU Z X, ZHU C H, ZHAO T J. Chinese named entity recognition with a sequence labeling approach: based on characters, or based on words?[C]∥ Proceedings of the 6th International Conference on Intelligent Computing. Heidelberg: Springer Verlag, 2010: 634-640.
|
[6] |
胥小波,王涛,康睿,等. 多特征中文命名实体识别[J]. 四川大学学报(自然科学版),DOI: 10.19907/j.0490-6756.2022.022003.
|
|
XU X B, WANG T, KANG R, et al. Multifeature Chinese named entity recognition[J]. Journal of Sichuan University (Natural Science Edition), DOI: 10.19907/j.0490-6756.2022.022003. (in Chinese)
|
[7] |
ZHANG Y, YANG J. Chinese NER using lattice LSTM[C]∥ Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018: 1554-1564.
|
[8] |
GUI T, MA R T, ZHANG Q, et al. CNNbased Chinese NER with lexicon rethinking[C]∥ Proceedings of the 28th International Joint Conference on Artificial Intelligence.[S.l.]: International Joint Conferences on Artificial Intelligence, 2019: 4982-4988.
|
[9] |
LIU W, XU T G, XU Q H, et al. An encoding strategy based wordcharacter LSTM for Chinese NER[C]∥Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Melbourne, Australia: Association for Computational Linguistics, 2019: 2379-2389.
|
[10] |
MA R T, PENG M L, ZHANG Q, et al. Simplify the usage of lexicon in Chinese NER[C]∥ Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2020: 5951-5960.
|
[11] |
胡新棒,于溆乔,李邵梅,等. 基于知识增强的中文命名实体识别[J]. 计算机工程,2021,47(11): 84-92.
|
|
HU X B, YU X Q, LI S M, et al. Chinese named entity recognition based on knowledge enhancement[J]. Computer Engineering, 2021,47(11):84-92. (in Chinese)
|
[12] |
LI Y Z, YU B, XUE M G, et al. Enhancing pre-trained Chinese character representation with word-aligned attention[C]∥Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2020: 3442-3448.
|
[13] |
SUI D B, CHEN Y B, LIU K, et al. Leverage lexical knowledge for Chinese named entity recognition via collaborative graph network[C]∥Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing. Melbourne, Australia: Association for Computational Linguistics, 2019: 3830-3840.
|
[14] |
GUI T, ZOU Y C, ZHANG Q, et al. A lexicon-based graph neural network for Chinese NER[C]∥ Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing. Melbourne, Australia: Association for Computational Linguistics, 2019: 1040-1050.
|
[15] |
DING R X, XIE P J, ZHANG X Y, et al. A neural multi-digraph model for Chinese NER with gazetteers[C]∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2019:1462-1467.
|
[16] |
朱张莉,饶元,吴渊,等. 注意力机制在深度学习中的研究进展[J]. 中文信息学报, 2019,33(6): 1-11.
|
|
ZHU Z L, RAO Y, WU Y, et al. Research progress of attention mechanism in deep learning[J]. Journal of Chinese Information Processing, 2019, 33(6):1-11. (in Chinese)
|
[17] |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]∥Proceedings of the 31st Annual Conference on Neural Information Processing Systems.[S.l.]: Neural Information Processing Systems Foundation,2017: 5999-6009.
|
[18] |
CHEN X C, QIU X P, ZHU C X, et al. Long short-term memory neural networks for Chinese word segmentation[C]∥ Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Melbourne, Australia: Association for Computational Linguistics, 2015: 1197-1206.
|
[19] |
PENG N Y, DREDZE M. Supplementary results for named entity recognition on Chinese social media with an updated dataset[J]. arXiv preprint arXiv:1603.00786, 2017.
|
[20] |
PENG N Y, DREDZE M. Named entity recognition for Chinese social media with jointly trained embeddings[C]∥ Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Melbourne, Australia: Association for Computational Linguistics, 2015: 548-554.
|
[21] |
MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]∥Proceedings of the 27th Annual Conference on Neural Information Processing Systems. [S.l.]: Neural Information Processing Systems Foundation, 2013: 3111-3119.
|
[22] |
KINGMA D P, BA J L. Adam: a method for stochastic optimization[C]∥Proceedings of the 3rd International Conference on Learning Representations. [S.l.]: International Conference on Learning Representations,2015,DOI:10.48550/arXiv.1412.6980.
|
[23] |
SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15: 1929-1958.
|
|
|
|