Multi-label text classification based on graph embedding and region attention
1. Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications, Chongqing 400065, China; 2. School of Electronic Science and Engineering, Nanjing University, Nanjing, Jiangsu 210023, China
Abstract: The traditional multilabel text classification method tends to ignore the correlations between labels and the correlations between labels and texts, and low frequency labels are not predicted well. To solve the problems, graph embedding technique and region attention mechanism were used to mine the correlation between labels. The encodergraph embedding and the region attentiondecoder were proposed to tackle the multilabel text classification. BiLSTM was used as encoder, and the label embedding matrix was generated by the graph embedding technique. The tokenlevel and regionlevel information were combined by the regional attention mechanism to consider the information of different parts of the text during generating each label, which could potentially extract the association between text and label. Recurrent neural network (RNN) and multilayer perception (MLP) were used as decoders and combined with stochastic gradient method to improve multilabel classification. The experiments were carried out on AAPD dataset and RCV1V2 dataset, and the relevant parameters were set according to the characteristics of the datasets. The microF1 and Hamming Loss were used as evaluation indexes to compare the proposed method with some classic ones, such as LP and CNN. The results show that the proposed method can predict the low frequency labels according to the higher ones, and it has higher microF1 and lower Hamming Loss than those by classical methods.
王进, 徐巍, 丁一, 孙开伟, 王利蕾. 基于图嵌入和区域注意力的多标签文本分类[J]. 江苏大学学报(自然科学版), 2022, 43(3): 310-318.
WANG Jin, XU Wei, DING Yi, SUN Kaiwei, WANG Lilei. Multi-label text classification based on graph embedding and region attention[J]. Journal of Jiangsu University(Natural Science Eidtion)
, 2022, 43(3): 310-318.