摘要 针对标签特定特征多标签学习算法(multilabel learning with labelspecific features, LIFT)未能在聚类以及分类阶段考虑标签相关性问题,提出一种基于标签相关性的标签特定特征多标签学习算法(multilabel learning with labelspecific features via label correlations, LFLC).将标签空间加入特征空间进行聚类构建分类模型,采用考虑标签相关性的聚类集成技术为每个标签构造标签特定特征,使用相关性矩阵构建无向完全图并挖掘图中标签集合相关性,通过树集成表达标签间多种不同结构的强相关性.在试验部分,采用涵盖不同领域的10个数据集,以Hamming Loss、Ranking Loss、Oneerror、Coverage、Average Precision和macroAUC为评估指标,进行了参数敏感性分析和统计假设检验.结果表明:结合聚类集成与标签间强相关性的LFLC算法较其他对比多标签算法整体上能取得较好的效果.
Abstract:To solve the problem that multilabel learning with label specific features(LIFT) could not consider label correlation in the clustering and classification stages, a method for multilabel learning with labelspecific features via label correlations (LFLC) was proposed. The label space was added to the feature space for clustering to construct the classification model, and the clustering ensemble with considering label correlation was used to construct labelspecific features for each label. The correlation matrix was used to construct undirected complete graph and mine the correlation of label sets in the graph. The strong correlation of multiple different structures between labels was expressed by tree ensemble. In the experiment, 10 data sets covering different fields were used, and Hamming Loss, Ranking Loss, Oneerror, Coverage, Average Precision and macroAUC were used as evaluation indexes to carry out parameter sensitivity analysis and statistical hypothesis test. The results show that the LFLC algorithm combined with clustering ensemble and strong correlation between labels can obtain better performance generally.
TEISSEYRE P. Classifier chains for positive unlabelled multilabel learning[J]. KnowledgeBased Systems, DOI:10.1016/j.knosys.2020.106709.
[2]
CHEN Z M, WEI X S, WANG P, et al. Multilabel image recognition with graph convolutional networks[C]∥Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Computer Society, DOI:10.1109/CVPR.2019.00532.
[3]
GONG J B, MA H Y, TENG Z Y, et al. Hierarchical graph transformerbased deep learning model for largescale multilabel text classification[J]. IEEE Access, 2020, 8: 30885-30896.
[4]
MARKATOPOULOU F, MEZARIS V, PATRAS I. Implicit and explicit concept relations in deep neural networks for multilabel video/image annotation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019,29(6):1631-1644.
[5]
MA Q W, YUAN C Y, ZHOU W, et al. Beyond statistical relations: integrating knowledge relations into style correlations for multilabel music style classification[C]∥Proceedings of the 13th International Conference on Web Search and Data Mining. New York: ACM, DOI:10.1145/3336191.3371838.
XIAO L, CHEN B L, HUANG X, et al. Multilabel text classification method based on label semantic information[J]. Journal of Software, 2020,31(4):1079-1089. (in Chinese)
[7]
WENG W, WANG D H, CHEN C L, et al. Label specific featuresbased classifier chains for multilabel classification[J]. IEEE Access, 2020,8:51265-51275.
[8]
WU Y P, LIN H T. Progressive random klabelsets for costsensitive multilabel classification[J]. Machine Learning, 2017,106(5):671-694.
[9]
WENG W, LIN Y J, WU S X, et al. Multilabel learning based on labelspecific features and local pairwise label correlation[J]. Neurocomputing, 2018,273:385-394.
[10]
HUANG J, QIN F, ZHENG X, et al. Improving multilabel classification with missing labels by learning labelspecific features[J]. Information Sciences, 2019, 492:124-146.
[11]
ZHANG J, LI C D, CAO D L, et al. Multilabel learning with labelspecific features by resolving label correlations[J]. KnowledgeBased Systems, 2018,159:148-157.
[12]
ZHANG C Y, LI Z S. Multilabel learning with labelspecific features via weighting and label entropy guided clustering ensemble[J]. Neurocomputing, 2021,419:59-69.
[13]
LUO F F, GUO W Z, CHEN G L. Addressing imbalance in weakly supervised multilabel learning[J]. IEEE Access, 2019,7:37463-37472.