上海大学学报(自然科学版) ›› 2022, Vol. 28 ›› Issue (2): 281-290.doi: 10.12066/j.issn.1007-2861.2243

• 研究论文 • 上一篇    下一篇

基于图拉普拉斯的多标签类属特征选择

吴喆君, 黄睿()   

  1. 上海大学 通信与信息工程学院, 上海 200444
  • 收稿日期:2020-03-07 出版日期:2022-04-30 发布日期:2020-09-01
  • 通讯作者: 黄睿 E-mail:huangr@shu.edu.cn
  • 作者简介:黄睿(1976--), 女, 副教授, 博士, 研究方向为模式识别与智能信息处理.E-mail: huangr@shu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61671283)

Multi-label label-specific feature selection based on graph Laplacian

WU Zhejun, HUANG Rui()   

  1. School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China
  • Received:2020-03-07 Online:2022-04-30 Published:2020-09-01
  • Contact: HUANG Rui E-mail:huangr@shu.edu.cn

摘要:

多标签特征选择能够有效去除冗余特征并提升分类精度, 是解决"维数灾难"问题的有效方法. 然而, 已有的多标签特征选择算法是对所有标签选择出相同的特征, 忽略了标签与特征之间的内在联系. 事实上, 每个标签都具有反映该标签特有属性的特征, 即类属特征. 提出一种基于图拉普拉斯的多标签类属特征选择(multi-label label-specific feature selection based on graph Laplacian, LSGL)算法. 对于每个类别标签, 基于拉普拉斯映射获得数据的低维嵌入, 再通过稀疏正则化获得数据空间到嵌入空间的投影矩阵, 接着通过分析矩阵系数确定每个标签相应的类属特征, 最后使用类属特征进行分类. 在 5 个公共多标签数据集上的多标签特征选择与分类实验结果证明了所提算法的有效性.

关键词: 多标签学习, 特征选择, 类属特征, 图拉普拉斯

Abstract:

Multi-label feature selection, which can effectively removeredundant features and improve classification performance, has become an effective solution for the problem of "curse of dimensionality". However, existing multi-label feature selection methods select the same features for all labels without considering the intrinsic relation between labels and features. In fact, each label has label-specific features that reflect the specific attributes of the label. A feature selection method called multi-label label-specific feature selectionbased on graph Laplacian (LSGL) is proposed in this study. LSGL first obtains alow-dimensional embedding of instances for each class label based on Laplacianeigenmaps. Next, it obtains a projection matrix that can project samples from adata space to manifold embedding space through sparse regularization. It thendetermines the label-specific features of the corresponding class label bycoefficient analysis of the matrix. Finally, the label-specific featuresare used for classification. Experimental results of multi-label featureselection andclassification on five public multi-label datasets showed the effectiveness of the proposed algorithm.

Key words: multi-label learning, feature selection, label-specific feature, graph Laplacian

中图分类号: