Journal of Shanghai University(Natural Science Edition) ›› 2022, Vol. 28 ›› Issue (3): 463-475.doi: 10.12066/j.issn.1007-2861.2375

• Machine Learning • Previous Articles     Next Articles

Feature selection based on reinforcement learning and its application in material informatics

ZHANG Peng1, ZHANG Rui1,2,3()   

  1. 1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
    2. Center of Materials Informatics and Data Science, Materials Genome Institute, Shanghai University, Shanghai 200444, China
    3. Zhejiang Laboratory, Hangzhou 311100, Zhejiang, China
  • Received:2020-03-20 Online:2022-06-30 Published:2022-05-27
  • Contact: ZHANG Rui E-mail:ruizhang@shu.edu.cn

Abstract:

Owing the rapid development of big data, artificial intelligence, and high-performance computing, the research and development of data-driven materials has intensified. During data mining and the machine learning of material data, the feature set must be preprocessed by reducing redundant and irrelevant features, which can not only avoid model overfitting, but also improve the model interpretability. Herein, a feature selection method based on reinforcement learning, known as FSRL, is proposed. By abstracting the encapsulated feature selection method into the interaction between the machine learning model and environment, the corresponding features are selected based on the maximum reward and then incorporated to the feature subset. In addition, we propose a feature construction method based on symbolic transformation to generate new high-order features to improve the prediction accuracy of the model. Subsequently, we apply the abovementioned method to the classification task of amorphous alloy materials and the regression task of aluminum matrix composite materials. Experiments show that our proposed method not only successfully achieve feature transformation in the FSRL, but also afford a 2.8% prediction improvement in the classification task and a 22.9% prediction improvement in the regression task respectively.

Key words: feature selection, reinforcement learning, feature construction method

CLC Number: