上海大学学报(自然科学版) ›› 2020, Vol. 26 ›› Issue (5): 824-833.doi: 10.12066/j.issn.1007-2861.2089

• 研究论文 • 上一篇    下一篇

基于机器学习方法的二维材料带隙预测

游洋1,2, 杜婉3, 李惟驹1, 陈竞哲1,2()   

  1. 1.上海大学 理学院, 上海 200444
    2.上海大学 量子与分子结构国际研究中心, 上海 200444
    3.上海大学 材料基因组工程研究院, 上海 200444
  • 收稿日期:2018-09-12 出版日期:2020-10-30 发布日期:2020-11-06
  • 通讯作者: 陈竞哲 E-mail:jingzhe@shu.edu.cn
  • 基金资助:
    国家自然科学基金青年科学基金资助项目(11404206)

Two-dimensional material band gap prediction based on machine learning method

YOU Yang1,2, DU Wan3, LI Weiju1, CHEN Jingzhe1,2()   

  1. 1. College of Sciences, Shanghai University, Shanghai 200444, China
    2. International Centre for Quantum and Molecular Structures, Shanghai University, Shanghai 200444, China
    3. Materials Genome Institute, Shanghai University, Shanghai 200444, China
  • Received:2018-09-12 Online:2020-10-30 Published:2020-11-06
  • Contact: CHEN Jingzhe E-mail:jingzhe@shu.edu.cn

摘要:

利用密度泛函理论与机器学习相结合的方法, 对二维金属化合物的带隙进行研究, 得到比传统理论计算成本更低且更有效的带隙预测方法. 以广义梯度近似(general gradient approximation, GGA)-Perdew-Burke-Ernzerhof (PBE) (GGA-PBE) 和 G0W0 的带隙计算结果为参考, 考察了化学通式为 MX2 的二维材料数据集. 利用套索回归, 即最小绝对值收敛和选择算子(least absolute shrinkage and selection operator, LASSO)、支持向量回归(support vector regression, SVR)和梯度树提升回归(gradient boosting regression, GBR)等机器学习方法建立带隙的预测模型. 测试结果表明, 对于大多数二维材料, 基于线性核函数的 SVR 与 LASSO 模型的预测性能相对较好, 训练模型的平均绝对误差为 0.34 eV, 测试集误差为 0.5 eV. 这说明对于二维材料带隙采取的特征参数集具有一定的完备性和合理性, 对新材料带隙的初步预测有一定的参考价值.

关键词: 二维材料, 第一性原理, 机器学习, 带隙

Abstract:

Machine learning (ML) algorithm and the traditional density functional theory (DFT) are combined to study the band gap of two-dimensional metal compounds, and as a result, a simple and effective model which is more cost-effective than the traditional quantum calculation method is established. Results of general gradient approximation-Perdew-Burke-Ernzerhof (GGA-PBE) and  G0Ware taken as reference and a two-dimensional material data set with chemical formula MX2 is investigated. Least absolute shrinkage and selection operator (LASSO), support vector machine regression (SVR) and gradient boosting regressor (GBR) and other machine learning methods are used to build a band gap prediction model. Among these models, it is found that the SVR model based on linear kernel function and LASSO model both can give a good prediction result, the mean absolute error (MAE) of training model is 0.34 eV and MAE of testing set is 0.5 eV. Thus, for the prediction of two-dimensional material band gap, the feature parameter set adopted by us has a certain completeness and rationality, which has a certain reference value for the preliminary prediction for the band gap of new materials.

Key words: two-dimensional materials, first principles, machine learning, band gap

中图分类号: