Data Collection, Database and Data Processing

Constructing a material-domain knowledge graph based on natural language processing

Expand
  • 1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
    2. Center of Materials Informatics and Data Science, Materials Genome Institute, Shanghai University, Shanghai 200444, China
    3. Zhejiang Laboratory, Hangzhou 311100, Zhejiang, China

Received date: 2022-03-28

  Online published: 2022-05-27

Abstract

Determining how to combine material-domain knowledge with the machine learning method is an urgent problem in materials intelligence. As an efficient knowledge-organization method, knowledge graphs (KGs) can effectively represent, organize, and reasoning material-domain knowledge so as to improve the intelligence level of machine-learning algorithms for materials. In this paper, we study natural language processing (NLP)-based knowledge-acquisition methods for materials and propose a joint extraction method comprising the material entity relationship based on bidirectional-gated recurrent unit-graph neural network-conditional random field (Bi-GRU-GNN-CRF) and a material-processing knowledge-extraction method based on the improved TextRank algorithm. Using the proposed knowledge-acquisition method, we acquire material-domain knowledge such as material entities, relationships, and technological processes from patents, papers, and other types of texts. The experimental results show that the proposed knowledge acquisition method has good accuracy and recall, which can effectively improve the knowledge coverage of the material KGs. The knowledge coverage of the material KGs constructed based on proposed method reaches 80%, which provides more comprehensive knowledge support for materials research and development. We also construct the domain KGs of special non-modulated steel, an aluminum matrix composite material, and a thermal-barrier ceramic-coating material, and the results further verify the potential of using material knowledge maps in materials research and development.

Cite this article

WEI Xiao, WANG Xiaoxin, CHEN Yongqi, ZHANG Huiran . Constructing a material-domain knowledge graph based on natural language processing[J]. Journal of Shanghai University, 2022 , 28(3) : 386 -398 . DOI: 10.12066/j.issn.1007-2861.2380

References

[1] 徐增林, 盛泳潘, 贺丽荣, 等. 知识图谱技术综述[J]. 电子科技大学学报, 2016, 45(4): 589-606.
[2] 付雷杰, 曹岩, 白瑀, 等. 国内垂直领域知识图谱发展现状与展望[J]. 计算机应用研究, 2021, 38(11): 3201-3214.
[3] 刘涛, 邓永和, 高明, 等. 材料属性知识图谱的建设与发展浅析[J]. 湖南工程学院学报(自然科学版), 2021, 31(4): 59-65.
[4] 杨丽, 苏航, 柴锋, 等. 材料数据库和数据挖掘技术的应用现状[J]. 中国材料进展, 2019, 38(7): 672-681, 650.
[5] 邓依依, 邬昌兴, 魏永丰, 等. 基于深度学习的命名实体识别综述[J]. 中文信息学报, 2021, 35(9): 30-45.
[6] 鄂海红, 张文静, 肖思琪, 等. 深度学习实体关系抽取研究综述[J]. 软件学报, 2019, 30(6): 1793-1818.
[7] 吴赛赛, 梁晓贺, 谢能付, 等. 面向领域实体关系联合抽取的标注方法[J]. 计算机应用, 2021, 41(10): 2858-2863.
[8] 付瑞, 李剑宇, 王笳辉, 等. 面向领域知识图谱的实体关系联合抽取[J]. 华东师范大学学报(自然科学版), 2021(5): 24-36.
[9] Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 2016: 1105-1116.
[10] Zheng S C, Hao Y X, Lu D Y, et al. Joint entity and relation extraction based on a hybrid neural network[J]. Neurocomputing, 2017, 257: 1-8.
[11] Katiyar A, Cardie C. Going out on a limb: joint extraction of entity mentions and relations without dependency trees[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 917-928.
[12] Li Q, Ji H. Incremental joint extraction of entity mentions and relations[C]// Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014: 402-412.
[13] Zheng S C, Wang F, Bao H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. 2017: 1227-1236.
[14] Huang P X, Zhao X, Fang Y, et al. End-to-end knowledge triplet extraction combined with adversarial training[J]. Journal of Computer Research and Development, 2019, 56(12): 2536-2548.
[15] Zeng X R, Zeng D J, He S Z, et al. Extracting relational facts by an end-to-end neural model with copy mechanism[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. 2018: 506-514.
[16] Mrdjenovich D, Horton M K, Montoya J H, et al. Propnet: a knowledge graph for materials science[J]. Matter, 2020, 2(2): 464-480.
[17] Wei X, Chen Y Q. Joint extraction of long-distance entity relation by aggregating local- and semantic-dependent features[J]. Wireless Communications and Mobile Computing, 2022, 2022: 3763940.
Outlines

/