Loading...

Table of Content

    30 June 2022, Volume 28 Issue 3
    Preface
    Materials informatics—data-driven materials research and development
    ZHANG Tongyi
    2022, 28(3):  357-360.  doi:10.12066/j.issn.1007-2861.2370
    Asbtract ( 2467 )   HTML ( 216)   PDF (587KB) ( 566 )  
    References | Related Articles | Metrics
    Data Collection, Database and Data Processing
    High-precision data acquisition method based on Jaya optimization and calibration
    ZHANG Hesheng, JIAO Peng, HU Qirui, CAI Jiangqian, HU Shunbo, CAO He, OUYANG Qiubao
    2022, 28(3):  361-371.  doi:10.12066/j.issn.1007-2861.2372
    Asbtract ( 2099 )   HTML ( 31)   PDF (1195KB) ( 293 )  
    Figures and Tables | References | Related Articles | Metrics

    Materials genome engineering (MGE) integrates high-throughput experiments, high-throughput computations, databases, and artificial intelligence to accelerate the development of advanced materials. However, a reliable and effective method to acquire data from experimental equipment is yet to be identified in MGE. Because the calibration data of high-precision data acquisition systems are not synchronized in terms of time, a linear model is used in this study as a model for data processing parameters, and the value displayed by the device is used as the real value to construct the objective function to optimize the data processing parameters. The Jaya optimization algorithm is used to realize the optimization search of processing parameters. Based on the data acquisition of the equipment temperature as an example, a high-precision data acquisition system is constructed and verified experimentally. The experimental results show that using the optimized model parameters, the average error of data acquisition is only 0.13 $^\circ$C, and the maximum accuracy is 99.89%. Compared with the non-optimized model parameters, the average error reduced by 63.20%, which significantly improves the data acquisition accuracy.

    Material data named entity recognition based on matching contextual lexical words and graph convolution
    CHEN Qian, WU Xing
    2022, 28(3):  372-385.  doi:10.12066/j.issn.1007-2861.2377
    Asbtract ( 2111 )   HTML ( 25)   PDF (1294KB) ( 340 )  
    Figures and Tables | References | Related Articles | Metrics

    Literature pertaining to materials contain abundant information regarding data mining using machine learning and natural language processing, which is currently being investigated extensively. Named entity recognition (NER) is first performed when mining and extracting information from data such that the data can be used efficiently. As vector representation cannot solve multiple meanings of words, and models often extract contextual features while disregarding global features, a named entity recognition method based on matching contextual lexical words and graph convolution is proposed herein. First, the contextual dynamic features of text is obtained using XLNet; second, the contextual and global features are obtained using a long short-term memory network and a graph convolutional network (GCN) combined with contextual lexical words of the text, respectively. Finally, a sequence of labels is output via a conditional random field. The model is validated using two different datasets. Experimental results of the material data show that the precision, recall, and F1 score are 90.05%, 88.67%, and 89.36%, respectively, which effectively improve the named entity recognition accuracy.

    Constructing a material-domain knowledge graph based on natural language processing
    WEI Xiao, WANG Xiaoxin, CHEN Yongqi, ZHANG Huiran
    2022, 28(3):  386-398.  doi:10.12066/j.issn.1007-2861.2380
    Asbtract ( 2689 )   HTML ( 55)   PDF (6137KB) ( 483 )  
    Figures and Tables | References | Related Articles | Metrics

    Determining how to combine material-domain knowledge with the machine learning method is an urgent problem in materials intelligence. As an efficient knowledge-organization method, knowledge graphs (KGs) can effectively represent, organize, and reasoning material-domain knowledge so as to improve the intelligence level of machine-learning algorithms for materials. In this paper, we study natural language processing (NLP)-based knowledge-acquisition methods for materials and propose a joint extraction method comprising the material entity relationship based on bidirectional-gated recurrent unit-graph neural network-conditional random field (Bi-GRU-GNN-CRF) and a material-processing knowledge-extraction method based on the improved TextRank algorithm. Using the proposed knowledge-acquisition method, we acquire material-domain knowledge such as material entities, relationships, and technological processes from patents, papers, and other types of texts. The experimental results show that the proposed knowledge acquisition method has good accuracy and recall, which can effectively improve the knowledge coverage of the material KGs. The knowledge coverage of the material KGs constructed based on proposed method reaches 80%, which provides more comprehensive knowledge support for materials research and development. We also construct the domain KGs of special non-modulated steel, an aluminum matrix composite material, and a thermal-barrier ceramic-coating material, and the results further verify the potential of using material knowledge maps in materials research and development.

    Database for materials genome engineering
    YUE Xichao, FENG Yan, LIU Jian, YU Yeyong, XI Kangjie, QIAN Quan
    2022, 28(3):  399-412.  doi:10.12066/j.issn.1007-2861.2388
    Asbtract ( 3192 )   HTML ( 85)   PDF (6883KB) ( 642 )  
    Figures and Tables | References | Related Articles | Metrics

    Materials data are multi-source, heterogeneous, and high-dimensional. Acquiring diverse and complex materials data as well as establishing a dedicated database for materials genome engineering (MGE) is the foundation for realizing data-driven new materials design. Herein, the materials genome database platform is introduced in terms of its system architecture, implementation, and deployment on a supercomputer. It is based on several core technologies, such as normalized representation of materials data, machine-learning modeling and model cross-domain deployment, machine learning under data privacy protection, and a materials database to a knowledge base using a knowledge graph. Finally, based on an anti-perovskite negative expansion material as an example, the entire application process of the MGE database platform from data curation to machine learning modeling followed by inverse design, in addition to a final experimental validation are discussed comprehensively herein.

    Blockchain based data copyright protection and combinatorial auction
    XU Yuqin, QIAN Quan
    2022, 28(3):  413-426.  doi:10.12066/j.issn.1007-2861.2376
    Asbtract ( 1884 )   HTML ( 17)   PDF (3954KB) ( 143 )  
    Figures and Tables | References | Related Articles | Metrics

    A data copyright protection and combinatorial auction system was designed and implemented based on blockchain, digital watermark and sealed bids combinatorial auction. First, a smart contract was applied to store the copyright data and their transaction record via a decentralized technique known as blockchain, which could achieve consensus between each node who did not trust each other. Consequently the system could be rendered more trustworthy. Moreover, a digital watermark was a unique code hidden in data, which could be used to prove the ownership of copyright. The results showed that the watermark module used in the system barely altered original data and was efficient when embedding and extracting watermark. Finally, a combinatorial auction algorithm was designed to select an optimal bid combination automatically to implement the value exchange of data copyright.

    Kalman filter based method for processing small noisy sample data
    LIU Fen, FAN Hongqiang, LÜ Tao, LI Qian, QIAN Quan
    2022, 28(3):  427-439.  doi:10.12066/j.issn.1007-2861.2379
    Asbtract ( 2148 )   HTML ( 20)   PDF (1783KB) ( 195 )  
    Figures and Tables | References | Related Articles | Metrics

    A small sample noisy data processing method based on Kalman filter and extended Kalman filter has been proposed. The core idea was to establish a system model using physical models or empirical formula, then used the system model to predict the model data, and finally used the observation data to correct the model data and achieve the effect of smoothing data noise. Experimental results showed that when using the autoregressive integrated moving average (ARIMA) model and random forest (RF) model to predict the corrosion weight gain of weather steel BC500, the coefficient of determination $R^{2}$ was increased by an average of 6.4% after Kalman filter denoising, while the $R^{2}$ was increased by an average of 4.9% after extended Kalman filter. These results verified the effectiveness of the proposed methods.

    Ensemble learning of polypropylene-composite aging data
    WU Xing, GAO Jin, DING Peng
    2022, 28(3):  440-450.  doi:10.12066/j.issn.1007-2861.2382
    Asbtract ( 1837 )   HTML ( 11)   PDF (4113KB) ( 278 )  
    Figures and Tables | References | Related Articles | Metrics

    Aging experiments conducted on polypropylene composites have long durations, and a limited number of samples can be collected in a single experiment. As a result, traditional machine-learning approaches have a low prediction accuracy. To address these issues, we present an ensemble learning prediction based on virtual sample generation (VSG). To generate valid virtual samples of aging data for polypropylene composites, we first adopted the Gaussian mixed model (GMM) method and then used the generated data set to build an ensemble-learning prediction model comprising the random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) algorithms. The LightGBM and CatBoost algorithms in the ensemble learning model demonstrate the best performance on the test data; the mean square errors are 0.001 3 and 0.000 1, respectively, which are 0.4 and 0.2 higher than those of the RF algorithm and XGBoost algorithm, respectively. This study's aging VSG and ensemble learning approach for polypropylene composites can not only successfully overcome the long experimental times and insufficient number of data samples acquired in a single experiment but outperforms a single machine-learning algorithm.

    Regression modeling and multi-objective optimization for small sample scattered data
    YAO Yu, HU Tao, FU Jianxun, HU Shunbo
    2022, 28(3):  451-462.  doi:10.12066/j.issn.1007-2861.2387
    Asbtract ( 2678 )   HTML ( 23)   PDF (5003KB) ( 435 )  
    Figures and Tables | References | Related Articles | Metrics

    Regression modeling on small-sample scattered data poses certain challenges. In this study, the Gaussian process is used to model regression, and maximum likelihood estimation is performed to learn the hyperparameters of the kernel function. The regression results, i.e., the mean and variance of the objective function, are calculated and predicted from the posterior. Combining the results with the multi-objective optimization of variance, the uncertainty of material reverse design can be estimated. Experimental verifications are conducted on 1215MS non-quenched and tempered steel and three-point bending concrete datasets. The results show that for the three-point bending concrete, 50% of the experimental data are within the 95% confidence interval of the prediction, and the Gaussian process regression (GPR) model can measure the uncertainty of the scattered small-sample data more effectively and yield reasonable predictions. For the 1215MS dataset, a non-dominated genetic algorithm with an elite strategy is used to perform multi-objective optimization based on the GPR model. The mechanical properties of the material and the corresponding variance are used as optimization objectives, and the optimal mechanical properties are considered while considering the effect of uncertainties on the experimental results. The optimal Pareto solution set is obtained, which is subsequently used as candidate points for the next experiment to assist material design and preparation optimization.

    Machine Learning
    Feature selection based on reinforcement learning and its application in material informatics
    ZHANG Peng, ZHANG Rui
    2022, 28(3):  463-475.  doi:10.12066/j.issn.1007-2861.2375
    Asbtract ( 2066 )   HTML ( 28)   PDF (1439KB) ( 388 )  
    Figures and Tables | References | Related Articles | Metrics

    Owing the rapid development of big data, artificial intelligence, and high-performance computing, the research and development of data-driven materials has intensified. During data mining and the machine learning of material data, the feature set must be preprocessed by reducing redundant and irrelevant features, which can not only avoid model overfitting, but also improve the model interpretability. Herein, a feature selection method based on reinforcement learning, known as FSRL, is proposed. By abstracting the encapsulated feature selection method into the interaction between the machine learning model and environment, the corresponding features are selected based on the maximum reward and then incorporated to the feature subset. In addition, we propose a feature construction method based on symbolic transformation to generate new high-order features to improve the prediction accuracy of the model. Subsequently, we apply the abovementioned method to the classification task of amorphous alloy materials and the regression task of aluminum matrix composite materials. Experiments show that our proposed method not only successfully achieve feature transformation in the FSRL, but also afford a 2.8% prediction improvement in the classification task and a 22.9% prediction improvement in the regression task respectively.

    Phase stability prediction of hign entropy alloys in aluminum matrix composites based on feature engneering and machine learning
    HU Rui, LIU Qing, ZHANG Guangjie, LI Junjie, CHEN Xiaoyu, WEI Xiao, DAI Dongbo
    2022, 28(3):  476-484.  doi:10.12066/j.issn.1007-2861.2381
    Asbtract ( 2393 )   HTML ( 17)   PDF (2113KB) ( 229 )  
    Figures and Tables | References | Related Articles | Metrics

    Aluminum matrix composites offer many excellent properties and wide application prospects. High entropy alloys with a simple and stable phase can be used as reinforcement to prepare aluminum matrix composites with significantly improved performance in all aspects. Herein, a new method based on feature engineering and machine learning is proposed to investigate the phase stability of high entropy alloys. This method uses feature engineering to determine the important factors affecting the target attributes, and then selects the corresponding regression method to predict the phase stability. A model on 50% of the datasets is trained and then the model is verified on other datasets. The results show that this method is highly accurate in predicting the phase stability of high entropy alloys ($R^2=0.994$). In addition, this method can be used to identify key factors affecting phase stability.

    Prediction of pitting potential for stainless steel by support vector regression
    MAI Jiaqi, XU Pengcheng, DING Song, SUN Yangting, LU Wencong
    2022, 28(3):  485-491.  doi:10.12066/j.issn.1007-2861.2378
    Asbtract ( 1911 )   HTML ( 7)   PDF (1359KB) ( 203 )  
    Figures and Tables | References | Related Articles | Metrics

    Pitting corrosion is a primary corrosion type of stainless steel, and pitting potential is often used to evaluate the difficulty of corrosion of stainless steel. The pitting potential is affected by many factors. Based on the elemental composition and process parameters of stainless steel, support vector regression (SVR) was used to establish a model for predicting the pitting potential. The results showed that the correlation coefficient of the independent test set could reach 0.97 with the corresponding root mean square error (RMSE) of only 0.07. From the Pearson correlation analysis and sensitivity analysis, the element contents of Cr and Mo and the temperature had a crucial influence on the pitting potential, and a small amount of rare earth elements could improve the corrosion resistance of stainless steel.

    Multi-modal data representation learning for ceramic coating materials
    WU Xing, HU Mingtao, DING Peng
    2022, 28(3):  492-503.  doi:10.12066/j.issn.1007-2861.2383
    Asbtract ( 2131 )   HTML ( 18)   PDF (3224KB) ( 291 )  
    Figures and Tables | References | Related Articles | Metrics

    Ceramic coatings have excellent temperature resistance, corrosion resistance, and wear resistance, among other advantages. Their thermal expansion coefficient and thermal conductivity are two properties directly related to their performance. To address the issues of high experimental costs and challenging test conditions, we propose a method to predict the performance of ceramic coating materials based on multimodal data representation learning. To enlarge the data set, this method uses the Gaussian mixture model virtual sample generation (GMMVSG) algorithm to generate samples that match the real ceramic-coating data distribution. The method extracts micro-structural image data's features using the very deep convolutional neural network VGG16, extracts structured data's features using TabNet, and fuses the features of the extracted image data with those of the structured data. the final prediction models based on three machine learning algorithms-K-nearest neighbor (KNN), support-vector-machine regression (SVR), and multi-layer perceptron(MLP)—are established by using multimodal data representation to predict the thermal expansion coefficient and thermal conductivity of the performance index of ceramic coatings. The experimental results show that the proposed multimodal-data representation-learning model has a better prediction performance than that of the single-modal-data machine-learning model, and that the former model based on the MLP can most accurately predict ceramic coating performance. In the test set, the mean absolute and mean square errors for the prediction of the thermal expansion coefficient are 0.026 6 and 0.001 7, respectively, and the mean absolute and mean square errors for the prediction of thermal conductivity are 0.017 9 and 0.000 7, respectively. Our proposed learning method for multimodal data representation of ceramic coating materials effectively combines structured and unstructured data to learn both types of modal data with potentially shared information and successfully improves the pred.

    Two-stage ensemble learning model for predicting band gaps of composites
    XU Yan, HU Hongqing, LIU Xi, ZHANG Yufeng, DING Guangtai, ZHANG Huiran
    2022, 28(3):  504-511.  doi:10.12066/j.issn.1007-2861.2389
    Asbtract ( 1967 )   HTML ( 4)   PDF (4947KB) ( 266 )  
    Figures and Tables | References | Related Articles | Metrics

    The band gap is an important parameter that can affect the physical and chemical properties of perovskite oxide composites, such as their conductivity and photo-electricity. To identify new perovskites for different applications, their band gap should be predicted via machine learning. Herein, a two-stage ensemble learning model that can predict the band gap of perovskite oxide composites is proposed by combining multiple individual base learners via a certain strategy. The first stage involves individual test functions produced by multiple regression learners. All individual base learners and some specific descriptors are aggregated into an ensemble model in the second stage. Subsequently, a dataset comprising the data of 210 ABX$_3$-type perovskites is used to evaluate the proposed ensemble learning model. Results show that the proposed two-stage ensemble methodology can improve the generalization performance. Its successful application on ABX$_3$-type perovskites indicates the effectiveness and practicability of ensemble learning in material research.

    Data-driven based properties prediction and reverse design of aluminum matrix composites
    CHEN Shuizhou, WANG Xiaoshu, OUYANG Qiubao, ZHANG Rui
    2022, 28(3):  512-522.  doi:10.12066/j.issn.1007-2861.2386
    Asbtract ( 2159 )   HTML ( 10)   PDF (2194KB) ( 223 )  
    Figures and Tables | References | Related Articles | Metrics

    A data-driven approach was used to analyze the chemical composition and preparation process of aluminum matrix composites SiCp(0.5CNT)/7075Al, and analyze the tensile strength and elongation. An integrated framework comprising eight machine learning algorithms was constructed to automatically perform parameters tuning and optimal model selection. Subsequently, an inverse design of the material was conducted. Experimental results showed that under the heat treatment of a solid solution at 470 ${^\circ}$C for 40 min and aging at 120 ${^\circ}$C for 15 h, the predicted tensile strength and elongation of SiCp(0.5CNT)/7075Al-1.0Mg were 617.48 MPa and 2.98%, respectively, whereas the real experimental values were 647.0 MPa and 3.31%, respectively. The mean absolute percentage errors (MAPE) of the two mechanical properties between the predicted and experimental results were 4.56% and 9.97%, respectively. It indicated the effectiveness of the data-driven method for the process optimization and property improvement of aluminum matrix composites.

    Microstructure Image Recognition and Microstructure Analysis
    Recognition of topographic features of thermal barrier coating based on image processing
    LIU Yuhong, HAN Yuexing, WANG Yuyan, ZENG Yi
    2022, 28(3):  523-533.  doi:10.12066/j.issn.1007-2861.2371
    Asbtract ( 1955 )   HTML ( 10)   PDF (13005KB) ( 225 )  
    Figures and Tables | References | Related Articles | Metrics

    To address the shortcomings of manual detection of thermal barrier coating topographic features, such as complexity and large errors, a method for automatically identifying topographic features of thermal barrier coatings using machine vision and calculating topographic feature parameters is proposed in this study. First, splat contours are automatically extracted based on a mathematical morphology and calculation of spread morphological parameters. The maximum interclass variance method is next used to obtain the binary segmentation threshold and the median filter and morphological operations are used to denoise the image and ensure a single splat. The connectivity of the splat is then obtained by contour extraction, and the solidity parameter of the splat is finally calculated according to the extracted contour. Simultaneously, this study realizes automatic identification and length calculation of cracks in thermal barrier coatings based on a traversal search. First, the lamellae in the image are identified and removed, and the fractured crack is repaired by the closing operation. The cracked skeleton is next obtained through image refinement, and each crack is then traversed and searched to complete the length calculation. The results show that the method effectively detects the splat profile and identifies cracks, has a good anti-noise interference ability, and can accurately calculate topographic feature parameters. Thus, this method can play a critical role in promoting the study of the deposition behavior of thermal spray droplets on the surfaces of substrates.

    Anti-counterfeit label detection algorithm based on lightweight network
    ZHANG Hongkun, HAN Yuexing, CHEN Qiaochuan, Wu Jinbo
    2022, 28(3):  534-544.  doi:10.12066/j.issn.1007-2861.2373
    Asbtract ( 1891 )   HTML ( 8)   PDF (14948KB) ( 98 )  
    Figures and Tables | References | Related Articles | Metrics

    Economic loss caused by counterfeit and pirated products has increased annually; in this regard, counterfeiting technology has been improved continuously, where researchers attempt to improve anti-counterfeiting detection. To alleviate the complex computation, high resource requirement, and long detection time of existing anti-counterfeiting detection methods, an anti-counterfeiting label identification and detection model based on a lightweight network is proposed herein. Convolutional neural networks (CNNs) are adopted for shape and texture recognition. During shape recognition, the size of the pooling layer is reduced to enhance model learning ability. A coordinate attention (CA) module is used in texture classification to enhance the information acquisition of a single feature graph. The loss function is designed to enhance the model's ability to identify authentic samples. Finally, the prediction result is obtained using the maximum feature vector. Experimental results show that the maximum overall detection accuracy achievable by the proposed method is 95.67%, and that the detection time improved significantly as compared with the conventional method.

    High-throughput X-ray characterization of discrete component samples of rare-earth-doped thermal barrier-coating materials
    WU Guang, SONG Xuemei, ZHANG Yifeng, ZENG Yi, FENG Zhenjie
    2022, 28(3):  545-551.  doi:10.12066/j.issn.1007-2861.2385
    Asbtract ( 1875 )   HTML ( 9)   PDF (5523KB) ( 208 )  
    Figures and Tables | References | Related Articles | Metrics

    With continuous research and development in materials science, the high-throughput X-ray characterization technique is widely used in material research and development to significantly improve the efficiency of new materials. In addition, it is a vital technique for material genome engineering research. A high-throughput X-ray characterization system was designed, and the effect of different contents of rare-earth element-doped thermal barrier-coating materials on their structural phase stability was investigated. The X-ray source could perform high-precision $x$-$y$ two-dimensional plane translation. Under the condition of ensuring data sample quality, multiple discrete samples could be detected within a short time.

    High-throughput X-ray diffraction of La$_{{1-x}}$Sr$_{x}$TiOx$_{3}$ thin films
    ZHANG Yifeng, WANG Yangzhou, WU Guang, CHEN Fei, FENG Zhenjie
    2022, 28(3):  552-557.  doi:10.12066/j.issn.1007-2861.2384
    Asbtract ( 1742 )   HTML ( 5)   PDF (6236KB) ( 174 )  
    Figures and Tables | References | Related Articles | Metrics

    Different from the traditional low-efficiency "trial-and-error" material development method, the high-throughput material synthesis and material characterization methods significantly accelerate the development and reformation of materials science. In this study, X-ray diffraction patterns of La$_{1-x}$Sr$_{x}$TiO$_{3}$ thin films were obtained through high-throughput X-ray diffraction experiments. The continuous change in the La$_{1-x}$Sr$_{x}$TiO$_{3}$ thin film was analyzed. This study will guide future experimental studies on various types of high-throughput X-ray diffraction.

    Analysis of nozzle clogging in the production of non-quenched and tempered, tellurium-containing 38MnVS6 steel
    SHEN Ping, LI Jie, ZHANG Hao, FU Jianxun
    2022, 28(3):  558-568.  doi:10.12066/j.issn.1007-2861.2374
    Asbtract ( 1956 )   HTML ( 9)   PDF (13340KB) ( 122 )  
    Figures and Tables | References | Related Articles | Metrics

    To control the morphology and distribution of sulfide inclusions in steel and to improve the product quality, calcium treatment was replaced by tellurium treatment, yielding high-quality, non-quenched, and tempered tellurium-containing 38MnVS6 steel. However, the production process was hampered by nozzle clogging. In order to determine its cause, X-ray diffraction (XRD) analysis, scanning electron microscopy analysis, and thermodynamic calculations were carried out. The relationship between the main phases of the clogs and the inclusions in the steel was constructed, and the effect of tellurium treatment on nozzle clogging was explored. The results show that the clogs are mainly composed of CaO$\cdot $2Al$_{2}$O$_{3}$ and MgO$\cdot $Al$_{2}$O$_{3}$ and do not contain any tellurium phases, which have a similar composition to oxide inclusions. Therefore, nozzle clogging is not directly caused by tellurium. By treating the steel with tellurium instead of calcium, the calcium content in the steel is insufficiently high for transforming Al$_{2}$O$_{3}$ to low-melting-point 12CaO$\cdot $7Al$_{2}$O$_{3}$. The main calcium aluminate inclusion produced using the specified Al and Ca contents is CaO$\cdot $2Al$_{2}$O$_{3}$. In addition, owing to the small amount of residual Mg ($0.25\times10^{-6}\sim 1.46\times10^{-6}$) in the steel, Al$_{2}$O$_{3}$ is converted into MgO$\cdot $Al$_{2}$O$_{3}$. When the molten steel flows through the nozzle, CaO$\cdot $2Al$_{2}$O$_{3}$ and MgO$\cdot $Al$_{2}$O$_{3}$ sinter and adhere to each other on the inner wall of the nozzle. The inclusions continuously accumulate and gradually thicken, finally resulting in nozzle clogging.