Data Collection, Database and Data Processing

Ensemble learning of polypropylene-composite aging data

Expand
  • 1. School of Computer Science and Engineering, Shanghai University, Shanghai 200444, China
    2. Zhejiang Laboratory, Hangzhou 311100, Zhejiang, China
    3. Research Center of Nanoscience and Nanotechnology, College of Sciences, Shanghai University, Shanghai 200444, China
    4. Center of Materials Informatics and Data Science, Materials Genome Institute, Shanghai University, Shanghai 200444, China

Received date: 2022-03-26

  Online published: 2022-05-27

Abstract

Aging experiments conducted on polypropylene composites have long durations, and a limited number of samples can be collected in a single experiment. As a result, traditional machine-learning approaches have a low prediction accuracy. To address these issues, we present an ensemble learning prediction based on virtual sample generation (VSG). To generate valid virtual samples of aging data for polypropylene composites, we first adopted the Gaussian mixed model (GMM) method and then used the generated data set to build an ensemble-learning prediction model comprising the random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) algorithms. The LightGBM and CatBoost algorithms in the ensemble learning model demonstrate the best performance on the test data; the mean square errors are 0.001 3 and 0.000 1, respectively, which are 0.4 and 0.2 higher than those of the RF algorithm and XGBoost algorithm, respectively. This study's aging VSG and ensemble learning approach for polypropylene composites can not only successfully overcome the long experimental times and insufficient number of data samples acquired in a single experiment but outperforms a single machine-learning algorithm.

Cite this article

WU Xing, GAO Jin, DING Peng . Ensemble learning of polypropylene-composite aging data[J]. Journal of Shanghai University, 2022 , 28(3) : 440 -450 . DOI: 10.12066/j.issn.1007-2861.2382

References

[1] Oladele I O, Oladejo M O, Adediran A A, et al. Influence of designated properties on the characteristics of dombeya buettneri fiber/graphite hybrid reinforced polypropylene composites[J]. Scientific Reports, 2020, 10(1): 1-13.
[2] 李宏岩, 周琳霞. 聚丙烯/竹纤维复合材料的制备及力学和抗老化性能研究[J]. 塑料科技, 2021, 49(7): 43-46.
[3] Nishiwaki M, Fujiwara H. Highly accurate prediction of material optical properties based on density functional theory[J]. Computational Materials Science, 2020, 172: 109315.
[4] He X L, Lei Z, Han Z, et al. Virtual sample generation method and its application in reforming data modeling[J]. Petroleum Processing and Petrochemicals, 2021, 52(6): 92-95.
[5] Han M, Wang Z, Zhang X. An approach to data acquisition for urban building energy modeling using a gaussian mixture model and expectation-maximization algorithm[J]. Buildings, 2021, 11(1): 30-48.
[6] Delon J, Desolneux A. A Wasserstein-type distance in the space of Gaussian mixture models[J]. SIAM Journal on Imaging Sciences, 2020, 13(2): 936-970.
[7] Kopp M, Pevný T, Holeñ M. Anomaly explanation with random forests[J]. Expert Systems with Applications, 2020, 149: 113187-113202.
[8] Aldrich C. Process variable importance analysis by use of random forests in a shapley regression framework[J]. Minerals, 2020, 10(5): 420-436.
[9] Daneshvar D, Behnood A. Estimation of the dynamic modulus of asphalt concretes using random forests algorithm[J]. International Journal of Pavement Engineering, 2022, 23(2): 250-260.
[10] Sagi O, Rokach L. Approximating XGBoost with an interpretable decision tree[J]. Information Sciences, 2021, 572: 522-542.
[11] Yan J, Xu Y, Cheng Q, et al. LightGBM: Accelerated genomically designed crop breeding through ensemble learning[J]. Genome Biology, 2021, 22(1): 1-24.
[12] Hancock J T, Khoshgoftaar T M. CatBoost for big data: an interdisciplinary review[J]. Journal of Big Data, 2020, 7(1): 1-45.
[13] de Rooij M, Weeda W. Cross-validation: A method every psychologist should know[J]. Advances in Methods and Practices in Psychological Science, 2020, 3(2): 248-263.
[14] Cai S, Zhao L, Ban Y, et al. GAN-based image-to-friction generation for tactile simulation of fabric material[J]. Computers & Graphics, 2021, 102: 460-473.
[15] Ali M A, Guan Q, Umer R, et al. Deep learning based semantic segmentation of $\mu $CT images for creating digital material twins of fibrous reinforcements[J]. Composites Part A: Applied Science and Manufacturing, 2020, 139: 106131-106137.
[16] Li L, Damarla S K, Wang Y, et al. A Gaussian mixture model based virtual sample generation approach for small datasets in industrial processes[J]. Information Sciences, 2021, 581: 262-277.
[17] Zhang Z, Mansouri Tehrani A, Oliynyk A O, et al. Finding the next superhard material through ensemble learning[J]. Advanced Materials, 2021, 33(5): 2005112-2005119.
[18] Talukdar S, Ghose B, Salam R, et al. Flood susceptibility modeling in Teesta River basin, Bangladesh using novel ensembles of bagging algorithms[J]. Stochastic Environmental Research and Risk Assessment, 2020, 34(12): 2277-2300.
[19] Liu K, Hu X, Zhou H, et al. Feature Analyses and Modeling of Lithium-Ion Battery Manufacturing Based on Random Forest Classification[J]. IEEE/ASME Transactions on Mechatronics, 2021, 26(6): 2944-2955.
[20] Gao X, Wang L, Yao L. Porosity prediction of ceramic matrix composites based on random forest[C]// IOP Conference Series: Materials Science and Engineering. IOP Publishing: Information Technology, 2020: 052115-052121.
[21] Khan M A, Memon S A, Farooq F, et al. Compressive strength of fly-ash-based geopolymer concrete by gene expression programming and random forest[J]. Advances in Civil Engineering, 2021, 2021: 1-17.
[22] Ebrahimy H, Feizizadeh B, Salmani S, et al. A comparative study of land subsidence susceptibility mapping of Tasuj plane, Iran, using boosted regression tree, random forest and classification and regression tree methods[J]. Environmental Earth Sciences, 2020, 79(10): 1-12.
[23] Song K, Yan F, Ding T, et al. A steel property optimization model based on the XGBoost algorithm and improved PSO[J]. Computational Materials Science, 2020, 174: 109472-109484.
[24] Zhao Y, Fu C, Fu L, et al. Data-driven machine learning models for the quick and accurate prediction of Tg and Td of OLED materials[J]Materials Chemistry, 2021, 22: 1-30.
[25] Bhamare D K, Saikia P, Rathod M K, et al. A machine learning and deep learning based approach to predict the thermal performance of phase change material integrated building envelope[J]. Building and Environment, 2021, 199: 107927-107938.
Outlines

/