Impact of dataset uncertainties on machine learning model predictions: the example of polymer glass transition temperatures

2019 ◽  
Vol 27 (2) ◽  
pp. 024002 ◽  
Author(s):  
Anurag Jha ◽  
Anand Chandrasekaran ◽  
Chiho Kim ◽  
Rampi Ramprasad
2020 ◽  
Vol 188 ◽  
pp. 92-100 ◽  
Author(s):  
Edesio Alcobaça ◽  
Saulo Martiello Mastelini ◽  
Tiago Botari ◽  
Bruno Almeida Pimentel ◽  
Daniel Roberto Cassar ◽  
...  

Materials ◽  
2020 ◽  
Vol 13 (24) ◽  
pp. 5701
Author(s):  
Zhuoying Jiang ◽  
Jiajie Hu ◽  
Babetta L. Marrone ◽  
Ghanshyam Pilania ◽  
Xiong (Bill) Yu

The purpose of this study was to develop a data-driven machine learning model to predict the performance properties of polyhydroxyalkanoates (PHAs), a group of biosourced polyesters featuring excellent performance, to guide future design and synthesis experiments. A deep neural network (DNN) machine learning model was built for predicting the glass transition temperature, Tg, of PHA homo- and copolymers. Molecular fingerprints were used to capture the structural and atomic information of PHA monomers. The other input variables included the molecular weight, the polydispersity index, and the percentage of each monomer in the homo- and copolymers. The results indicate that the DNN model achieves high accuracy in estimation of the glass transition temperature of PHAs. In addition, the symmetry of the DNN model is ensured by incorporating symmetry data in the training process. The DNN model achieved better performance than the support vector machine (SVD), a nonlinear ML model and least absolute shrinkage and selection operator (LASSO), a sparse linear regression model. The relative importance of factors affecting the DNN model prediction were analyzed. Sensitivity of the DNN model, including strategies to deal with missing data, were also investigated. Compared with commonly used machine learning models incorporating quantitative structure–property (QSPR) relationships, it does not require an explicit descriptor selection step but shows a comparable performance. The machine learning model framework can be readily extended to predict other properties.


Sign in / Sign up

Export Citation Format

Share Document