scholarly journals General QSPR Protocol for Atomic/Inter-atomic Properties Predictions: Fragments based Graph Convolutional Neural Network (F-GCN)

Author(s):  
Peng Gao ◽  
Jie Zhang ◽  
Hongbo Qiu ◽  
Shuaifei Zhao

In this study, a general quantitative structure-property relationship (QSPR) protocol, fragments based graph convolutional neural network (F-GCN), was developed for atomic and inter-atomic properties predictions. We applied this novel artificial intelligence (AI) tool in NMR chemical shifts and bond dissociation energies (BDEs) predictions. The predicted results were comparable to experimental measurement, while the computational cost was substantially reduced, with respect to pure density functional theory (DFT) calculations. The two important features of F-GCN can be summarised as: first, it could utilise different levels of molecular fragments centered at the target chemical bonds for atomic and inter-atomic information extraction; second, the designed architecture is also open to include additional descriptors for more accurate solution of chemical environment, making itself more efficient for local properties descriptions. And during our test, the averaged prediction error of <sup>1</sup>H NMR chemical shifts can be as small as 0.32 ppm; and the error of C-H BDEs estimations, is 2.7 kcal/mol. Moreover, we further demonstrated the applicability of this developed F-GCN model via several challenging structural assignments. The success of the F-GCN in atomic and inter-atomic predictions also indicates an essential improvement of computational chemistry with the assistance of AI tools.

2021 ◽  
Author(s):  
Peng Gao ◽  
Jie Zhang ◽  
Hongbo Qiu ◽  
Shuaifei Zhao

In this study, a general quantitative structure-property relationship (QSPR) protocol, fragments based graph convolutional neural network (F-GCN), was developed for atomic and inter-atomic properties predictions. We applied this novel artificial intelligence (AI) tool in NMR chemical shifts and bond dissociation energies (BDEs) predictions. The predicted results were comparable to experimental measurement, while the computational cost was substantially reduced, with respect to pure density functional theory (DFT) calculations. The two important features of F-GCN can be summarised as: first, it could utilise different levels of molecular fragments centered at the target chemical bonds for atomic and inter-atomic information extraction; second, the designed architecture is also open to include additional descriptors for more accurate solution of chemical environment, making itself more efficient for local properties descriptions. And during our test, the averaged prediction error of <sup>1</sup>H NMR chemical shifts can be as small as 0.32 ppm; and the error of C-H BDEs estimations, is 2.7 kcal/mol. Moreover, we further demonstrated the applicability of this developed F-GCN model via several challenging structural assignments. The success of the F-GCN in atomic and inter-atomic predictions also indicates an essential improvement of computational chemistry with the assistance of AI tools.


2019 ◽  
Author(s):  
Peng Gao ◽  
Jun Zhang ◽  
Qian Peng ◽  
Vassiliki-Alexandra Glezakou

Accurate prediction of NMR chemical shifts with affordable computational cost is of great importance for rigorous structural assignments of experimental studies. However, the most popular computational schemes for NMR calculation—based on density functional theory (DFT) and gauge-including atomic orbital (GIAO) methods—still suffer from ambiguities in structural assignments. Using state-of-the-art machine learning (ML) techniques, we have developed a DFT+ML model that is capable of predicting 13C/1H NMR chemical shifts of organic molecules with high accuracy. The input for this generalizable DFT+ML model contains two critical parts: one is a vector providing insights into chemical environments, which can be evaluated without knowing the exact geometry of the molecule; the other one is the DFT-calculated isotropic shielding constant. The DFT+ML model was trained with a dataset containing 476 13C and 270 1H experimental chemical shifts. For the DFT methods used here, the root-mean-square-derivations (RMSDs) for the errors between predicted and experimental 13C/1H chemical shifts are as small as 2.10/0.18 ppm, which is much lower than the typical DFT (5.54/0.25 ppm), or DFT+linear regression (4.77/0.23 ppm) approaches. It also has smaller RMSDs and maximum absolute errors than two previously reported NMR-predicting ML models. We test the robustness of the model on two classes of organic molecules (TIC10 and hyacinthacines), where we unambiguously assigned the correct isomers to the experimental ones. This DFT+ML model is a promising way of predicting NMR chemical shifts and can be easily adapted to calculated shifts for any chemical compound.<br>


2019 ◽  
Author(s):  
Peng Gao ◽  
Jun Zhang ◽  
Qian Peng ◽  
Vassiliki-Alexandra Glezakou

Accurate prediction of NMR chemical shifts with affordable computational cost is of great importance for rigorous structural assignments of experimental studies. However, the most popular computational schemes for NMR calculation—based on density functional theory (DFT) and gauge-including atomic orbital (GIAO) methods—still suffer from ambiguities in structural assignments. Using state-of-the-art machine learning (ML) techniques, we have developed a DFT+ML model that is capable of predicting 13C/1H NMR chemical shifts of organic molecules with high accuracy. The input for this generalizable DFT+ML model contains two critical parts: one is a vector providing insights into chemical environments, which can be evaluated without knowing the exact geometry of the molecule; the other one is the DFT-calculated isotropic shielding constant. The DFT+ML model was trained with a dataset containing 476 13C and 270 1H experimental chemical shifts. For the DFT methods used here, the root-mean-square-derivations (RMSDs) for the errors between predicted and experimental 13C/1H chemical shifts are as small as 2.10/0.18 ppm, which is much lower than the typical DFT (5.54/0.25 ppm), or DFT+linear regression (4.77/0.23 ppm) approaches. It also has smaller RMSDs and maximum absolute errors than two previously reported NMR-predicting ML models. We test the robustness of the model on two classes of organic molecules (TIC10 and hyacinthacines), where we unambiguously assigned the correct isomers to the experimental ones. This DFT+ML model is a promising way of predicting NMR chemical shifts and can be easily adapted to calculated shifts for any chemical compound.<br>


2020 ◽  
Author(s):  
Yasemin Yesiltepe ◽  
Niri Govind ◽  
Thomas O. Metz ◽  
Ryan Renslow

Abstract The majority of primary and secondary metabolites in nature have yet to be identified, representing a major challenge for metabolomics studies that currently require reference libraries from analyses of authentic compounds. Using currently available analytical methods, complete chemical characterization of metabolomes is infeasible for both technical and economic reasons. For example, unambiguous identification of metabolites is limited by the availability of authentic chemical standards, which, for the majority of molecules, do not exist. Computationally predicted or calculated data are a viable solution to expand the currently limited metabolite reference libraries, if such methods are shown to be sufficiently accurate. For example, determining nuclear magnetic resonance (NMR) spectroscopy spectra in silico has shown promise in the identification and delineation of metabolite structures. Many researchers have been taking advantage of density functional theory (DFT), a computationally inexpensive yet reputable method for the prediction of carbon and proton NMR spectra of metabolites. However, such methods are expected to have some error in predicted 13C and 1H NMR spectra with respect to experimentally measured values. This leads us to the question–what accuracy is required in predicted 13C and 1H NMR chemical shifts for confident metabolite identification? Using the set of 11,716 small molecules found in the Human Metabolome Database (HMDB), we simulated both experimental and theoretical NMR chemical shift databases. We investigated the level of accuracy required for identification of metabolites in simulated pure and impure samples by matching predicted chemical shifts to experimental data. We found 90% or more of molecules in simulated pure samples can be successfully identified when errors of 1H and 13C chemical shifts in water are below 0.6 ppm and 7.1 ppm, respectively, and below 0.5 ppm and 4.6 ppm in chloroform solvation, respectively. In simulated complex mixtures, as the complexity of the mixture increased, greater accuracy of the calculated chemical shifts was required, as expected. However, if the number of molecules in the mixture is known, e.g., when NMR is combined with MS and sample complexity is low, the likelihood of confident molecular identification increased by 90%.


Author(s):  
Peng Gao ◽  
Jie Zhang ◽  
Hongbo Qiu ◽  
Shuaifei Zhao

In this study, a general quantitative structure-property relationship (QSPR) protocol, fragments based graph convolutional neural network (F-GCN), was developed for atomic and inter-atomic properties predictions. We applied this novel artificial...


Materials ◽  
2018 ◽  
Vol 11 (9) ◽  
pp. 1646 ◽  
Author(s):  
Ilia Ponomarev ◽  
Peter Kroll

We investigate 29Si nuclear magnetic resonance (NMR) chemical shifts, δiso, of silicon nitride. Our goal is to relate the local structure to the NMR signal and, thus, provide the means to extract more information from the experimental 29Si NMR spectra in this family of compounds. We apply structural modeling and the gauge-included projector augmented wave (GIPAW) method within density functional theory (DFT) calculations. Our models comprise known and hypothetical crystalline Si3N4, as well as amorphous Si3N4 structures. We find good agreement with available experimental 29Si NMR data for tetrahedral Si[4] and octahedral Si[6] in crystalline Si3N4, predict the chemical shift of a trigonal-bipyramidal Si[5] to be about −120 ppm, and quantify the impact of Si-N bond lengths on 29Si δiso. We show through computations that experimental 29Si NMR data indicates that silicon dicarbodiimide, Si(NCN)2 exhibits bent Si-N-C units with angles of about 143° in its structure. A detailed investigation of amorphous silicon nitride shows that an observed peak asymmetry relates to the proximity of a fifth N neighbor in non-bonding distance between 2.5 and 2.8 Å to Si. We reveal the impact of both Si-N(H)-Si bond angle and Si-N bond length on 29Si δiso in hydrogenated silicon nitride structure, silicon diimide Si(NH)2.


Inventions ◽  
2021 ◽  
Vol 6 (4) ◽  
pp. 70
Author(s):  
Elena Solovyeva ◽  
Ali Abdullah

In this paper, the structure of a separable convolutional neural network that consists of an embedding layer, separable convolutional layers, convolutional layer and global average pooling is represented for binary and multiclass text classifications. The advantage of the proposed structure is the absence of multiple fully connected layers, which is used to increase the classification accuracy but raises the computational cost. The combination of low-cost separable convolutional layers and a convolutional layer is proposed to gain high accuracy and, simultaneously, to reduce the complexity of neural classifiers. Advantages are demonstrated at binary and multiclass classifications of written texts by means of the proposed networks under the sigmoid and Softmax activation functions in convolutional layer. At binary and multiclass classifications, the accuracy obtained by separable convolutional neural networks is higher in comparison with some investigated types of recurrent neural networks and fully connected networks.


Sign in / Sign up

Export Citation Format

Share Document