scholarly journals QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Johannes Hoja ◽  
Leonardo Medrano Sandonas ◽  
Brian G. Ernst ◽  
Alvaro Vazquez-Mayagoitia ◽  
Robert A. DiStasio ◽  
...  

AbstractWe introduce QM7-X, a comprehensive dataset of 42 physicochemical properties for ≈4.2 million equilibrium and non-equilibrium structures of small organic molecules with up to seven non-hydrogen (C, N, O, S, Cl) atoms. To span this fundamentally important region of chemical compound space (CCS), QM7-X includes an exhaustive sampling of (meta-)stable equilibrium structures—comprised of constitutional/structural isomers and stereoisomers, e.g., enantiomers and diastereomers (including cis-/trans- and conformational isomers)—as well as 100 non-equilibrium structural variations thereof to reach a total of ≈4.2 million molecular structures. Computed at the tightly converged quantum-mechanical PBE0+MBD level of theory, QM7-X contains global (molecular) and local (atom-in-a-molecule) properties ranging from ground state quantities (such as atomization energies and dipole moments) to response quantities (such as polarizability tensors and dispersion coefficients). By providing a systematic, extensive, and tightly-converged dataset of quantum-mechanically computed physicochemical properties, we expect that QM7-X will play a critical role in the development of next-generation machine-learning based models for exploring greater swaths of CCS and performing in silico design of molecules with targeted properties.

1973 ◽  
Vol 27 (1) ◽  
pp. 30-40 ◽  
Author(s):  
Joseph Schechter ◽  
Peter C. Jurs

An empirical method employing computerized pattern recognition techniques has been applied to the generation of simulated mass spectra of small organic molecules. Molecular structures are represented in computer-compatible form through the use of a fragmentation code which assigns code designations to specific groups of atoms and/or bonds within the molecules. Using such descriptions of molecules, pattern classifiers have been developed to predict the presence or absence of mass spectral peaks in each of 60 nominal m/e positions and to give a measure of the intensity of peaks in 11 of these. Information in the molecular descriptor lists which correlates with the appearance of specific peaks is shown to be present in relatively few of the descriptors developed. To test the complete system, a number of entire mass spectra were developed; in this test, 93% of the classifications were made correctly.


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Tim Würger ◽  
Di Mei ◽  
Bahram Vaghefinazari ◽  
David A. Winkler ◽  
Sviatlana V. Lamaka ◽  
...  

AbstractSmall organic molecules that modulate the degradation behavior of Mg constitute benign and useful materials to modify the service environment of light metal materials for specific applications. The vast chemical space of potentially effective compounds can be explored by machine learning-based quantitative structure-property relationship models, accelerating the discovery of potent dissolution modulators. Here, we demonstrate how unsupervised clustering of a large number of potential Mg dissolution modulators by structural similarities and sketch-maps can predict their experimental performance using a kernel ridge regression model. We compare the prediction accuracy of this approach to that of a prior artificial neural networks study. We confirm the robustness of our data-driven model by blind prediction of the dissolution modulating performance of 10 untested compounds. Finally, a workflow is presented that facilitates the automated discovery of chemicals with desired dissolution modulating properties from a commercial database. We subsequently prove this concept by blind validation of five chemicals.


Synlett ◽  
2021 ◽  
Author(s):  
Chuan He ◽  
Jiefeng Zhu

A Rh-catalyzed enantioselective intermolecular dehydrogenative Si−O coupling of dihydrosilanes with alocohols and silanols is demonstrated. Rh(I) catalyst equipped with a Josiphos ligand enables the highly enantioselective alcoholysis process of dihydrosilanes, giving access to a variety of functionalized triorgano-substituted silicon-stereogenic alkoxysilanes and siloxanes in decent yields and ee, which significantly expand the chemical space of the silicon-centered chiral molecules. Utility of this methodology is illustrated by the construction of CPL-active (circularly polarized luminescence) chiral alkoxysilane small organic molecules.


Author(s):  
Joshua Horton ◽  
Alice Allen ◽  
Leela Dodda ◽  
Daniel Cole

<div><div><div><p>Modern molecular mechanics force fields are widely used for modelling the dynamics and interactions of small organic molecules using libraries of transferable force field parameters. For molecules outside the training set, parameters may be missing or inaccurate, and in these cases, it may be preferable to derive molecule-specific parameters. Here we present an intuitive parameter derivation toolkit, QUBEKit (QUantum mechanical BEspoke Kit), which enables the automated generation of system-specific small molecule force field parameters directly from quantum mechanics. QUBEKit is written in python and combines the latest QM parameter derivation methodologies with a novel method for deriving the positions and charges of off-center virtual sites. As a proof of concept, we have re-derived a complete set of parameters for 109 small organic molecules, and assessed the accuracy by comparing computed liquid properties with experiment. QUBEKit gives highly competitive results when compared to standard transferable force fields, with mean unsigned errors of 0.024 g/cm3, 0.79 kcal/mol and 1.17 kcal/mol for the liquid density, heat of vaporization and free energy of hydration respectively. This indicates that the derived parameters are suitable for molecular modelling applications, including computer-aided drug design.</p></div></div></div>


Author(s):  
Joshua Horton ◽  
Alice Allen ◽  
Leela Dodda ◽  
Daniel Cole

<div><div><div><p>Modern molecular mechanics force fields are widely used for modelling the dynamics and interactions of small organic molecules using libraries of transferable force field parameters. For molecules outside the training set, parameters may be missing or inaccurate, and in these cases, it may be preferable to derive molecule-specific parameters. Here we present an intuitive parameter derivation toolkit, QUBEKit (QUantum mechanical BEspoke Kit), which enables the automated generation of system-specific small molecule force field parameters directly from quantum mechanics. QUBEKit is written in python and combines the latest QM parameter derivation methodologies with a novel method for deriving the positions and charges of off-center virtual sites. As a proof of concept, we have re-derived a complete set of parameters for 109 small organic molecules, and assessed the accuracy by comparing computed liquid properties with experiment. QUBEKit gives highly competitive results when compared to standard transferable force fields, with mean unsigned errors of 0.024 g/cm3, 0.79 kcal/mol and 1.17 kcal/mol for the liquid density, heat of vaporization and free energy of hydration respectively. This indicates that the derived parameters are suitable for molecular modelling applications, including computer-aided drug design.</p></div></div></div>


2019 ◽  
Author(s):  
Siddhartha Laghuvarapu ◽  
Yashaswi Pathak ◽  
U. Deva Priyakumar

Recent advances in artificial intelligence along with development of large datasets of energies calculated using quantum mechanical (QM)/density functional theory (DFT) methods have enabled prediction of accurate molecular energies at reasonably low computational cost. However, machine learning models that have been reported so far requires the atomic positions obtained from geometry optimizations using high level QM/DFT methods as input in order to predict the energies, and do not allow for geometry optimization. In this paper, a transferable and molecule-size independent machine learning model (BAND NN) based on a chemically intuitive representation inspired by molecular mechanics force fields is presented. The model predicts the atomization energies of equilibrium and non-equilibrium structures as sum of energy contributions from bonds (B), angles (A), nonbonds (N) and dihedrals (D) at remarkable accuracy. The robustness of the proposed model is further validated by calculations that span over the conformational, configurational and reaction space. The transferability of this model on systems larger than the ones in the dataset is demonstrated by performing calculations on select large molecules. Importantly, employing the BAND NN model, it is possible to perform geometry optimizations starting from non-equilibrium structures along with predicting their energies.


ACS Omega ◽  
2021 ◽  
Vol 6 (7) ◽  
pp. 4995-5000 ◽  
Author(s):  
Jiaxiang Zhang ◽  
Junwen Yang ◽  
Ziyue Liu ◽  
Bin Zheng

Sign in / Sign up

Export Citation Format

Share Document