scholarly journals Quantum chemical benchmark databases of gold-standard dimer interaction energies

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Alexander G. Donchev ◽  
Andrew G. Taube ◽  
Elizabeth Decolvenaere ◽  
Cory Hargus ◽  
Robert T. McGibbon ◽  
...  

AbstractAdvances in computational chemistry create an ongoing need for larger and higher-quality datasets that characterize noncovalent molecular interactions. We present three benchmark collections of quantum mechanical data, covering approximately 3,700 distinct types of interacting molecule pairs. The first collection, which we refer to as DES370K, contains interaction energies for more than 370,000 dimer geometries. These were computed using the coupled-cluster method with single, double, and perturbative triple excitations [CCSD(T)], which is widely regarded as the gold-standard method in electronic structure theory. Our second benchmark collection, a core representative subset of DES370K called DES15K, is intended for more computationally demanding applications of the data. Finally, DES5M, our third collection, comprises interaction energies for nearly 5,000,000 dimer geometries; these were calculated using SNS-MP2, a machine learning approach that provides results with accuracy comparable to that of our coupled-cluster training data. These datasets may prove useful in the development of density functionals, empirically corrected wavefunction-based approaches, semi-empirical methods, force fields, and models trained using machine learning methods.

Author(s):  
Mihail Bogojeski ◽  
Leslie Vogt-Maranto ◽  
Mark E. Tuckerman ◽  
Klaus-Robert Mueller ◽  
Kieron Burke

<div> <div> <div> <p>Kohn-Sham density functional theory (DFT) is a standard tool in most branches of chemistry, but accuracies for many molecules are limited to 2-3 kcal/mol with presently-available functionals. <i>Ab initio </i>methods, such as coupled-cluster, routinely produce much higher accuracy, but computational costs limit their application to small molecules. We create density functionals from coupled-cluster energies, based only on DFT densities, via machine learning. These functionals attain quantum chemical accuracy (errors below 1 kcal/mol). Moreover, density-based ∆-learning (learning only the correction to a standard DFT calculation, ∆-DFT) significantly reduces the amount of training data required. We demonstrate these concepts for a single water molecule, and then illustrate how to include molecular symmetries with ethanol. Finally, we highlight the robustness of ∆-DFT by correcting DFT simulations of resorcinol on the fly to obtain molecular dynamics (MD) trajectories with coupled-cluster accuracy. Thus ∆-DFT opens the door to running gas-phase MD simulations with quantum chemical accuracy, even for strained geometries and conformer changes where standard DFT is quantitatively incorrect. </p> </div> </div> </div>


2021 ◽  
Author(s):  
Raul Rodriguez-Esteban ◽  
José Duarte ◽  
Priscila C. Teixeira ◽  
Fabien Richard ◽  
Svetlana Koltsova ◽  
...  

AbstractBackgroundA key step in clinical flow cytometry data analysis is gating, which involves the identification of cell populations. The process of gating produces a set of reportable results, which are typically described by gating definitions. The non-standardized, non-interpreted nature of gating definitions represents a hurdle for data interpretation and data sharing across and within organizations. Interpreting and standardizing gating definitions for subsequent analysis of gating results requires a curation effort from experts. Machine learning approaches have the potential to help in this process by predicting expert annotations associated with gating definitions.MethodsWe created a gold-standard dataset by manually annotating thousands of gating definitions with cell type and functional marker annotations. We used this dataset to train and test a machine learning pipeline able to predict standard cell types and functional marker genes associated with gating definitions.ResultsThe machine learning pipeline predicted annotations with high accuracy for both cell types and functional marker genes. Accuracy was lower for gating definitions from assays belonging to laboratories from which limited or no prior data was available in the training. Manual error review ensured that resulting predicted annotations could be reused subsequently as additional gold-standard training data.ConclusionsMachine learning methods are able to consistently predict annotations associated with gating definitions from flow cytometry assays. However, a hybrid automatic and manual annotation workflow would be recommended to achieve optimal results.


2019 ◽  
Author(s):  
Mihail Bogojeski ◽  
Leslie Vogt-Maranto ◽  
Mark E. Tuckerman ◽  
Klaus-Robert Mueller ◽  
Kieron Burke

<div> <div> <p>Kohn-Sham density functional theory (DFT) is a standard tool in most branches of chemistry, but accuracies for many molecules are limited to 2-3 kcal/mol with presently-available functionals. <i>Ab initio</i> methods, such as coupled-cluster, routinely produce much higher accuracy, but computational costs limit their application to small molecules. In this paper, we leverage machine learning to calculate coupled-cluster energies from DFT densities, reaching quantum chemical accuracy (errors below 1 kcal/mol) on test data. Moreover, density-based ∆-learning (learning only the correction to a standard DFT calculation, termed ∆-DFT) significantly reduces the amount of training data required, particularly when molecular symmetries are included. The robustness of ∆-DFT is highlighted by correcting "on the fly" DFT-based molecular dynamics (MD) simulations of resorcinol (C<sub>6</sub>H<sub>4</sub>(OH)<sub>2</sub>) to obtain MD trajectories with coupled-cluster accuracy. We conclude, therefore, that ∆-DFT facilitates running gas-phase MD simulations with quantum chemical accuracy, even for strained geometries and conformer changes where standard DFT fails.</p> </div> </div>


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Mihail Bogojeski ◽  
Leslie Vogt-Maranto ◽  
Mark E. Tuckerman ◽  
Klaus-Robert Müller ◽  
Kieron Burke

Abstract Kohn-Sham density functional theory (DFT) is a standard tool in most branches of chemistry, but accuracies for many molecules are limited to 2-3 kcal ⋅ mol−1 with presently-available functionals. Ab initio methods, such as coupled-cluster, routinely produce much higher accuracy, but computational costs limit their application to small molecules. In this paper, we leverage machine learning to calculate coupled-cluster energies from DFT densities, reaching quantum chemical accuracy (errors below 1 kcal ⋅ mol−1) on test data. Moreover, density-based Δ-learning (learning only the correction to a standard DFT calculation, termed Δ-DFT ) significantly reduces the amount of training data required, particularly when molecular symmetries are included. The robustness of Δ-DFT  is highlighted by correcting “on the fly” DFT-based molecular dynamics (MD) simulations of resorcinol (C6H4(OH)2) to obtain MD trajectories with coupled-cluster accuracy. We conclude, therefore, that Δ-DFT  facilitates running gas-phase MD simulations with quantum chemical accuracy, even for strained geometries and conformer changes where standard DFT fails.


2004 ◽  
Vol 69 (1) ◽  
pp. 189-212 ◽  
Author(s):  
Juraj Raab ◽  
Andrej Antušek ◽  
Stanislav Biskupič ◽  
Miroslav Urban

The partially spin-adapted coupled cluster method with the restricted open-shell Hartree- Fock reference was applied to calculations of interaction energies between the helium atom and the three radicals, CN (2Σ), NO (2Π), and O2 (3Sg-). Basis set dependences with medium-augmented correlation consistent basis sets were alleviated by using extrapolations to the basis set limit which were based on aug-cc-pVTZ and aug-cc-pVQZ results. The two-dimensional potential energy surfaces were fitted by exponential and polynomial functions. Minima and transition states were located. Potential energy surfaces are very floppy, especially for HeCN. This complex exhibits the weakest van der Waals interaction, the electronic interaction energy being 92 μEh. Interaction energy in HeNO is 122 μEh, almost the same as was found for HeO2 (124 μEh). Considering zero-point-vibrational corrections, the dissociation energy of HeCN, HeNO, and HeO2 is 4.6, 6.6, and 7.3 cm-1, respectively. This sequence of the magnitude of interaction energies and the structural data for global and local minima and transition states were compared with available literature data. No simple link between the magnitude of intermolecular forces and dipole moments and dipole polarizabilities of CN, NO, and O2 was found. The low-order long-range model based on the induction and dispersion forces is completely useless in the assessment of the sequence of the size of intermolecular interactions of the HeCN, HeNO, and HeO2 complexes.


1998 ◽  
Vol 94 (1) ◽  
pp. 181-187 ◽  
Author(s):  
EPHRAIM ELIAV ◽  
UZI KALDOR ◽  
YASUYUKI ISHIKAWA

2020 ◽  
Author(s):  
Soumi Haldar ◽  
Achintya Kumar Dutta

We have presented a multi-layer implementation of the equation of motion coupled-cluster method for the electron affinity, based on local and pair natural orbitals. The method gives consistent accuracy for both localized and delocalized anionic states. It results in many fold speedup in computational timing as compared to the canonical and DLPNO based implementation of the EA-EOM-CCSD method. We have also developed an explicit fragment-based approach which can lead to even higher speed-up with little loss in accuracy. The multi-layer method can be used to treat the environmental effect of both bonded and non-bonded nature on the electron attachment process in large molecules.<br>


2019 ◽  
Author(s):  
Andrew Medford ◽  
Shengchun Yang ◽  
Fuzhu Liu

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CH<sub>x</sub>, NH<sub>x</sub> and OH<sub>x</sub> species on the oxygen vacancy and pristine rutile TiO<sub>2</sub>(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N<sup>1.12</sup>) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.


Sign in / Sign up

Export Citation Format

Share Document