Machine-learning coupled cluster properties through a density tensor representation

<div> <div> <div> <p>The introduction of machine-learning (ML) algorithms to quantum mechanics enables rapid evaluation of otherwise intractable expressions at the cost of prior training on appropriate benchmarks. Many computational bottlenecks in the evaluation of accurate electronic structure theory could potentially benefit from the application of such models, from reducing the complexity of the underlying wave function parameter space to circumventing the complications of solving the electronic Schrödinger equation entirely. Applications of ML to electronic structure have thus far been focused on learning molecular properties (mainly the energy) from geometric representations. While this line of study has been quite successful, highly accurate models typically require a “big data” approach with thousands of train- ing data points. Herein, we propose a general, systematically improvable scheme for wave function-based ML of arbitrary molecular properties, inspired by the underlying equations that govern the canonical approach to computing the properties. To this end, we combine the established ML machinery of the t-amplitude tensor representation with a new reduced density matrix representation. The resulting model provides quantitative accuracy in both the electronic energy and dipoles of small molecules using only a few dozen training points per system. </p> </div> </div> </div>

Download Full-text

Machine-Learning Coupled Cluster Properties through a Density Tensor Representation

10.26434/chemrxiv.12056178 ◽

2020 ◽

Author(s):

Benjamin Peyton ◽

Connor Briggs ◽

Ruhee D'Cunha ◽

Johannes T. Margraf ◽

Thomas Crawford

Keyword(s):

Machine Learning ◽

Wave Function ◽

Electronic Structure ◽

Electronic Energy ◽

Molecular Properties ◽

Structure Theory ◽

Function Parameter ◽

Tensor Representation ◽

Cluster Properties ◽

The Cost

<div> <div> <div> <p>The introduction of machine-learning (ML) algorithms to quantum mechanics enables rapid evaluation of otherwise intractable expressions at the cost of prior training on appropriate benchmarks. Many computational bottlenecks in the evaluation of accurate electronic structure theory could potentially benefit from the application of such models, from reducing the complexity of the underlying wave function parameter space to circumventing the complications of solving the electronic Schrödinger equation entirely. Applications of ML to electronic structure have thus far been focused on learning molecular properties (mainly the energy) from geometric representations. While this line of study has been quite successful, highly accurate models typically require a “big data” approach with thousands of train- ing data points. Herein, we propose a general, systematically improvable scheme for wave function-based ML of arbitrary molecular properties, inspired by the underlying equations that govern the canonical approach to computing the properties. To this end, we combine the established ML machinery of the t-amplitude tensor representation with a new reduced density matrix representation. The resulting model provides quantitative accuracy in both the electronic energy and dipoles of small molecules using only a few dozen training points per system. </p> </div> </div> </div>

Download Full-text

Quantum chemical benchmark databases of gold-standard dimer interaction energies

Scientific Data ◽

10.1038/s41597-021-00833-x ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Alexander G. Donchev ◽

Andrew G. Taube ◽

Elizabeth Decolvenaere ◽

Cory Hargus ◽

Robert T. McGibbon ◽

...

Keyword(s):

Machine Learning ◽

Gold Standard ◽

Training Data ◽

Structure Theory ◽

Cluster Method ◽

Coupled Cluster ◽

Interaction Energies ◽

Gold Standard Method ◽

Representative Subset ◽

Semi Empirical

AbstractAdvances in computational chemistry create an ongoing need for larger and higher-quality datasets that characterize noncovalent molecular interactions. We present three benchmark collections of quantum mechanical data, covering approximately 3,700 distinct types of interacting molecule pairs. The first collection, which we refer to as DES370K, contains interaction energies for more than 370,000 dimer geometries. These were computed using the coupled-cluster method with single, double, and perturbative triple excitations [CCSD(T)], which is widely regarded as the gold-standard method in electronic structure theory. Our second benchmark collection, a core representative subset of DES370K called DES15K, is intended for more computationally demanding applications of the data. Finally, DES5M, our third collection, comprises interaction energies for nearly 5,000,000 dimer geometries; these were calculated using SNS-MP2, a machine learning approach that provides results with accuracy comparable to that of our coupled-cluster training data. These datasets may prove useful in the development of density functionals, empirically corrected wavefunction-based approaches, semi-empirical methods, force fields, and models trained using machine learning methods.

Download Full-text

Accelerating coupled cluster calculations with nonlinear dynamics and supervised machine learning

The Journal of Chemical Physics ◽

10.1063/5.0037090 ◽

2021 ◽

Vol 154 (4) ◽

pp. 044110

Author(s):

Valay Agarawal ◽

Samrendra Roy ◽

Anish Chakraborty ◽

Rahul Maitra

Keyword(s):

Nonlinear Dynamics ◽

Machine Learning ◽

Supervised Machine Learning ◽

Coupled Cluster ◽

Cluster Calculations ◽

Coupled Cluster Calculations

Download Full-text

Assessing Conformer Energies using Electronic Structure and Machine Learning Methods

10.26434/chemrxiv.11920914 ◽

2020 ◽

Author(s):

Dakota Folmsbee ◽

Geoffrey Hutchison

Keyword(s):

Machine Learning ◽

Electronic Structure ◽

Density Functional ◽

Large Scale ◽

Single Point ◽

Semiempirical Method ◽

Coupled Cluster ◽

Scale Evaluation ◽

Machine Learning Methods ◽

Electronic Structure Methods

We have performed a large-scale evaluation of current computational methods, including conventional small-molecule force fields, semiempirical, density functional, ab initio electronic structure methods, and current machine learning (ML) techniques to evaluate relative single-point energies. Using up to 10 local minima geometries across ~700 molecules, each optimized by B3LYP-D3BJ with single-point DLPNO-CCSD(T) triple-zeta energies, we consider over 6,500 single points to compare the correlation between different methods for both relative energies and ordered rankings of minima. We find promise from current ML methods and recommend methods at each tier of the accuracy-time tradeoff, particularly the recent GFN2 semiempirical method, the B97-3c density functional approximation, and RI-MP2 for accurate conformer energies. The ANI family of ML methods shows promise, particularly the ANI-1ccx variant trained in part on coupled-cluster energies. Multiple methods suggest continued improvements should be expected in both performance and accuracy.

Download Full-text

Assessing Conformer Energies using Electronic Structure and Machine Learning Methods

10.26434/chemrxiv.11920914.v2 ◽

2020 ◽

Author(s):

Dakota Folmsbee ◽

Geoffrey Hutchison

Keyword(s):

Machine Learning ◽

Electronic Structure ◽

Density Functional ◽

Large Scale ◽

Single Point ◽

Semiempirical Method ◽

Coupled Cluster ◽

Scale Evaluation ◽

Machine Learning Methods ◽

Electronic Structure Methods

We have performed a large-scale evaluation of current computational methods, including conventional small-molecule force fields, semiempirical, density functional, ab initio electronic structure methods, and current machine learning (ML) techniques to evaluate relative single-point energies. Using up to 10 local minima geometries across ~700 molecules, each optimized by B3LYP-D3BJ with single-point DLPNO-CCSD(T) triple-zeta energies, we consider over 6,500 single points to compare the correlation between different methods for both relative energies and ordered rankings of minima. We find promise from current ML methods and recommend methods at each tier of the accuracy-time tradeoff, particularly the recent GFN2 semiempirical method, the B97-3c density functional approximation, and RI-MP2 for accurate conformer energies. The ANI family of ML methods shows promise, particularly the ANI-1ccx variant trained in part on coupled-cluster energies. Multiple methods suggest continued improvements should be expected in both performance and accuracy.

Download Full-text

A turbulent heat flux prediction framework based on tensor representation theory and machine learning

International Communications in Heat and Mass Transfer ◽

10.1016/j.icheatmasstransfer.2018.04.005 ◽

2018 ◽

Vol 95 ◽

pp. 74-79 ◽

Cited By ~ 9

Author(s):

C. Sotgiu ◽

B. Weigand ◽

K. Semmler

Keyword(s):

Machine Learning ◽

Heat Flux ◽

Representation Theory ◽

Turbulent Heat Flux ◽

Turbulent Heat ◽

Tensor Representation

Download Full-text

A Simple and Efficient Tensor Calculus

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5881 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4527-4534

Author(s):

Sören Laue ◽

Matthias Mitterreiter ◽

Joachim Giesen

Keyword(s):

Machine Learning ◽

Efficient Method ◽

Automatic Differentiation ◽

State Of The Art ◽

Higher Order ◽

Tensor Representation ◽

Tensor Calculus ◽

Correct Algorithm ◽

Previous State ◽

Derivatives Of

Computing derivatives of tensor expressions, also known as tensor calculus, is a fundamental task in machine learning. A key concern is the efficiency of evaluating the expressions and their derivatives that hinges on the representation of these expressions. Recently, an algorithm for computing higher order derivatives of tensor expressions like Jacobians or Hessians has been introduced that is a few orders of magnitude faster than previous state-of-the-art approaches. Unfortunately, the approach is based on Ricci notation and hence cannot be incorporated into automatic differentiation frameworks like TensorFlow, PyTorch, autograd, or JAX that use the simpler Einstein notation. This leaves two options, to either change the underlying tensor representation in these frameworks or to develop a new, provably correct algorithm based on Einstein notation. Obviously, the first option is impractical. Hence, we pursue the second option. Here, we show that using Ricci notation is not necessary for an efficient tensor calculus and develop an equally efficient method for the simpler Einstein notation. It turns out that turning to Einstein notation enables further improvements that lead to even better efficiency.

Download Full-text