scholarly journals Machine Learning Accurate Exchange and Correlation Functionals of the Electronic Density

2020 ◽  
Author(s):  
Sebastian Dick ◽  
Marivi Fernandez-Serra

<div>Density Functional Theory (DFT) is the standard formalism to study the electronic structure</div><div>of matter at the atomic scale. In Kohn-Sham DFT simulations, the balance between accuracy</div><div>and computational cost depends on the choice of exchange and correlation functional, which only</div><div>exists in approximate form. Here we propose a framework to create density functionals using</div><div>supervised machine learning, termed NeuralXC. These machine-learned functionals are designed to</div><div>lift the accuracy of baseline functionals towards that are provided by more accurate methods while</div><div>maintaining their efficiency. We show that the functionals learn a meaningful representation of the</div><div>physical information contained in the training data, making them transferable across systems. A</div><div>NeuralXC functional optimized for water outperforms other methods characterizing bond breaking</div><div>and excels when comparing against experimental results. This work demonstrates that NeuralXC</div><div>is a first step towards the design of a universal, highly accurate functional valid for both molecules</div><div>and solids.</div>

2020 ◽  
Author(s):  
Sebastian Dick ◽  
Marivi Fernandez-Serra

<div>Density Functional Theory (DFT) is the standard formalism to study the electronic structure</div><div>of matter at the atomic scale. In Kohn-Sham DFT simulations, the balance between accuracy</div><div>and computational cost depends on the choice of exchange and correlation functional, which only</div><div>exists in approximate form. Here we propose a framework to create density functionals using</div><div>supervised machine learning, termed NeuralXC. These machine-learned functionals are designed to</div><div>lift the accuracy of baseline functionals towards that are provided by more accurate methods while</div><div>maintaining their efficiency. We show that the functionals learn a meaningful representation of the</div><div>physical information contained in the training data, making them transferable across systems. A</div><div>NeuralXC functional optimized for water outperforms other methods characterizing bond breaking</div><div>and excels when comparing against experimental results. This work demonstrates that NeuralXC</div><div>is a first step towards the design of a universal, highly accurate functional valid for both molecules</div><div>and solids.</div>


2019 ◽  
Author(s):  
Sebastian Dick ◽  
Marivi Fernandez-Serra

Density Functional Theory (DFT) is the standard formalism to study the electronic structure of matter<br>at the atomic scale. The balance between accuracy and computational cost that<br>DFT-based simulations provide allows researchers to understand the structural and dynamical properties of increasingly large and complex systems at the quantum mechanical level.<br>In Kohn-Sham DFT, this balance depends on the choice of exchange and correlation functional, which only exists<br>in approximate form. Increasing the non-locality of this functional and climbing the figurative Jacob's ladder of DFT, one can systematically reduce the amount of approximation involved and thus approach the exact functional. Doing this, however, comes at the price of increased computational cost, and so, for extensive systems, the predominant methods of choice can still be found within the lower-rung approximations. <br>Here we propose a framework to create highly accurate density functionals by using supervised machine learning, termed NeuralXC. These machine-learned functionals are designed to lift the accuracy of local and semilocal functionals to that provided by more accurate methods while maintaining their efficiency. We show that the functionals learn a meaningful representation of the physical information contained in the training data, making them transferable across systems. We further demonstrate how a functional optimized on water can reproduce experimental results when used in molecular dynamics simulations. Finally, we discuss the effects that our method has on self-consistent electron densities by comparing these densities to benchmark coupled-cluster results.


2019 ◽  
Author(s):  
Sebastian Dick ◽  
Marivi Fernandez-Serra

Density Functional Theory (DFT) is the standard formalism to study the electronic structure of matter<br>at the atomic scale. The balance between accuracy and computational cost that<br>DFT-based simulations provide allows researchers to understand the structural and dynamical properties of increasingly large and complex systems at the quantum mechanical level.<br>In Kohn-Sham DFT, this balance depends on the choice of exchange and correlation functional, which only exists<br>in approximate form. Increasing the non-locality of this functional and climbing the figurative Jacob's ladder of DFT, one can systematically reduce the amount of approximation involved and thus approach the exact functional. Doing this, however, comes at the price of increased computational cost, and so, for extensive systems, the predominant methods of choice can still be found within the lower-rung approximations. <br>Here we propose a framework to create highly accurate density functionals by using supervised machine learning, termed NeuralXC. These machine-learned functionals are designed to lift the accuracy of local and semilocal functionals to that provided by more accurate methods while maintaining their efficiency. We show that the functionals learn a meaningful representation of the physical information contained in the training data, making them transferable across systems. We further demonstrate how a functional optimized on water can reproduce experimental results when used in molecular dynamics simulations. Finally, we discuss the effects that our method has on self-consistent electron densities by comparing these densities to benchmark coupled-cluster results.


2019 ◽  
Vol 5 (1) ◽  
Author(s):  
Alberto Hernandez ◽  
Adarsh Balasubramanian ◽  
Fenglin Yuan ◽  
Simon A. M. Mason ◽  
Tim Mueller

AbstractThe length and time scales of atomistic simulations are limited by the computational cost of the methods used to predict material properties. In recent years there has been great progress in the use of machine-learning algorithms to develop fast and accurate interatomic potential models, but it remains a challenge to develop models that generalize well and are fast enough to be used at extreme time and length scales. To address this challenge, we have developed a machine-learning algorithm based on symbolic regression in the form of genetic programming that is capable of discovering accurate, computationally efficient many-body potential models. The key to our approach is to explore a hypothesis space of models based on fundamental physical principles and select models within this hypothesis space based on their accuracy, speed, and simplicity. The focus on simplicity reduces the risk of overfitting the training data and increases the chances of discovering a model that generalizes well. Our algorithm was validated by rediscovering an exact Lennard-Jones potential and a Sutton-Chen embedded-atom method potential from training data generated using these models. By using training data generated from density functional theory calculations, we found potential models for elemental copper that are simple, as fast as embedded-atom models, and capable of accurately predicting properties outside of their training set. Our approach requires relatively small sets of training data, making it possible to generate training data using highly accurate methods at a reasonable computational cost. We present our approach, the forms of the discovered models, and assessments of their transferability, accuracy and speed.


2019 ◽  
Author(s):  
Siddhartha Laghuvarapu ◽  
Yashaswi Pathak ◽  
U. Deva Priyakumar

Recent advances in artificial intelligence along with development of large datasets of energies calculated using quantum mechanical (QM)/density functional theory (DFT) methods have enabled prediction of accurate molecular energies at reasonably low computational cost. However, machine learning models that have been reported so far requires the atomic positions obtained from geometry optimizations using high level QM/DFT methods as input in order to predict the energies, and do not allow for geometry optimization. In this paper, a transferable and molecule-size independent machine learning model (BAND NN) based on a chemically intuitive representation inspired by molecular mechanics force fields is presented. The model predicts the atomization energies of equilibrium and non-equilibrium structures as sum of energy contributions from bonds (B), angles (A), nonbonds (N) and dihedrals (D) at remarkable accuracy. The robustness of the proposed model is further validated by calculations that span over the conformational, configurational and reaction space. The transferability of this model on systems larger than the ones in the dataset is demonstrated by performing calculations on select large molecules. Importantly, employing the BAND NN model, it is possible to perform geometry optimizations starting from non-equilibrium structures along with predicting their energies.


2021 ◽  
Vol 13 (3) ◽  
pp. 368
Author(s):  
Christopher A. Ramezan ◽  
Timothy A. Warner ◽  
Aaron E. Maxwell ◽  
Bradley S. Price

The size of the training data set is a major determinant of classification accuracy. Nevertheless, the collection of a large training data set for supervised classifiers can be a challenge, especially for studies covering a large area, which may be typical of many real-world applied projects. This work investigates how variations in training set size, ranging from a large sample size (n = 10,000) to a very small sample size (n = 40), affect the performance of six supervised machine-learning algorithms applied to classify large-area high-spatial-resolution (HR) (1–5 m) remotely sensed data within the context of a geographic object-based image analysis (GEOBIA) approach. GEOBIA, in which adjacent similar pixels are grouped into image-objects that form the unit of the classification, offers the potential benefit of allowing multiple additional variables, such as measures of object geometry and texture, thus increasing the dimensionality of the classification input data. The six supervised machine-learning algorithms are support vector machines (SVM), random forests (RF), k-nearest neighbors (k-NN), single-layer perceptron neural networks (NEU), learning vector quantization (LVQ), and gradient-boosted trees (GBM). RF, the algorithm with the highest overall accuracy, was notable for its negligible decrease in overall accuracy, 1.0%, when training sample size decreased from 10,000 to 315 samples. GBM provided similar overall accuracy to RF; however, the algorithm was very expensive in terms of training time and computational resources, especially with large training sets. In contrast to RF and GBM, NEU, and SVM were particularly sensitive to decreasing sample size, with NEU classifications generally producing overall accuracies that were on average slightly higher than SVM classifications for larger sample sizes, but lower than SVM for the smallest sample sizes. NEU however required a longer processing time. The k-NN classifier saw less of a drop in overall accuracy than NEU and SVM as training set size decreased; however, the overall accuracies of k-NN were typically less than RF, NEU, and SVM classifiers. LVQ generally had the lowest overall accuracy of all six methods, but was relatively insensitive to sample size, down to the smallest sample sizes. Overall, due to its relatively high accuracy with small training sample sets, and minimal variations in overall accuracy between very large and small sample sets, as well as relatively short processing time, RF was a good classifier for large-area land-cover classifications of HR remotely sensed data, especially when training data are scarce. However, as performance of different supervised classifiers varies in response to training set size, investigating multiple classification algorithms is recommended to achieve optimal accuracy for a project.


Author(s):  
Victor H. Chávez ◽  
Adam Wasserman

In some sense, quantum mechanics solves all the problems in chemistry: The only thing one has to do is solve the Schrödinger equation for the molecules of interest. Unfortunately, the computational cost of solving this equation grows exponentially with the number of electrons and for more than ~100 electrons, it is impossible to solve it with chemical accuracy (~ 2 kcal/mol). The Kohn-Sham (KS) equations of density functional theory (DFT) allow us to reformulate the Schrödinger equation using the electronic probability density as the central variable without having to calculate the Schrödinger wave functions. The cost of solving the Kohn-Sham equations grows only as N3, where N is the number of electrons, which has led to the immense popularity of DFT in chemistry. Despite this popularity, even the most sophisticated approximations in KS-DFT result in errors that limit the use of methods based exclusively on the electronic density. By using fragment densities (as opposed to total densities) as the main variables, we discuss here how new methods can be developed that scale linearly with N while providing an appealing answer to the subtitle of the article: What is the shape of atoms in molecules?


Author(s):  
Tich Phuoc Tran ◽  
Pohsiang Tsai ◽  
Tony Jan ◽  
Xiangjian He

Most of the currently available network security techniques are not able to cope with the dynamic and increasingly complex nature of cyber attacks on distributed computer systems. Therefore, an automated and adaptive defensive tool is imperative for computer networks. Alongside the existing prevention techniques such as encryption and firewalls, Intrusion Detection System (IDS) has established itself as an emerging technology that is able to detect unauthorized access and abuse of computer systems by both internal users and external offenders. Most of the novel approaches in this field have adopted Artificial Intelligence (AI) technologies such as Artificial Neural Networks (ANN) to improve performance as well as robustness of IDS. The true power and advantages of ANN lie in its ability to represent both linear and non-linear relationships and learn these relationships directly from the data being modeled. However, ANN is computationally expensive due to its demanding processing power and this leads to overfitting problem, i.e. the network is unable to extrapolate accurately once the input is outside of the training data range. These limitations challenge IDS with low detection rate, high false alarm rate and excessive computation cost. This chapter proposes a novel Machine Learning (ML) algorithm to alleviate those difficulties of existing AI techniques in the area of computer network security. The Intrusion Detection dataset provided by Knowledge Discovery and Data Mining (KDD-99) is used as a benchmark to compare our model with other existing techniques. Extensive empirical analysis suggests that the proposed method outperforms other state-of-the-art learning algorithms in terms of learning bias, generalization variance and computational cost. It is also reported to significantly improve the overall detection capability for difficult-to-detect novel attacks which are unseen or irregularly occur in the training phase.


Author(s):  
Kazuko Fuchi ◽  
Eric M. Wolf ◽  
David S. Makhija ◽  
Nathan A. Wukie ◽  
Christopher R. Schrock ◽  
...  

Abstract A machine learning algorithm that performs multifidelity domain decomposition is introduced. While the design of complex systems can be facilitated by numerical simulations, the determination of appropriate physics couplings and levels of model fidelity can be challenging. The proposed method automatically divides the computational domain into subregions and assigns required fidelity level, using a small number of high fidelity simulations to generate training data and low fidelity solutions as input data. Unsupervised and supervised machine learning algorithms are used to correlate features from low fidelity solutions to fidelity assignment. The effectiveness of the method is demonstrated in a problem of viscous fluid flow around a cylinder at Re ≈ 20. Ling et al. built physics-informed invariance and symmetry properties into machine learning models and demonstrated improved model generalizability. Along these lines, we avoid using problem dependent features such as coordinates of sample points, object geometry or flow conditions as explicit inputs to the machine learning model. Use of pointwise flow features generates large data sets from only one or two high fidelity simulations, and the fidelity predictor model achieved 99.5% accuracy at training points. The trained model was shown to be capable of predicting a fidelity map for a problem with an altered cylinder radius. A significant improvement in the prediction performance was seen when inputs are expanded to include multiscale features that incorporate neighborhood information.


Sign in / Sign up

Export Citation Format

Share Document