Active Machine Learning for Chemical Dynamics Simulations. I. Estimating the Energy Gradient
Ab initio molecular dymamics (AIMD) simulation studies are a direct way to visualize chemical reactions and help elucidate non-statistical dynamics that does not follow the intrinsic reaction coordinate. However, due to the enormous amount of the ab initio energy gradient calculations needed for AIMD, it has been largely restrained to limited sampling and low level of theory (i.e., density functional theory with small basis sets). To overcome this issue, a number of machine learning (ML) methods have been employed to predict the energy gradient of the system of interest. In this manuscript, we outline the theoretical foundations of a novel ML method which trains from a varying set of atomic positions and their energy gradients, called interpolating moving ridge regression (IMRR), and directly predicts the energy gradient of a new set of atomic positions. Several key theoretical findings are presented regarding the inputs used to train IMRR and the predicted energy gradient. A hyperparameter used to guide IMRR is rigorously examined as well. The method is then applied to three bimolecular reactions studied with AIMD, including HBr+ + CO2, H2S + CH, and C4H2 + CH, to demonstrate IMRR’s performance on different chemical systems of different sizes. This manuscript also compares the computational cost of the energy gradient calculation with IMRR vs. ab initio, and the results highlight IMRR as a viable option to greatly increase the efficiency of AIMD.