Simultaneously improving reaction coverage and computational cost in automated reaction prediction tasks

2021 ◽  
Vol 1 (7) ◽  
pp. 479-490
Author(s):  
Qiyuan Zhao ◽  
Brett M. Savoie
2020 ◽  
Author(s):  
Qiyuan Zhao ◽  
Brett Savoie

<div> <div> <div> <p>Automated reaction prediction has the potential to elucidate complex reaction networks for applications ranging from combustion to materials degradation. Although substantial progress has been made in predicting specific reaction pathways and resolving mechanisms, the computational cost and inconsistent reaction coverage of automated prediction are still obstacles to exploring deep reaction networks without using heuristics. Here we show that cost can be reduced and reaction coverage can be increased simultaneously by relatively straight- forward modifications of the reaction enumeration, geometry initialization, and transition state convergence algorithms that are common to many emerging prediction methodologies. These changes are implemented in the context of Yet Another Reaction Program (YARP), our reaction prediction package, for which we report a head-to-head comparison with prevailing methods for two benchmark reaction prediction tasks. In all cases, we observe near perfect recapitulation of established reaction pathways and products by YARP, without the use of heuristics or other domain knowledge to guide reaction selection. In addition, YARP also discovers many new kinetically relevant pathways and products reported here for the first time. This is achieved while simultaneously reducing the cost of reaction characterization nearly 100-fold and increasing transition state success rates and intended rates over 2-fold and 10-fold, respectively, compared with recent benchmarks. This combination of ultra-low cost and high reaction-coverage creates opportunities to explore the reactivity of larger sys- tems and more complex reaction networks for applications like chemical degradation, where approaches based on domain heuristics fail. </p> </div> </div> </div>


2020 ◽  
Author(s):  
Qiyuan Zhao ◽  
Brett Savoie

<div> <div> <div> <p>Automated reaction prediction has the potential to elucidate complex reaction networks for applications ranging from combustion to materials degradation. Although substantial progress has been made in predicting specific reaction pathways and resolving mechanisms, the computational cost and inconsistent reaction coverage of automated prediction are still obstacles to exploring deep reaction networks without using heuristics. Here we show that cost can be reduced and reaction coverage can be increased simultaneously by relatively straight- forward modifications of the reaction enumeration, geometry initialization, and transition state convergence algorithms that are common to many emerging prediction methodologies. These changes are implemented in the context of Yet Another Reaction Program (YARP), our reaction prediction package, for which we report a head-to-head comparison with prevailing methods for two benchmark reaction prediction tasks. In all cases, we observe near perfect recapitulation of established reaction pathways and products by YARP, without the use of heuristics or other domain knowledge to guide reaction selection. In addition, YARP also discovers many new kinetically relevant pathways and products reported here for the first time. This is achieved while simultaneously reducing the cost of reaction characterization nearly 100-fold and increasing transition state success rates and intended rates over 2-fold and 10-fold, respectively, compared with recent benchmarks. This combination of ultra-low cost and high reaction-coverage creates opportunities to explore the reactivity of larger sys- tems and more complex reaction networks for applications like chemical degradation, where approaches based on domain heuristics fail. </p> </div> </div> </div>


2012 ◽  
Author(s):  
Todd Wareham ◽  
Robert Robere ◽  
Iris van Rooij
Keyword(s):  

2020 ◽  
Vol 2020 (14) ◽  
pp. 378-1-378-7
Author(s):  
Tyler Nuanes ◽  
Matt Elsey ◽  
Radek Grzeszczuk ◽  
John Paul Shen

We present a high-quality sky segmentation model for depth refinement and investigate residual architecture performance to inform optimally shrinking the network. We describe a model that runs in near real-time on mobile device, present a new, highquality dataset, and detail a unique weighing to trade off false positives and false negatives in binary classifiers. We show how the optimizations improve bokeh rendering by correcting stereo depth misprediction in sky regions. We detail techniques used to preserve edges, reject false positives, and ensure generalization to the diversity of sky scenes. Finally, we present a compact model and compare performance of four popular residual architectures (ShuffleNet, MobileNetV2, Resnet-101, and Resnet-34-like) at constant computational cost.


2012 ◽  
Vol 2 (1) ◽  
pp. 7-9 ◽  
Author(s):  
Satinderjit Singh

Median filtering is a commonly used technique in image processing. The main problem of the median filter is its high computational cost (for sorting N pixels, the temporal complexity is O(N·log N), even with the most efficient sorting algorithms). When the median filter must be carried out in real time, the software implementation in general-purpose processorsdoes not usually give good results. This Paper presents an efficient algorithm for median filtering with a 3x3 filter kernel with only about 9 comparisons per pixel using spatial coherence between neighboring filter computations. The basic algorithm calculates two medians in one step and reuses sorted slices of three vertical neighboring pixels. An extension of this algorithm for 2D spatial coherence is also examined, which calculates four medians per step.


2020 ◽  
Author(s):  
Florencia Klein ◽  
Daniela Cáceres-Rojas ◽  
Monica Carrasco ◽  
Juan Carlos Tapia ◽  
Julio Caballero ◽  
...  

<p>Although molecular dynamics simulations allow for the study of interactions among virtually all biomolecular entities, metal ions still pose significant challenges to achieve an accurate structural and dynamical description of many biological assemblies. This is particularly the case for coarse-grained (CG) models. Although the reduced computational cost of CG methods often makes them the technique of choice for the study of large biomolecular systems, the parameterization of metal ions is still very crude or simply not available for the vast majority of CG- force fields. Here, we show that incorporating statistical data retrieved from the Protein Data Bank (PDB) to set specific Lennard-Jones interactions can produce structurally accurate CG molecular dynamics simulations. Using this simple approach, we provide a set of interaction parameters for Calcium, Magnesium, and Zinc ions, which cover more than 80% of the metal-bound structures reported on the PDB. Simulations performed using the SIRAH force field on several proteins and DNA systems show that using the present approach it is possible to obtain non-bonded interaction parameters that obviate the use of topological constraints. </p>


2020 ◽  
Author(s):  
Shi Jun Ang ◽  
Wujie Wang ◽  
Daniel Schwalbe-Koda ◽  
Simon Axelrod ◽  
Rafael Gomez-Bombarelli

<div>Modeling dynamical effects in chemical reactions, such as post-transition state bifurcation, requires <i>ab initio</i> molecular dynamics simulations due to the breakdown of simpler static models like transition state theory. However, these simulations tend to be restricted to lower-accuracy electronic structure methods and scarce sampling because of their high computational cost. Here, we report the use of statistical learning to accelerate reactive molecular dynamics simulations by combining high-throughput ab initio calculations, graph-convolution interatomic potentials and active learning. This pipeline was demonstrated on an ambimodal trispericyclic reaction involving 8,8-dicyanoheptafulvene and 6,6-dimethylfulvene. With a dataset size of approximately</div><div>31,000 M062X/def2-SVP quantum mechanical calculations, the computational cost of exploring the reactive potential energy surface was reduced by an order of magnitude. Thousands of virtually costless picosecond-long reactive trajectories suggest that post-transition state bifurcation plays a minor role for the reaction in vacuum. Furthermore, a transfer-learning strategy effectively upgraded the potential energy surface to higher</div><div>levels of theory ((SMD-)M06-2X/def2-TZVPD in vacuum and three other solvents, as well as the more accurate DLPNO-DSD-PBEP86 D3BJ/def2-TZVPD) using about 10% additional calculations for each surface. Since the larger basis set and the dynamic correlation capture intramolecular non-covalent interactions more accurately, they uncover longer lifetimes for the charge-separated intermediate on the more accurate potential energy surfaces. The character of the intermediate switches from entropic to thermodynamic upon including implicit solvation effects, with lifetimes increasing with solvent polarity. Analysis of 2,000 reactive trajectories on the chloroform PES shows a qualitative agreement with the experimentally-reported periselectivity for this reaction. This overall approach is broadly applicable and opens a door to the study of dynamical effects in larger, previously-intractable reactive systems.</div>


Author(s):  
Yudong Qiu ◽  
Daniel Smith ◽  
Chaya Stern ◽  
mudong feng ◽  
Lee-Ping Wang

<div>The parameterization of torsional / dihedral angle potential energy terms is a crucial part of developing molecular mechanics force fields.</div><div>Quantum mechanical (QM) methods are often used to provide samples of the potential energy surface (PES) for fitting the empirical parameters in these force field terms.</div><div>To ensure that the sampled molecular configurations are thermodynamically feasible, constrained QM geometry optimizations are typically carried out, which relax the orthogonal degrees of freedom while fixing the target torsion angle(s) on a grid of values.</div><div>However, the quality of results and computational cost are affected by various factors on a non-trivial PES, such as dependence on the chosen scan direction and the lack of efficient approaches to integrate results started from multiple initial guesses.</div><div>In this paper we propose a systematic and versatile workflow called \textit{TorsionDrive} to generate energy-minimized structures on a grid of torsion constraints by means of a recursive wavefront propagation algorithm, which resolves the deficiencies of conventional scanning approaches and generates higher quality QM data for force field development.</div><div>The capabilities of our method are presented for multi-dimensional scans and multiple initial guess structures, and an integration with the MolSSI QCArchive distributed computing ecosystem is described.</div><div>The method is implemented in an open-source software package that is compatible with many QM software packages and energy minimization codes.</div>


2020 ◽  
Author(s):  
Ali Raza ◽  
Arni Sturluson ◽  
Cory Simon ◽  
Xiaoli Fern

Virtual screenings can accelerate and reduce the cost of discovering metal-organic frameworks (MOFs) for their applications in gas storage, separation, and sensing. In molecular simulations of gas adsorption/diffusion in MOFs, the adsorbate-MOF electrostatic interaction is typically modeled by placing partial point charges on the atoms of the MOF. For the virtual screening of large libraries of MOFs, it is critical to develop computationally inexpensive methods to assign atomic partial charges to MOFs that accurately reproduce the electrostatic potential in their pores. Herein, we design and train a message passing neural network (MPNN) to predict the atomic partial charges on MOFs under a charge neutral constraint. A set of ca. 2,250 MOFs labeled with high-fidelity partial charges, derived from periodic electronic structure calculations, serves as training examples. In an end-to-end manner, from charge-labeled crystal graphs representing MOFs, our MPNN machine-learns features of the local bonding environments of the atoms and learns to predict partial atomic charges from these features. Our trained MPNN assigns high-fidelity partial point charges to MOFs with orders of magnitude lower computational cost than electronic structure calculations. To enhance the accuracy of virtual screenings of large libraries of MOFs for their adsorption-based applications, we make our trained MPNN model and MPNN-charge-assigned computation-ready, experimental MOF structures publicly available.<br>


2020 ◽  
Author(s):  
Jingbai Li ◽  
Patrick Reiser ◽  
André Eberhard ◽  
Pascal Friederich ◽  
Steven Lopez

<p>Photochemical reactions are being increasingly used to construct complex molecular architectures with mild and straightforward reaction conditions. Computational techniques are increasingly important to understand the reactivities and chemoselectivities of photochemical isomerization reactions because they offer molecular bonding information along the excited-state(s) of photodynamics. These photodynamics simulations are resource-intensive and are typically limited to 1–10 picoseconds and 1,000 trajectories due to high computational cost. Most organic photochemical reactions have excited-state lifetimes exceeding 1 picosecond, which places them outside possible computational studies. Westermeyr <i>et al.</i> demonstrated that a machine learning approach could significantly lengthen photodynamics simulation times for a model system, methylenimmonium cation (CH<sub>2</sub>NH<sub>2</sub><sup>+</sup>).</p><p>We have developed a Python-based code, Python Rapid Artificial Intelligence <i>Ab Initio</i> Molecular Dynamics (PyRAI<sup>2</sup>MD), to accomplish the unprecedented 10 ns <i>cis-trans</i> photodynamics of <i>trans</i>-hexafluoro-2-butene (CF<sub>3</sub>–CH=CH–CF<sub>3</sub>) in 3.5 days. The same simulation would take approximately 58 years with ground-truth multiconfigurational dynamics. We proposed an innovative scheme combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to effectively sample the initial data, facilitating the adaptive sampling to generate an informative and data-efficient training set with 6,232 data points. Our neural networks achieved chemical accuracy (mean absolute error of 0.032 eV). Our 4,814 trajectories reproduced the S<sub>1</sub> half-life (60.5 fs), the photochemical product ratio (<i>trans</i>: <i>cis</i> = 2.3: 1), and autonomously discovered a pathway towards a carbene. The neural networks have also shown the capability of generalizing the full potential energy surface with chemically incomplete data (<i>trans</i> → <i>cis</i> but not <i>cis</i> → <i>trans</i> pathways) that may offer future automated photochemical reaction discoveries.</p>


Sign in / Sign up

Export Citation Format

Share Document