scholarly journals MLSolvA: solvation free energy prediction from pairwise atomistic interactions by machine learning

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Hyuntae Lim ◽  
YounJoon Jung

AbstractRecent advances in machine learning technologies and their applications have led to the development of diverse structure–property relationship models for crucial chemical properties. The solvation free energy is one of them. Here, we introduce a novel ML-based solvation model, which calculates the solvation energy from pairwise atomistic interactions. The novelty of the proposed model consists of a simple architecture: two encoding functions extract atomic feature vectors from the given chemical structure, while the inner product between the two atomistic feature vectors calculates their interactions. The results of 6239 experimental measurements achieve outstanding performance and transferability for enlarging training data owing to its solvent-non-specific nature. An analysis of the interaction map shows that our model has significant potential for producing group contributions on the solvation energy, which indicates that the model provides not only predictions of target properties but also more detailed physicochemical insights.

2021 ◽  
Author(s):  
Hyuntae Lim ◽  
YounJoon Jung

Abstract Recent advances in machine learning technologies and their applications have led to the development of diverse structure-property relationship models for crucial chemical properties. The solvation free energy is one of them. Here, we introduce a novel ML-based solvation model, which calculates the solvation energy from pairwise atomistic interactions. The novelty of the proposed model consists of a simple architecture: two encoding functions extract atomic feature vectors from the given chemical structure, while the inner product between the two atomistic features calculates their interactions. The results of 6,493 experimental measurements achieve outstanding performance and transferability for enlarging training data owing to its solvent-non-specific nature. An analysis of the interaction map shows that our model has significant potential for producing group contributions on the solvation energy, which indicates that the model provides provides not only predictions of target properties but also more detailed physicochemical insights.


2022 ◽  
Author(s):  
Yunsie Chung ◽  
Florence H. Vermeire ◽  
Haoyang Wu ◽  
Pierre J. Walker ◽  
Michael H. Abraham ◽  
...  

We present a group contribution method (SoluteGC) and a machine learning model (SoluteML) to predict the Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The proposed group contribution method uses atom-centered functional groups with corrections for ring and polycyclic strain whilst the machine learning models adopt a directed message passing neural network. The solute parameters predicted from SoluteGC and SoluteML are used to calculate solvation energy and enthalpy via linear free energy relationships. Extensive data sets containing 8366 solute parameters, 20253 solvation free energies, and 6322 solvation enthalpies are compiled in this work to train the models. The three models are each evaluated on the same test sets using both random and substructure-based solute splits for solvation energy and enthalpy predictions. The results show that the DirectML model is superior to the SoluteML and SoluteGC models for both predictions and can provide accuracy comparable to that of advanced quantum chemistry methods. Yet, even though the DirectML model performs better in general, all three models are useful for various purposes. Uncertain predicted values can be identified by comparing the 3 models, and when the 3 models are combined together, they can provide even more accurate predictions than any one of them individually. Finally, we present our compiled solute parameter, solvation energy, and solvation enthalpy databases (SoluteDB, dGsolvDBx, dHsolvDB) and provide public access to our final prediction models through a simple web-based tool, software package, and source code.


2021 ◽  
Author(s):  
Yunsie Chung ◽  
Florence H. Vermeire ◽  
Haoyang Wu ◽  
Pierre J. Walker ◽  
Michael H. Abraham ◽  
...  

We present a group contribution method (SoluteGC) and a machine learning model (SoluteML) to predict the Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The proposed group contribution method uses atom-centered functional groups with corrections for ring and polycyclic strain whilst the machine learning models adopt a directed message passing neural network. The solute parameters predicted from SoluteGC and SoluteML are used to calculate solvation energy and enthalpy via linear free energy relationships. Extensive data sets containing 8366 solute parameters, 20253 solvation free energies, and 6322 solvation enthalpies are compiled in this work to train the models. The three models are each evaluated on the same test sets using both random and substructure-based solute splits for solvation energy and enthalpy predictions. The results show that the DirectML model is superior to the SoluteML and SoluteGC models for both predictions and can provide accuracy comparable to that of advanced quantum chemistry methods. Yet, even though the DirectML model performs better in general, all three models are useful for various purposes. Uncertain predicted values can be identified by comparing the 3 models, and when the 3 models are combined together, they can provide even more accurate predictions than any one of them individually. Finally, we present our compiled solute parameter, solvation energy, and solvation enthalpy databases (SoluteDB, dGsolvDBx, dHsolvDB) and provide public access to our final prediction models through a simple web-based tool, software package, and source code.


Author(s):  
Aditya Nandy ◽  
Jiazhou Zhu ◽  
Jon Paul Janet ◽  
Chenru Duan ◽  
Rachel Getman ◽  
...  

<p>Metal-oxo moieties are important catalytic intermediates in the selective partial oxidation of hydrocarbons and in water splitting. Stable metal-oxo species have reactive properties that vary depending on the spin state of the metal, complicating the development of structure-property relationships. To overcome these challenges, we train the first machine learning (ML) models capable of predicting metal-oxo formation energies across a range of first-row metals, oxidation states, and spin states. Using connectivity-only features tailored for inorganic chemistry as inputs to kernel ridge regression or artificial neural network ML models, we achieve good mean absolute errors (4-5 kcal/mol) on set-aside test data across a range of ligand orientations. Analysis of feature importance for oxo formation energy prediction reveals the dominance of non-local, electronic ligand properties in contrast to other transition metal complex properties (e.g., spin-state or ionization potential). We enumerate the theoretical catalyst space with an ANN, revealing both expected trends in oxo formation energetics, such as destabilization of the metal-oxo species with increasing <i>d</i>-filling, as well as exceptions, such as weak correlations with indicators of oxidative stability of the metal in the resting state or unexpected spin-state dependence in reactivity. We carry out uncertainty aware evolutionary optimization using the ANN to explore a > 37,000 candidate catalyst space. New metal and oxidation state combinations are uncovered and validated with density functional theory (DFT), including counter-intuitive oxo-formation energies for oxidatively stable complexes. This approach doubles the density of confirmed DFT leads in originally sparsely populated regions of property space, highlighting the potential of ML-model-driven discovery to uncover catalyst design rules and exceptions.</p>


Author(s):  
Peiyuan Gao ◽  
Xiu Yang ◽  
Yuhang Tang ◽  
Muqing Zheng ◽  
Amity Andersen ◽  
...  

The solvation free energy of organic molecules is a critical parameter in determining emergent properties such as solubility, liquid-phase equilibrium constants, and pKa and redox potentials in an organic redox...


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yue Wang

Real-time collection of athletes’ abnormal training data can improve the training effect of athletes. This paper studies the real-time collection method of athletes’ abnormal training data based on machine learning. The main motivation of this paper is to collect the athletes’ abnormal training data in time, which can help to evaluate and improve the training effect. Four sensor nodes are arranged in the upper and lower limbs of athletes to collect the angular velocity, acceleration, and magnetic field strength data of athletes in training state. The data are sent to the data transmission base station through wireless sensors, and the data transmission base station transmits the data to the data processing terminal. The data processing terminal calculates the difference between the sample values of each sensor to obtain the data dispersion of each sensor. The features of each dimension data in a time domain and frequency domain are obtained by using the dispersion degree to construct 32-dimensional feature vectors, and the extracted feature vectors are input into the hidden Markov model. The forward algorithm is used to obtain the probability of the final observation sequence, so as to realize the final collection of athletes’ abnormal training data. The experimental results show that the accuracy and recall rate of the abnormal data collected by this method is higher than 98%, which requires less time.


2019 ◽  
Author(s):  
Aditya Nandy ◽  
Jiazhou Zhu ◽  
Jon Paul Janet ◽  
Chenru Duan ◽  
Rachel Getman ◽  
...  

<p>Metal-oxo moieties are important catalytic intermediates in the selective partial oxidation of hydrocarbons and in water splitting. Stable metal-oxo species have reactive properties that vary depending on the spin state of the metal, complicating the development of structure-property relationships. To overcome these challenges, we train the first machine learning (ML) models capable of predicting metal-oxo formation energies across a range of first-row metals, oxidation states, and spin states. Using connectivity-only features tailored for inorganic chemistry as inputs to kernel ridge regression or artificial neural network ML models, we achieve good mean absolute errors (4-5 kcal/mol) on set-aside test data across a range of ligand orientations. Analysis of feature importance for oxo formation energy prediction reveals the dominance of non-local, electronic ligand properties in contrast to other transition metal complex properties (e.g., spin-state or ionization potential). We enumerate the theoretical catalyst space with an ANN, revealing both expected trends in oxo formation energetics, such as destabilization of the metal-oxo species with increasing <i>d</i>-filling, as well as exceptions, such as weak correlations with indicators of oxidative stability of the metal in the resting state or unexpected spin-state dependence in reactivity. We carry out uncertainty aware evolutionary optimization using the ANN to explore a > 37,000 candidate catalyst space. New metal and oxidation state combinations are uncovered and validated with density functional theory (DFT), including counter-intuitive oxo-formation energies for oxidatively stable complexes. This approach doubles the density of confirmed DFT leads in originally sparsely populated regions of property space, highlighting the potential of ML-model-driven discovery to uncover catalyst design rules and exceptions.</p>


2019 ◽  
Vol 31 (3) ◽  
pp. 376-389 ◽  
Author(s):  
Congying Guan ◽  
Shengfeng Qin ◽  
Yang Long

Purpose The big challenge in apparel recommendation system research is not the exploration of machine learning technologies in fashion, but to really understand clothes, fashion and people, and know what to learn. The purpose of this paper is to explore an advanced apparel style learning and recommendation system that can recognise deep design-associated features of clothes and learn the connotative meanings conveyed by these features relating to style and the body so that it can make recommendations as a skilled human expert. Design/methodology/approach This study first proposes a type of new clothes style training data. Second, it designs three intelligent apparel-learning models based on newly proposed training data including ATTRIBUTE, MEANING and the raw image data, and compares the models’ performances in order to identify the best learning model. For deep learning, two models are introduced to train the prediction model, one is a convolutional neural network joint with the baseline classifier support vector machine and the other is with a newly proposed classifier later kernel fusion. Findings The results show that the most accurate model (with average prediction rate of 88.1 per cent) is the third model that is designed with two steps, one is to predict apparel ATTRIBUTEs through the apparel images, and the other is to further predict apparel MEANINGs based on predicted ATTRIBUTEs. The results indicate that adding the proposed ATTRIBUTE data that captures the deep features of clothes design does improve the model performances (e.g. from 73.5 per cent, Model B to 86 per cent, Model C), and the new concept of apparel recommendation based on style meanings is technically applicable. Originality/value The apparel data and the design of three training models are originally introduced in this study. The proposed methodology can evaluate the pros and cons of different clothes feature extraction approaches through either images or design attributes and balance different machine learning technologies between the latest CNN and traditional SVM.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Amin Alibakhshi ◽  
Bernd Hartke

AbstractTheoretical estimation of solvation free energy by continuum solvation models, as a standard approach in computational chemistry, is extensively applied by a broad range of scientific disciplines. Nevertheless, the current widely accepted solvation models are either inaccurate in reproducing experimentally determined solvation free energies or require a number of macroscopic observables which are not always readily available. In the present study, we develop and introduce the Machine-Learning Polarizable Continuum solvation Model (ML-PCM) for a substantial improvement of the predictability of solvation free energy. The performance and reliability of the developed models are validated through a rigorous and demanding validation procedure. The ML-PCM models developed in the present study improve the accuracy of widely accepted continuum solvation models by almost one order of magnitude with almost no additional computational costs. A freely available software is developed and provided for a straightforward implementation of the new approach.


Sign in / Sign up

Export Citation Format

Share Document