Exchange Spin Coupling from Gaussian Process Regression

Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.

2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


2020 ◽  
Author(s):  
Marc Philipp Bahlke ◽  
Natnael Mogos ◽  
Jonny Proppe ◽  
Carmen Herrmann

Heisenberg exchange spin coupling between metal centers is essential for describing and understanding the electronic structure of many molecular catalysts, metalloenzymes, and molecular magnets for potential application in information technology. We explore the machine-learnability of exchange spin coupling, which has not been studied yet. We employ Gaussian process regression since it can potentially deal with small training sets (as likely associated with the rather complex molecular structures required for exploring spin coupling) and since it provides uncertainty estimates (“error bars”) along with predicted values. We compare a range of descriptors and kernels for 257 small dicopper complexes and find that a simple descriptor based on chemical intuition, consisting only of copper-bridge angles and copper-copper distances, clearly outperforms several more sophisticated descriptors when it comes to extrapolating towards larger experimentally relevant complexes. Exchange spin coupling is similarly easy to learn as the polarizability, while learning dipole moments is much harder. The strength of the sophisticated descriptors lies in their ability to linearize structure-property relationships, to the point that a simple linear ridge regression performs just as well as the kernel-based machine-learning model for our small dicopper data set. The superior extrapolation performance of the simple descriptor is unique to exchange spin coupling, reinforcing the crucial role of choosing a suitable descriptor, and highlighting the interesting question of the role of chemical intuition vs. systematic or automated selection of features for machine learning in chemistry and material science.


2021 ◽  
Author(s):  
Ryan Volz ◽  
Jorge L. Chau ◽  
Philip J. Erickson ◽  
Juha P. Vierinen ◽  
J. Miguel Urco ◽  
...  

Abstract. Mesoscale dynamics in the mesosphere and lower thermosphere (MLT) region have been difficult to study from either ground- or satellite-based observations. For understanding of atmospheric coupling processes, important spatial scales at these altitudes range between tens to hundreds of kilometers in the horizontal plane. To date, this scale size is challenging observationally, and so structures are usually parameterized in global circulation models. The advent of multistatic specular meteor radar networks allows exploration of MLT mesocale dynamics on these scales using an increased number of detections and a diversity of viewing angles inherent to multistatic networks. In this work, we introduce a four dimensional wind field inversion method that makes use of Gaussian process regression (GPR), a non-parametric and Bayesian approach. The method takes measured projected wind velocities and prior distributions of the wind velocity as a function of space and time, specified by the user or estimated from the data, and produces posterior distributions for the wind velocity. Computation of the predictive posterior distribution is performed on sampled points of interest and is not necessarily regularly sampled. The main benefits of the GPR method include this non-gridded sampling, the built-in statistical uncertainty estimates, and the ability to horizontally-resolve winds on relatively small scales. The performance of the GPR implementation has been evaluated on Monte Carlo simulations with known distributions using the same spatial and temporal sampling as one day of real meteor measurements. Based on the simulation results we find that the GPR implementation is robust, providing wind fields that are statistically unbiased and with statistical variances that depend on the geometry and are proportional to the prior velocity variances. A conservative and fast approach can be straightforwardly implemented by employing overestimated prior variances and distances, while a more robust but computationally intensive approach can be implemented by employing training and fitting of model parameters. The latter GPR approach has been applied to a 24-hour data set and shown to compare well to previously used homogeneous and gradient methods. Small scale features have reasonably low statistical uncertainties, implying geophysical wind field horizontal structures as low as 20–50 km. We suggest that this GPR approach forms a suitable method for MLT regional and weather studies.


2021 ◽  
Vol 13 (3) ◽  
pp. 403
Author(s):  
Luca Pipia ◽  
Eatidal Amin ◽  
Santiago Belda ◽  
Matías Salinero-Delgado ◽  
Jochem Verrelst

For the last decade, Gaussian process regression (GPR) proved to be a competitive machine learning regression algorithm for Earth observation applications, with attractive unique properties such as band relevance ranking and uncertainty estimates. More recently, GPR also proved to be a proficient time series processor to fill up gaps in optical imagery, typically due to cloud cover. This makes GPR perfectly suited for large-scale spatiotemporal processing of satellite imageries into cloud-free products of biophysical variables. With the advent of the Google Earth Engine (GEE) cloud platform, new opportunities emerged to process local-to-planetary scale satellite data using advanced machine learning techniques and convert them into gap-filled vegetation properties products. However, GPR is not yet part of the GEE ecosystem. To circumvent this limitation, this work proposes a general adaptation of GPR formulation to parallel processing framework and its integration into GEE. To demonstrate the functioning and utility of the developed workflow, a GPR model predicting green leaf area index (LAIG) from Sentinel-2 imagery was imported. Although by running this GPR model into GEE any corner of the world can be mapped into LAIG at a resolution of 20 m, here we show some demonstration cases over western Europe with zoom-ins over Spain. Thanks to the computational power of GEE, the mapping takes place on-the-fly. Additionally, a GPR-based gap filling strategy based on pre-optimized kernel hyperparameters is also put forward for the generation of multi-orbit cloud-free LAIG maps with an unprecedented level of detail, and the extraction of regularly-sampled LAIG time series at a pixel level. The ability to plugin a locally-trained GPR model into the GEE framework and its instant processing opens up a new paradigm of remote sensing image processing.


2019 ◽  
Vol 127 (4) ◽  
pp. 959-973 ◽  
Author(s):  
Seung Ho Yeom ◽  
Ji Sung Na ◽  
Hwi-Dong Jung ◽  
Hyung-Ju Cho ◽  
Yoon Jeong Choi ◽  
...  

Obstructive sleep apnea (OSA) is a common sleep breathing disorder. With the use of computational fluid dynamics (CFD), this study provides a quantitative standard for accurate diagnosis and effective surgery based on the investigation of the relationship between airway geometry and aerodynamic characteristics. Based on computed tomography data from patients having normal geometry, 4 major geometric parameters were selected and a total of 160 idealized cases were modeled and simulated. We created a predictive model using Gaussian process regression (GPR) through a data set obtained through numerical method. The results demonstrated that the mean accuracy of the overall GPR model was ~72% with respect to the CFD results for the realistic upper airway model. A support vector machine model was also used to identify the degree of OSA symptoms in patients as normal-mild and moderate and severe. We achieved an accuracy of 82.5% with the training data set and an accuracy of 80% with the test data set. NEW & NOTEWORTHY There have been many studies on the analysis of obstructive sleep apnea (OSA) through computational fluid dynamics and finite element analysis. However, these methods are not useful for practical medical applications because they have limited information for OSA symptom. This study employs the machine learning algorithm to predict flow characteristics quickly and to determine the symptoms of the patient's OSA. The overall Gaussian process regression model's mean accuracy was ~72%, and the accuracy for the classification of OSA was >80%.


2016 ◽  
Vol 2016 ◽  
pp. 1-8 ◽  
Author(s):  
Nhat-Duc Hoang ◽  
Anh-Duc Pham ◽  
Quoc-Lam Nguyen ◽  
Quang-Nhat Pham

This research carries out a comparative study to investigate a machine learning solution that employs the Gaussian Process Regression (GPR) for modeling compressive strength of high-performance concrete (HPC). This machine learning approach is utilized to establish the nonlinear functional mapping between the compressive strength and HPC ingredients. To train and verify the aforementioned prediction model, a data set containing 239 HPC experimental tests, recorded from an overpass construction project in Danang City (Vietnam), has been collected for this study. Based on experimental outcomes, prediction results of the GPR model are superior to those of the Least Squares Support Vector Machine and the Artificial Neural Network. Furthermore, GPR model is strongly recommended for estimating HPC strength because this method demonstrates good learning performance and can inherently express prediction outputs coupled with prediction intervals.


Author(s):  
Sachin Dev Suresh ◽  
Ali Qasim ◽  
Bhajan Lal ◽  
Syed Muhammad Imran ◽  
Khor Siak Foo

The production of oil and natural gas contributes to a significant amount of revenue generation in Malaysia thereby strengthening the country’s economy. The flow assurance industry is faced with impediments during smooth operation of the transmission pipeline in which gas hydrate formation is the most important. It affects the normal operation of the pipeline by plugging it. Under high pressure and low temperature conditions, gas hydrate is a crystalline structure consisting of a network of hydrogen bonds between host molecules of water and guest molecules of the incoming gases. Industry uses different types of chemical inhibitors in pipeline to suppress hydrate formation. To overcome this problem, machine learning algorithm has been introduced as part of risk management strategies. The objective of this paper is to utilize Machine Learning (ML) model which is Gaussian Process Regression (GPR). GPR is a new approach being applied to mitigate the growth of gas hydrate. The input parameters used are concentration and pressure of Carbon Dioxide (CO2) and Methane (CH4) gas hydrates whereas the output parameter is the Average Depression Temperature (ADT). The values for the parameter are taken from available data sets that enable GPR to predict the results accurately in terms of Coefficient of Determination, R2 and Mean Squared Error, MSE. The outcome from the research showed that GPR model provided with highest R2 value for training and testing data of 97.25% and 96.71%, respectively. MSE value for GPR was also found to be lowest for training and testing data of 0.019 and 0.023, respectively.


Symmetry ◽  
2020 ◽  
Vol 12 (4) ◽  
pp. 669 ◽  
Author(s):  
Eunseo Oh ◽  
Hyunsoo Lee

The developments in the fields of industrial Internet of Things (IIoT) and big data technologies have made it possible to collect a lot of meaningful industrial process and quality-based data. The gathered data are analyzed using contemporary statistical methods and machine learning techniques. Then, the extracted knowledge can be used for predictive maintenance or prognostic health management. However, it is difficult to gather complete data due to several issues in IIoT, such as devices breaking down, running out of battery, or undergoing scheduled maintenance. Data with missing values are often ignored, as they may contain insufficient information from which to draw conclusions. In order to overcome these issues, we propose a novel, effective missing data handling mechanism for the concepts of symmetry principles. While other existing methods only attempt to estimate missing parts, the proposed method generates a whole set of data set using Gaussian process regression and a generative adversarial network. In order to prove the effectiveness of the proposed framework, we examine a real-world, industrial case involving an air pressure system (APS), where we use the proposed method to make quality predictions and compare the results with existing state-of-the-art estimation methods.


2015 ◽  
Vol 12 (4) ◽  
pp. 1205-1221 ◽  
Author(s):  
H. Post ◽  
H. J. Hendricks Franssen ◽  
A. Graf ◽  
M. Schmidt ◽  
H. Vereecken

Abstract. The use of eddy covariance (EC) CO2 flux measurements in data assimilation and other applications requires an estimate of the random uncertainty. In previous studies, the (classical) two-tower approach has yielded robust uncertainty estimates, but care must be taken to meet the often competing requirements of statistical independence (non-overlapping footprints) and ecosystem homogeneity when choosing an appropriate tower distance. The role of the tower distance was investigated with help of a roving station separated between 8 m and 34 km from a permanent EC grassland station. Random uncertainty was estimated for five separation distances with the classical two-tower approach and an extended approach which removed systematic differences of CO2 fluxes measured at two EC towers. This analysis was made for a data set where (i) only similar weather conditions at the two sites were included, and (ii) an unfiltered one. The extended approach, applied to weather-filtered data for separation distances of 95 and 173 m gave uncertainty estimates in best correspondence with an independent reference method. The introduced correction for systematic flux differences considerably reduced the overestimation of the two-tower based uncertainty of net CO2 flux measurements and decreased the sensitivity of results to tower distance. We therefore conclude that corrections for systematic flux differences (e.g., caused by different environmental conditions at both EC towers) can help to apply the two-tower approach to more site pairs with less ideal conditions.


Sign in / Sign up

Export Citation Format

Share Document