scholarly journals Multi-scale approach for the prediction of atomic scale properties

2021 ◽  
Author(s):  
Andrea Grisafi ◽  
Jigyasa Nigam ◽  
Michele Ceriotti

Multi-scale equivariant representations overcome the nearsightedness of local machine-learning approaches.

2021 ◽  
Author(s):  
Christopher Feeney ◽  
Jack Cosby ◽  
David Robinson ◽  
Amy Thomas ◽  
Bridget Emmett

<p>Soil organic carbon (SOC) is the largest reservoir of organic carbon in the terrestrial biosphere and is the main constituent of soil organic matter, which underpins key soil functions such as storage and filtration of water, and nutrient cycling. SOC concentrations are controlled by several dynamic variables, ranging from micro-scale properties like particle aggregation, to larger-scale drivers such as climate and land cover. Hence, soils are vulnerable to climate change and human disturbances, with implications for ecosystem services such as agriculture and global warming mitigation. Recent decades have seen greater efforts to monitor SOC dynamics, such as the UKCEH Countryside Survey, and to predict concentrations of SOC where we have no measurements, using geostatistics or machine learning approaches. Yet, there is still much to be understood about what controls spatial patterns of SOC, and how effectively different modelling approaches can capture this. Here, we compare predictions by nine maps of the spatial distribution of topsoil SOC in Great Britain. We found broad similarities in SOC concentrations predicted by all maps, which each showed right-skewed distributions with similar median values (43 to 97 g kg<sup>-1</sup>). The greatest differences between maps occur at higher latitudes and are reflected in the upper ends of the SOC distributions. While the maps generally exhibit a sharp rise in SOC concentrations with increasing latitude from ~54<sup>o</sup>N, values predicted by the ISRIC-2017 and FAO-GSOC maps show weaker increases with increasing latitude, and peak at lower values of 332 g kg<sup>-1</sup> and 354 g kg<sup>-1</sup>, respectively. We demonstrate that most of the maps, regardless of the modelling approach taken or the underlying data used, produced similar estimates of SOC concentration, including broad spatial patterns. This work will form the basis of more detailed future assessments of the sensitivity of SOC mapping to analytical methods versus the data used to drive these methods, and will be used to assess the importance of using stratified random field survey approaches for generating more accurate predictions of areas that cannot be sampled. Exploration of why and where different and coincident SOC predictions occur between maps should shed light on the utility of different modelling techniques and machine-learning meta-analyses of driving variables currently used to map SOC. Understanding how SOC predictions differ across all current national scale GB maps is a first step in improving modelling and assessment of SOC stock and change.</p>


2019 ◽  
Vol 73 (12) ◽  
pp. 972-982 ◽  
Author(s):  
Félix Musil ◽  
Michele Ceriotti

Statistical learning algorithms are finding more and more applications in science and technology. Atomic-scale modeling is no exception, with machine learning becoming commonplace as a tool to predict energy, forces and properties of molecules and condensed-phase systems. This short review summarizes recent progress in the field, focusing in particular on the problem of representing an atomic configuration in a mathematically robust and computationally efficient way. We also discuss some of the regression algorithms that have been used to construct surrogate models of atomic-scale properties. We then show examples of how the optimization of the machine-learning models can both incorporate and reveal insights onto the physical phenomena that underlie structure–property relations.


2020 ◽  
pp. 1911-1937 ◽  
Author(s):  
Michele Ceriotti ◽  
Michael J. Willatt ◽  
Gábor Csányi

2019 ◽  
Vol 70 (3) ◽  
pp. 214-224
Author(s):  
Bui Ngoc Dung ◽  
Manh Dzung Lai ◽  
Tran Vu Hieu ◽  
Nguyen Binh T. H.

Video surveillance is emerging research field of intelligent transport systems. This paper presents some techniques which use machine learning and computer vision in vehicles detection and tracking. Firstly the machine learning approaches using Haar-like features and Ada-Boost algorithm for vehicle detection are presented. Secondly approaches to detect vehicles using the background subtraction method based on Gaussian Mixture Model and to track vehicles using optical flow and multiple Kalman filters were given. The method takes advantages of distinguish and tracking multiple vehicles individually. The experimental results demonstrate high accurately of the method.


2017 ◽  
Author(s):  
Sabrina Jaeger ◽  
Simone Fulle ◽  
Samo Turk

Inspired by natural language processing techniques we here introduce Mol2vec which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Similarly, to the Word2vec models where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that are pointing in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing up vectors of the individual substructures and, for instance, feed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pre-trained once, yields dense vector representations and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment independent and can be thus also easily used for proteins with low sequence similarities.


Sign in / Sign up

Export Citation Format

Share Document