Machine Learning Prediction of the Exfoliation Energies of Two-Dimension Materials via Data-Driven Approach

Author(s):  
Zhongyu Wan ◽  
Quan-De Wang
Author(s):  
Lidong Wu

The No-Free-Lunch theorem is an interesting and important theoretical result in machine learning. Based on philosophy of No-Free-Lunch theorem, we discuss extensively on the limitation of a data-driven approach in solving NP-hard problems.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Trevor David Rhone ◽  
Wei Chen ◽  
Shaan Desai ◽  
Steven B. Torrisi ◽  
Daniel T. Larson ◽  
...  

Abstract We use a data-driven approach to study the magnetic and thermodynamic properties of van der Waals (vdW) layered materials. We investigate monolayers of the form $$\hbox {A}_2\hbox {B}_2\hbox {X}_6$$ A 2 B 2 X 6 , based on the known material $$\hbox {Cr}_2\hbox {Ge}_2\hbox {Te}_6$$ Cr 2 Ge 2 Te 6 , using density functional theory (DFT) calculations and machine learning methods to determine their magnetic properties, such as magnetic order and magnetic moment. We also examine formation energies and use them as a proxy for chemical stability. We show that machine learning tools, combined with DFT calculations, can provide a computationally efficient means to predict properties of such two-dimensional (2D) magnetic materials. Our data analytics approach provides insights into the microscopic origins of magnetic ordering in these systems. For instance, we find that the X site strongly affects the magnetic coupling between neighboring A sites, which drives the magnetic ordering. Our approach opens new ways for rapid discovery of chemically stable vdW materials that exhibit magnetic behavior.


2020 ◽  
Author(s):  
Jung-Hyun Kim ◽  
Simon I. Briceno ◽  
Cedric Y. Justin ◽  
Dimitri Mavris

2020 ◽  
Author(s):  
Adam Soffer ◽  
Morya Ifrach ◽  
Stefan Ilic ◽  
Ariel Afek ◽  
Dan Vilenchik ◽  
...  

AbstractDNA–protein interactions are essential in all aspects of every living cell. Understanding of how features embedded in the DNA sequence affect specific interactions with proteins is challenging but important, since it may contribute to finding the means to regulate metabolic pathways involving DNA–protein interactions. Using a massive experimental benchmark dataset of binding scores for DNA sequences and a machine learning workflow, we describe the binding to DNA of T7 primase, as a model system for specific DNA–protein interactions. Effective binding of T7 primase to its specific DNA recognition se-quences triggers the formation of RNA primers that serve as Okazaki fragment start sites during DNA replication.


2020 ◽  
Vol 171 ◽  
pp. 105286 ◽  
Author(s):  
Mohit Taneja ◽  
John Byabazaire ◽  
Nikita Jalodia ◽  
Alan Davy ◽  
Cristian Olariu ◽  
...  

JAMIA Open ◽  
2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Fuchiang R Tsui ◽  
Lingyun Shi ◽  
Victor Ruiz ◽  
Neal D Ryan ◽  
Candice Biernesser ◽  
...  

Abstract Objective Limited research exists in predicting first-time suicide attempts that account for two-thirds of suicide decedents. We aimed to predict first-time suicide attempts using a large data-driven approach that applies natural language processing (NLP) and machine learning (ML) to unstructured (narrative) clinical notes and structured electronic health record (EHR) data. Methods This case-control study included patients aged 10–75 years who were seen between 2007 and 2016 from emergency departments and inpatient units. Cases were first-time suicide attempts from coded diagnosis; controls were randomly selected without suicide attempts regardless of demographics, following a ratio of nine controls per case. Four data-driven ML models were evaluated using 2-year historical EHR data prior to suicide attempt or control index visits, with prediction windows from 7 to 730 days. Patients without any historical notes were excluded. Model evaluation on accuracy and robustness was performed on a blind dataset (30% cohort). Results The study cohort included 45 238 patients (5099 cases, 40 139 controls) comprising 54 651 variables from 5.7 million structured records and 798 665 notes. Using both unstructured and structured data resulted in significantly greater accuracy compared to structured data alone (area-under-the-curve [AUC]: 0.932 vs. 0.901 P < .001). The best-predicting model utilized 1726 variables with AUC = 0.932 (95% CI, 0.922–0.941). The model was robust across multiple prediction windows and subgroups by demographics, points of historical most recent clinical contact, and depression diagnosis history. Conclusions Our large data-driven approach using both structured and unstructured EHR data demonstrated accurate and robust first-time suicide attempt prediction, and has the potential to be deployed across various populations and clinical settings.


2020 ◽  
Author(s):  
Kevin Maik Jablonka ◽  
Daniele Ongari ◽  
Seyed Mohamad Moosavi ◽  
Berend Smit

<div><div><div><p>Knowledge of the oxidation state of a metal centre in a material is essential to understand its properties. Chemists have developed several theories to predict the oxidation state on the basis of the chemical formula. These methods are quite successful for simple compounds but often fail to describe the oxidation states of more complex systems, such as metal-organic frameworks. In this work, we present a data-driven approach to automatically assign oxidation states, using a machine learning algorithm trained on the assignments by chemists encoded in the chemical names in the Cambridge Crystallographic Database. Our approach only considers the immediate local chemical environment around a metal centre and, in this way, is robust to most of the experimental uncertainties in these structures (like incorrect protonation or unbound solvents). We find such excellent accuracy (> 98 %) in our predictions that we can use our method to identify a large number of incorrect assignments in the database. The predictions of our model follow chemical intuition, without explicitly having taught the model those heuristics. This work nicely illustrates how powerful the collective knowledge of chemists actually is. Machine learning can harvest this knowledge and convert it into a useful tool for chemists.</p></div></div></div>


Author(s):  
Kevin Maik Jablonka ◽  
Daniele Ongari ◽  
Seyed Mohamad Moosavi ◽  
Berend Smit

<div><div><div><p>Knowledge of the oxidation state of a metal centre in a material is essential to understand its properties. Chemists have developed several theories to predict the oxidation state on the basis of the chemical formula. These methods are quite successful for simple compounds but often fail to describe the oxidation states of more complex systems, such as metal-organic frameworks. In this work, we present a data-driven approach to automatically assign oxidation states, using a machine learning algorithm trained on the assignments by chemists encoded in the chemical names in the Cambridge Crystallographic Database. Our approach only considers the immediate local chemical environment around a metal centre and, in this way, is robust to most of the experimental uncertainties in these structures (like incorrect protonation or unbound solvents). We find such excellent accuracy (> 98 %) in our predictions that we can use our method to identify a large number of incorrect assignments in the database. The predictions of our model follow chemical intuition, without explicitly having taught the model those heuristics. This work nicely illustrates how powerful the collective knowledge of chemists actually is. Machine learning can harvest this knowledge and convert it into a useful tool for chemists.</p></div></div></div>


Sign in / Sign up

Export Citation Format

Share Document