scholarly journals Chemical hardness-driven interpretable machine learning approach for rapid search of photocatalysts

2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Ritesh Kumar ◽  
Abhishek K. Singh

AbstractStrategies combining high-throughput (HT) and machine learning (ML) to accelerate the discovery of promising new materials have garnered immense attention in recent years. The knowledge of new guiding principles is usually scarce in such studies, essentially due to the ‘black-box’ nature of the ML models. Therefore, we devised an intuitive method of interpreting such opaque ML models through SHapley Additive exPlanations (SHAP) values and coupling them with the HT approach for finding efficient 2D water-splitting photocatalysts. We developed a new database of 3099 2D materials consisting of metals connected to six ligands in an octahedral geometry, termed as 2DO (octahedral 2D materials) database. The ML models were constructed using a combination of composition and chemical hardness-based features to gain insights into the thermodynamic and overall stabilities. Most importantly, it distinguished the target properties of the isocompositional 2DO materials differing in bond connectivities by combining the advantages of both elemental and structural features. The interpretable ML regression, classification, and data analysis lead to a new hypothesis that the highly stable 2DO materials follow the HSAB principle. The most stable 2DO materials were further screened based on suitable band gaps within the visible region and band alignments with respect to standard redox potentials using the GW method, resulting in 21 potential candidates. Moreover, HfSe2 and ZrSe2 were found to have high solar-to-hydrogen efficiencies reaching their theoretical limits. The proposed methodology will enable materials scientists and engineers to formulate predictive models, which will be accurate, physically interpretable, transferable, and computationally tractable.

2018 ◽  
Vol 11 (1) ◽  
pp. 34
Author(s):  
Alfan Farizki Wicaksono ◽  
Sharon Raissa Herdiyana ◽  
Mirna Adriani

Someone's understanding and stance on a particular controversial topic can be influenced by daily news or articles he consume everyday. Unfortunately, readers usually do not realize that they are reading controversial articles. In this paper, we address the problem of automatically detecting controversial article from citizen journalism media. To solve the problem, we employ a supervised machine learning approach with several hand-crafted features that exploits linguistic information, meta-data of an article, structural information in the commentary section, and sentiment expressed inside the body of an article. The experimental results shows that our proposed method manages to perform the addressed task effectively. The best performance so far is achieved when we use all proposed feature with Logistic Regression as our model (82.89\% in terms of accuracy). Moreover, we found that information from commentary section (structural features) contributes most to the classification task.


2020 ◽  
Author(s):  
Qi Yang ◽  
Yao Li ◽  
Jin-Dong Yang ◽  
Yidi Liu ◽  
Long Zhang ◽  
...  

The acid dissociation constant p<i>K</i><sub>a</sub> dictates a molecule’s ionic status, and is a critical physicochemical property in rationalizing acid-base chemistry in solution and in many biological contexts. Although numerous theoretic approaches have been developed for predicating aqueous p<i>K</i><sub>a</sub>, fast and accurate prediction of non-aqueous p<i>K</i><sub>a</sub>s has remained a major challenge. On the basis of <i>i</i>BonD experimental p<i>K</i><sub>a</sub> database curated across 39 solvents, a holistic p<i>K</i><sub>a</sub> prediction model was established by using machine learning approach. Structural and physical organic parameters combined descriptors (SPOC) were introduced to represent the electronic and structural features of molecules. With SPOC and ionic status labelling (ISL), the holistic models trained with neural network or XGBoost algorithm showed the best prediction performance <a>with MAE value as low as 0.87</a> p<i>K</i><sub>a</sub> unit. The holistic model showed better performance than all the tested single-solvent models (SSMs), verifying the transfer learning features. The capability of prediction in diverse solvents allows for a comprehensive mapping of all the possible p<i>K</i><sub>a</sub> correlations between different solvents. The <i>i</i>BonD holistic model was validated by prediction of aqueous p<i>K</i><sub>a</sub> and micro-p<i>K</i><sub>a</sub> of pharmaceutical molecules and p<i>K</i><sub>a</sub>s of organocatalysts in DMSO and MeCN with high accuracy. An on-line prediction platform (<a href="http://pka.luoszgroup.com/">http://pka.luoszgroup.com</a>) was constructed based on the current model.


2020 ◽  
Author(s):  
Qi Yang ◽  
Yao Li ◽  
Jin-Dong Yang ◽  
Yidi Liu ◽  
Long Zhang ◽  
...  

The acid dissociation constant p<i>K</i><sub>a</sub> dictates a molecule’s ionic status, and is a critical physicochemical property in rationalizing acid-base chemistry in solution and in many biological contexts. Although numerous theoretic approaches have been developed for predicating aqueous p<i>K</i><sub>a</sub>, fast and accurate prediction of non-aqueous p<i>K</i><sub>a</sub>s has remained a major challenge. On the basis of <i>i</i>BonD experimental p<i>K</i><sub>a</sub> database curated across 39 solvents, a holistic p<i>K</i><sub>a</sub> prediction model was established by using machine learning approach. Structural and physical organic parameters combined descriptors (SPOC) were introduced to represent the electronic and structural features of molecules. With SPOC and ionic status labelling (ISL), the holistic models trained with neural network or XGBoost algorithm showed the best prediction performance <a>with MAE value as low as 0.87</a> p<i>K</i><sub>a</sub> unit. The holistic model showed better performance than all the tested single-solvent models (SSMs), verifying the transfer learning features. The capability of prediction in diverse solvents allows for a comprehensive mapping of all the possible p<i>K</i><sub>a</sub> correlations between different solvents. The <i>i</i>BonD holistic model was validated by prediction of aqueous p<i>K</i><sub>a</sub> and micro-p<i>K</i><sub>a</sub> of pharmaceutical molecules and p<i>K</i><sub>a</sub>s of organocatalysts in DMSO and MeCN with high accuracy. An on-line prediction platform (<a href="http://pka.luoszgroup.com/">http://pka.luoszgroup.com</a>) was constructed based on the current model.


Sign in / Sign up

Export Citation Format

Share Document