scholarly journals Computational Methods for Training Set Selection and Error Assessment Applied to Catalyst Design: Guidelines for Deciding Which Reactions to Run First and Which to Run Next

Author(s):  
Scott Denmark ◽  
Andrew Zahrt ◽  
William Darrow ◽  
Brennan Rose ◽  
Jeremy Henle

The application of machine learning (ML) to problems in homogeneous catalysis has emerged as a promising avenue for catalyst optimization. An important aspect of such optimization campaigns is determining which reactions to run at the outset of experimentation and which future predictions are the most reliable. Herein, we explore methods for these two tasks in the context of our previously developed chemoinformatics workflow. First, different methods for training set selection are compared, including algorithmic selection and selection informed by unsupervised learning methods. Next, an array of different metrics for assessment of prediction confidence are examined in multiple catalyst manifolds. These approaches will inform future computer-guided studies to accelerate catalyst selection and reaction optimization. Finally, this work demonstrates the generality of the Average Steric Occupancy (ASO) and Average Electronic Indicator Field (AEIF) descriptors in their application to transition metal catalysts for the first time. <br>

2020 ◽  
Author(s):  
Scott Denmark ◽  
Andrew Zahrt ◽  
William Darrow ◽  
Brennan Rose ◽  
Jeremy Henle

The application of machine learning (ML) to problems in homogeneous catalysis has emerged as a promising avenue for catalyst optimization. An important aspect of such optimization campaigns is determining which reactions to run at the outset of experimentation and which future predictions are the most reliable. Herein, we explore methods for these two tasks in the context of our previously developed chemoinformatics workflow. First, different methods for training set selection are compared, including algorithmic selection and selection informed by unsupervised learning methods. Next, an array of different metrics for assessment of prediction confidence are examined in multiple catalyst manifolds. These approaches will inform future computer-guided studies to accelerate catalyst selection and reaction optimization. Finally, this work demonstrates the generality of the Average Steric Occupancy (ASO) and Average Electronic Indicator Field (AEIF) descriptors in their application to transition metal catalysts for the first time. <br>


Author(s):  
Andrew F. Zahrt ◽  
Brennan T. Rose ◽  
William T. Darrow ◽  
Jeremy J. Henle ◽  
Scott E. Denmark

Different subset selection methods are examined to guide catalyst selection in optimization campaigns. Error assessment methods are used to quantitatively inform selection of new catalyst candidates from in silico libraries of catalyst structures.


2020 ◽  
Author(s):  
Xin Yi See ◽  
Benjamin Reiner ◽  
Xuelan Wen ◽  
T. Alexander Wheeler ◽  
Channing Klein ◽  
...  

<div> <div> <div> <p>Herein, we describe the use of iterative supervised principal component analysis (ISPCA) in de novo catalyst design. The regioselective synthesis of 2,5-dimethyl-1,3,4-triphenyl-1H- pyrrole (C) via Ti- catalyzed formal [2+2+1] cycloaddition of phenyl propyne and azobenzene was targeted as a proof of principle. The initial reaction conditions led to an unselective mixture of all possible pyrrole regioisomers. ISPCA was conducted on a training set of catalysts, and their performance was regressed against the scores from the top three principal components. Component loadings from this PCA space along with k-means clustering were used to inform the design of new test catalysts. The selectivity of a prospective test set was predicted in silico using the ISPCA model, and only optimal candidates were synthesized and tested experimentally. This data-driven predictive-modeling workflow was iterated, and after only three generations the catalytic selectivity was improved from 0.5 (statistical mixture of products) to over 11 (> 90% C) by incorporating 2,6-dimethyl- 4-(pyrrolidin-1-yl)pyridine as a ligand. The successful development of a highly selective catalyst without resorting to long, stochastic screening processes demonstrates the inherent power of ISPCA in de novo catalyst design and should motivate the general use of ISPCA in reaction development. </p> </div> </div> </div>


2019 ◽  
Author(s):  
Seoin Back ◽  
Kevin Tran ◽  
Zachary Ulissi

<div> <div> <div> <div><p>Developing active and stable oxygen evolution catalysts is a key to enabling various future energy technologies and the state-of-the-art catalyst is Ir-containing oxide materials. Understanding oxygen chemistry on oxide materials is significantly more complicated than studying transition metal catalysts for two reasons: the most stable surface coverage under reaction conditions is extremely important but difficult to understand without many detailed calculations, and there are many possible active sites and configurations on O* or OH* covered surfaces. We have developed an automated and high-throughput approach to solve this problem and predict OER overpotentials for arbitrary oxide surfaces. We demonstrate this for a number of previously-unstudied IrO2 and IrO3 polymorphs and their facets. We discovered that low index surfaces of IrO2 other than rutile (110) are more active than the most stable rutile (110), and we identified promising active sites of IrO2 and IrO3 that outperform rutile (110) by 0.2 V in theoretical overpotential. Based on findings from DFT calculations, we pro- vide catalyst design strategies to improve catalytic activity of Ir based catalysts and demonstrate a machine learning model capable of predicting surface coverages and site activity. This work highlights the importance of investigating unexplored chemical space to design promising catalysts.<br></p></div></div></div></div><div><div><div> </div> </div> </div>


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Joaquin Caro-Astorga ◽  
Kenneth T. Walker ◽  
Natalia Herrera ◽  
Koon-Yang Lee ◽  
Tom Ellis

AbstractEngineered living materials (ELMs) based on bacterial cellulose (BC) offer a promising avenue for cheap-to-produce materials that can be programmed with genetically encoded functionalities. Here we explore how ELMs can be fabricated in a modular fashion from millimetre-scale biofilm spheroids grown from shaking cultures of Komagataeibacter rhaeticus. Here we define a reproducible protocol to produce BC spheroids with the high yield bacterial cellulose producer K. rhaeticus and demonstrate for the first time their potential for their use as building blocks to grow ELMs in 3D shapes. Using genetically engineered K. rhaeticus, we produce functionalized BC spheroids and use these to make and grow patterned BC-based ELMs that signal within a material and can sense and report on chemical inputs. We also investigate the use of BC spheroids as a method to regenerate damaged BC materials and as a way to fuse together smaller material sections of cellulose and synthetic materials into a larger piece. This work improves our understanding of BC spheroid formation and showcases their great potential for fabricating, patterning and repairing ELMs based on the promising biomaterial of bacterial cellulose.


1995 ◽  
Vol 3 (4) ◽  
pp. 279-292 ◽  
Author(s):  
I. T. Cousins ◽  
M. T. D. Cronin ◽  
J. C. Dearden ◽  
C. D. Watts

Author(s):  
Tomasz Kajdanowicz ◽  
Slawomir Plamowski ◽  
Przemyslaw Kazienko

Choosing a proper training set for machine learning tasks is of great importance in complex domain problems. In the paper a new distance measure for training set selection is presented and thoroughly discussed. The distance between two datasets is computed using variance of entropy in groups obtained after clustering. The approach is validated using real domain datasets from debt portfolio valuation process. Eventually, prediction performance is examined.


Sign in / Sign up

Export Citation Format

Share Document