scholarly journals Representation of Molecular Structures with Persistent Homology Leads to the Discovery of Molecular Groups with Enhanced CO2 Binding

Author(s):  
Jacob Townsend ◽  
Cassie Putman Micucci ◽  
John H. Hymel ◽  
Vasileios Maroulas ◽  
Konstantinos Vogiatzis

<p>Developing alternative strategies for efficient separation of CO2 and N2 is of general interest for the reduction of anthropogenic carbon emissions. In recent years, machine learning and high-throughput computational screening have been valuable tools in accelerated first-principles screening for the discovery of the next generation of functionalized molecules and materials. The application of machine learning for chemical applications requires the conversion of molecular structures to a machine-readable format known as a molecular representation. The choice of such representations impacts the performance and outcomes of chemical machine learning methods. Herein, we present a new concise and size-consistent molecular representation derived from persistent homology,an applied branch of mathematics. We have demonstrated its applicability in a high-throughput computational screening of a large molecular database (GDB-9) with more than 133,000 organic molecules. Our target is to identify novel molecules that selectively interact with CO2. The methodology and performance of the novel molecular fingerprinting method is presented and the new chemically-driven persistence image representation is used to screen the GDB-9 database to suggest molecules and/or functional groups with enhanced properties.</p>

2020 ◽  
Author(s):  
Jacob Townsend ◽  
Cassie Putman Micucci ◽  
John H. Hymel ◽  
Vasileios Maroulas ◽  
Konstantinos Vogiatzis

<p>Developing alternative strategies for efficient separation of CO2 and N2 is of general interest for the reduction of anthropogenic carbon emissions. In recent years, machine learning and high-throughput computational screening have been valuable tools in accelerated first-principles screening for the discovery of the next generation of functionalized molecules and materials. The application of machine learning for chemical applications requires the conversion of molecular structures to a machine-readable format known as a molecular representation. The choice of such representations impacts the performance and outcomes of chemical machine learning methods. Herein, we present a new concise and size-consistent molecular representation derived from persistent homology,an applied branch of mathematics. We have demonstrated its applicability in a high-throughput computational screening of a large molecular database (GDB-9) with more than 133,000 organic molecules. Our target is to identify novel molecules that selectively interact with CO2. The methodology and performance of the novel molecular fingerprinting method is presented and the new chemically-driven persistence image representation is used to screen the GDB-9 database to suggest molecules and/or functional groups with enhanced properties.</p>


2019 ◽  
Author(s):  
Jacob Townsend ◽  
Cassie Putman Micucci ◽  
John H. Hymel ◽  
Vasileios Maroulas ◽  
Konstantinos Vogiatzis

<p>Developing alternative strategies for efficient separation of CO2 and N2 is of general interest for the reduction of anthropogenic carbon emissions. In recent years, machine learning and high-throughput computational screening have been valuable tools in accelerated first-principles screening for the discovery of the next generation of functionalized molecules and materials. The application of machine learning for chemical applications requires the conversion of molecular structures to a machine-readable format known as a molecular representation. The choice of such representations impacts the performance and outcomes of chemical machine learning methods. Herein, we present a new concise and size-consistent molecular representation derived from persistent homology,an applied branch of mathematics. We have demonstrated its applicability in a high-throughput computational screening of a large molecular database (GDB-9) with more than 133,000 organic molecules. Our target is to identify novel molecules that selectively interact with CO2. The methodology and performance of the novel molecular fingerprinting method is presented and the new chemically-driven persistence image representation is used to screen the GDB-9 database to suggest molecules and/or functional groups with enhanced properties.</p>


2014 ◽  
Vol 43 (16) ◽  
pp. 5735-5749 ◽  
Author(s):  
Yamil J. Colón ◽  
Randall Q. Snurr

High-throughput computational screening of MOFs allows identification of promising candidates, new structure–property relationships, and performance limits.


Nanomaterials ◽  
2022 ◽  
Vol 12 (1) ◽  
pp. 159
Author(s):  
Lifeng Li ◽  
Zenan Shi ◽  
Hong Liang ◽  
Jie Liu ◽  
Zhiwei Qiao

Atmospheric water harvesting by strong adsorbents is a feasible method of solving the shortage of water resources, especially for arid regions. In this study, a machine learning (ML)-assisted high-throughput computational screening is employed to calculate the capture of H2O from N2 and O2 for 6013 computation-ready, experimental metal-organic frameworks (CoRE-MOFs) and 137,953 hypothetical MOFs (hMOFs). Through the univariate analysis of MOF structure-performance relationships, Qst is shown to be a key descriptor. Moreover, three ML algorithms (random forest, gradient boosted regression trees, and neighbor component analysis (NCA)) are applied to hunt for the complicated interrelation between six descriptors and performance. After the optimizing strategy of grid search and five-fold cross-validation is performed, three ML can effectively build the predictive model for CoRE-MOFs, and the accuracy R2 of NCA can reach 0.97. In addition, based on the relative importance of the descriptors by ML, it can be quantitatively concluded that the Qst is dominant in governing the capture of H2O. Besides, the NCA model trained by 6013 CoRE-MOFs can predict the selectivity of hMOFs with a R2 of 0.86, which is more universal than other models. Finally, 10 CoRE-MOFs and 10 hMOFs with high performance are identified. The computational screening and prediction of ML could provide guidance and inspiration for the development of materials for water harvesting in the atmosphere.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Jacob Townsend ◽  
Cassie Putman Micucci ◽  
John H. Hymel ◽  
Vasileios Maroulas ◽  
Konstantinos D. Vogiatzis

Processes ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 165
Author(s):  
Hao Qin ◽  
Zihao Wang ◽  
Zhen Song ◽  
Xiang Zhang ◽  
Teng Zhou

The separation of 1,3-butadiene (1,3-C4H6) and 1-butene (n-C4H8) is quite challenging due to their close boiling points and similar molecular structures. Extractive distillation (ED) is widely regarded as a promising approach for such a separation task. For ED processes, the selection of suitable entrainer is of central importance. Traditional ED processes using organic solvents suffer from high energy consumption. To tackle this issue, the utilization of ionic liquids (ILs) can serve as a potential alternative. In this work, a high-throughput computational screening of ILs is performed to find proper entrainers, where 36,260 IL candidates comprising of 370 cations and 98 anions are involved. COSMO-RS is employed to calculate the infinite dilution extractive capacity and selectivity of the 36,260 ILs. In doing so, the ILs that satisfy the prespecified thermodynamic criteria and physical property constraints are identified. After the screening, the resulting IL candidates are sent for rigorous process simulation and design. 1,2,3,4,5-pentamethylimidazolium methylcarbonate is found to be the optimal IL solvent. Compared with the benchmark ED process where the organic solvent N-methyl-2-pyrrolidone is adopted, the energy consumption is reduced by 26%. As a result, this work offers a new IL-based ED process for efficient 1,3-C4H6 production.


2020 ◽  
Vol 5 (4) ◽  
pp. 725-742 ◽  
Author(s):  
Zenan Shi ◽  
Wenyuan Yang ◽  
Xiaomei Deng ◽  
Chengzhi Cai ◽  
Yaling Yan ◽  
...  

The combination of machine learning and high-throughput computation for the screening of MOFs with high performance.


Sign in / Sign up

Export Citation Format

Share Document