scholarly journals Critical Benchmarking of the G4(MP2) Model, the Correlation Consistent Composite Approach and Popular Density Functional Approximations on a Probabilistically Pruned Benchmark Dataset of Formation Enthalpies

Author(s):  
sambit kumar das ◽  
Sabyasachi Chakraborty ◽  
Raghunathan Ramakrishnan

First-principles calculation of the standard formation enthalpy, $\Delta H_f^0$~(298K), in such large scale as required by chemical space explorations, is amenable only with density functional approximations (DFAs) and some composite wave function theories (cWFTs). Alas, the accuracies of popular range-separated hybrid, `rung-4' DFAs, and cWFTs that offer the best accuracy-vs.-cost trade-off have as yet been established only for datasets predominantly comprising small molecules, hence, their transferability to larger datasets remains vague. In this study, we present an extended benchmark dataset of over two-thousand values of $\Delta H_f^0$ for structurally and electronically diverse molecules. We apply quartile-ranking based on boundary-corrected kernel density estimation to filter outliers and arrive at Probabilistically Pruned Enthalpies of 1908 compounds (PPE1908). For this dataset, we rank the prediction accuracies of G4(MP2), ccCA and 23 popular DFAs using conventional and probabilistic error metrics. We discuss systematic prediction errors and highlight the role an empirical higher-level correction (HLC) plays in the G4(MP2) model. Furthermore, we comment on uncertainties associated with the reference empirical data for atoms and systematic errors introduced by these that grow with the molecular size. We believe these findings to aid in identifying meaningful application domains for quantum thermochemical methods.

2020 ◽  
Author(s):  
sambit kumar das ◽  
Sabyasachi Chakraborty ◽  
Raghunathan Ramakrishnan

First-principles calculation of the standard formation enthalpy, $\Delta H_f^0$~(298K), in such large scale as required by chemical space explorations, is amenable only with density functional approximations (DFAs) and some composite wave function theories (cWFTs). Alas, the accuracies of popular range-separated hybrid, `rung-4' DFAs, and cWFTs that offer the best accuracy-vs.-cost trade-off have as yet been established only for datasets predominantly comprising small molecules, hence, their transferability to larger datasets remains vague. In this study, we present an extended benchmark dataset of over two-thousand values of $\Delta H_f^0$ for structurally and electronically diverse molecules. We apply quartile-ranking based on boundary-corrected kernel density estimation to filter outliers and arrive at Probabilistically Pruned Enthalpies of 1908 compounds (PPE1908). For this dataset, we rank the prediction accuracies of G4(MP2), ccCA and 23 popular DFAs using conventional and probabilistic error metrics. We discuss systematic prediction errors and highlight the role an empirical higher-level correction (HLC) plays in the G4(MP2) model. Furthermore, we comment on uncertainties associated with the reference empirical data for atoms and systematic errors introduced by these that grow with the molecular size. We believe these findings to aid in identifying meaningful application domains for quantum thermochemical methods.


Molecules ◽  
2021 ◽  
Vol 26 (8) ◽  
pp. 2310
Author(s):  
Nathan C. Frey ◽  
Eric Van Dornshuld ◽  
Charles Edwin Webster

The correlation consistent Composite Approach for transition metals (ccCA-TM) and density functional theory (DFT) computations have been applied to investigate the fluxional mechanisms of cyclooctatetraene tricarbonyl chromium ((COT)Cr(CO)3) and 1,3,5,7-tetramethylcyclooctatetraene tricarbonyl chromium, molybdenum, and tungsten ((TMCOT)M(CO)3 (M = Cr, Mo, and W)) complexes. The geometries of (COT)Cr(CO)3 were fully characterized with the PBEPBE, PBE0, B3LYP, and B97-1 functionals with various basis set/ECP combinations, while all investigated (TMCOT)M(CO)3 complexes were fully characterized with the PBEPBE, PBE0, and B3LYP methods. The energetics of the fluxional dynamics of (COT)Cr(CO)3 were examined using the correlation consistent Composite Approach for transition metals (ccCA-TM) to provide reliable energy benchmarks for corresponding DFT results. The PBE0/BS1 results are in semiquantitative agreement with the ccCA-TM results. Various transition states were identified for the fluxional processes of (COT)Cr(CO)3. The PBEPBE/BS1 energetics indicate that the 1,2-shift is the lowest energy fluxional process, while the B3LYP/BS1 energetics (where BS1 = H, C, O: 6-31G(d′); M: mod-LANL2DZ(f)-ECP) indicate the 1,3-shift having a lower electronic energy of activation than the 1,2-shift by 2.9 kcal mol−1. Notably, PBE0/BS1 describes the (CO)3 rotation to be the lowest energy process, followed by the 1,3-shift. Six transition states have been identified in the fluxional processes of each of the (TMCOT)M(CO)3 complexes (except for (TMCOT)W(CO)3), two of which are 1,2-shift transition states. The lowest-energy fluxional process of each (TMCOT)M(CO)3 complex (computed with the PBE0 functional) has a ΔG‡ of 12.6, 12.8, and 13.2 kcal mol−1 for Cr, Mo, and W complexes, respectively. Good agreement was observed between the experimental and computed 1H-NMR and 13C-NMR chemical shifts for (TMCOT)Cr(CO)3 and (TMCOT)Mo(CO)3 at three different temperature regimes, with coalescence of chemically equivalent groups at higher temperatures.


2021 ◽  
Author(s):  
Ryan Kingsbury ◽  
Ayush Gupta ◽  
Christopher Bartel ◽  
Jason Munro ◽  
Shyam Dwaraknath ◽  
...  

Computational materials discovery efforts utilize hundreds or thousands of density functional theory (DFT) calculations to predict material properties. Historically, such efforts have performed calculations at the generalized gradient approximation (GGA) level of theory due to its efficient compromise between accuracy and computational reliability. However, high-throughput calculations at the higher metaGGA level of theory are becoming feasible. The Strongly Constrainted and Appropriately Normed (SCAN) metaGGA functional offers superior accuracy to GGA across much of chemical space, making it appealing as a general-purpose metaGGA functional, but it suffers from numerical instabilities that impede it's use in high-throughput workflows. The recently-developed r2SCAN metaGGA functional promises accuracy similar to SCAN in addition to more robust numerical performance. However, its performance compared to SCAN has yet to be evaluated over a large group of solid materials. In this work, we compared r2SCAN and SCAN predictions for key properties of approximately 6,000 solid materials using a newly-developed high-throughput computational workflow. We find that r2SCAN predicts formation energies more accurately than SCAN and PBEsol for both strongly- and weakly-bound materials and that r2SCAN predicts systematically larger lattice constants than SCAN. We also find that r2SCAN requires modestly fewer computational resources than SCAN and offers significantly more reliable convergence. Thus, our large-scale benchmark confirms that r2SCAN has delivered on its promises of numerical efficiency and accuracy, making it a preferred choice for high-throughput metaGGA calculations.


Materials ◽  
2020 ◽  
Vol 13 (19) ◽  
pp. 4221
Author(s):  
Yongxin Jian ◽  
Zhifu Huang ◽  
Yu Wang ◽  
Jiandong Xing

First-principles calculations based on density functional theory (DFT) have been performed to explore the effects of Si, Cr, W, and Nb elements on the stability, mechanical properties, and electronic structures of MoAlB ternary boride. The five crystals, with the formulas of Mo4Al4B4, Mo4Al3SiB4, Mo3CrAl4B4, Mo3WAl4B4, and Mo3NbAl4B4, have been respectively established. All the calculated crystals are thermodynamically stable, according to the negative cohesive energy and formation enthalpy. By the calculation of elastic constants, the mechanical moduli and ductility evolutions of MoAlB with elemental doping can be further estimated, with the aid of B/G and Poisson’s ratios. Si and W doping cannot only enhance the Young’s modulus of MoAlB, but also improve the ductility to some degree. Simultaneously, the elastic moduli of MoAlB are supposed to become more isotropic after Si and W addition. However, Cr and Nb doping plays a negative role in ameliorating the mechanical properties. Through the analysis of electronic structures and chemical bonding, the evolutions of chemical bondings can be disclosed with the addition of dopant. The enhancement of B-B, Al/Si-B, and Al/Si-Mo bondings takes place after Si substitution, and W addition apparently intensifies the bonding with B and Al. In this case, the strengthening of chemical bonding after Si and W doping exactly accounts for the improvement of mechanical properties of MoAlB. Additionally, Si doping can also improve the Debye temperature and melting point of the MoAlB crystal. Overall, Si element is predicted to be the optimized dopant to ameliorate the mechanical properties of MoAlB.


2013 ◽  
Vol 321-324 ◽  
pp. 1761-1765 ◽  
Author(s):  
Jian Ying Li ◽  
Jing Zhang ◽  
Qi Zhi Cao ◽  
Yi Fang Ouyang

The elastic constants of FeP with orthorhombic structure were calculated by using the density-functional theory method. The formation enthalpy, electronic density of states, bulk modulus, and lattice parameters of orthorhombic FeP were also calculated. All of the results are in good agreement with the experimental data and theoretical results available. The results indicate that orthorhombic FeP intermetallic compound is brittleness.


2009 ◽  
Vol 1165 ◽  
Author(s):  
Tsuyoshi Maeda ◽  
Satoshi Nakamura ◽  
Takahiro Wada

AbstractWe have theoretically evaluated the phase stability and electronic structure of Cu2ZnSnSe4 (CZTSe) and Cu2ZnSnS4 (CZTS). The enthalpies of formation for kesterite, stannite and wurtz-stannite phases of CZTSe and CZTS were calculated using a plane-wave pseudopotential method within the density functional formalism. For CZTSe, the calculated formation enthalpy (ΔH) of the kesterite phase (−312.7 kJ/mol) is a little smaller than that of the stannite phase (−311.3 kJ/mol) and much smaller than that of the wurtz-stannite phase (−305.7 kJ/mol). For CZTS, the ΔH of the kesterite phase (−361.9 kJ/mol) is smaller than that of the stannite phase (−359.9 kJ/mol) and much smaller than that of the wurtz-stannite phase (−354.6 kJ/mol). The difference of ΔH between the kesterite and stannite phases for CZTS is greater than that for CZTSe. This indicates the kesterite phase is more stable than the stannite phase in CZTS compared with CZTSe. The valence band maximums (VBMs) of both the kesterite- and stannite-type CZTSe(CZTS) are antibonding orbitals of Cu 3d and Se 4p (S 3p). The conduction band minimums (CBMs) are antibonding orbitals of Sn 5s and Se 4p (S 3p). The Zn atom does not affect the VBM or the CBM in either CZTSe(CZTS). The theoretical band gap of the kesterite phase calculated with sX-LDA in both CZTSe and CZTS is a little wider than that of the wurtz-stannite phase and much wider than that of the stannite phase.


2019 ◽  
Author(s):  
Jon Paul Janet ◽  
Chenru Duan ◽  
Tzuhsiung Yang ◽  
Aditya Nandy ◽  
Heather Kulik

<p>Machine learning (ML) models, such as artificial neural networks, have emerged as a complement to high-throughput screening, enabling characterization of new compounds in seconds instead of hours. The promise of ML models to enable large-scale, chemical space exploration can only be realized if it is straightforward to identify when molecules and materials are outside the model’s domain of applicability. Established uncertainty metrics for neural network models are either costly to obtain (e.g., ensemble models) or rely on feature engineering (e.g., feature space distances), and each has limitations in estimating prediction errors for chemical space exploration. We introduce the distance to available data in the latent space of a neural network ML model as a low-cost, quantitative uncertainty metric that works for both inorganic and organic chemistry. The calibrated performance of this approach exceeds widely used uncertainty metrics and is readily applied to models of increasing complexity at no additional cost. Tightening latent distance cutoffs systematically drives down predicted model errors below training errors, thus enabling predictive error control in chemical discovery or identification of useful data points for active learning.</p>


Author(s):  
Jon Paul Janet ◽  
Chenru Duan ◽  
Tzuhsiung Yang ◽  
Aditya Nandy ◽  
Heather Kulik

<p>Machine learning (ML) models, such as artificial neural networks, have emerged as a complement to high-throughput screening, enabling characterization of new compounds in seconds instead of hours. The promise of ML models to enable large-scale, chemical space exploration can only be realized if it is straightforward to identify when molecules and materials are outside the model’s domain of applicability. Established uncertainty metrics for neural network models are either costly to obtain (e.g., ensemble models) or rely on feature engineering (e.g., feature space distances), and each has limitations in estimating prediction errors for chemical space exploration. We introduce the distance to available data in the latent space of a neural network ML model as a low-cost, quantitative uncertainty metric that works for both inorganic and organic chemistry. The calibrated performance of this approach exceeds widely used uncertainty metrics and is readily applied to models of increasing complexity at no additional cost. Tightening latent distance cutoffs systematically drives down predicted model errors below training errors, thus enabling predictive error control in chemical discovery or identification of useful data points for active learning.</p>


Author(s):  
Jon Paul Janet ◽  
Chenru Duan ◽  
Tzuhsiung Yang ◽  
Aditya Nandy ◽  
Heather Kulik

<p>Machine learning (ML) models, such as artificial neural networks, have emerged as a complement to high-throughput screening, enabling characterization of new compounds in seconds instead of hours. The promise of ML models to enable large-scale, chemical space exploration can only be realized if it is straightforward to identify when molecules and materials are outside the model’s domain of applicability. Established uncertainty metrics for neural network models are either costly to obtain (e.g., ensemble models) or rely on feature engineering (e.g., feature space distances), and each has limitations in estimating prediction errors for chemical space exploration. We introduce the distance to available data in the latent space of a neural network ML model as a low-cost, quantitative uncertainty metric that works for both inorganic and organic chemistry. The calibrated performance of this approach exceeds widely used uncertainty metrics and is readily applied to models of increasing complexity at no additional cost. Tightening latent distance cutoffs systematically drives down predicted model errors below training errors, thus enabling predictive error control in chemical discovery or identification of useful data points for active learning.</p>


Sign in / Sign up

Export Citation Format

Share Document