Quantitative Activity Structure Relationship (QSAR) of a Series of Azetidinones Derived from Dapsone by the Method of Density Functional Theory (DFT)

This QSAR study, which involved a series of Azetidinones derived from 4,4'-diaminodiphenylsulfone (dapsone), yielded two models based on molecular descriptors and the antibacterial activities Escherichia coli and Staphylococcus aureus.The molecular descriptors were obtained by applying the methods of quantum chemistry at the B3LYP/6-31G (d) level. The statistical indicators of the first model which is a function of the Escherichia coli activity are: the coefficient of determination R2 equals 0.992, the standard deviation S equals 0.342, the Fischer coefficient F equals 185.088 and the cross-validation coefficient Q2CV equals 0.992. Those of the second model showing the activity of Staphylococcus aureus are: the regression coefficient R2= 0.987, a standard deviation S=0.193, the Fischer coefficient F=114.955 and the cross-validation coefficient Q2CV= 0.987. These models have good statistical performances. The quantum descriptors of dipole moment (μ), global softness (σ) and electronegativity (χ) are responsible of the antibacterial activity of the Azetidinones derived from dapsone. In addition, the dipole moment is the priority descriptor for the prediction of the antibacterial activity of the studied compounds. The Eriksson et al. acceptance criteria used for the test set is verified. The values of the dtheo/dexp ratio of the theoretical and experimental activities for the test set tend towards unity.

Download Full-text

An Accurate and Efficient Method to Predict Y-NO Bond Homolysis Bond Dissociation Energies

Mathematical Problems in Engineering ◽

10.1155/2013/860357 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10 ◽

Cited By ~ 4

Author(s):

Hong Zhi Li ◽

Lin Li ◽

Zi Yan Zhong ◽

Yi Han ◽

LiHong Hu ◽

...

Keyword(s):

Dft Calculations ◽

Density Functional ◽

Molecular Descriptors ◽

Basis Set ◽

Bond Dissociation Energies ◽

Data Set ◽

Test Set ◽

Bond Dissociation ◽

Dissociation Energies ◽

Bond Homolysis

The paper suggests a new method that combines the Kennard and Stone algorithm (Kenstone, KS), hierarchical clustering (HC), and ant colony optimization (ACO)-based extreme learning machine (ELM) (KS-HC/ACO-ELM) with the density functional theory (DFT) B3LYP/6-31G(d) method to improve the accuracy of DFT calculations for the Y-NO homolysis bond dissociation energies (BDE). In this method, Kenstone divides the whole data set into two parts, the training set and the test set; HC and ACO are used to perform the cluster analysis on molecular descriptors; correlation analysis is applied for selecting the most correlated molecular descriptors in the classes, and ELM is the nonlinear model for establishing the relationship between DFT calculations and homolysis BDE experimental values. The results show that the standard deviation of homolysis BDE in the molecular test set is reduced from 4.03 kcal mol−1calculated by the DFT B3LYP/6-31G(d) method to 0.30, 0.28, 0.29, and 0.32 kcal mol−1by the KS-ELM, KS-HC-ELM, and KS-ACO-ELM methods and the artificial neural network (ANN) combined with KS-HC, respectively. This method predicts accurate values with much higher efficiency when compared to the larger basis set DFT calculation and may also achieve similarly accurate calculation results for larger molecules.

Download Full-text

Physicochemical and Structural Parameters Contributing to the Antibacterial Activity and Efflux Susceptibility of Small Molecule Inhibitors of Escherichia coli

Antimicrobial Agents and Chemotherapy ◽

10.1128/aac.01925-20 ◽

2021 ◽

Author(s):

Sara S. El Zahed ◽

Shawn French ◽

Maya A. Farha ◽

Garima Kumar ◽

Eric D. Brown

Keyword(s):

Machine Learning ◽

Escherichia Coli ◽

Antibacterial Activity ◽

Small Molecules ◽

Small Molecule ◽

In Silico ◽

Molecular Descriptors ◽

Structural Parameters ◽

Side Chain ◽

Gram Negative

Discovering new Gram-negative antibiotics has been a challenge for decades. This has been largely attributed to a limited understanding of the molecular descriptors governing Gram-negative permeation and efflux evasion. Herein, we address the contribution of efflux using a novel approach that applies multivariate analysis, machine learning, and structure-based clustering to some 4,500 actives from a small molecule screen in efflux-compromised Escherichia coli. We employed principal-component analysis and trained two decision tree-based machine learning models to investigate descriptors contributing to the antibacterial activity and efflux susceptibility of these actives. This approach revealed that the Gram-negative activity of hydrophobic and planar small molecules with low molecular stability is limited to efflux-compromised E. coli. Further, molecules with reduced branching and compactness showed increased susceptibility to efflux. Given these distinct properties that govern efflux, we developed the first machine learning model, called Susceptibility to Efflux Random Forest (SERF), as a tool to analyze the molecular descriptors of small molecules and predict those that could be susceptible to efflux pumps in silico. Here, SERF demonstrated high accuracy in identifying such molecules. Further, we clustered all 4,500 actives based on their core structures and identified distinct clusters highlighting side chain moieties that cause marked changes in efflux susceptibility. In all, our work reveals a role for physicochemical and structural parameters in governing efflux, presents a machine learning tool for rapid in silico analysis of efflux susceptibility, and provides a proof of principle for the potential of exploiting side chain modification to design novel antimicrobials evading efflux pumps.

Download Full-text

Toward a Reliable Evaluation of Forecasting Systems for Plant Diseases: A Case Study Using Fusarium Head Blight of Wheat

Plant Disease ◽

10.1094/pdis-08-11-0665 ◽

2012 ◽

Vol 96 (6) ◽

pp. 889-896 ◽

Cited By ~ 18

Author(s):

S. Landschoot ◽

W. Waegeman ◽

K. Audenaert ◽

J. Vandepitte ◽

G. Haesaert ◽

...

Keyword(s):

Fusarium Head Blight ◽

Cross Validation ◽

Mean Squared Error ◽

Plant Diseases ◽

Coefficient Of Determination ◽

Linear Regression Models ◽

Head Blight ◽

Evaluation Strategies ◽

The Cross ◽

Validation Strategy

Despite great efforts to forecast plant diseases, many of the existing systems often fall short in providing farmers with accurate predictions. One of the main problems arises from the existence of year and location effects, so that more advanced procedures are required for evaluating existing systems in an unbiased manner. This paper illustrates the case of Fusarium head blight of winter wheat in Belgium. We present a new cross-validation strategy that enables the evaluation of the predictive performance of a forecasting system for years and locations that are different from the years and locations on which the forecast was developed. Four different cross-validation strategies and five regression techniques are used. The results demonstrated that traditional evaluation strategies are too optimistic in their predictions, whereas the cross-year cross-location validation strategy yielded more realistic outcomes. Using this procedure, the mean squared error increased and the coefficient of determination decreased in predicting disease severity and deoxynivalenol content, suggesting that existing evaluation strategies may generate a substantial optimistic bias. The strongest discrepancies between the cross-validation strategies were observed for multiple linear regression models.

Download Full-text

Quantitative Structure-Activity Study against Plasmodium falciparum of a Series of Derivatives of Azetidine-2-Carbonitriles by the Method of Density Functional Theory

Mediterranean Journal of Chemistry ◽

10.13171/mjc02103241572mgrk ◽

2021 ◽

Vol 11 (2) ◽

pp. 162

Author(s):

Mamadou Guy-Richard Koné ◽

Jean Stéphane N’dri ◽

Georges Stéphane Dembélé

Keyword(s):

Plasmodium Falciparum ◽

Standard Deviation ◽

Partition Coefficient ◽

Correlation Coefficient ◽

Cross Validation ◽

Molecular Descriptors ◽

Regression Coefficient ◽

Log P ◽

Quantitative Structure ◽

Structure Activity

This work deals with a Quantitative Structure-Activity study against Plasmodium falciparum of a series of Azetidine-2-carbonitrile derivatives. Using the MLR and MNLR methods from excel and xlstat software, we have been able to develop two QSAR models based on molecular descriptors and plasmodial activity. Calculation level B3LYP/6-311 G (d, p) was used to determine molecular descriptors. The statistical indicators of the first model obtained by the MLR method are: the regression coefficient found was R2 = 0.939 with a standard deviation S =0.266, Fischer's coefficient F =82.064, and a cross-validation correlation coefficient =0.935. The parameters of the second model developed by the MNLR method are: the regression coefficient R2: de 0.953, a standard deviation S of 0.258, the Fischer's test F of 108.957, and the correlation coefficient of the cross-validation =0.951. Moreover, these models have shown some interesting statistical performance. The energy of the highest occupied molecular orbital (EHOMO), the dipole moment (µD), and the partition coefficient (log P) are the molecular descriptors responsible for the Plasmodium falciparum activity of Azetidine-derivatives 2-carbonitriles. Furthermore, the partition coefficient is the primary descriptor for predicting the biological activity of the studied compounds. From the findings, Eriksson et al. and the external validation criteria of Tropsha used to implement the test are verified and accurate.

Download Full-text

Predictive Modeling of Breast Anticancer Activity of a Series of Coumarin Derivatives using Quantum Descriptors

Chemical Science International Journal ◽

10.9734/csji/2019/v26i430098 ◽

2019 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Lamoussa Ouattara ◽

Kafoumba Bamba ◽

Mamadou Guy-Richard Koné ◽

Jean Stéphane N’dri ◽

Kouakou Nobel N’Guessan ◽

...

Keyword(s):

Correlation Coefficient ◽

Density Functional ◽

Cross Validation ◽

Molecular Descriptors ◽

Qsar Model ◽

Coumarin Derivatives ◽

Functional Theory ◽

Anti Cancer ◽

Statistical Indicators ◽

Mcf 7

We focused on a series of coumarin derivatives in this work. The method of Density Functional Theory (DFT) of quantum chemistry has been used at B3LYP / 6-31G (d, p) level in order to identify molecular descriptors which are useful for this study. The analysis of the statistical indicators allowed to obtain a QSAR model based on quantum descriptors and anti-cancer activity against breast cancer (MCF-7) that were accredited for good statistical performance. For the model, the statistical indicators were: correlation coefficient R2 = 0.904, standard deviation S = 0.102, Fischer test coefficient F = 18.779 and correlation coefficient of cross validation

Download Full-text

Critical Evaluation of NIR and ATR-IR Spectroscopic Quantifications of Rosmarinic Acid in Rosmarini folium Supported by Quantum Chemical Calculations

Planta Medica ◽

10.1055/s-0043-107032 ◽

2017 ◽

Vol 83 (12/13) ◽

pp. 1076-1084 ◽

Cited By ~ 8

Author(s):

Christian Kirchler ◽

Cornelia Pezzei ◽

Krzysztof Beć ◽

Raphael Henn ◽

Mika Ishigaki ◽

...

Keyword(s):

Standard Error ◽

Rosmarinic Acid ◽

Near Infrared ◽

Cross Validation ◽

Total Reflection ◽

Attenuated Total Reflection ◽

Coefficient Of Determination ◽

Least Squares Regression ◽

Test Set ◽

Wavenumber Region

AbstractThe present study evaluates the analytical performance of near infrared as well as attenuated total reflection infrared spectroscopy for the determination of the rosmarinic acid content in Rosmarini folium. Therefore, the recorded near infrared and attenuated total reflection infrared spectra of 42 milled Rosmarini folium samples were correlated with reference data (range: 1.138–2.199 rosmarinic acid %) obtained by HPLC analysis. Partial least squares regression models were established as a quantitative multivariate data analysis tool. Evaluation via full cross-validation and test set validation resulted in comparable performances for both techniques: near infrared [coefficient of determination: 0.90 (test set validation); standard error of cross-validation: 0.060 rosmarinic acid %; standard error of prediction: 0.058 rosmarinic acid %] and attenuated total reflection infrared [coefficient of determination: 0.91 (test set validation); standard error of cross-validation: 0.063 rosmarinic acid %; standard error of prediction: 0.060 rosmarinic acid %]. Furthermore, quantum chemical calculations were applied to obtain a theoretical infrared spectrum of rosmarinic acid. Good agreement to the spectrum of pure rosmarinic acid was achieved in the lower wavenumber region, whereas the higher wavenumber region showed less compliance. The knowledge of the vibrational modes of rosmarinic acid was used for the association with the high values of the regression coefficient plots of the established partial least squares regression models.

Download Full-text

The cross-validation method for the evaluation of the adequacy, complexity and generality of timing theories

PsycEXTRA Dataset ◽

10.1037/e604022013-035 ◽

2003 ◽

Author(s):

Paulo Guilhardi

Keyword(s):

Cross Validation ◽

Validation Method ◽

The Cross

Download Full-text

Using Connectionist Modules for Decision Support

Methods of Information in Medicine ◽

10.1055/s-0038-1634790 ◽

1990 ◽

Vol 29 (03) ◽

pp. 167-181 ◽

Cited By ~ 6

Author(s):

G. Hripcsak

Keyword(s):

Decision Support ◽

Standard Deviation ◽

Confidence Interval ◽

Posterior Probability ◽

Back Propagation ◽

Connectionist Model ◽

Test Set ◽

The Third ◽

Independent Test ◽

Better Than

AbstractA connectionist model for decision support was constructed out of several back-propagation modules. Manifestations serve as input to the model; they may be real-valued, and the confidence in their measurement may be specified. The model produces as its output the posterior probability of disease. The model was trained on 1,000 cases taken from a simulated underlying population with three conditionally independent manifestations. The first manifestation had a linear relationship between value and posterior probability of disease, the second had a stepped relationship, and the third was normally distributed. An independent test set of 30,000 cases showed that the model was better able to estimate the posterior probability of disease (the standard deviation of residuals was 0.046, with a 95% confidence interval of 0.046-0.047) than a model constructed using logistic regression (with a standard deviation of residuals of 0.062, with a 95% confidence interval of 0.062-0.063). The model fitted the normal and stepped manifestations better than the linear one. It accommodated intermediate levels of confidence well.

Download Full-text

Intrinsic Stacking Interactions of Natural and Artificial Nucleobases

10.26434/chemrxiv.11400405 ◽

2019 ◽

Author(s):

Drew P. Harding ◽

Laura J. Kingsley ◽

Glen Spraggon ◽

Steven Wheeler

Keyword(s):

Gas Phase ◽

Electrostatic Interactions ◽

Density Functional ◽

Molecular Descriptors ◽

Stacking Interactions ◽

Interaction Energies ◽

Functional Theory ◽

Binding Partner ◽

Heavy Atoms ◽

Wide Range

The intrinsic (gas-phase) stacking energies of natural and artificial nucleobases were explored using density functional theory (DFT) and correlated ab initio methods. Ranking the stacking strength of natural nucleobase dimers revealed a preference in binding partner similar to that seen from experiments, namely G > C > A > T > U. Decomposition of these interaction energies using symmetry-adapted perturbation theory (SAPT) showed that these dispersion dominated interactions are modulated by electrostatics. Artificial nucleobases showed a similar stacking preference for natural nucleobases and were also modulated by electrostatic interactions. A robust predictive multivariate model was developed that quantitively predicts the maximum stacking interaction between natural and a wide range of artificial nucleobases using molecular descriptors based on computed electrostatic potentials (ESPs) and the number of heavy atoms. This model should find utility in designing artificial nucleobase analogs that exhibit stacking interactions comparable to those of natural nucleobases. Further analysis of the descriptors in this model unveil the origin of superior stacking abilities of certain nucleobases, including cytosine and guanine.

Download Full-text

PREDIKSI KUALITAS AIR SUNGAI CILIWUNG DENGAN MENGGUNAKAN ALGORITMA POHON KEPUTUSAN

Jurnal Air Indonesia ◽

10.29122/jai.v12i2.4364 ◽

2021 ◽

Vol 12 (2) ◽

Author(s):

Mohammad Haekal ◽

Henki Bayu Seta ◽

Mayanda Mega Santoni

Keyword(s):

Data Mining ◽

Decision Tree ◽

Cross Validation ◽

Online Monitoring ◽

Training Set ◽

Microsoft Excel ◽

Test Set

Untuk memprediksi kualitas air sungai Ciliwung, telah dilakukan pengolahan data-data hasil pemantauan secara Online Monitoring dengan menggunakan Metode Data Mining. Pada metode ini, pertama-tama data-data hasil pemantauan dibuat dalam bentuk tabel Microsoft Excel, kemudian diolah menjadi bentuk Pohon Keputusan yang disebut Algoritma Pohon Keputusan (Decision Tree) mengunakan aplikasi WEKA. Metode Pohon Keputusan dipilih karena lebih sederhana, mudah dipahami dan mempunyai tingkat akurasi yang sangat tinggi. Jumlah data hasil pemantauan kualitas air sungai Ciliwung yang diolah sebanyak 5.476 data. Hasil klarifikasi dengan Pohon Keputusan, dari 5.476 data ini diperoleh jumlah data yang mengindikasikan sungai Ciliwung Tidak Tercemar sebanyak 1.059 data atau sebesar 19,3242%, dan yang mengindikasikan Tercemar sebanyak 4.417 data atau 80,6758%. Selanjutnya data-data hasil pemantauan ini dievaluasi menggunakan 4 Opsi Tes (Test Option) yaitu dengan Use Training Set, Supplied Test Set, Cross-Validation folds 10, dan Percentage Split 66%. Hasil evaluasi dengan 4 opsi tes yang digunakan ini, semuanya menunjukkan tingkat akurasi yang sangat tinggi, yaitu diatas 99%. Dari data-data hasil peneltian ini dapat diprediksi bahwa sungai Ciliwung terindikasi sebagai sungai tercemar bila mereferensi kepada Peraturan Pemerintah Republik Indonesia nomor 82 tahun 2001 dan diketahui pula bahwa penggunaan aplikasi WEKA dengan Algoritma Pohon Keputusan untuk mengolah data-data hasil pemantauan dengan mengambil tiga parameter (pH, DO dan Nitrat) adalah sangat akuran dan tepat. Kata Kunci : Kualitas air sungai, Data Mining, Algoritma Pohon Keputusan, Aplikasi WEKA.

Download Full-text