scholarly journals QSAR study of the octanol/water partition coefficient of organophosphorous compounds: The hybrid GA/MLR and GA/ANN approaches

2020 ◽  
Vol 85 (4) ◽  
pp. 467-480 ◽  
Author(s):  
Rana Amiri ◽  
Djelloul Messadi ◽  
Amel Bouakkadia

This study aimed at predicting the n-octanol/water partition coefficient (Kow) of 43 organophosphorous insecticides. Quantitative structure?property relationship analysis was performed on the series of 43 insecticides using two different methods, linear (multiple linear regression, MLR) and non-linear (artificial neural network, ANN), which Kow values of these chemicals to their structural descriptors. First, the data set was separated with a duplex algorithm into a training set (28 chemicals) and a test set (15 chemicals) for statistical external validation. A model with four descriptors was developed using as independent variables theoretical descriptors derived from Dragon software when applying genetic algorithm (GA)?variable subset selection (VSS) procedure. The values of statistical parameters, R2, Q2 ext, SDEPext and SDEC for the MLR (94.09 %, 92.43 %, 0.533 and 0.471, respectively) and ANN model (97.24 %, 92.17 %, 0.466 and 0.332, respectively) obtained for the three approaches are very similar, which confirmed that the employed four parameters model is stable, robust and significant.

2017 ◽  
Vol 28 (4) ◽  
pp. 579-592 ◽  
Author(s):  
Amel Bouakkadia ◽  
Leila Lourici ◽  
Djelloul Messadi

Purpose The purpose of this paper is to predict the octanol/water partition coefficient (Kow) of 43 organophosphorous compounds. Design/methodology/approach A quantitative structure-property relationship analysis was performed on a series of 43 pesticides using multiple linear regression and support vector machines methods, which correlate the octanol-water partition coefficient (Kow) values of these chemicals to their structural descriptors. At first, the data set was randomly separated into a training set (34 chemicals) and a test set (nine chemicals) for statistical external validation. Findings Models with three descriptors were developed using theoretical descriptors as independent variables derived from Dragon software while applying genetic algorithm-variable subset selection procedure. Originality/value The robustness and the predictive performance of the proposed linear model were verified using both internal and external statistical validation. One influential point which reinforces the model and an outlier were highlighted.


2016 ◽  
Vol 27 (3) ◽  
pp. 299-312
Author(s):  
Nadia Ziani ◽  
Khadidja Amirat ◽  
Djelloul Messadi

Purpose – The purpose of this paper is to predict the aquatic toxicity (LC50) of 92 substituted benzenes derivatives in Pimephales promelas. Design/methodology/approach – Quantitative structure-activity relationship analysis was performed on a series of 92 substituted benzenes derivatives using multiple linear regression (MLR), artificial neural network (ANN) and support vector machines (SVM) methods, which correlate aquatic toxicity (LC50) values of these chemicals to their structural descriptors. At first, the entire data set was split according to Kennard and Stone algorithm into a training set (74 chemicals) and a test set (18 chemical) for statistical external validation. Findings – Models with six descriptors were developed using as independent variables theoretical descriptors derived from Dragon software when applying genetic algorithm – variable subset selection procedure. Originality/value – The values of Q2 and RMSE in internal validation for MLR, SVM, and ANN model were: (0.8829; 0.225), (0.8882; 0.222); (0.8980; 0.214), respectively and also for external validation were: (0.9538; 0.141); (0.947; 0.146); (0.9564; 0.146). The statistical parameters obtained for the three approaches are very similar, which confirm that our six parameters model is stable, robust and significant.


2020 ◽  
Vol 16 (8) ◽  
pp. 1088-1105
Author(s):  
Nafiseh Vahedi ◽  
Majid Mohammadhosseini ◽  
Mehdi Nekoei

Background: The poly(ADP-ribose) polymerases (PARP) is a nuclear enzyme superfamily present in eukaryotes. Methods: In the present report, some efficient linear and non-linear methods including multiple linear regression (MLR), support vector machine (SVM) and artificial neural networks (ANN) were successfully used to develop and establish quantitative structure-activity relationship (QSAR) models capable of predicting pEC50 values of tetrahydropyridopyridazinone derivatives as effective PARP inhibitors. Principal component analysis (PCA) was used to a rational division of the whole data set and selection of the training and test sets. A genetic algorithm (GA) variable selection method was employed to select the optimal subset of descriptors that have the most significant contributions to the overall inhibitory activity from the large pool of calculated descriptors. Results: The accuracy and predictability of the proposed models were further confirmed using crossvalidation, validation through an external test set and Y-randomization (chance correlations) approaches. Moreover, an exhaustive statistical comparison was performed on the outputs of the proposed models. The results revealed that non-linear modeling approaches, including SVM and ANN could provide much more prediction capabilities. Conclusion: Among the constructed models and in terms of root mean square error of predictions (RMSEP), cross-validation coefficients (Q2 LOO and Q2 LGO), as well as R2 and F-statistical value for the training set, the predictive power of the GA-SVM approach was better. However, compared with MLR and SVM, the statistical parameters for the test set were more proper using the GA-ANN model.


2020 ◽  
Vol 16 (3) ◽  
pp. 207-221
Author(s):  
Etratsadat Dadfar ◽  
Fatemeh Shafiei ◽  
Tahereh M. Isfahani

Aim and Objective: Sulfonamides (sulfa drugs) are compounds with a wide range of biological activities and they are the basis of several groups of drugs. Quantitative Structure-Property Relationship (QSPR) models are derived to predict the logarithm of water/ 1-octanol partition coefficients (logP) of sulfa drugs. Materials and Methods: A data set of 43 sulfa drugs was randomly divided into 3 groups: training, test and validation sets consisting of 70%, 15% and 15% of data point, respectively. A large number of molecular descriptors were calculated with Dragon software. The Genetic Algorithm - Multiple Linear Regressions (GA-MLR) and genetic algorithm -artificial neural network (GAANN) were employed to design the QSPR models. The possible molecular geometries of sulfa drugs were optimized at B3LYP/6-31G* level with Gaussian 98 software. The molecular descriptors derived from the Dragon software were used to build a predictive model for prediction logP of mentioned compounds. The Genetic Algorithm (GA) method was applied to select the most relevant molecular descriptors. Results: The R2 and MSE values of the MLR model were calculated to be 0.312 and 5.074 respectively. R2 coefficients were 0.9869, 0.9944 and 0.9601for the training, test and validation sets of the ANN model, respectively. Conclusion: Comparison of the results revealed that the application the GA-ANN method gave better results than GA-MLR method.


2014 ◽  
Vol 79 (8) ◽  
pp. 965-975 ◽  
Author(s):  
Long Jiao ◽  
Xiaofei Wang ◽  
LI. Hua ◽  
Yunxia Wang

The quantitative structure property relationship (QSPR) for gas/particle partition coefficient, Kp, of polychlorinated biphenyls (PCBs) was investigated. Molecular distance-edge vector (MDEV) index was used as the structural descriptor of PCBs. The quantitative relationship between the MDEV index and log Kp was modeled by multivariate linear regression (MLR) and artificial neural network (ANN) respectively. Leave one out cross validation and external validation were carried out to assess the prediction ability of the developed models. When the MLR method is used, the root mean square relative error (RMSRE) of prediction for leave one out cross validation and external validation is 4.72 and 8.62 respectively. When the ANN method is employed, the prediction RMSRE of leave one out cross validation and external validation is 3.87 and 7.47 respectively. It is demonstrated that the developed models are practicable for predicting the Kp of PCBs. The MDEV index is shown to be quantitatively related to the Kp of PCBs.


Author(s):  
Tamiris Maria de Assis ◽  
Teodorico Castro Ramalho ◽  
Elaine Fontes Ferreira da Cunha

Background: The quantitative structure-activity relationship is an analysis method that can be applied for designing new molecules. In 1997, Hopfinger and coworkers developed the 4D-QSAR methodology aiming to eliminate the question of which conformation to use in a QSAR study. In this work, the 4D-QSAR methodology was used to quantitatively determine the influence of structural descriptors on the activity of aryl pyrimidine derivatives as inhibitors of the TGF-β1 receptor. The members of the TGF-β subfamily are interesting molecular targets, since they play an important function in the growth and development of cell cellular including proliferation, apoptosis, differentiation, epithelial-mesenchymal transition (EMT), and migration. In late stages, TGF-β exerts tumor-promoting effects, increasing tumor invasiveness, and metastasis. Therefore, TGF-β is an attractive target for cancer therapy. Objective: The major goal of the current research is to develop 4D-QSAR models aiming to propose new structures of aryl pyrimidine derivatives. Materials and Methods: Molecular dynamics simulation was carried out to generate the conformational ensemble profile of a data set with aryl pyrimidine derivatives. The conformations were overlaid into a three-dimensional cubic box, according to the three-ordered atom alignment. The occupation of the grid cells by the interaction of pharmacophore elements provides the grid cell occupancy descriptors (GCOD), the dependent variables used to build the 4D-QSAR models. The best models were validated (internal and external validation) using several statistical parameters. Docking molecular studies were performed to better understand the binding mode of pyrimidine derivatives inside the TGF-β active site. Results : The 4D-QSAR model presented seven descriptors and acceptable statistical parameters (R2 = 0.89, q2 = 0.68, R2pred = 0.65, r2m = 0.55, R2P = 0.68 and R2rand = 0.21) besides pharmacophores groups important for the activity of these compounds. The molecular docking studies helped to understand the pharmacophoric groups and proposed substituents that increase the potency of aryl pyrimidine derivatives. Conclusion: The best QSAR model showed adequate statistical parameters that ensure their fitness, robustness, and predictivity. Structural modifications were assessed, and five new structures were proposed as candidates for a drug for cancer treatment.


2011 ◽  
Vol 356-360 ◽  
pp. 83-88 ◽  
Author(s):  
Shu Qiao ◽  
Kun Xie ◽  
Chuan Fu ◽  
Jie Pan

Polychlorinated dibenzo-p-dioxins and dibenzofurans (PCDD/Fs) are a group of important persistent organic pollutants. Quantitative structure–property relationship (QSPR) modeling is a powerful approach for predicting the properties of environmental organic pollutants from their structure descriptors. In this study, a QSPR model is established for estimating n-octanol/water partition coefficient (log KOW) of PCDD/Fs. Three-dimensional holographic vector of atomic interaction field (3D-HoVAIF) is used to describe the chemical structures, SMR-PLS QSAR model has been created and good correlation coefficients and cross-validated correlation coefficient is obtained. Predictive capability of the models has also been demonstrated by leave-one-out cross-validation. Moreover, the estimated values have been presented for those PCDD/Fs which are lack of experimentally data by the optimum model.


Author(s):  
B. Elidrissi ◽  
A. Ousaa ◽  
M. Ghamali ◽  
S. Chtita ◽  
M. A. Ajana ◽  
...  

A Quantitative Structure–Activity Relationship (QSAR) study was performed to predict HIV-1 integrase inhibition activity (pIC50) of thirty-five 5-hydroxy-6-oxo-1,6-dihydropyrimidine-4-carboxamide compounds using the electronic and physico-chemical descriptors computed respectively, with Gaussian 03W and ACD/ChemSketch programs. The structures of all compounds were optimized using the hybrid Density Functional Theory (DFT) at the B3LYP/6-31G(d) level of theory. In both approaches, 28 compounds were assigned as the training set and the rest as the test set. These compounds were analyzed by the principal components analysis (PCA) method, the descendant Multiple Linear Regression (MLR) analyses and the Artificial Neural Network (ANN). The robustness of the obtained models was assessed by leave-many-out cross-validation, and external validation through a test set. This study shows that the MLR has served marginally better to predict pIC50 activity, when compared with the results given by predictions made with a (4-3-1) ANN model.


Sign in / Sign up

Export Citation Format

Share Document