scholarly journals Optimal Division of Molecules Into Training And Test Sets With A New Tool To Predict Pharmacophore In 3D-QSAR

Author(s):  
Tuğba Alp Tokat ◽  
Burçin Türkmenoğlu ◽  
Yahya Güzel

Abstract According to the descriptors in the pharmacophore model, dividing molecules into training and test sets serves to create a good model. It is difficult to track the Local Reactive Descriptor (LRD) effect of the pharmacophore at each interaction point in the 3D metric system. A subset of clusters of atoms can correspond to all or part of the pharmacophore structure. In this study, the multidimensional system of the subset was reduced to a one-dimensional index and the Vector Fingerprint Functions (VFF) of the molecules were created. Models were established by dividing molecules with close and similar VFFs into training and test sets. Sub-clusters were examined for all molecules by applying the Genetic Algorithm (GA). The model was predicted using the Leave One Out-Cross Validation (LOO-CV) method and verified with an external test set. The statistical results of the model obtained according to the division in the new method we developed (Q2 = 0.604 and R2 = 0.760 for training-80 and external test-20 sets, respectively) were compared with random and manual division results.

2016 ◽  
Vol 67 (1) ◽  
pp. 55-60
Author(s):  
Ante Miličević ◽  
Nenad Raos

Abstract Three sets of flavonoid derivatives (N=32, 40, and 74) and logarithms of their dissociation constants (log Kd) that describe flavonoid affinity toward P-glycoprotein were modelled using six connectivity indices. The best results were obtained with the zero-order valence molecular connectivity index (0χv) for all three sets. Standard errors of the calibration models were around 0.3, and of the constants from the test sets even a little lower, 0.22 and 0.24. Despite using only one descriptor, our model proved better in internal (cross-validation) and especially in external (test set) statistics than much more demanding methods used in previous 3D QSAR modelling.


2011 ◽  
Vol 76 (12) ◽  
pp. 1447-1469 ◽  
Author(s):  
Jahan B. Ghasemi ◽  
Somayeh Pirhadi

Using generated conformations from docking analysis by CDOCKER algorithm, some 3D-QSAR models; CoMFA region focusing (CoMFA-RF) and CoMSIA have been created on a series of a new class of potent and non-chiral renin inhibitors. The satisfactory predictions were obtained by CoMFA-RF and CoMSIA based on docking alignment in comparison to CoMFA. Robustness and predictability of the models were further verified by using the test set, cross validation (leave one out and leave ten out), bootstrapping, and progressive scrambling. All-orientation search (AOS) strategy was used to acquire the best orientation and minimize the effect of the initial orientation of aligned compounds. The results of 3D-QSAR models are in agreement with docking results. Moreover, the resulting 3D CoMFA-RF/ CoMSIA contour maps and corresponding models were applied to design new and more active inhibitors.


2012 ◽  
Vol 9 (4) ◽  
pp. 1699-1710 ◽  
Author(s):  
K. Meena Kumari ◽  
L. Yamini ◽  
M. Vijjulatha

Thymidylate synthase (TS) is a crucial enzyme for DNA biosynthesis and many nonclassical lipophilic antifolates targeting this enzyme are quite efficient and encouraging as antitumor drugs. We report 3D-QSAR analyses on pyrrolo pyrimidine and thieno pyrimidine antifolates to contemplate the mechanism of action and structure-activity relationship of these molecules. By applying leave-one-out (LOO) cross-validation study, cross-validated q2value of 0.523 and 0.566 for CoMFA Ligand based (LB) and Receptor based (RB), 0.516 and 0.471 for CoMSIA LB and RB respectively. while the non-cross-validated r2values were found to be 0.974 and 0.969 for CoMFA LB and RB, 0.983 and 0.972 for CoMSIA LB and RB respectively. The models were graphically interpreted using CoMFA and CoMSIA contour plots. The results obtained from this study were used for rational design of potent inhibitors against thymidylate synthase.


2016 ◽  
Vol 81 (2) ◽  
pp. 209-218 ◽  
Author(s):  
Long Jiao ◽  
Shan Bing ◽  
Xiaofeng Zhang ◽  
Hua Li

The application of interval partial least squares (IPLS) and moving window partial least squares (MWPLS) to the enantiomeric analysis of tryptophan (Trp) was investigated. A UV-Vis spectroscopy method for determining the enantiomeric composition of Trp was developed. The calibration model was built by using partial least squares (PLS), IPLS and MWPLS respectively. Leave-one-out cross validation and external test validation were used to assess the prediction performance of the established models. The validation result demonstrates the established full-spectrum PLS model is impractical for quantifying the relationship between the spectral data and enantiomeric composition of L-Trp. On the contrary, the developed IPLS and MWPLS model are both practicable for modeling this relationship. For the IPLS model, the root mean square relative error (RMSRE) of external test validation and leave-one-out cross validation is 4.03 and 6.50 respectively. For the MWPLS model, the RMSRE of external test validation and leave-one-out cross validation is 2.93 and 4.73 respectively. Obviously, the prediction accuracy of the MWPLS model is higher than that of the IPLS model. It is demonstrated UV-Vis spectroscopy combined with MWPLS is a commendable method for determining the enantiomeric composition of Trp. MWPLS is superior to IPLS for selecting spectral region in UV-Vis spectroscopy analysis.


2018 ◽  
Vol 21 (5) ◽  
pp. 381-387 ◽  
Author(s):  
Hossein Atabati ◽  
Kobra Zarei ◽  
Hamid Reza Zare-Mehrjardi

Aim and Objective: Human dihydroorotate dehydrogenase (DHODH) catalyzes the fourth stage of the biosynthesis of pyrimidines in cells. Hence it is important to identify suitable inhibitors of DHODH to prevent virus replication. In this study, a quantitative structure-activity relationship was performed to predict the activity of one group of newly synthesized halogenated pyrimidine derivatives as inhibitors of DHODH. Materials and Methods: Molecular structures of halogenated pyrimidine derivatives were drawn in the HyperChem and then molecular descriptors were calculated by DRAGON software. Finally, the most effective descriptors for 32 halogenated pyrimidine derivatives were selected using bee algorithm. Results: The selected descriptors using bee algorithm were applied for modeling. The mean relative error and correlation coefficient were obtained as 2.86% and 0.9627, respectively, while these amounts for the leave one out−cross validation method were calculated as 4.18% and 0.9297, respectively. The external validation was also conducted using two training and test sets. The correlation coefficients for the training and test sets were obtained as 0.9596 and 0.9185, respectively. Conclusion: The results of modeling of present work showed that bee algorithm has good performance for variable selection in QSAR studies and its results were better than the constructed model with the selected descriptors using the genetic algorithm method.


2019 ◽  
Vol 76 (7) ◽  
pp. 2349-2361
Author(s):  
Benjamin Misiuk ◽  
Trevor Bell ◽  
Alec Aitken ◽  
Craig J Brown ◽  
Evan N Edinger

Abstract Species distribution models are commonly used in the marine environment as management tools. The high cost of collecting marine data for modelling makes them finite, especially in remote locations. Underwater image datasets from multiple surveys were leveraged to model the presence–absence and abundance of Arctic soft-shell clam (Mya spp.) to support the management of a local small-scale fishery in Qikiqtarjuaq, Nunavut, Canada. These models were combined to predict Mya abundance, conditional on presence throughout the study area. Results suggested that water depth was the primary environmental factor limiting Mya habitat suitability, yet seabed topography and substrate characteristics influence their abundance within suitable habitat. Ten-fold cross-validation and spatial leave-one-out cross-validation (LOO CV) were used to assess the accuracy of combined predictions and to test whether this was inflated by the spatial autocorrelation of transect sample data. Results demonstrated that four different measures of predictive accuracy were substantially inflated due to spatial autocorrelation, and the spatial LOO CV results were therefore adopted as the best estimates of performance.


2014 ◽  
Vol 79 (8) ◽  
pp. 965-975 ◽  
Author(s):  
Long Jiao ◽  
Xiaofei Wang ◽  
LI. Hua ◽  
Yunxia Wang

The quantitative structure property relationship (QSPR) for gas/particle partition coefficient, Kp, of polychlorinated biphenyls (PCBs) was investigated. Molecular distance-edge vector (MDEV) index was used as the structural descriptor of PCBs. The quantitative relationship between the MDEV index and log Kp was modeled by multivariate linear regression (MLR) and artificial neural network (ANN) respectively. Leave one out cross validation and external validation were carried out to assess the prediction ability of the developed models. When the MLR method is used, the root mean square relative error (RMSRE) of prediction for leave one out cross validation and external validation is 4.72 and 8.62 respectively. When the ANN method is employed, the prediction RMSRE of leave one out cross validation and external validation is 3.87 and 7.47 respectively. It is demonstrated that the developed models are practicable for predicting the Kp of PCBs. The MDEV index is shown to be quantitatively related to the Kp of PCBs.


Sign in / Sign up

Export Citation Format

Share Document