scholarly journals A simple 2D-QSPR model for the prediction of Setschenow constants of organic compounds

2016 ◽  
Vol 35 (1) ◽  
pp. 53 ◽  
Author(s):  
Qi Xu ◽  
Lingling Fan ◽  
Jie Xu

A quantitative structure-property relationship (QSPR) analysis of the Setschenow constants (Ksalt) of organic compounds in a sodium chloride solution was carried out using only two-dimensional (2D) descriptors as input parameters. The whole set of 101 compounds was split into a training set of 71 compounds and a validation set of 30 compounds by means of the Kennard and Stones algorithm. A general four-parameter equation, with correlation coefficient (R) of 0.887 and standard error of estimation (s) of 0.031, was obtained by stepwise multilinear regression analysis (MLRA) on the training set. The reliability and robustness of the present model was verified with leave-one-out cross-validation, randomization tests, and the external validation set. All of the descriptors contained in this model are calculated directly from the molecular 2D structures; thus, this model can be used to easily predict the Ksalt of other compounds not involved in the present dataset.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shoko Kure ◽  
Sera Satoi ◽  
Toshihiko Kitayama ◽  
Yuta Nagase ◽  
Nobuo Nakano ◽  
...  

AbstractSafe and noninvasive methods for breast cancer screening with improved accuracy are urgently needed. Volatile organic compounds (VOCs) in biological samples such as breath and blood have been investigated as noninvasive novel markers of cancer. We investigated volatile organic compounds in urine to assess their potential for the detection of breast cancer. One hundred and ten women with biopsy-proven breast cancer and 177 healthy volunteers were enrolled. The subjects were divided into two groups: a training set and an external validation set. Urine samples were collected and analyzed by gas chromatography and mass spectrometry. A predictive model was constructed by multivariate analysis, and the sensitivity and specificity of the model were confirmed using both a training set and an external set with reproducibility tests. The training set included 60 breast cancer patients (age 34–88 years, mean 60.3) and 60 healthy controls (age 34–81 years, mean 58.7). The external validation set included 50 breast cancer patients (age 35–85 years, mean 58.8) and 117 healthy controls (age 18–84 years, mean 51.2). One hundred and ninety-one compounds detected in at least 80% of the samples from the training set were used for further analysis. The predictive model that best-detected breast cancer at various clinical stages was constructed using a combination of two of the compounds, 2-propanol and 2-butanone. The sensitivity and specificity in the training set were 93.3% and 83.3%, respectively. Triplicated reproducibility tests were performed by randomly choosing ten samples from each group, and the results showed a matching rate of 100% for the breast cancer patient group and 90% for the healthy control group. Our prediction model using two VOCs is a useful complement to the current diagnostic tools. Further studies inclusive of benign tumors and non-breast malignancies are warranted.


2014 ◽  
Vol 79 (8) ◽  
pp. 965-975 ◽  
Author(s):  
Long Jiao ◽  
Xiaofei Wang ◽  
LI. Hua ◽  
Yunxia Wang

The quantitative structure property relationship (QSPR) for gas/particle partition coefficient, Kp, of polychlorinated biphenyls (PCBs) was investigated. Molecular distance-edge vector (MDEV) index was used as the structural descriptor of PCBs. The quantitative relationship between the MDEV index and log Kp was modeled by multivariate linear regression (MLR) and artificial neural network (ANN) respectively. Leave one out cross validation and external validation were carried out to assess the prediction ability of the developed models. When the MLR method is used, the root mean square relative error (RMSRE) of prediction for leave one out cross validation and external validation is 4.72 and 8.62 respectively. When the ANN method is employed, the prediction RMSRE of leave one out cross validation and external validation is 3.87 and 7.47 respectively. It is demonstrated that the developed models are practicable for predicting the Kp of PCBs. The MDEV index is shown to be quantitatively related to the Kp of PCBs.


In this paper, the authors present an effort to increase the applicability domain (AD) by means of retraining models using a database of 701 great dissimilar molecules presenting anti-tyrosinase activity and 728 drugs with other uses. Atom-based linear indices and best subset linear discriminant analysis (LDA) were used to develop individual classification models. Eighteen individual classification-based QSAR models for the tyrosinase inhibitory activity were obtained with global accuracy varying from 88.15-91.60% in the training set and values of Matthews correlation coefficients (C) varying from 0.76-0.82. The external validation set shows globally classifications above 85.99% and 0.72 for C. All individual models were validated and fulfilled by OECD principles. A brief analysis of AD for the training set of 478 compounds and the new active compounds included in the re-training was carried out. Various assembled multiclassifier systems contained eighteen models using different selection criterions were obtained, which provide possibility of select the best strategy for particular problem. The various assembled multiclassifier systems also estimated the potency of active identified compounds. Eighteen validated potency models by OECD principles were used.


2019 ◽  
Vol 31 (5) ◽  
pp. 665-673 ◽  
Author(s):  
Maud Menard ◽  
Alexis Lecoindre ◽  
Jean-Luc Cadoré ◽  
Michèle Chevallier ◽  
Aurélie Pagnon ◽  
...  

Accurate staging of hepatic fibrosis (HF) is important for treatment and prognosis of canine chronic hepatitis. HF scores are used in human medicine to indirectly stage and monitor HF, decreasing the need for liver biopsy. We developed a canine HF score to screen for moderate or greater HF. We included 96 dogs in our study, including 5 healthy dogs. A liver biopsy for histologic examination and a biochemistry profile were performed on all dogs. The dogs were randomly split into a training set of 58 dogs and a validation set of 38 dogs. A HF score that included alanine aminotransferase, alkaline phosphatase, total bilirubin, potassium, and gamma-glutamyl transferase was developed in the training set. Model performance was confirmed using the internal validation set, and was similar to the performance in the training set. The overall sensitivity and specificity for the study group were 80% and 70% respectively, with an area under the curve of 0.80 (0.71–0.90). This HF score could be used for indirect diagnosis of canine HF when biochemistry panels are performed on the Konelab 30i (Thermo Scientific), using reagents as in our study. External validation is required to determine if the score is sufficiently robust to utilize biochemical results measured in other laboratories with different instruments and methodologies.


Molecules ◽  
2019 ◽  
Vol 24 (10) ◽  
pp. 2006 ◽  
Author(s):  
Liadys Mora Lagares ◽  
Nikola Minovski ◽  
Marjana Novič

P-glycoprotein (P-gp) is a transmembrane protein that actively transports a wide variety of chemically diverse compounds out of the cell. It is highly associated with the ADMET (absorption, distribution, metabolism, excretion and toxicity) properties of drugs/drug candidates and contributes to decreasing toxicity by eliminating compounds from cells, thereby preventing intracellular accumulation. Therefore, in the drug discovery and toxicological assessment process it is advisable to pay attention to whether a compound under development could be transported by P-gp or not. In this study, an in silico multiclass classification model capable of predicting the probability of a compound to interact with P-gp was developed using a counter-propagation artificial neural network (CP ANN) based on a set of 2D molecular descriptors, as well as an extensive dataset of 2512 compounds (1178 P-gp inhibitors, 477 P-gp substrates and 857 P-gp non-active compounds). The model provided a good classification performance, producing non error rate (NER) values of 0.93 for the training set and 0.85 for the test set, while the average precision (AvPr) was 0.93 for the training set and 0.87 for the test set. An external validation set of 385 compounds was used to challenge the model’s performance. On the external validation set the NER and AvPr values were 0.70 for both indices. We believe that this in silico classifier could be effectively used as a reliable virtual screening tool for identifying potential P-gp ligands.


Cells ◽  
2019 ◽  
Vol 8 (10) ◽  
pp. 1286 ◽  
Author(s):  
Onat Kadioglu ◽  
Thomas Efferth

P-glycoprotein (P-gp) is an important determinant of multidrug resistance (MDR) because its overexpression is associated with increased efflux of various established chemotherapy drugs in many clinically resistant and refractory tumors. This leads to insufficient therapeutic targeting of tumor populations, representing a major drawback of cancer chemotherapy. Therefore, P-gp is a target for pharmacological inhibitors to overcome MDR. In the present study, we utilized machine learning strategies to establish a model for P-gp modulators to predict whether a given compound would behave as substrate or inhibitor of P-gp. Random forest feature selection algorithm-based leave-one-out random sampling was used. Testing the model with an external validation set revealed high performance scores. A P-gp modulator list of compounds from the ChEMBL database was used to test the performance, and predictions from both substrate and inhibitor classes were selected for the last step of validation with molecular docking. Predicted substrates revealed similar docking poses than that of doxorubicin, and predicted inhibitors revealed similar docking poses than that of the known P-gp inhibitor elacridar, implying the validity of the predictions. We conclude that the machine-learning approach introduced in this investigation may serve as a tool for the rapid detection of P-gp substrates and inhibitors in large chemical libraries.


2016 ◽  
Vol 15 (02) ◽  
pp. 1650011 ◽  
Author(s):  
Xinliang Yu ◽  
Xianwei Huang

The glass transition temperature [Formula: see text] is the most important parameter of an amorphous polymer. A quantitative structure-property relationship (QSPR) was developed for [Formula: see text]s of 82 polyacrylates, by applying stepwise multiple linear regression (MLR) analysis. Molecular descriptors used to describe polymer structures were, for the first time, calculated from the motion units of polymer backbones, which are chain segments with 20 carbons in length (10 repeating units). After internal validation with leave-one-out (LOO) method, external validation was carried out to test the stability of the MLR model of [Formula: see text]s. Compared to the models already published in the literature, the MLR model in this paper was accurate and acceptable, although our model was based on bigger data sets. The feasibility of calculating molecular descriptors from the motion units of polymer backbones for developing [Formula: see text] models of polyacrylates has been demonstrated.


2017 ◽  
Vol 16 (02) ◽  
pp. 1750014 ◽  
Author(s):  
Xinliang Yu ◽  
Rimeng Zhan ◽  
Jiyong Deng ◽  
Xianwei Huang

Lubricating additives can improve the lubricant performance of base oil in reducing friction and wear and minimizing loss of energy. It is of great significance to study the relationship between chemical structures and lubrication properties of lubricant additives. This paper reports a quantitative structure–property relationship (QSPR) model of the maximum nonseizure loads ([Formula: see text]) of 79 lubricant additives by applying artificial neural network (ANN) based on the algorithm of backward propagation of errors. Six molecular descriptors appearing in the multiple linear regression (MLR) model were used as vectors to develop the ANN model. The optimal condition of ANN with network structure of [6-4-1] was obtained by adjusting various parameters by trial-and-error. The root-mean-square (rms) errors from ANN model are [Formula: see text] ([Formula: see text]) for the training set and [Formula: see text] ([Formula: see text]) for the test set, which are superior to the MLR results of [Formula: see text] ([Formula: see text]) for the training set and [Formula: see text] ([Formula: see text]) for the test set. Compared to the existing model for [Formula: see text], our model has better statistical quality. The results indicate that our ANN model can be applied to predict the [Formula: see text] values for lubricant additives.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e15718-e15718
Author(s):  
Shuichi Mitsunaga ◽  
Shogo Nomura ◽  
Kazuo Hara ◽  
Yukiko Takayama ◽  
Makoto Ueno ◽  
...  

e15718 Background: The diagnostic value of serum microRNAs (miRNA) in a highly sensitive microarray for pancreatobiliary cancer (PBca) has been demonstrated. This study attempted to build and validate a signature comprised of multiple serum miRNA markers for discriminating PBca from healthy controls. Methods: A multicenter prospective study on the diagnostic performance of serum miRNAs was conducted. The patients (pts) with treatment-naïve PBca and healthy participants aged ≥60 years were enrolled. Clinical data and sera were collected. Target population was randomly divided to training or validation cohort with an allocation ratio of 2:1. Twenty-nine serum miRNA markers on the microarray data were analyzed. Using any combinations of the markers, a Fisher’s linear discriminant analysis was performed, and the resulting sensitivity, specificity and AUC of ROC curve to discriminate PBca from healthy controls were calculated for each combination. Marker combinations with a sensitivity/specificity (SN/SP) of ≥80%/90% and high AUC in comparison with AUC of CA19-9 were defined as the diagnostic miRNA signature, which were selected in the training cohort. Next, the signatures were screened out which showed a good reproducibility in the validation cohort. As an independent external cohort, PBca pts and healthy with pooled frozen sera were enrolled and the identified miRNA signatures were further validated. Results: Total of 546 participants (80 healthy and 223 PBca in training set, 40 healthy and 104 PBca in validation set, 49 healthy and 50 PBca in external validation set) were analyzed in this study. Four serum miRNA combinations were identified as the diagnostic miRNA signature. In the training set, four miRNA signatures, consisted of 10 miRNAs, were developed. For the best-performed miRNA signature, the SN/SP and AUC in the validation and external validation cohorts were 84/90% and 0.95 (CA19-9: 73/95% and 0.88) and 84/90% and 0.93 (CA19-9: 80/94% and 0.87), respectively. Conclusions: The diagnostic serum miRNA signatures for PBca were identified in this study.


2010 ◽  
Vol 2010 ◽  
pp. 1-5 ◽  
Author(s):  
Chari V. Kandala ◽  
Jaya Sundaram

It would be useful to know the total kernel mass within a given mass of peanuts (mass ratio) while the peanuts are bought or being processed. In this work, the possibility of finding the mass ratio while the peanuts were in their shells was investigated. Capacitance, phase angle, and dissipation factor measurements on a parallel-plate capacitor holding in-shell peanut samples were made at frequencies from 1 to 10 MHz insteps of 1 MHz. A calibration equation was developed by multilinear regression analysis correlating the percentage ratio of the kernel weight with the measured capacitance, dissipation factor, and phase angle values of in-shell peanut samples with known kernel weights. The equation was used to predict the percentage mass ratio in the validation groups. Fitness of calibration model was verified using standard error of calibration, root mean square error of calibration, and leverage and influence plots. The predictability percentage, within 1% and 2% of the visual determination, was calculated by comparing the kernel mass ratio, obtained by the model equation and the reference value obtained by visual determination. Cross-validation gave 96% and 100% predictability, and external validation gave 87% and 98% predictability within 1% and 2% difference, respectively.


Sign in / Sign up

Export Citation Format

Share Document