On determining the statistical parameters for pollution concentration from a truncated data set

1977 ◽  
Vol 11 (7) ◽  
pp. 663
Author(s):  
S. Tauber
2020 ◽  
Vol 16 (8) ◽  
pp. 1088-1105
Author(s):  
Nafiseh Vahedi ◽  
Majid Mohammadhosseini ◽  
Mehdi Nekoei

Background: The poly(ADP-ribose) polymerases (PARP) is a nuclear enzyme superfamily present in eukaryotes. Methods: In the present report, some efficient linear and non-linear methods including multiple linear regression (MLR), support vector machine (SVM) and artificial neural networks (ANN) were successfully used to develop and establish quantitative structure-activity relationship (QSAR) models capable of predicting pEC50 values of tetrahydropyridopyridazinone derivatives as effective PARP inhibitors. Principal component analysis (PCA) was used to a rational division of the whole data set and selection of the training and test sets. A genetic algorithm (GA) variable selection method was employed to select the optimal subset of descriptors that have the most significant contributions to the overall inhibitory activity from the large pool of calculated descriptors. Results: The accuracy and predictability of the proposed models were further confirmed using crossvalidation, validation through an external test set and Y-randomization (chance correlations) approaches. Moreover, an exhaustive statistical comparison was performed on the outputs of the proposed models. The results revealed that non-linear modeling approaches, including SVM and ANN could provide much more prediction capabilities. Conclusion: Among the constructed models and in terms of root mean square error of predictions (RMSEP), cross-validation coefficients (Q2 LOO and Q2 LGO), as well as R2 and F-statistical value for the training set, the predictive power of the GA-SVM approach was better. However, compared with MLR and SVM, the statistical parameters for the test set were more proper using the GA-ANN model.


Author(s):  
Tamiris Maria de Assis ◽  
Teodorico Castro Ramalho ◽  
Elaine Fontes Ferreira da Cunha

Background: The quantitative structure-activity relationship is an analysis method that can be applied for designing new molecules. In 1997, Hopfinger and coworkers developed the 4D-QSAR methodology aiming to eliminate the question of which conformation to use in a QSAR study. In this work, the 4D-QSAR methodology was used to quantitatively determine the influence of structural descriptors on the activity of aryl pyrimidine derivatives as inhibitors of the TGF-β1 receptor. The members of the TGF-β subfamily are interesting molecular targets, since they play an important function in the growth and development of cell cellular including proliferation, apoptosis, differentiation, epithelial-mesenchymal transition (EMT), and migration. In late stages, TGF-β exerts tumor-promoting effects, increasing tumor invasiveness, and metastasis. Therefore, TGF-β is an attractive target for cancer therapy. Objective: The major goal of the current research is to develop 4D-QSAR models aiming to propose new structures of aryl pyrimidine derivatives. Materials and Methods: Molecular dynamics simulation was carried out to generate the conformational ensemble profile of a data set with aryl pyrimidine derivatives. The conformations were overlaid into a three-dimensional cubic box, according to the three-ordered atom alignment. The occupation of the grid cells by the interaction of pharmacophore elements provides the grid cell occupancy descriptors (GCOD), the dependent variables used to build the 4D-QSAR models. The best models were validated (internal and external validation) using several statistical parameters. Docking molecular studies were performed to better understand the binding mode of pyrimidine derivatives inside the TGF-β active site. Results : The 4D-QSAR model presented seven descriptors and acceptable statistical parameters (R2 = 0.89, q2 = 0.68, R2pred = 0.65, r2m = 0.55, R2P = 0.68 and R2rand = 0.21) besides pharmacophores groups important for the activity of these compounds. The molecular docking studies helped to understand the pharmacophoric groups and proposed substituents that increase the potency of aryl pyrimidine derivatives. Conclusion: The best QSAR model showed adequate statistical parameters that ensure their fitness, robustness, and predictivity. Structural modifications were assessed, and five new structures were proposed as candidates for a drug for cancer treatment.


1995 ◽  
Vol 78 (2) ◽  
pp. 471-476 ◽  
Author(s):  
Luis Cuadros Rodríguez ◽  
Ana M García Campaña ◽  
Fermin Alés Barrero ◽  
Carlos Jiménez Linares ◽  
Manuel Román Ceba

Abstract A statistical procedure to validate an analytical methodology by standard addition methodology is described. The data set obtained in 3 calibration experiments with standard solutions, standard additions, and portions of sample is used. The accuracy of the analytical results is checked by comparison of analyte contents in the different calibrations and from the recovery. Mathematical expressions to estimate the statistical parameters are proposed. The statistical protocol has been applied to fluorimetric determination of molybdenum with alizarin S in vegetable tissues.


2016 ◽  
Vol 27 (3) ◽  
pp. 299-312
Author(s):  
Nadia Ziani ◽  
Khadidja Amirat ◽  
Djelloul Messadi

Purpose – The purpose of this paper is to predict the aquatic toxicity (LC50) of 92 substituted benzenes derivatives in Pimephales promelas. Design/methodology/approach – Quantitative structure-activity relationship analysis was performed on a series of 92 substituted benzenes derivatives using multiple linear regression (MLR), artificial neural network (ANN) and support vector machines (SVM) methods, which correlate aquatic toxicity (LC50) values of these chemicals to their structural descriptors. At first, the entire data set was split according to Kennard and Stone algorithm into a training set (74 chemicals) and a test set (18 chemical) for statistical external validation. Findings – Models with six descriptors were developed using as independent variables theoretical descriptors derived from Dragon software when applying genetic algorithm – variable subset selection procedure. Originality/value – The values of Q2 and RMSE in internal validation for MLR, SVM, and ANN model were: (0.8829; 0.225), (0.8882; 0.222); (0.8980; 0.214), respectively and also for external validation were: (0.9538; 0.141); (0.947; 0.146); (0.9564; 0.146). The statistical parameters obtained for the three approaches are very similar, which confirm that our six parameters model is stable, robust and significant.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Waqed H. Hassan ◽  
Halah K. Jalal

AbstractLocal scouring around the piers of a bridge is the one of the major reasons for bridge failure, potentially resulting in heavy losses in terms of both the economy and human life. Prediction of accurate depth of local scouring is a difficult task due to the many factors that contribute to this process, however. The main aim of this study is thus to offer a new formula for the prediction the local depth of scouring around the pier of a bridge using a modern fine computing modelling technique known as gene expression programming (GEP), with data obtained from numerical simulations used to compare GEP performance with that of a standard non-linear regression (NLR) model. The best technique for prediction of the local scouring depth is then determined based on three statistical parameters: the determination coefficient (R2), mean absolute error (MAE), and root mean squared error (RMSE). A total data set of 243 measurements, obtained by numerical simulation in Flow-3D, for intensity of flow, ratio of pier width, ratio of flow depth, pier Froude number, and pier shape factor is divided into training and validation (testing) datasets to achieve this. The results suggest that the formula from the GEP model provides better performance for predicting the local depth of scouring as compared with conventional regression with the NLR model, with R2 = 0.901, MAE = 0.111, and RMSE = 0.142. The sensitivity analysis results further suggest that the ratio of the depth of flow has the greatest impact on the prediction of local scour depth as compared to the other input parameters. The formula obtained from the GEP model gives the best predictor of depth of scouring, and, in addition, GEP offers the special feature of providing both explicit and compressed arithmetical terms to allow calculation of such depth of scouring.


2003 ◽  
Vol 11 (2) ◽  
pp. 123-136 ◽  
Author(s):  
Athanasia M. Goula ◽  
Konstantinos G. Adamopoulos

The use of near infrared (NIR) reflectance spectroscopy for the rapid and accurate measurement of moisture, sugar, acid, protein and salt was explored in a diverse group of tomato juice products. Partial and overall calibrations were performed on four different tomato juice products. Partial calibrations for each product included samples of the specific product, whereas overall calibration used samples of all the products. Samples were analysed employing traditional chemical methods and scanned using an Instalab 600-Dickey-John NIR apparatus to obtain NIR spectra. Calibrations were achieved with the use of multilinear regression between chemical and spectral data from each calibration data set. A separate set of samples was used to validate the calibrations. Linear regression was applied to compare the results obtained by NIR spectroscopy for all constituents of the validation set with those obtained by the reference methods. In addition, the root mean square error of prediction ( RMSEP), the bias and the correlation coefficients ( r and r′) were calculated. All of the statistical parameters were better with overall than with partial calibrations. Prediction ability of overall calibration was very good for all the constituents. r and r′ values were higher than 0.9488 and 0.9453, respectively, RMSEP values were smaller than 0.1067, whereas bias varied from −0.020 to 0.016. The partial calibrations are considerable less variable with the correlation coefficients r and r′ ranged from 0.8890 to 0.9477 and from 0.7202 to 0.8518, respectively, RMSEP varied from 0.0647 to 0.4942 and bias from −0.365 to 0.071. NIR measurement as performed by the Dickey-John Analyser was proved a rapid and accurate method for analysis of tomato juice samples and may be used as a replacement for conventional expensive and time-consuming wet chemistry methods.


2020 ◽  
Vol 85 (4) ◽  
pp. 467-480 ◽  
Author(s):  
Rana Amiri ◽  
Djelloul Messadi ◽  
Amel Bouakkadia

This study aimed at predicting the n-octanol/water partition coefficient (Kow) of 43 organophosphorous insecticides. Quantitative structure?property relationship analysis was performed on the series of 43 insecticides using two different methods, linear (multiple linear regression, MLR) and non-linear (artificial neural network, ANN), which Kow values of these chemicals to their structural descriptors. First, the data set was separated with a duplex algorithm into a training set (28 chemicals) and a test set (15 chemicals) for statistical external validation. A model with four descriptors was developed using as independent variables theoretical descriptors derived from Dragon software when applying genetic algorithm (GA)?variable subset selection (VSS) procedure. The values of statistical parameters, R2, Q2 ext, SDEPext and SDEC for the MLR (94.09 %, 92.43 %, 0.533 and 0.471, respectively) and ANN model (97.24 %, 92.17 %, 0.466 and 0.332, respectively) obtained for the three approaches are very similar, which confirmed that the employed four parameters model is stable, robust and significant.


Sign in / Sign up

Export Citation Format

Share Document