A Method for Selecting Between Linear and Quadratic Classification Models in Discriminant Analysis

1995 ◽  
Vol 63 (3) ◽  
pp. 263-273 ◽  
Author(s):  
Alice Meshbane ◽  
John D. Morris
Foods ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 2723
Author(s):  
Evgenia D. Spyrelli ◽  
Christina Papachristou ◽  
George-John E. Nychas ◽  
Efstathios Z. Panagou

Fourier transform infrared spectroscopy (FT-IR) and multispectral imaging (MSI) were evaluated for the prediction of the microbiological quality of poultry meat via regression and classification models. Chicken thigh fillets (n = 402) were subjected to spoilage experiments at eight isothermal and two dynamic temperature profiles. Samples were analyzed microbiologically (total viable counts (TVCs) and Pseudomonas spp.), while simultaneously MSI and FT-IR spectra were acquired. The organoleptic quality of the samples was also evaluated by a sensory panel, establishing a TVC spoilage threshold at 6.99 log CFU/cm2. Partial least squares regression (PLS-R) models were employed in the assessment of TVCs and Pseudomonas spp. counts on chicken’s surface. Furthermore, classification models (linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVMs), and quadratic support vector machines (QSVMs)) were developed to discriminate the samples in two quality classes (fresh vs. spoiled). PLS-R models developed on MSI data predicted TVCs and Pseudomonas spp. counts satisfactorily, with root mean squared error (RMSE) values of 0.987 and 1.215 log CFU/cm2, respectively. SVM model coupled to MSI data exhibited the highest performance with an overall accuracy of 94.4%, while in the case of FT-IR, improved classification was obtained with the QDA model (overall accuracy 71.4%). These results confirm the efficacy of MSI and FT-IR as rapid methods to assess the quality in poultry products.


Metabolites ◽  
2020 ◽  
Vol 10 (7) ◽  
pp. 278 ◽  
Author(s):  
Marta Bevilacqua ◽  
Rasmus Bro

In this paper, we discuss the validity of using score plots of component models such as partial least squares regression, especially when these models are used for building classification models, and models derived from partial least squares regression for discriminant analysis (PLS-DA). Using examples and simulations, it is shown that the currently accepted practice of showing score plots from calibration models may give misleading interpretations. It is suggested and shown that the problem can be solved by replacing the currently used calibrated score plots with cross-validated score plots.


Author(s):  
Brian Carnahan ◽  
Gérard Meyer ◽  
Lois-Ann Kuntz

Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches - genetic programming and decision tree induction - were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.


2004 ◽  
Vol 1 (1) ◽  
pp. 143-161
Author(s):  
Maja Pohar ◽  
Mateja Blas ◽  
Sandra Turk

Two of the most widely used statistical methods for analyzing categorical outcome variables are linear discriminant analysis and logistic regression. While both are appropriate for the development of linear classification models, linear discriminant analysis makes more assumptions about the underlying data. Hence, it is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions. In this paper we consider the problem of choosing between the two methods, and set some guidelines for proper choice. The comparison between the methods is based on several measures of predictive accuracy. The performance of the methods is studied by simulations. We start with an example where all the assumptions of the linear discriminant analysis are satisfied and observe the impact of changes regarding the sample size, covariance matrix, Mahalanobis distance and direction of distance between group means. Next, we compare the robustness of the methods towards categorisation and non-normality of explanatory variables in a closely controlled way. We show that the results of LDA and LR are close whenever the normality assumptions are not too badly violated, and set some guidelines for recognizing these situations. We discuss the inappropriateness of LDA in all other cases.


Molecules ◽  
2019 ◽  
Vol 24 (8) ◽  
pp. 1550 ◽  
Author(s):  
Liang Xu ◽  
Wen Sun ◽  
Cui Wu ◽  
Yucui Ma ◽  
Zhimao Chao

Near infrared (NIR) spectroscopy with chemometric techniques was applied to discriminate the geographical origins of crude drugs (i.e., dried ripe fruits of Trichosanthes kirilowii) and prepared slices of Trichosanthis Fructus in this work. The crude drug samples (120 batches) from four growing regions (i.e., Shandong, Shanxi, Hebei, and Henan Provinces) were collected, dried, and used and the prepared slice samples (30 batches) were purchased from different drug stores. The raw NIR spectra were acquired and preprocessed with multiplicative scatter correction (MSC). Principal component analysis (PCA) was used to extract relevant information from the spectral data and gave visible cluster trends. Four different classification models, namely K-nearest neighbor (KNN), soft independent modeling of class analogy (SIMCA), partial least squares-discriminant analysis (PLS-DA), and support vector machine-discriminant analysis (SVM-DA), were constructed and their performances were compared. The corresponding classification model parameters were optimized by cross-validation (CV). Among the four classification models, SVM-DA model was superior over the other models with a classification accuracy up to 100% for both the calibration set and the prediction set. The optimal SVM-DA model was achieved when C =100, γ = 0.00316, and the number of principal components (PCs) = 6. While PLS-DA model had the classification accuracy of 95% for the calibration set and 98% for the prediction set. The KNN model had a classification accuracy of 92% for the calibration set and 94% for prediction set. The non-linear classification method was superior to the linear ones. Generally, the results demonstrated that the crude drugs from different geographical origins and the crude drugs and prepared slices of Trichosanthis Fructus could be distinguished by NIR spectroscopy coupled with SVM-DA model rapidly, nondestructively, and reliably.


2016 ◽  
Vol 62 (2) ◽  
pp. 173-179
Author(s):  
V.Yu. Grigorev ◽  
S.L. Solodova ◽  
D.E. Polianczyk ◽  
O.A. Raevsky

Thirty three classification models of substrate specificity of 177 drugs to P-glycoprotein have been created using of the linear discriminant analysis, random forest and support vector machine methods. QSAR modeling was carried out using 2 strategies. The first strategy consisted in search of all possible combinations from 1¸5 descriptors on the basis of 7 most significant molecular descriptors with clear physico-chemical interpretation. In the second case forward selection procedure up to 5 descriptors, starting from the best single descriptor was used. This strategy was applied to a set of 387 DRAGON descriptors. It was found that only one of 33 models has necessary statistical parameters. This model was designed by means of the linear discriminant analysis on the basis of a single descriptor of H-bond (SCad). The model has good statistical characteristics as evidenced by results to both internal cross-validation, and external validation with application of 44 new chemicals. This confirms an important role of hydrogen bond in the processes connected with penetration of chemical compounds through a blood-brain barrier


2020 ◽  
Vol 36 (3) ◽  
pp. 257-270
Author(s):  
Jean Frederic Isingizwe Nturambirwe ◽  
Helene H Nieuwoudt ◽  
Willem Jacobus Perold ◽  
Umezuruike Linus Opara

HighlightsIn the Emission Head (EH) configuration differences in apple bruise severity were well captured.A good representation of new samples variability, in calibration, ensured robust quantitative PLS-DA models.EH mode with PLS-DA is an attractive spectroscopic option for inline apple sorting based on bruise damage. Abstract. Bruise damage in apples is very common and undesirable because it hinders consumer satisfaction and greatly contributes to food loss. Fast detection of bruise damage in fruit using spectroscopic systems is still problematic, especially in terms of quantitative and objective assessments of mechanical damage and standardization of bruise measurement method, among other issues. Non-destructive techniques among which is near infrared (NIR) spectroscopy are under development as a potential solution carrier for such issues. A study of bruise damage was conducted on three apple cultivars using Fourier transform (FT) near infrared spectroscopy in two configurations (‘emission head’ of Bruker’s Matrix-F and ‘integrating sphere’ of Bruker’s multipurpose analyzer, MPA). The emission head (EH) allows for contactless large sample (100 mm) exposure that simulates on-line applications, while the MPA (sample size: 22 mm) is commonly used for in-laboratory analysis of inhomogeneous material such as fruit. Bruise damages were mechanically induced in apples, bruise sizes measured physically and destructively. Partial least squares discriminant analysis (PLS-DA) was used to determine the differences captured by the scanning spectrometers in apple fruit tissues. Discriminant analysis revealed that in both sample acquisition modes, distinction between bruised and non-bruised apple fruit tissue was achieved with high (from 78% to 93%) accuracy of classification (ACcl) based solely on spectral data. The classification accuracy improved when individual cultivars were considered and ranged from 94% to 96%. Classification models were tested for robustness and showed that both cultivar and bruise severity had influence on classification models’ performance. The results showed ability of the emission head configuration in detecting bruises and differentiating between severity of bruises in apple fruit, thus making it a good candidate for use in rapid detection and quantitative assessment of bruising in apple on sorting lines. Possibilities for improving the classification model performance and ensuring their robustness for the EH were suggested. Keywords: Apple bruise, Discriminant analysis, Model performance, Model threshold, NIR spectroscopy.


Sign in / Sign up

Export Citation Format

Share Document