A Method for Selecting Between Linear and Quadratic Classification Models in Discriminant Analysis

Fourier transform infrared spectroscopy (FT-IR) and multispectral imaging (MSI) were evaluated for the prediction of the microbiological quality of poultry meat via regression and classification models. Chicken thigh fillets (n = 402) were subjected to spoilage experiments at eight isothermal and two dynamic temperature profiles. Samples were analyzed microbiologically (total viable counts (TVCs) and Pseudomonas spp.), while simultaneously MSI and FT-IR spectra were acquired. The organoleptic quality of the samples was also evaluated by a sensory panel, establishing a TVC spoilage threshold at 6.99 log CFU/cm2. Partial least squares regression (PLS-R) models were employed in the assessment of TVCs and Pseudomonas spp. counts on chicken’s surface. Furthermore, classification models (linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machines (SVMs), and quadratic support vector machines (QSVMs)) were developed to discriminate the samples in two quality classes (fresh vs. spoiled). PLS-R models developed on MSI data predicted TVCs and Pseudomonas spp. counts satisfactorily, with root mean squared error (RMSE) values of 0.987 and 1.215 log CFU/cm2, respectively. SVM model coupled to MSI data exhibited the highest performance with an overall accuracy of 94.4%, while in the case of FT-IR, improved classification was obtained with the QDA model (overall accuracy 71.4%). These results confirm the efficacy of MSI and FT-IR as rapid methods to assess the quality in poultry products.

Download Full-text

Can We Trust Score Plots?

Metabolites ◽

10.3390/metabo10070278 ◽

2020 ◽

Vol 10 (7) ◽

pp. 278 ◽

Cited By ~ 1

Author(s):

Marta Bevilacqua ◽

Rasmus Bro

Keyword(s):

Discriminant Analysis ◽

Least Squares ◽

Partial Least Squares ◽

Partial Least Squares Regression ◽

Classification Models ◽

Least Squares Regression ◽

Calibration Models ◽

Accepted Practice ◽

Component Models

In this paper, we discuss the validity of using score plots of component models such as partial least squares regression, especially when these models are used for building classification models, and models derived from partial least squares regression for discriminant analysis (PLS-DA). Using examples and simulations, it is shown that the currently accepted practice of showing score plots from calibration models may give misleading interpretations. It is suggested and shown that the problem can be solved by replacing the currently used calibrated score plots with cross-validated score plots.

Download Full-text

Comparing Statistical and Machine Learning Classifiers: Alternatives for Predictive Modeling in Human Factors Research

Human Factors The Journal of the Human Factors and Ergonomics Society ◽

10.1518/hfes.45.3.408.27248 ◽

2003 ◽

Vol 45 (3) ◽

pp. 408-423 ◽

Cited By ~ 6

Author(s):

Brian Carnahan ◽

Gérard Meyer ◽

Lois-Ann Kuntz

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Discriminant Analysis ◽

Human Factors ◽

Predictive Accuracy ◽

Performance Outcomes ◽

Learning Approaches ◽

Classification Models ◽

Machine Learning Classification ◽

Human Factors Research

Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches - genetic programming and decision tree induction - were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.

Download Full-text

Comparison of logistic regression and linear discriminant analysis

Advances in Methodology and Statistics ◽

10.51936/ayrt6204 ◽

2004 ◽

Vol 1 (1) ◽

pp. 143-161

Author(s):

Maja Pohar ◽

Mateja Blas ◽

Sandra Turk

Keyword(s):

Logistic Regression ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Predictive Accuracy ◽

Classification Models ◽

Linear Classification ◽

Robust Method ◽

Linear Discriminant ◽

Explanatory Variables ◽

The Impact

Two of the most widely used statistical methods for analyzing categorical outcome variables are linear discriminant analysis and logistic regression. While both are appropriate for the development of linear classification models, linear discriminant analysis makes more assumptions about the underlying data. Hence, it is assumed that logistic regression is the more flexible and more robust method in case of violations of these assumptions. In this paper we consider the problem of choosing between the two methods, and set some guidelines for proper choice. The comparison between the methods is based on several measures of predictive accuracy. The performance of the methods is studied by simulations. We start with an example where all the assumptions of the linear discriminant analysis are satisfied and observe the impact of changes regarding the sample size, covariance matrix, Mahalanobis distance and direction of distance between group means. Next, we compare the robustness of the methods towards categorisation and non-normality of explanatory variables in a closely controlled way. We show that the results of LDA and LR are close whenever the normality assumptions are not too badly violated, and set some guidelines for recognizing these situations. We discuss the inappropriateness of LDA in all other cases.

Download Full-text

Discrimination of Trichosanthis Fructus from Different Geographical Origins Using Near Infrared Spectroscopy Coupled with Chemometric Techniques

Molecules ◽

10.3390/molecules24081550 ◽

2019 ◽

Vol 24 (8) ◽

pp. 1550 ◽

Cited By ~ 3

Author(s):

Liang Xu ◽

Wen Sun ◽

Cui Wu ◽

Yucui Ma ◽

Zhimao Chao

Keyword(s):

Discriminant Analysis ◽

Classification Accuracy ◽

Near Infrared ◽

Nir Spectroscopy ◽

Classification Model ◽

Support Vector ◽

Model Parameters ◽

Classification Models ◽

Crude Drugs ◽

Chemometric Techniques

Near infrared (NIR) spectroscopy with chemometric techniques was applied to discriminate the geographical origins of crude drugs (i.e., dried ripe fruits of Trichosanthes kirilowii) and prepared slices of Trichosanthis Fructus in this work. The crude drug samples (120 batches) from four growing regions (i.e., Shandong, Shanxi, Hebei, and Henan Provinces) were collected, dried, and used and the prepared slice samples (30 batches) were purchased from different drug stores. The raw NIR spectra were acquired and preprocessed with multiplicative scatter correction (MSC). Principal component analysis (PCA) was used to extract relevant information from the spectral data and gave visible cluster trends. Four different classification models, namely K-nearest neighbor (KNN), soft independent modeling of class analogy (SIMCA), partial least squares-discriminant analysis (PLS-DA), and support vector machine-discriminant analysis (SVM-DA), were constructed and their performances were compared. The corresponding classification model parameters were optimized by cross-validation (CV). Among the four classification models, SVM-DA model was superior over the other models with a classification accuracy up to 100% for both the calibration set and the prediction set. The optimal SVM-DA model was achieved when C =100, γ = 0.00316, and the number of principal components (PCs) = 6. While PLS-DA model had the classification accuracy of 95% for the calibration set and 98% for the prediction set. The KNN model had a classification accuracy of 92% for the calibration set and 94% for prediction set. The non-linear classification method was superior to the linear ones. Generally, the results demonstrated that the crude drugs from different geographical origins and the crude drugs and prepared slices of Trichosanthis Fructus could be distinguished by NIR spectroscopy coupled with SVM-DA model rapidly, nondestructively, and reliably.

Download Full-text

Classification models of structure - P-glycoprotein activity of drugs

Biomeditsinskaya Khimiya ◽

10.18097/pbmc20166202173 ◽

2016 ◽

Vol 62 (2) ◽

pp. 173-179

Author(s):

V.Yu. Grigorev ◽

S.L. Solodova ◽

D.E. Polianczyk ◽

O.A. Raevsky

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

External Validation ◽

Statistical Characteristics ◽

Support Vector ◽

Classification Models ◽

Qsar Modeling ◽

Linear Discriminant ◽

P Glycoprotein ◽

Dragon Descriptors

Thirty three classification models of substrate specificity of 177 drugs to P-glycoprotein have been created using of the linear discriminant analysis, random forest and support vector machine methods. QSAR modeling was carried out using 2 strategies. The first strategy consisted in search of all possible combinations from 1¸5 descriptors on the basis of 7 most significant molecular descriptors with clear physico-chemical interpretation. In the second case forward selection procedure up to 5 descriptors, starting from the best single descriptor was used. This strategy was applied to a set of 387 DRAGON descriptors. It was found that only one of 33 models has necessary statistical parameters. This model was designed by means of the linear discriminant analysis on the basis of a single descriptor of H-bond (SCad). The model has good statistical characteristics as evidenced by results to both internal cross-validation, and external validation with application of 44 new chemicals. This confirms an important role of hydrogen bond in the processes connected with penetration of chemical compounds through a blood-brain barrier

Download Full-text

Discriminant Analysis and Other Linear Classification Models

Applied Predictive Modeling ◽

10.1007/978-1-4614-6849-3_12 ◽

2013 ◽

pp. 275-328 ◽

Cited By ~ 9

Author(s):

Max Kuhn ◽

Kjell Johnson

Keyword(s):

Discriminant Analysis ◽

Classification Models ◽

Linear Classification

Download Full-text

Uncertainty estimation and misclassification probability for classification models based on discriminant analysis and support vector machines

Analytica Chimica Acta ◽

10.1016/j.aca.2018.09.022 ◽

2019 ◽

Vol 1063 ◽

pp. 40-46 ◽

Cited By ~ 11

Author(s):

Camilo L.M. Morais ◽

Kássio M.G. Lima ◽

Francis L. Martin

Keyword(s):

Support Vector Machines ◽

Discriminant Analysis ◽

Uncertainty Estimation ◽

Support Vector ◽

Classification Models ◽

Vector Machines ◽

Misclassification Probability

Download Full-text

Detecting Bruise Damage and Level of Severity in Apples Using a Contactless NIR Spectrometer

Applied Engineering in Agriculture ◽

10.13031/aea.13218 ◽

2020 ◽

Vol 36 (3) ◽

pp. 257-270

Author(s):

Jean Frederic Isingizwe Nturambirwe ◽

Helene H Nieuwoudt ◽

Willem Jacobus Perold ◽

Umezuruike Linus Opara

Keyword(s):

Discriminant Analysis ◽

Near Infrared ◽

Model Performance ◽

Nir Spectroscopy ◽

Apple Fruit ◽

Performance Model ◽

Classification Model ◽

Good Representation ◽

List Type ◽

Classification Models

HighlightsIn the Emission Head (EH) configuration differences in apple bruise severity were well captured.A good representation of new samples variability, in calibration, ensured robust quantitative PLS-DA models.EH mode with PLS-DA is an attractive spectroscopic option for inline apple sorting based on bruise damage. Abstract. Bruise damage in apples is very common and undesirable because it hinders consumer satisfaction and greatly contributes to food loss. Fast detection of bruise damage in fruit using spectroscopic systems is still problematic, especially in terms of quantitative and objective assessments of mechanical damage and standardization of bruise measurement method, among other issues. Non-destructive techniques among which is near infrared (NIR) spectroscopy are under development as a potential solution carrier for such issues. A study of bruise damage was conducted on three apple cultivars using Fourier transform (FT) near infrared spectroscopy in two configurations (‘emission head’ of Bruker’s Matrix-F and ‘integrating sphere’ of Bruker’s multipurpose analyzer, MPA). The emission head (EH) allows for contactless large sample (100 mm) exposure that simulates on-line applications, while the MPA (sample size: 22 mm) is commonly used for in-laboratory analysis of inhomogeneous material such as fruit. Bruise damages were mechanically induced in apples, bruise sizes measured physically and destructively. Partial least squares discriminant analysis (PLS-DA) was used to determine the differences captured by the scanning spectrometers in apple fruit tissues. Discriminant analysis revealed that in both sample acquisition modes, distinction between bruised and non-bruised apple fruit tissue was achieved with high (from 78% to 93%) accuracy of classification (ACcl) based solely on spectral data. The classification accuracy improved when individual cultivars were considered and ranged from 94% to 96%. Classification models were tested for robustness and showed that both cultivar and bruise severity had influence on classification models’ performance. The results showed ability of the emission head configuration in detecting bruises and differentiating between severity of bruises in apple fruit, thus making it a good candidate for use in rapid detection and quantitative assessment of bruising in apple on sorting lines. Possibilities for improving the classification model performance and ensuring their robustness for the EH were suggested. Keywords: Apple bruise, Discriminant analysis, Model performance, Model threshold, NIR spectroscopy.

Download Full-text