Linear Discriminant Analysis for Prediction of Group Membership: A User-Friendly Primer

In psychology, researchers are often interested in the predictive classification of individuals. Various models exist for such a purpose, but which model is considered a best practice is conditional on attributes of the data. Under certain conditions, linear discriminant analysis (LDA) has been shown to perform better than other predictive methods, such as logistic regression, multinomial logistic regression, random forests, support-vector machines, and the K-nearest neighbor algorithm. The purpose of this Tutorial is to provide researchers who already have a basic level of statistical training with a general overview of LDA and an example of its implementation and interpretation. Decisions that must be made when conducting an LDA (e.g., prior specification, choice of cross-validation procedures) and methods of evaluating case classification (posterior probability, typicality probability) and overall classification (hit rate, Huberty’s I index) are discussed. LDA for prediction is described from a modern Bayesian perspective, as opposed to its original derivation. A step-by-step example of implementing and interpreting LDA results is provided. All analyses were conducted in R, and the script is provided; the data are available online.

Download Full-text

Pengenalan Wajah Menggunakan Metode Linear Discriminant Analysis dan k Nearest Neighbor

Jurnal ULTIMATICS ◽

10.31937/ti.v9i1.557 ◽

2017 ◽

Vol 9 (1) ◽

pp. 1-9

Author(s):

Fandiansyah Fandiansyah ◽

Jayanti Yusmah Sari ◽

Ika Putri Ningrum

Keyword(s):

Feature Extraction ◽

Face Recognition ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Nearest Neighbor ◽

Experimental Result ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Nearest Neighbor Classifier ◽

Neighbor Classifier

Face recognition is one of the biometric system that mostly used for individual recognition in the absent machine or access control. This is because the face is the most visible part of human anatomy and serves as the first distinguishing factor of a human being. Feature extraction and classification are the key to face recognition, as they are to any pattern classification task. In this paper, we describe a face recognition method based on Linear Discriminant Analysis (LDA) and k-Nearest Neighbor classifier. LDA used for feature extraction, which directly extracts the proper features from image matrices with the objective of maximizing between-class variations and minimizing within-class variations. The features of a testing image will be compared to the features of database image using K-Nearest Neighbor classifier. The experiments in this paper are performed by using using 66 face images of 22 different people. The experimental result shows that the recognition accuracy is up to 98.33%. Index Terms—face recognition, k nearest neighbor, linear discriminant analysis.

Download Full-text

Prediction of the autism spectrum disorder diagnosis with linear discriminant analysis classifier and K-nearest neighbor in children

2018 6th International Symposium on Digital Forensic and Security (ISDFS) ◽

10.1109/isdfs.2018.8355354 ◽

2018 ◽

Cited By ~ 12

Author(s):

Osman Altay ◽

Mustafa Ulas

Keyword(s):

Autism Spectrum Disorder ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Nearest Neighbor ◽

Autism Spectrum ◽

Spectrum Disorder ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Autism Spectrum Disorder Diagnosis ◽

Linear Discriminant Analysis Classifier

Download Full-text

Comparison of k-nearest neighbor, quadratic discriminant and linear discriminant analysis in classification of electromyogram signals based on the wrist-motion directions

Current Applied Physics ◽

10.1016/j.cap.2010.11.051 ◽

2011 ◽

Vol 11 (3) ◽

pp. 740-745 ◽

Cited By ~ 150

Author(s):

Kang Soo Kim ◽

Heung Ho Choi ◽

Chang Soo Moon ◽

Chi Woong Mun

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Wrist Motion

Download Full-text

PLS Generalized Linear Regression and Kernel Multilogit Algorithm (KMA) for Microarray Data Classiﬁcation Problem

Revista Colombiana de Estadística ◽

10.15446/rce.v43n2.81811 ◽

2020 ◽

Vol 43 (2) ◽

pp. 233-249

Author(s):

Adolphus Wagala ◽

Graciela González-Farías ◽

Rogelio Ramos ◽

Oscar Dalmau

Keyword(s):

Logistic Regression ◽

Discriminant Analysis ◽

Linear Regression ◽

Linear Discriminant Analysis ◽

Least Squares ◽

Partial Least Squares ◽

Classification Error ◽

Support Vector ◽

Linear Discriminant ◽

Generalized Linear Regression

This study involves the implentation of the extensions of the partial least squares generalized linear regression (PLSGLR) by combining it with logistic regression and linear discriminant analysis, to get a partial least squares generalized linear regression-logistic regression model (PLSGLR-log), and a partial least squares generalized linear regression-linear discriminant analysis model (PLSGLRDA). A comparative study of the obtained classiﬁers with the classical methodologies like the k-nearest neighbours (KNN), linear discriminant analysis (LDA), partial least squares discriminant analysis (PLSDA), ridge partial least squares (RPLS), and support vector machines(SVM) is then carried out. Furthermore, a new methodology known as kernel multilogit algorithm (KMA) is also implemented and its performance compared with those of the other classiﬁers. The KMA emerged as the best classiﬁer based on the lowest classiﬁcation error rates compared to the others when applied to the types of data are considered; the un- preprocessed and preprocessed.

Download Full-text

Protein sub-cellular localization based on noise-intensity-weighted linear discriminant analysis and an improved k-nearest-neighbor classifier

2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) ◽

10.1109/cisp-bmei.2016.7853022 ◽

2016 ◽

Cited By ~ 2

Author(s):

Zhenfeng Lei ◽

Shunfang Wang ◽

Dongshu Xu

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Noise Intensity ◽

Cellular Localization ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Nearest Neighbor Classifier ◽

Neighbor Classifier

Download Full-text

New approaches for predicting outcomes in patients with atrial fibrillation

Jounal of arrhythmology ◽

10.35336/va-2020-e-28-32 ◽

2020 ◽

Vol 27 ◽

pp. 28-32

Author(s):

N. A. Novikova ◽

M. Yu. Gilyarov ◽

A. Yu. Suvorov ◽

A. Yu. Kuchina

Keyword(s):

Neural Network ◽

Machine Learning ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Test Group ◽

Learning Technologies ◽

Support Vector ◽

K Nearest Neighbor ◽

Cardiovascular Risks ◽

Linear Discriminant

Aim: we aimed to assess the capabilities of “machine learning” methods in predicting remote outcomes in patients with non-valvular atrial fi brillation (AF).Methods. From 2015 to 2016 234 patients with non-valvular AF were included in the study (median age 72 (65; 79) years; 50.0% men). During the median follow-up of 2.9 (2.7; 3.2) years 42 patients died, 9 patients had non-fatal acute cerebral circulatory disorders and 3 patients had non-fatal myocardial infarction (MI). These events in 52 subjects (22.2% from all patients included) were combined into a combined endpoint (death and a nonfatal cardiovascular accident at the stage of remote observation). The first 184 patients comprised a “training” group. The next 50 patients formed the “test” group. The following methods of «machine learning» were used in the analysis: classifi cation trees, linear discriminant analysis, the k-nearest neighbor method, support vectors method, neural network.Results. Long-term outcomes were influenced by age, known traditional risk factors for cardiovascular diseases, the presence of these diseases, changes in intracardiac hemodynamics and heart chambers as evaluated by echocardiography, the presence of concomitant anemia, advanced stages of chronic kidney disease, and the administration of drugs associated with a more severe cardiovascular disease progression (amiodarone, digoxin). The best prognosis was created using the model of linear discriminant analysis, the complex neural network model, and the support vector machine.Conclusion. Modern methods aimed at prognosis estimation seem to be of importance in cardiology. These methods include big data analysis and machine learning technologies. The methods require further evaluation and confirmation, and in the future they may allow correcting cardiovascular risks, using data from real clinical practice and evidence-based medicine at the same time.

Download Full-text

NEW METHODS FOR PREDICTING OUTCOMES AND COMPLICATIONS IN PATIENTS WITH ATRIAL FIBRILLATION

Jounal of arrhythmology ◽

10.35336/va-2019-2-49-50 ◽

2019 ◽

Vol 26 (2(96)) ◽

pp. 45-50

Author(s):

N. A. Novikova ◽

M. Yu. Gilyarov ◽

A. Yu. Suvorov ◽

A. Yu. Kuchina

Keyword(s):

Neural Network ◽

Machine Learning ◽

Atrial Fibrillation ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Test Group ◽

Learning Technologies ◽

Support Vector ◽

K Nearest Neighbor ◽

Linear Discriminant

Aim: assessment of the capabilities of “machine learning” methods in predicting remote outcomes in patients with non-valvular atrial fibrillation (AF).Methods. From 2015 to 2016 234 patients with non-valvular AF were included in the study (median age 72 (65; 79) years; 50.0% men). During the median follow-up of 2.9 (2.7; 3.2) years 42 patients died, 9 patients had non-fatal acute cerebral circulatory disorders and 3 patients had non-fatal myocardial infarction (MI). These events in 52 subjects (22.2% from all patients included) were combined into a combined endpoint (death and a nonfatal cardiovascular accident at the stage of remote observation). The first 184 patients comprised a “training” group. The next 50 patients formed the “test” group. The following methods of «machine learning» were used in the analysis: classification trees, linear discriminant analysis, the k-nearest neighbor method, support vectors method, neural network.Results. Long-term outcomes were influenced by age, known traditional risk factors for cardiovascular diseases, the presence of these diseases, changes in intracardiac hemodynamics and heart chambers as evaluated by echocardiography, the presence of concomitant anemia, advanced stages of chronic kidney disease, and the administration of drugs associated with a more severe cardiovascular disease progression (amiodarone, digoxin). The best prognosis was created using the model of linear discriminant analysis, the complex neural network model, and the support vector machine.Conclusion. Modern methods aimed at prognosis estimation seem to be of great potential for cardiology. These methods include big data analysis and machine learning technologies. The methods require further evaluation and con firmation, and in the future they may allow correcting cardiovascular risks, using data from real clinical practice and evidence-based medicine at the same time.

Download Full-text

Comparisons of classification methods for viral genomes and protein families using alignment-free vectorization

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2018-0004 ◽

2018 ◽

Vol 17 (4) ◽

Author(s):

Hsin-Hsiung Huang ◽

Shuai Hao ◽

Saul Alarcon ◽

Jie Yang

Keyword(s):

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Nearest Neighbor ◽

Genomic Analysis ◽

Support Vector ◽

Comparative Genomic ◽

Classification Methods ◽

Statistical Classification ◽

Linear Discriminant ◽

Viral Genomes

Abstract In this paper, we propose a statistical classification method based on discriminant analysis using the first and second moments of positions of each nucleotide of the genome sequences as features, and compare its performances with other classification methods as well as natural vector for comparative genomic analysis. We examine the normality of the proposed features. The statistical classification models used including linear discriminant analysis, quadratic discriminant analysis, diagonal linear discriminant analysis, k-nearest-neighbor classifier, logistic regression, support vector machines, and classification trees. All these classifiers are tested on a viral genome dataset and a protein dataset for predicting viral Baltimore labels, viral family labels, and protein family labels.

Download Full-text

Statistical sex determination from craniometrics: Comparison of linear discriminant analysis, logistic regression, and support vector machines

Forensic Science International ◽

10.1016/j.forsciint.2014.10.010 ◽

2014 ◽

Vol 245 ◽

pp. 204.e1-204.e8 ◽

Cited By ~ 34

Author(s):

Frédéric Santos ◽

Pierre Guyomarc’h ◽

Jaroslav Bruzek

Keyword(s):

Logistic Regression ◽

Support Vector Machines ◽

Discriminant Analysis ◽

Sex Determination ◽

Linear Discriminant Analysis ◽

Support Vector ◽

Linear Discriminant ◽

Vector Machines

Download Full-text

Crime Detection with ICA and Artificial Intelligent Approach

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.816-817.616 ◽

2013 ◽

Vol 816-817 ◽

pp. 616-622

Author(s):

Ahmad Kadri Junoh ◽

Muhammad Naufal Mansor ◽

Alezar Mat Ya'acob ◽

Farah Adibah Adnan ◽

Syafawati Ab. Saad ◽

...

Keyword(s):

Discriminant Analysis ◽

Independent Component Analysis ◽

Linear Discriminant Analysis ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Violent Crimes ◽

Artificial Intelligent ◽

Linear Discriminant ◽

Crime Detection ◽

Intelligent Approach

The Rise of Crime in Malaysia reported that violent crimes comprised only 10% of reported crimes each year and the majority of crimes, 90%, were classified as property crimes. However, the ratio of police to population is 3.6 officers to 1,000 citizens in Malaysia. This lack of manpower sources ratios alone are not a comprehensive afford of crime fighting capabilities. Thus, we proposed an Artificial Intelligent Techniques to determine the behaviour of the burglar with Independent Component Analysis (ICA), Linear Discriminant Analysis (LDA) and k Nearest Neighbor (k-NN) Classifier. This system provided a good justification as a monitoring supplementary tool for the Malaysian police arm forced.

Download Full-text