Linear Discriminant Analysis for Prediction of Group Membership: A User-Friendly Primer

2019 ◽  
Vol 2 (3) ◽  
pp. 250-263 ◽  
Author(s):  
Peter Boedeker ◽  
Nathan T. Kearns

In psychology, researchers are often interested in the predictive classification of individuals. Various models exist for such a purpose, but which model is considered a best practice is conditional on attributes of the data. Under certain conditions, linear discriminant analysis (LDA) has been shown to perform better than other predictive methods, such as logistic regression, multinomial logistic regression, random forests, support-vector machines, and the K-nearest neighbor algorithm. The purpose of this Tutorial is to provide researchers who already have a basic level of statistical training with a general overview of LDA and an example of its implementation and interpretation. Decisions that must be made when conducting an LDA (e.g., prior specification, choice of cross-validation procedures) and methods of evaluating case classification (posterior probability, typicality probability) and overall classification (hit rate, Huberty’s I index) are discussed. LDA for prediction is described from a modern Bayesian perspective, as opposed to its original derivation. A step-by-step example of implementing and interpreting LDA results is provided. All analyses were conducted in R, and the script is provided; the data are available online.

2017 ◽  
Vol 9 (1) ◽  
pp. 1-9
Author(s):  
Fandiansyah Fandiansyah ◽  
Jayanti Yusmah Sari ◽  
Ika Putri Ningrum

Face recognition is one of the biometric system that mostly used for individual recognition in the absent machine or access control. This is because the face is the most visible part of human anatomy and serves as the first distinguishing factor of a human being. Feature extraction and classification are the key to face recognition, as they are to any pattern classification task. In this paper, we describe a face recognition method based on Linear Discriminant Analysis (LDA) and k-Nearest Neighbor classifier. LDA used for feature extraction, which directly extracts the proper features from image matrices with the objective of maximizing between-class variations and minimizing within-class variations. The features of a testing image will be compared to the features of database image using K-Nearest Neighbor classifier. The experiments in this paper are performed by using using 66 face images of 22 different people. The experimental result shows that the recognition accuracy is up to 98.33%. Index Terms—face recognition, k nearest neighbor, linear discriminant analysis.


2020 ◽  
Vol 43 (2) ◽  
pp. 233-249
Author(s):  
Adolphus Wagala ◽  
Graciela González-Farías ◽  
Rogelio Ramos ◽  
Oscar Dalmau

This study involves the implentation of the extensions of the partial least squares generalized linear regression (PLSGLR) by combining  it with logistic regression and  linear  discriminant analysis,  to  get a  partial least  squares generalized linear  regression-logistic regression model (PLSGLR-log),  and a partial least squares generalized linear regression-linear discriminant analysis model (PLSGLRDA). A comparative  study  of  the obtained  classifiers with   the   classical  methodologies like  the k-nearest  neighbours (KNN), linear   discriminant  analysis  (LDA),   partial  least  squares discriminant analysis (PLSDA),  ridge  partial least squares (RPLS), and  support vector machines(SVM)  is  then  carried  out.    Furthermore,  a  new  methodology known as kernel multilogit algorithm (KMA) is also implemented and its performance compared with those of the other classifiers. The KMA emerged as the best classifier based  on the lowest  classification error  rates  compared to  the  others  when  applied   to  the  types   of data   are considered;  the  un- preprocessed and preprocessed.


2020 ◽  
Vol 27 ◽  
pp. 28-32
Author(s):  
N. A. Novikova ◽  
M. Yu. Gilyarov ◽  
A. Yu. Suvorov ◽  
A. Yu. Kuchina

Aim: we aimed to assess the capabilities of “machine learning” methods in predicting remote outcomes in patients with non-valvular atrial fi brillation (AF).Methods. From 2015 to 2016 234 patients with non-valvular AF were included in the study (median age 72 (65; 79) years; 50.0% men). During the median follow-up of 2.9 (2.7; 3.2) years 42 patients died, 9 patients had non-fatal acute cerebral circulatory disorders and 3 patients had non-fatal myocardial infarction (MI). These events in 52 subjects (22.2% from all patients included) were combined into a combined endpoint (death and a nonfatal cardiovascular accident at the stage of remote observation). The first 184 patients comprised a “training” group. The next 50 patients formed the “test” group. The following methods of «machine learning» were used in the analysis: classifi cation trees, linear discriminant analysis, the k-nearest neighbor method, support vectors method, neural network.Results. Long-term outcomes were influenced by age, known traditional risk factors for cardiovascular diseases, the presence of these diseases, changes in intracardiac hemodynamics and heart chambers as evaluated by echocardiography, the presence of concomitant anemia, advanced stages of chronic kidney disease, and the administration of drugs associated with a more severe cardiovascular disease progression (amiodarone, digoxin). The best prognosis was created using the model of linear discriminant analysis, the complex neural network model, and the support vector machine.Conclusion. Modern methods aimed at prognosis estimation seem to be of importance in cardiology. These methods include big data analysis and machine learning technologies. The methods require further evaluation and confirmation, and in the future they may allow correcting cardiovascular risks, using data from real clinical practice and evidence-based medicine at the same time.


2019 ◽  
Vol 26 (2(96)) ◽  
pp. 45-50
Author(s):  
N. A. Novikova ◽  
M. Yu. Gilyarov ◽  
A. Yu. Suvorov ◽  
A. Yu. Kuchina

Aim: assessment of the capabilities of “machine learning” methods in predicting remote outcomes in patients with non-valvular atrial fibrillation (AF).Methods. From 2015 to 2016 234 patients with non-valvular AF were included in the study (median age 72 (65; 79) years; 50.0% men). During the median follow-up of 2.9 (2.7; 3.2) years 42 patients died, 9 patients had non-fatal acute cerebral circulatory disorders and 3 patients had non-fatal myocardial infarction (MI). These events in 52 subjects (22.2% from all patients included) were combined into a combined endpoint (death and a nonfatal cardiovascular accident at the stage of remote observation). The first 184 patients comprised a “training” group. The next 50 patients formed the “test” group. The following methods of «machine learning» were used in the analysis: classification trees, linear discriminant analysis, the k-nearest neighbor method, support vectors method, neural network.Results. Long-term outcomes were influenced by age, known traditional risk factors for cardiovascular diseases, the presence of these diseases, changes in intracardiac hemodynamics and heart chambers as evaluated by echocardiography, the presence of concomitant anemia, advanced stages of chronic kidney disease, and the administration of drugs associated with a more severe cardiovascular disease progression (amiodarone, digoxin). The best prognosis was created using the model of linear discriminant analysis, the complex neural network model, and the support vector machine.Conclusion. Modern methods aimed at prognosis estimation seem to be of great potential for cardiology. These methods include big data analysis and machine learning technologies. The methods require further evaluation and con firmation, and in the future they may allow correcting cardiovascular risks, using data from real clinical practice and evidence-based medicine at the same time.


Author(s):  
Hsin-Hsiung Huang ◽  
Shuai Hao ◽  
Saul Alarcon ◽  
Jie Yang

Abstract In this paper, we propose a statistical classification method based on discriminant analysis using the first and second moments of positions of each nucleotide of the genome sequences as features, and compare its performances with other classification methods as well as natural vector for comparative genomic analysis. We examine the normality of the proposed features. The statistical classification models used including linear discriminant analysis, quadratic discriminant analysis, diagonal linear discriminant analysis, k-nearest-neighbor classifier, logistic regression, support vector machines, and classification trees. All these classifiers are tested on a viral genome dataset and a protein dataset for predicting viral Baltimore labels, viral family labels, and protein family labels.


2013 ◽  
Vol 816-817 ◽  
pp. 616-622
Author(s):  
Ahmad Kadri Junoh ◽  
Muhammad Naufal Mansor ◽  
Alezar Mat Ya'acob ◽  
Farah Adibah Adnan ◽  
Syafawati Ab. Saad ◽  
...  

The Rise of Crime in Malaysia reported that violent crimes comprised only 10% of reported crimes each year and the majority of crimes, 90%, were classified as property crimes. However, the ratio of police to population is 3.6 officers to 1,000 citizens in Malaysia. This lack of manpower sources ratios alone are not a comprehensive afford of crime fighting capabilities. Thus, we proposed an Artificial Intelligent Techniques to determine the behaviour of the burglar with Independent Component Analysis (ICA), Linear Discriminant Analysis (LDA) and k Nearest Neighbor (k-NN) Classifier. This system provided a good justification as a monitoring supplementary tool for the Malaysian police arm forced.


Sign in / Sign up

Export Citation Format

Share Document