NMR pattern recognition of peracetylated mono- and oligosaccharide structures. Classification of residues using principal-component analysis, K-nearest neighbor analysis, and SIMCA class modeling

Pears is a fruit that is widely available in tropical climates such as in western Europe, Asia, Africa and one of them is Indonesia. There are many types of pears in Indonesia. Types of pears can be distinguished from the color, size, and shape. But it is still difficult for ordinary people to get to know the types of pears. This is what gave rise to the idea to conduct research related to image processing to classify three types of pears namely abate, red and william pears in order to help determine the type of pears. The pear type classification process is done by verify the image of pears based on existing training data. The research method used consisted of preprocessing image segmentation with morphological operations and feature extraction into Principal Component Analysis (PCA). The classification algorithm used is K-Nearest Neighbor (KNN). The use of adequate training data will further improve the classification of types of pears. The final results of this study amounted to 87.5%.

Download Full-text

Classification of Wines Using Principal Component Analysis

Journal of Wine Economics ◽

10.1017/jwe.2020.35 ◽

2021 ◽

Vol 16 (1) ◽

pp. 56-67 ◽

Cited By ~ 1

Author(s):

Jackson Barth ◽

Duwani Katumullage ◽

Chenyu Yang ◽

Jing Cao

Keyword(s):

Principal Component Analysis ◽

Principal Components ◽

Nearest Neighbor ◽

Predictive Accuracy ◽

Principal Component ◽

Component Analysis ◽

K Nearest Neighbor ◽

Explanatory Variables ◽

Highly Correlated

AbstractClassification of wines with a large number of correlated covariates may lead to classification results that are difficult to interpret. In this study, we use a publicly available dataset on wines from three known cultivars, where there are 13 highly correlated variables measuring chemical compounds of wines. The goal is to produce an efficient classifier with straightforward interpretation to shed light on the important features of wines in the classification. To achieve the goal, we incorporate principal component analysis (PCA) in the k-nearest neighbor (kNN) classification to deal with the serious multicollinearity among the explanatory variables. PCA can identify the underlying dominant features and provide a more succinct and straightforward summary over the correlated covariates. The study shows that kNN combined with PCA yields a much simpler and interpretable classifier that has comparable performance with kNN based on all the 13 variables. The appropriate number of principal components is chosen to strike a balance between predictive accuracy and simplicity of interpretation. Our final classifier is based on only two principal components, which can be interpreted as the strength of taste and level of alcohol and fermentation in wines, respectively. (JEL Classifications: C10, Cl4, D83)

Download Full-text

MATLAB-Based Classification of Hypokalemia by Principal Component Analysis and k Nearest Neighbor

Proceedings of the 2020 10th International Conference on Biomedical Engineering and Technology ◽

10.1145/3397391.3397416 ◽

2020 ◽

Author(s):

Ernesto Vergara ◽

Glenn V. Magwili ◽

Leonardo D. Valiente ◽

Jill Angielaine S. Morales ◽

Ma. Carmela P. Sapalaran ◽

...

Keyword(s):

Principal Component Analysis ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

K Nearest Neighbor

Download Full-text

Wide-Area Monitoring of Power Systems Using Principal Component Analysis and $k$-Nearest Neighbor Analysis

IEEE Transactions on Power Systems ◽

10.1109/tpwrs.2017.2783242 ◽

2018 ◽

Vol 33 (5) ◽

pp. 4913-4923 ◽

Cited By ~ 18

Author(s):

Lianfang Cai ◽

Nina F. Thornhill ◽

Stefanie Kuenzel ◽

Bikash C. Pal

Keyword(s):

Principal Component Analysis ◽

Power Systems ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

Wide Area ◽

K Nearest Neighbor ◽

Nearest Neighbor Analysis ◽

Wide Area Monitoring ◽

Area Monitoring

Download Full-text

Decision Support System for Classification of Early Childhood Diseases Using Principal Component Analysis and K-Nearest Neighbors Classifier

Journal of Information Systems Engineering and Business Intelligence ◽

10.20473/jisebi.5.1.13-22 ◽

2019 ◽

Vol 5 (1) ◽

pp. 13

Author(s):

Damar Dananjaya ◽

Indah Werdiningsih ◽

Rini Semiati

Keyword(s):

Principal Component Analysis ◽

Early Childhood ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

Major Influence ◽

K Nearest Neighbor ◽

Childhood Diseases ◽

Childhood Disease

Background: Data on early childhood disease collected in clinics has accumulated into big data. Those data can be used for classification of early childhood diseases to help medical staff in diagnosing diseases that attack early childhoods.Objective: This study aims to apply Principal Component Analysis (PCA) and K-Nearest Neighbor (K-NN) Classifier for the classification of early childhood diseases.Methods: Data analysis was performed using PCA to obtain variables that had a major influence on the classification of early childhood diseases. PCA was done by observing the correlation between variables and eliminating variables that have little influence on classification. Furthermore, data on early childhood disease was classified using the K-Nearest Neighbor Classifier method.Results: The results of system evaluation using 150 test data indicated that the classification system by applying PCA and KNN Classifier had an accuracy value of 86%.Conclusion: PCA can be used to reduce the number of variables involved so that it can improve system performance in terms of efficiency. In addition, the application of PCA and KNN can also improve accuracy in the classification of early childhood diseases.

Download Full-text

Analisis Metode Pengenalan Wajah Two Dimensial Principal Component Analysis (2DPCA) dan Kernel Fisher Discriminant Analysis Menggunakan Klasifikasi KNN (K- Nearest Neighbor)

Jurnal Teknologi dan Rekayasa Manufaktur ◽

10.48182/jtrm.v2i2.30 ◽

2020 ◽

Vol 2 (2) ◽

pp. 29-38

Author(s):

Abdur Rohman Harits Martawireja ◽

Hilman Mujahid Purnama ◽

Atika Nur Rahmawati

Keyword(s):

Principal Component Analysis ◽

Discriminant Analysis ◽

Cross Validation ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

K Nearest Neighbor ◽

Fisher Discriminant Analysis ◽

Fisher Discriminant ◽

Kernel Fisher Discriminant Analysis

Pengenalan wajah manusia (face recognition) merupakan salah satu bidang penelitian yang penting dan belakangan ini banyak aplikasi yang menerapkannya, baik di bidang komersil ataupun di bidang penegakan hukum. Pengenalan wajah merupakan sebuah sistem yang berfungsikan untuk mengidentifikasi berdasarkan ciri-ciri dari wajah seseorang berbasis biometrik yang memiliki keakuratan tinggi. Pengenalan wajah dapat diterapkan pada sistem keamanan. Banyak metode yang dapat digunakan dalam aplikasi pengenalan wajah untuk keamanan sistem, namun pada artikel ini akan membahas tentang dua metode yaitu Two Dimensial Principal Component Analysis dan Kernel Fisher Discriminant Analysis dengan metode klasifikasi menggunakan K-Nearest Neigbor. Kedua metode ini diuji menggunakan metode cross validation. Hasil dari penelitian terdahulu terbukti bahwa sistem pengenalan wajah metode Two Dimensial Principal Component Analysis dengan 5-folds cross validation menghasilkan akurasi sebesar 88,73%, sedangkan dengan 2-folds validation akurasi yang dihasilkan sebesar 89,25%. Dan pengujian metode Kernel Fisher Discriminant dengan 2-folds cross validation menghasilkan akurasi rata rata sebesar 83,10%.

Download Full-text

Discrimination of Chinese Liquors Based on Electronic Nose and Fuzzy Discriminant Principal Component Analysis

Foods ◽

10.3390/foods8010038 ◽

2019 ◽

Vol 8 (1) ◽

pp. 38 ◽

Cited By ~ 2

Author(s):

Xiaohong Wu ◽

Jin Zhu ◽

Bin Wu ◽

Chao Zhao ◽

Jun Sun ◽

...

Keyword(s):

Principal Component Analysis ◽

Feature Extraction ◽

Electronic Nose ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

K Nearest Neighbor ◽

Knn Classifier ◽

Extraction Algorithm ◽

Leave One Out

The detection of liquor quality is an important process in the liquor industry, and the quality of Chinese liquors is partly determined by the aromas of the liquors. The electronic nose (e-nose) refers to an artificial olfactory technology. The e-nose system can quickly detect different types of Chinese liquors according to their aromas. In this study, an e-nose system was designed to identify six types of Chinese liquors, and a novel feature extraction algorithm, called fuzzy discriminant principal component analysis (FDPCA), was developed for feature extraction from e-nose signals by combining discriminant principal component analysis (DPCA) and fuzzy set theory. In addition, principal component analysis (PCA), DPCA, K-nearest neighbor (KNN) classifier, leave-one-out (LOO) strategy and k-fold cross-validation (k = 5, 10, 20, 25) were employed in the e-nose system. The maximum classification accuracy of feature extraction for Chinese liquors was 98.378% using FDPCA, showing this algorithm to be extremely effective. The experimental results indicate that an e-nose system coupled with FDPCA is a feasible method for classifying Chinese liquors.

Download Full-text

Facial recognition using two-dimensional principal component analysis and k-nearest neighbor: a case analysis of facial images

Journal of Physics Conference Series ◽

10.1088/1742-6596/1567/3/032028 ◽

2020 ◽

Vol 1567 ◽

pp. 032028

Author(s):

E Sugiharti ◽

A T Putra ◽

Subhan

Keyword(s):

Principal Component Analysis ◽

Nearest Neighbor ◽

Facial Recognition ◽

Principal Component ◽

Case Analysis ◽

Component Analysis ◽

Two Dimensional ◽

K Nearest Neighbor ◽

Facial Images

Download Full-text

Multi-mode operation of principal component analysis with k-nearest neighbor algorithm to monitor compressors for liquefied natural gas mixed refrigerant processes

Computers & Chemical Engineering ◽

10.1016/j.compchemeng.2017.05.029 ◽

2017 ◽

Vol 106 ◽

pp. 96-105 ◽

Cited By ~ 19

Author(s):

Daegeun Ha ◽

Usama Ahmed ◽

Hahyung Pyun ◽

Chul-Jin Lee ◽

Kye Hyun Baek ◽

...

Keyword(s):

Principal Component Analysis ◽

Natural Gas ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

Liquefied Natural Gas ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Multi Mode

Download Full-text

Application of Fuzzy K-Nearest Neighbor (FKNN) to Detect the Parkinson’s Disease

InPrime: Indonesian Journal of Pure and Applied Mathematics ◽

10.15408/inprime.v1i1.12827 ◽

2019 ◽

Vol 1 (1) ◽

Author(s):

L.N. Desinaini ◽

Azizatul Mualimah ◽

Dian C. R. Novitasari ◽

Moh. Hafiyusholeh

Keyword(s):

Machine Learning ◽

Parkinson’S Disease ◽

Principal Component Analysis ◽

Parkinson's Disease ◽

Nearest Neighbor ◽

Principal Component ◽

Component Analysis ◽

Training Data ◽

K Nearest Neighbor ◽

Positive Data

AbstractParkinson’s disease is a neurological disorder in which there is a gradual loss of brain cells that make and store dopamine. Researchers estimate that four to six million people worldwide, are living with Parkinson’s. The average age of patients is 60 years old, but some are diagnosed at age 40 or even younger and the worst thing is some patients are late to find out that they have Parkinson's disease. In this paper, we present a diagnosis system based on Fuzzy K-Nearest Neighbor (FKNN) to detect Parkinson’s disease. We use Parkinson’s disease dataset taken from UCI Machine Learning Repository. The first step is normalize the Parkinson’s disease dataset and analyze using Principal Component Analysis (PCA). The result shows that there are four new factors that influence Parkinson’s disease with total variance is 85.719%. In classification step, we use several percentage of training data to classify (detect) the Parkinson's disease i.e. 50%, 60%, 70%, 75%, 80% and 90%. We also use k = 3, 5, 7, and 9. The classification result shows that the highest accuracy obtained for the percentage of training data is 90% and k = 5, where 19 are correctly classified i.e. 14 positive data and 5 negative data, while 1 positive data is classified incorrectly.Keywords: Parkinson's disease; Fuzzy K-Nearest Neighbor; Principal Component Analysis. AbstrakPenyakit Parkinson merupakan kelainan sel saraf pada otak yang menyebabkan hilangnya dopamin pada otak. Para peneliti mengestimasi bahwa, empat sampai enam juta orang di dunia, menderita Parkinson. Penyakit ini rata-rata diderita oleh pasien berusia 60 tahun, namun beberapa orang terdeteksi saat berusia 40 tahun atau lebih muda dan hal terburuk adalah seseorang terlambat untuk mendeteksinya. Di dalam artikel ini, kami menyajikan sistem diagnosa penyakit Parkinson menggunakan metode Fuzzy K-Nearest Neighbor (FKNN). Kami menggunakan Data uji yang diperoleh dari UCI Machine Learning Repository yang telah banyak diterapkan pada masalah klasifikasi. Tahapan pertama yang kami lakukan adalah menormalisasi data kemudian menganalisisnya menggunakan Analisis Komponen Utama (Principal Component Analysis). Hasil Analisis Komponen Utama menunjukkan bahwa terdapat empat factor baru yang mempengaruhi penyakit Parkinson dengan variansi total 87,719%. Pada tahap klasifikasi, kami menggunakan beberapa prosentase data latih untuk mendeteksi penyakit yaitu 50%, 60%, 70%, 75%, 80% and 90%. Selain itu, kami menggunakan beberapa nilai k yaitu 3, 5, 7, and 9. Hasil menunjukkan bahwa klasifikasi dengan akurasi tertinggi diperoleh untuk 90% data latih dengan k = 5, dimana 19 diklasifikasikan secara tepat yaitu 14 data positif dan 5 data negatif, sedangkan satu data positif tidak diklasifikasikan dengan tepat.Keywords: penyakit Parkinson; Fuzzy K-Nearest Neighbor; Analisis Komponen Utama.

Download Full-text