Rapid and Nondestructive On-Site Classification Method for Consumer-Grade Plastics Based on Portable NIR Spectrometer and Machine Learning

The classification of plastic waste before recycling is of great significance to achieve effective recycling. In order to achieve rapid, nondestructive, and on-site detection, a portable near-infrared spectrometer was used in this study to obtain the diffuse reflectance spectrum for both standard and commercial plastics made by ABS, PC, PE, PET, PP, PS, and PVC. After applying a series of pretreatments, the principal component analysis (PCA) was used to analyze the cluster trend. K-nearest neighbor (KNN), support vector machine (SVM), and back propagation neural network (BPNN) classification models were developed and evaluated, respectively. The result showed that different plastics could be well separated in top three principal components space after pretreatment, and the classification models performed excellent classification results and high generalization capability. This study indicated that the portable NIR spectrometer, integrated with chemometrics, could achieve excellent performance and has great potential in the field of commercial plastic identification.

Download Full-text

Authenticity Detection of Black Rice by Near-Infrared Spectroscopy and Support Vector Data Description

International Journal of Analytical Chemistry ◽

10.1155/2018/8032831 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 5

Author(s):

Hui Chen ◽

Chao Tan ◽

Zan Lin

Keyword(s):

Near Infrared ◽

Nearest Neighbor ◽

Principal Component ◽

Support Vector ◽

Support Vector Data Description ◽

Vector Data ◽

K Nearest Neighbor ◽

Black Rice ◽

Target Class ◽

Data Description

Black rice is an important rice species in Southeast Asia. It is a common phenomenon to pass low-priced black rice off as high-priced ones for economic benefit, especially in some remote towns. There is increasing need for the development of fast, easy-to-use, and low-cost analytical methods for authenticity detection. The feasibility to utilize near-infrared (NIR) spectroscopy and support vector data description (SVDD) for such a goal is explored. Principal component analysis (PCA) is used for exploratory analysis and feature extraction. Another two data description methods, i.e., k-nearest neighbor data description (KNNDD) and GAUSS method, are used as the reference. A total of 142 samples from three brands were collected for spectral analysis. Each time, the samples of a brand serve as the target class whereas other samples serve as the outlier class. Based on both the first two principal components (PCs) and original variables, three types of data descriptions were constructed. On average, the optimized SVDD model achieves acceptable performance, i.e., a specificity of 100% and a sensitivity of 94.2% on the independent test set with tight boundary. It indicates that SVDD combined with NIR is feasible and effective for authenticity detection of black rice.

Download Full-text

Automatic Detection and Staging of Lung Tumors using Locational Features and Double-Staged Classifications

Applied Sciences ◽

10.3390/app9112329 ◽

2019 ◽

Vol 9 (11) ◽

pp. 2329 ◽

Cited By ~ 3

Author(s):

May Phu Paing ◽

Kazuhiko Hamamoto ◽

Supan Tungjitkusolmun ◽

Chuchart Pintavirooj

Keyword(s):

Lung Cancer ◽

Nearest Neighbor ◽

Treatment Options ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Classification Performance ◽

Clinical Staging ◽

Support Vector ◽

K Nearest Neighbor ◽

Experimental Findings

Lung cancer is a life-threatening disease with the highest morbidity and mortality rates of any cancer worldwide. Clinical staging of lung cancer can significantly reduce the mortality rate, because effective treatment options strongly depend on the specific stage of cancer. Unfortunately, manual staging remains a challenge due to the intensive effort required. This paper presents a computer-aided diagnosis (CAD) method for detecting and staging lung cancer from computed tomography (CT) images. This CAD works in three fundamental phases: segmentation, detection, and staging. In the first phase, lung anatomical structures from the input tomography scans are segmented using gray-level thresholding. In the second, the tumor nodules inside the lungs are detected using some extracted features from the segmented tumor candidates. In the last phase, the clinical stages of the detected tumors are defined by extracting locational features. For accurate and robust predictions, our CAD applies a double-staged classification: the first is for the detection of tumors and the second is for staging. In both classification stages, five alternative classifiers, namely the Decision Tree (DT), K-nearest neighbor (KNN), Support Vector Machine (SVM), Ensemble Tree (ET), and Back Propagation Neural Network (BPNN), are applied and compared to ensure high classification performance. The average accuracy levels of 92.8% for detection and 90.6% for staging are achieved using BPNN. Experimental findings reveal that the proposed CAD method provides preferable results compared to previous methods; thus, it is applicable as a clinical diagnostic tool for lung cancer.

Download Full-text

Identification of Pilots’ Fatigue Status Based on Electrocardiogram Signals

Sensors ◽

10.3390/s21093003 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3003

Author(s):

Ting Pan ◽

Haibo Wang ◽

Haiqing Si ◽

Yao Li ◽

Lei Shang

Keyword(s):

Back Propagation ◽

Principal Component ◽

Back Propagation Neural Network ◽

Practical Significance ◽

Support Vector ◽

State Identification ◽

Feature Parameter ◽

The Time Domain ◽

Pilot Fatigue ◽

Identification Model

Fatigue is an important factor affecting modern flight safety. It can easily lead to a decline in pilots’ operational ability, misjudgments, and flight illusions. Moreover, it can even trigger serious flight accidents. In this paper, a wearable wireless physiological device was used to obtain pilots’ electrocardiogram (ECG) data in a simulated flight experiment, and 1440 effective samples were determined. The Friedman test was adopted to select the characteristic indexes that reflect the fatigue state of the pilot from the time domain, frequency domain, and non-linear characteristics of the effective samples. Furthermore, the variation rules of the characteristic indexes were analyzed. Principal component analysis (PCA) was utilized to extract the features of the selected feature indexes, and the feature parameter set representing the fatigue state of the pilot was established. For the study on pilots’ fatigue state identification, the feature parameter set was used as the input of the learning vector quantization (LVQ) algorithm to train the pilots’ fatigue state identification model. Results show that the recognition accuracy of the LVQ model reached 81.94%, which is 12.84% and 9.02% higher than that of traditional back propagation neural network (BPNN) and support vector machine (SVM) model, respectively. The identification model based on the LVQ established in this paper is suitable for identifying pilots’ fatigue states. This is of great practical significance to reduce flight accidents caused by pilot fatigue, thus providing a theoretical foundation for pilot fatigue risk management and the development of intelligent aircraft autopilot systems.

Download Full-text

Two-parameter KNN algorithm and its application in recognition of brand rice

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210584 ◽

2021 ◽

pp. 1-7

Author(s):

Zhu Siyu ◽

He Chongnan ◽

Song Mingjuan ◽

Li Linna

Keyword(s):

Near Infrared ◽

Nearest Neighbor ◽

Recognition Accuracy ◽

Principal Component ◽

Kernel Principal Component Analysis ◽

K Nearest Neighbor ◽

Fisher Discriminant Analysis ◽

Pattern Class ◽

Fisher Discriminant ◽

Two Parameter

In response to the frequent counterfeiting of Wuchang rice in the market, an effective method to identify brand rice is proposed. Taking the near-infrared spectroscopy data of a total of 373 grains of rice from the four origins (Wuchang, Shangzhi, Yanshou, and Fangzheng) as the observations, kernel principal component analysis(KPCA) was employed to reduce the dimensionality, and Fisher discriminant analysis(FDA) and k-nearest neighbor algorithm (KNN) were used to identify brand rice respectively. The effects of the two recognition methods are very good, and that of KNN is relatively better. Howerver the shortcomings of KNN are obvious. For instance, it has only one test dimension and its test of samples is not delicate enough. In order to further improve the recognition accuracy, fuzzy k-nearest neighbor set is defined and fuzzy probability theory is employed to get a new recognition method –Two-Parameter KNN discrimination method. Compared with KNN algorithm, this method increases the examination dimension. It not only examines the proportion of the number of samples in each pattern class in the k-nearest neighbor set, but also examines the degree of similarity between the center of each pattern class and the sample to be identified. Therefore, the recognition process is more delicate and the recognition accuracy is higher. In the identification of brand rice, the discriminant accuracy of Two-Parameter KNN algorithm is significantly higher than that of FDA and that of KNN algorithm.

Download Full-text

Feature and Decision Level Fusion in Children Multimodal Biometrics

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6396.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 2522-2527

Keyword(s):

Nearest Neighbor ◽

Principal Component ◽

Identification Accuracy ◽

Support Vector ◽

Biometric System ◽

K Nearest Neighbor ◽

Decision Level ◽

Multimodal Biometric System ◽

Decision Level Fusion ◽

Level Fusion

In this paper, we design method for recognition of fingerprint and IRIS using feature level fusion and decision level fusion in Children multimodal biometric system. Initially, Histogram of Gradients (HOG), Gabour and Maximum filter response are extracted from both the domains of fingerprint and IRIS and considered for identification accuracy. The combination of feature vector of all the possible features is recommended by biometrics traits of fusion. For fusion vector the Principal Component Analysis (PCA) is used to select features. The reduced features are fed into fusion classifier of K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Navie Bayes(NB). For children multimodal biometric system the suitable combination of features and fusion classifiers is identified. The experimentation conducted on children’s fingerprint and IRIS database and results reveal that fusion combination outperforms individual. In addition the proposed model advances the unimodal biometrics system.

Download Full-text

Classification of fish species from different ecosystems using the near infrared diffuse reflectance spectra of otoliths

Journal of Near Infrared Spectroscopy ◽

10.1177/0967033520935999 ◽

2020 ◽

Vol 28 (4) ◽

pp. 224-235

Author(s):

Irina M Benson ◽

Beverly K Barnett ◽

Thomas E Helser

Keyword(s):

Discriminant Analysis ◽

Spatial Variability ◽

Near Infrared ◽

Nir Spectroscopy ◽

Principal Component ◽

Classification Model ◽

Support Vector ◽

Otolith Chemistry ◽

K Nearest Neighbor ◽

Fisheries Science

Applications of Fourier transform near infrared (FT-NIR) spectroscopy in fisheries science are currently limited. This current analysis of otolith spectral data demonstrate the potential applicability of FT-NIR spectroscopy to otolith chemistry and spatial variability in fisheries science. The objective of this study was to examine the use of NIR spectroscopy as a tool to differentiate among marine fishes in four large marine ecosystems. We examined otoliths from 13 different species, with three of these species coming from different regions. Principal component analysis described the main directions along which the specimens were separated. The separation of species and their ecosystems may suggest interactions between fish phylogeny, ontogeny, and environmental conditions that can be evaluated using NIR spectroscopy. In order to discriminate spectra across ecosystems and species, four supervised classification model techniques were utilized: soft independent modelling of class analogies, support vector machine discriminant analysis, partial least squares discriminant analysis, and k-nearest neighbor analysis (KNN). This study showed that the best performing model to classify combined ecosystems, all four ecosystems, and species was the KNN model, which had an overall accuracy rate of 99.9%, 97.6%, and 91.5%, respectively. Results from this study suggest that further investigations are needed to determine applications of NIR spectroscopy to otolith chemistry and spatial variability.

Download Full-text

ARRHYTHMIA DISEASE DIAGNOSIS USING NEURAL NETWORK, SVM, AND GENETIC ALGORITHM-OPTIMIZED k-MEANS CLUSTERING

Journal of Mechanics in Medicine and Biology ◽

10.1142/s0219519411004101 ◽

2011 ◽

Vol 11 (04) ◽

pp. 897-915 ◽

Cited By ~ 18

Author(s):

ROSHAN JOY MARTIS ◽

CHANDAN CHAKRABORTY

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Normal Sinus Rhythm ◽

Optimization Technique ◽

Back Propagation ◽

Principal Component ◽

Disease Diagnosis ◽

Back Propagation Neural Network ◽

Polynomial Kernel ◽

Support Vector

This work aims at presenting a methodology for electrocardiogram (ECG)-based arrhythmia disease detection using genetic algorithm (GA)-optimized k-means clustering. The open-source ECG data from MIT-BIH arrhythmia database and MIT-BIH normal sinus rhythm database are subjected to a sequence of steps including segmentation using R-point detection, extraction of features using principal component analysis (PCA), and pattern classification. Here, the classical classifiers viz., k-means clustering, error back propagation neural network (EBPNN), and support vector machine (SVM) have been initially attempted and subsequently m-fold (m = 3) cross validation is used to reduce the bias during training of the classifier. The average classification accuracy is computed as the average over all the three folds. It is observed that EBPNN and SVM with different order polynomial kernel provide significant accuracies in comparison with k-means one. In fact, the parameters (centroids) of k-means algorithm are locally optimized by minimizing its objective function. In order to overcome this limitation, a global optimization technique viz., GA is suggested here and implemented to find more robust parameters of k-means clustering. Finally, it is shown that GA-optimized k-means algorithm enhances its accuracy to those of other classifiers. The results are discussed and compared. It is concluded that the GA-optimized k-means algorithm is an alternate approach for classification whose accuracy will be near to that of supervised (viz., EBPNN and SVM) classifiers.

Download Full-text

Biometric authenticator algorithm based on multiresolution analysis

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v20.i3.pp1332-1341 ◽

2020 ◽

Vol 20 (3) ◽

pp. 1332

Author(s):

Soumia Kerrache ◽

Beladgham Mohammed ◽

Hamza Aymen ◽

Kadri Ibrahim

Keyword(s):

Feature Extraction ◽

Multiresolution Analysis ◽

Nearest Neighbor ◽

Curvelet Transform ◽

Principal Component ◽

Image Features ◽

Support Vector ◽

K Nearest Neighbor ◽

Feature Extraction Method ◽

Fusion Approach

Features extraction is an essential process in identifying person biometrics because the effectiveness of the system depends on it. Multiresolution Analysis success can be used in the system of a person’s identification and pattern recognition. In this paper, we present a feature extraction method for two-dimensional face and iris authentication. Our approach is a combination of principal component analysis (PCA) and curvelet transform as an improved fusion approach for feature extraction. The proposed fusion approach involves image denoising using 2D-Curvelet transform to achieve compact representations of curves singularities. This is followed by the application of PCA as a fusion rule to improve upon the spatial resolution. The limitations of the only PCA algorithm are a poor recognition speed and complex mathematical calculating load, to reduce these limitations, we are applying the curvelet transform. <br /> To assess the performance of the presented method, we have employed three classification techniques: Neural networks (NN), K-Nearest Neighbor (KNN) and Support Vector machines (SVM).<br />The results reveal that the extraction of image features is more efficient using Curvelet/PCA.

Download Full-text

A Systematic Methodology to Evaluate Prediction Models for Driving Style Classification

Sensors ◽

10.3390/s20061692 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1692 ◽

Cited By ~ 6

Author(s):

Iván Silva ◽

José Eugenio Naranjo

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Performance Metrics ◽

Prediction Models ◽

Statistical Tests ◽

Area Under The Curve ◽

The Other ◽

Support Vector ◽

Classification Models ◽

K Nearest Neighbor

Identifying driving styles using classification models with in-vehicle data can provide automated feedback to drivers on their driving behavior, particularly if they are driving safely. Although several classification models have been developed for this purpose, there is no consensus on which classifier performs better at identifying driving styles. Therefore, more research is needed to evaluate classification models by comparing performance metrics. In this paper, a data-driven machine-learning methodology for classifying driving styles is introduced. This methodology is grounded in well-established machine-learning (ML) methods and literature related to driving-styles research. The methodology is illustrated through a study involving data collected from 50 drivers from two different cities in a naturalistic setting. Five features were extracted from the raw data. Fifteen experts were involved in the data labeling to derive the ground truth of the dataset. The dataset fed five different models (Support Vector Machines (SVM), Artificial Neural Networks (ANN), fuzzy logic, k-Nearest Neighbor (kNN), and Random Forests (RF)). These models were evaluated in terms of a set of performance metrics and statistical tests. The experimental results from performance metrics showed that SVM outperformed the other four models, achieving an average accuracy of 0.96, F1-Score of 0.9595, Area Under the Curve (AUC) of 0.9730, and Kappa of 0.9375. In addition, Wilcoxon tests indicated that ANN predicts differently to the other four models. These promising results demonstrate that the proposed methodology may support researchers in making informed decisions about which ML model performs better for driving-styles classification.

Download Full-text

Novel Mathematical Model of Breast Cancer Diagnostics Using an Associative Pattern Classification

Diagnostics ◽

10.3390/diagnostics10030136 ◽

2020 ◽

Vol 10 (3) ◽

pp. 136 ◽

Cited By ~ 2

Author(s):

Raúl Santiago-Montero ◽

Humberto Sossa ◽

David A. Gutiérrez-Hernández ◽

Víctor Zamudio ◽

Ignacio Hernández-Bautista ◽

...

Keyword(s):

Breast Cancer ◽

Nearest Neighbor ◽

Early Stage ◽

Back Propagation ◽

Cancer Diagnostics ◽

Support Vector ◽

K Nearest Neighbor ◽

Breast Cancer Death ◽

Positron Emission ◽

The Government

Breast cancer is a disease that has emerged as the second leading cause of cancer deaths in women worldwide. The annual mortality rate is estimated to continue growing. Cancer detection at an early stage could significantly reduce breast cancer death rates long-term. Many investigators have studied different breast diagnostic approaches, such as mammography, magnetic resonance imaging, ultrasound, computerized tomography, positron emission tomography and biopsy. However, these techniques have limitations, such as being expensive, time consuming and not suitable for women of all ages. Proposing techniques that support the effective medical diagnosis of this disease has undoubtedly become a priority for the government, for health institutions and for civil society in general. In this paper, an associative pattern classifier (APC) was used for the diagnosis of breast cancer. The rate of efficiency obtained on the Wisconsin breast cancer database was 97.31%. The APC’s performance was compared with the performance of a support vector machine (SVM) model, back-propagation neural networks, C4.5, naive Bayes, k-nearest neighbor (k-NN) and minimum distance classifiers. According to our results, the APC performed best. The algorithm of the APC was written and executed in a JAVA platform, as well as the experimental and comparativeness between algorithms.

Download Full-text