Data Classification Methodology for Electronic Noses Using Uniform Manifold Approximation and Projection and Extreme Learning Machine

Jersson X. Leon-Medina; Núria Parés; Maribel Anaya; Diego A. Tibaduiza; Francesc Pozo

doi:10.3390/math10010029

Data Classification Methodology for Electronic Noses Using Uniform Manifold Approximation and Projection and Extreme Learning Machine

Mathematics ◽

10.3390/math10010029 ◽

2021 ◽

Vol 10 (1) ◽

pp. 29

Author(s):

Jersson X. Leon-Medina ◽

Núria Parés ◽

Maribel Anaya ◽

Diego A. Tibaduiza ◽

Francesc Pozo

Keyword(s):

Machine Learning ◽

Extreme Learning Machine ◽

Data Reduction ◽

Classification Accuracy ◽

Sensor Array ◽

Data Preprocessing ◽

Electronic Noses ◽

Average Classification Accuracy ◽

Learning Classifier ◽

Learning Machine

The classification and use of robust methodologies in sensor array applications of electronic noses (ENs) remain an open problem. Among the several steps used in the developed methodologies, data preprocessing improves the classification accuracy of this type of sensor. Data preprocessing methods, such as data transformation and data reduction, enable the treatment of data with anomalies, such as outliers and features, that do not provide quality information; in addition, they reduce the dimensionality of the data, thereby facilitating the tasks of a machine learning classifier. To help solve this problem, in this study, a machine learning methodology is introduced to improve signal processing and develop methodologies for classification when an EN is used. The proposed methodology involves a normalization stage to scale the data from the sensors, using both the well-known min−max approach and the more recent mean-centered unitary group scaling (MCUGS). Next, a manifold learning algorithm for data reduction is applied using uniform manifold approximation and projection (UMAP). The dimensionality of the data at the input of the classification machine is reduced, and an extreme learning machine (ELM) is used as a machine learning classifier algorithm. To validate the EN classification methodology, three datasets of ENs were used. The first dataset was composed of 3600 measurements of 6 volatile organic compounds performed by employing 16 metal-oxide gas sensors. The second dataset was composed of 235 measurements of 3 different qualities of wine, namely, high, average, and low, as evaluated by using an EN sensor array composed of 6 different sensors. The third dataset was composed of 309 measurements of 3 different gases obtained by using an EN sensor array of 2 sensors. A 5-fold cross-validation approach was used to evaluate the proposed methodology. A test set consisting of 25% of the data was used to validate the methodology with unseen data. The results showed a fully correct average classification accuracy of 1 when the MCUGS, UMAP, and ELM methods were used. Finally, the effect of changing the number of target dimensions on the reduction of the number of data was determined based on the highest average classification accuracy.

Download Full-text

Online Inertial Machine Learning for Sensor Array Long-Term Drift Compensation

Chemosensors ◽

10.3390/chemosensors9120353 ◽

2021 ◽

Vol 9 (12) ◽

pp. 353

Author(s):

Xiaorui Dong ◽

Shijing Han ◽

Ancheng Wang ◽

Kai Shang

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Sensor Array ◽

Data Preprocessing ◽

Production Costs ◽

Support Vector ◽

Base Classifier ◽

Preprocessing Method ◽

Drift Compensation

The sensor drift problem is objective and inevitable, and drift compensation has essential research significance. For long-term drift, we propose a data preprocessing method, which is different from conventional research methods, and a machine learning framework that supports online self-training and data analysis without additional sensor production costs. The data preprocessing method proposed can effectively solve the problems of sign error, decimal point error, and outliers in data samples. The framework, which we call inertial machine learning, takes advantage of the recent inertia of high classification accuracy to extend the reliability of sensors. We establish a reasonable memory and forgetting mechanism for the framework, and the choice of base classifier is not limited. In this paper, we use a support vector machine as the base classifier and use the gas sensor array drift dataset in the UCI machine learning repository for experiments. By analyzing the experimental results, the classification accuracy is greatly improved, the effective time of the sensor array is extended by 4–10 months, and the time of single response and model adjustment is less than 300 ms, which is well in line with the actual application scenarios. The research ideas and results in this paper have a certain reference value for the research in related fields.

Download Full-text

An Expert Diagnosis System for Parkinson Disease Based on Genetic Algorithm-Wavelet Kernel-Extreme Learning Machine

Parkinson s Disease ◽

10.1155/2016/5264743 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 14

Author(s):

Derya Avci ◽

Akif Dogantekin

Keyword(s):

Genetic Algorithm ◽

Parkinson Disease ◽

Extreme Learning Machine ◽

Classification Accuracy ◽

Disease Diagnosis ◽

Major Public Health Problem ◽

Diagnosis System ◽

Kernel Extreme Learning Machine ◽

Learning Machine ◽

Hidden Neurons

Parkinson disease is a major public health problem all around the world. This paper proposes an expert disease diagnosis system for Parkinson disease based on genetic algorithm- (GA-) wavelet kernel- (WK-) Extreme Learning Machines (ELM). The classifier used in this paper is single layer neural network (SLNN) and it is trained by the ELM learning method. The Parkinson disease datasets are obtained from the UCI machine learning database. In wavelet kernel-Extreme Learning Machine (WK-ELM) structure, there are three adjustable parameters of wavelet kernel. These parameters and the numbers of hidden neurons play a major role in the performance of ELM. In this study, the optimum values of these parameters and the numbers of hidden neurons of ELM were obtained by using a genetic algorithm (GA). The performance of the proposed GA-WK-ELM method is evaluated using statical methods such as classification accuracy, sensitivity and specificity analysis, and ROC curves. The calculated highest classification accuracy of the proposed GA-WK-ELM method is found as 96.81%.

Download Full-text

Multi-Classification of Fetal Health Status Using Extreme Learning Machine

ICONTECH INTERNATIONAL JOURNAL ◽

10.46291/icontechvol5iss2pp62-70 ◽

2021 ◽

Vol 5 (2) ◽

pp. 62-70

Author(s):

Ömer KASIM

Keyword(s):

Extreme Learning Machine ◽

Classification Accuracy ◽

Binary Classification ◽

Clinical Decision ◽

Support Vector ◽

Data Set ◽

Multiple Classification ◽

Learning Machine ◽

Multi Class Classification ◽

The University

Cardiotocography (CTG) is used for monitoring the fetal heart rate signals during pregnancy. Evaluation of these signals by specialists provides information about fetal status. When a clinical decision support system is introduced with a system that can automatically classify these signals, it is more sensitive for experts to examine CTG data. In this study, CTG data were analysed with the Extreme Learning Machine (ELM) algorithm and these data were classified as normal, suspicious and pathological as well as benign and malicious. The proposed method is validated with the University of California International CTG data set. The performance of the proposed method is evaluated with accuracy, f1 score, Cohen kappa, precision, and recall metrics. As a result of the experiments, binary classification accuracy was obtained as 99.29%. There was only 1 false positive. When multi-class classification was performed, the accuracy was obtained as 98.12%. The amount of false positives was found as 2. The processing time of the training and testing of the ELM algorithm were quite minimized in terms of data processing compared to the support vector machine and multi-layer perceptron. This result proved that a high classification accuracy was obtained by analysing the CTG data both binary and multiple classification.

Download Full-text

Assessing the Applicability of Random Forest, Stochastic Gradient Boosted Model, and Extreme Learning Machine Methods to the Quantitative Precipitation Estimation of the Radar Data: A Case Study to Gwangdeoksan Radar, South Korea, in 2018

Advances in Meteorology ◽

10.1155/2019/6542410 ◽

2019 ◽

Vol 2019 ◽

pp. 1-17

Author(s):

Ju-Young Shin ◽

Yonghun Ro ◽

Joo-Wan Cha ◽

Kyu-Rang Kim ◽

Jong-Chul Ha

Keyword(s):

Machine Learning ◽

Random Forest ◽

South Korea ◽

Extreme Learning Machine ◽

Radar Data ◽

Machine Learning Algorithms ◽

Quantitative Precipitation Estimation ◽

Precipitation Estimation ◽

Learning Machine ◽

Estimation Models

Machine learning algorithms should be tested for use in quantitative precipitation estimation models of rain radar data in South Korea because such an application can provide a more accurate estimate of rainfall than the conventional ZR relationship-based model. The applicability of random forest, stochastic gradient boosted model, and extreme learning machine methods to quantitative precipitation estimation models was investigated using case studies with polarization radar data from Gwangdeoksan radar station. Various combinations of input variable sets were tested, and results showed that machine learning algorithms can be applied to build the quantitative precipitation estimation model of the polarization radar data in South Korea. The machine learning-based quantitative precipitation estimation models led to better performances than ZR relationship-based models, particularly for heavy rainfall events. The extreme learning machine is considered the best of the algorithms used based on evaluation criteria.

Download Full-text

A novel randomized machine learning approach: Reservoir computing extreme learning machine

Applied Soft Computing ◽

10.1016/j.asoc.2020.106433 ◽

2020 ◽

Vol 94 ◽

pp. 106433

Author(s):

Ömer Faruk Ertuğrul

Keyword(s):

Machine Learning ◽

Extreme Learning Machine ◽

Learning Approach ◽

Reservoir Computing ◽

Machine Learning Approach ◽

Learning Machine

Download Full-text

Retracted: Medical Dataset Classification: A Machine Learning Paradigm Integrating Particle Swarm Optimization with Extreme Learning Machine Classifier

The Scientific World JOURNAL ◽

10.1155/2016/7137054 ◽

2016 ◽

Vol 2016 ◽

pp. 1-1

Keyword(s):

Machine Learning ◽

Particle Swarm Optimization ◽

Extreme Learning Machine ◽

Particle Swarm ◽

Learning Paradigm ◽

Swarm Optimization ◽

Learning Machine ◽

Medical Dataset

Download Full-text

Automatic Detection of Epilepsy and Seizure Using Multiclass Sparse Extreme Learning Machine Classification

Computational and Mathematical Methods in Medicine ◽

10.1155/2017/6849360 ◽

2017 ◽

Vol 2017 ◽

pp. 1-10 ◽

Cited By ~ 9

Author(s):

Yuanfa Wang ◽

Zunchao Li ◽

Lichen Feng ◽

Chuang Zheng ◽

Wenhao Zhang

Keyword(s):

Computational Complexity ◽

Extreme Learning Machine ◽

Classification System ◽

Classification Accuracy ◽

Detection System ◽

Automatic Detection ◽

Seizure Detection ◽

Discrete Wavelet ◽

Eeg Signals ◽

Learning Machine

An automatic detection system for distinguishing normal, ictal, and interictal electroencephalogram (EEG) signals is of great help in clinical practice. This paper presents a three-class classification system based on discrete wavelet transform (DWT) and the nonlinear sparse extreme learning machine (SELM) for epilepsy and epileptic seizure detection. Three-level lifting DWT using Daubechies order 4 wavelet is introduced to decompose EEG signals into delta, theta, alpha, and beta subbands. Considering classification accuracy and computational complexity, the maximum and standard deviation values of each subband are computed to create an eight-dimensional feature vector. After comparing five multiclass SELM strategies, the one-against-one strategy with the highest accuracy is chosen for the three-class classification system. The performance of the designed three-class classification system is tested with publicly available epilepsy dataset. The results show that the system achieves high enough classification accuracy by combining the SELM and DWT and reduces training and testing time by decreasing computational complexity and feature dimension. With excellent classification performance and low computation complexity, this three-class classification system can be utilized for practical epileptic EEG detection, and it offers great potentials for portable automatic epilepsy and seizure detection system in the future hardware implementation.

Download Full-text

Review Process on URL Phishing

International Journal of Scientific Research in Science and Technology ◽

10.32628/ijsrst218344 ◽

2021 ◽

pp. 241-244

Author(s):

Vivek Sharma S ◽

Hemalatha R ◽

Kavyashree Y B

Keyword(s):

Machine Learning ◽

Extreme Learning Machine ◽

Review Process ◽

Learning Machine

Phishing is that the most typical and most dangerous attack among cybercrimes. The aim of these attacks is to steal the data that’s utilized by people and organizations to perform transactions or any vital info. The goal of this is often to perform an Extreme Learning Machine (ELM) primarily based upon the classification of options together with Phishing Websites information among the UC Irvine Machine Learning Repository information. For results assessment, ELM was compared with different machine learning (SVM), Naive Thomas Bayes (NB) strategies and detected to possess the best possible accuracy.

Download Full-text

An introductory study to machine learning and its application to employee turnover prediction

Revista dos Trabalhos de Iniciação Científica da UNICAMP ◽

10.20396/revpibic262018679 ◽

2019 ◽

Author(s):

João Pedro Pazinato Cruz de Oliveira ◽

Leonardo Tomazeli Duarte

Keyword(s):

Machine Learning ◽

Extreme Learning Machine ◽

Test Data ◽

Employee Turnover ◽

Learning Machine ◽

Elm Classifier

The objective of this paper is to study the problem of employee turnover prediction and to develop a classifier that uses employee's data to identify those who have a greater tendency to leave the company voluntarily. For such purpose, the data of 8724 employees from a real Brazilian beverage company was used to train an Extreme Learning Machine (ELM) classifier, assigning to each sample a weight inversely proportional to the size of the respective class. After the training, the classifier displayed an overall accuracy of 79% of the test data.

Download Full-text

Multi-step metal prices forecasting based on a data preprocessing method and an optimized extreme learning machine by marine predators algorithm

Resources Policy ◽

10.1016/j.resourpol.2021.102335 ◽

2021 ◽

Vol 74 ◽

pp. 102335

Author(s):

Pei Du ◽

Ju’e Guo ◽

Shaolong Sun ◽

Shouyang Wang ◽

Jing Wu

Keyword(s):

Extreme Learning Machine ◽

Data Preprocessing ◽

Marine Predators ◽

Preprocessing Method ◽

Learning Machine

Download Full-text