Online Inertial Machine Learning for Sensor Array Long-Term Drift Compensation

The sensor drift problem is objective and inevitable, and drift compensation has essential research significance. For long-term drift, we propose a data preprocessing method, which is different from conventional research methods, and a machine learning framework that supports online self-training and data analysis without additional sensor production costs. The data preprocessing method proposed can effectively solve the problems of sign error, decimal point error, and outliers in data samples. The framework, which we call inertial machine learning, takes advantage of the recent inertia of high classification accuracy to extend the reliability of sensors. We establish a reasonable memory and forgetting mechanism for the framework, and the choice of base classifier is not limited. In this paper, we use a support vector machine as the base classifier and use the gas sensor array drift dataset in the UCI machine learning repository for experiments. By analyzing the experimental results, the classification accuracy is greatly improved, the effective time of the sensor array is extended by 4–10 months, and the time of single response and model adjustment is less than 300 ms, which is well in line with the actual application scenarios. The research ideas and results in this paper have a certain reference value for the research in related fields.

Download Full-text

Data Classification Methodology for Electronic Noses Using Uniform Manifold Approximation and Projection and Extreme Learning Machine

Mathematics ◽

10.3390/math10010029 ◽

2021 ◽

Vol 10 (1) ◽

pp. 29

Author(s):

Jersson X. Leon-Medina ◽

Núria Parés ◽

Maribel Anaya ◽

Diego A. Tibaduiza ◽

Francesc Pozo

Keyword(s):

Machine Learning ◽

Extreme Learning Machine ◽

Data Reduction ◽

Classification Accuracy ◽

Sensor Array ◽

Data Preprocessing ◽

Electronic Noses ◽

Average Classification Accuracy ◽

Learning Classifier ◽

Learning Machine

The classification and use of robust methodologies in sensor array applications of electronic noses (ENs) remain an open problem. Among the several steps used in the developed methodologies, data preprocessing improves the classification accuracy of this type of sensor. Data preprocessing methods, such as data transformation and data reduction, enable the treatment of data with anomalies, such as outliers and features, that do not provide quality information; in addition, they reduce the dimensionality of the data, thereby facilitating the tasks of a machine learning classifier. To help solve this problem, in this study, a machine learning methodology is introduced to improve signal processing and develop methodologies for classification when an EN is used. The proposed methodology involves a normalization stage to scale the data from the sensors, using both the well-known min−max approach and the more recent mean-centered unitary group scaling (MCUGS). Next, a manifold learning algorithm for data reduction is applied using uniform manifold approximation and projection (UMAP). The dimensionality of the data at the input of the classification machine is reduced, and an extreme learning machine (ELM) is used as a machine learning classifier algorithm. To validate the EN classification methodology, three datasets of ENs were used. The first dataset was composed of 3600 measurements of 6 volatile organic compounds performed by employing 16 metal-oxide gas sensors. The second dataset was composed of 235 measurements of 3 different qualities of wine, namely, high, average, and low, as evaluated by using an EN sensor array composed of 6 different sensors. The third dataset was composed of 309 measurements of 3 different gases obtained by using an EN sensor array of 2 sensors. A 5-fold cross-validation approach was used to evaluate the proposed methodology. A test set consisting of 25% of the data was used to validate the methodology with unseen data. The results showed a fully correct average classification accuracy of 1 when the MCUGS, UMAP, and ELM methods were used. Finally, the effect of changing the number of target dimensions on the reduction of the number of data was determined based on the highest average classification accuracy.

Download Full-text

A novel sensor array and classifier optimization method of electronic nose based on enhanced quantum-behaved particle swarm optimization

Sensor Review ◽

10.1108/sr-02-2013-630 ◽

2014 ◽

Vol 34 (3) ◽

pp. 304-311 ◽

Cited By ~ 14

Author(s):

Pengfei Jia ◽

Fengchun Tian ◽

Shu Fan ◽

Qinghua He ◽

Jingwei Feng ◽

...

Keyword(s):

Particle Swarm Optimization ◽

Wound Infection ◽

Electronic Nose ◽

Classification Accuracy ◽

Sensor Array ◽

Particle Swarm ◽

Optimization Method ◽

Support Vector ◽

Swarm Optimization ◽

Content Type

Purpose – The purpose of the paper is to propose a new optimization algorithm to realize a synchronous optimization of sensor array and classifier, to improve the performance of E-nose in the detection of wound infection. When an electronic nose (E-nose) is used to detect the wound infection, sensor array’s optimization and parameters’ setting of classifier have a strong impact on the classification accuracy. Design/methodology/approach – An enhanced quantum-behaved particle swarm optimization based on genetic algorithm, genetic quantum-behaved particle swarm optimization (G-QPSO), is proposed to realize a synchronous optimization of sensor array and classifier. The importance-factor (I-F) method is used to weight the sensors of E-nose by its degree of importance in classification. Both radical basis function network and support vector machine are used for classification. Findings – The classification accuracy of E-nose is the highest when the weighting coefficients of the I-F method and classifier’s parameters are optimized by G-QPSO. All results make it clear that the proposed method is an ideal optimization method of E-nose in the detection of wound infection. Research limitations/implications – To make the proposed optimization method more effective, the key point of further research is to enhance the classifier of E-nose. Practical implications – In this paper, E-nose is used to distinguish the class of wound infection; meanwhile, G-QPSO is used to realize a synchronous optimization of sensor array and classifier of E-nose. These are all important for E-nose to realize its clinical application in wound monitoring. Originality/value – The innovative concept improves the performance of E-nose in wound monitoring and paves the way for the clinical detection of E-nose.

Download Full-text

Long-term streamflow forecasting for the Cascade Reservoir System of Han River using SWAT with CFS output

Hydrology Research ◽

10.2166/nh.2018.114 ◽

2018 ◽

Vol 50 (2) ◽

pp. 655-671

Author(s):

Tian Liu ◽

Yuanfang Chen ◽

Binquan Li ◽

Yiming Hu ◽

Hui Qiu ◽

...

Keyword(s):

Machine Learning ◽

Assessment Tool ◽

Flow Simulation ◽

Machine Learning Algorithms ◽

Support Vector ◽

Streamflow Forecasting ◽

Distribution Models ◽

Han River ◽

Reservoir System

Abstract Due to the large uncertainties of long-term precipitation prediction and reservoir operation, it is difficult to forecast long-term streamflow for large basins with cascade reservoirs. In this paper, a framework coupling the original Climate Forecasting System (CFS) precipitation with the Soil and Water Assessment Tool (SWAT) was proposed to forecast the nine-month streamflow for the Cascade Reservoir System of Han River (CRSHR) including Shiquan, Ankang and Danjiangkou reservoirs. First, CFS precipitation was tested against the observation and post-processed through two machine learning algorithms, random forest and support vector regression. Results showed the correlation coefficients between the monthly areal CFS precipitation (post-processed) and observation were 0.91–0.96, confirming that CFS precipitation post-processing using machine learning was not affected by the extended forecast period. Additionally, two precipitation spatio-temporal distribution models, original CFS and similar historical observation, were adopted to disaggregate the processed monthly areal CFS precipitation to daily subbasin-scale precipitation. Based on the reservoir restoring flow, the regional SWAT was calibrated for CRSHR. The Nash–Sutcliffe efficiencies for three reservoirs flow simulation were 0.86, 0.88 and 0.84, respectively, meeting the accuracy requirement. The experimental forecast showed that for three reservoirs, long-term streamflow forecast with similar historical observed distribution was more accurate than that with original CFS.

Download Full-text

Use of machine learning to identify autism with natural social gaze behavior (Preprint)

10.2196/preprints.29328 ◽

2021 ◽

Author(s):

Zhong Zhao ◽

Haiming Tang ◽

Xiaobin Zhang ◽

Xingda Qu ◽

Jianping Lu

Keyword(s):

Machine Learning ◽

Eye Tracking ◽

Classification Accuracy ◽

Autism Spectrum ◽

Support Vector ◽

Gaze Behavior ◽

Linear Discriminant ◽

Face To Face ◽

Area Of Interest ◽

Social Gaze

BACKGROUND Abnormal gaze behavior is a prominent feature of the autism spectrum disorder (ASD). Previous eye tracking studies had participants watch images (i.e., picture, video and webpage), and the application of machine learning (ML) on these data showed promising results in identify ASD individuals. Given the fact that gaze behavior differs in face-to-face interaction from image viewing tasks, no study has investigated whether natural social gaze behavior could accurately identify ASD. OBJECTIVE The objective of this study was to examine whether and what area of interest (AOI)-based features extracted from the natural social gaze behavior could identify ASD. METHODS Both children with ASD and typical development (TD) were eye-tracked when they were engaged in a face-to-face conversation with an interviewer. Four ML classifiers (support vector machine, SVM; linear discriminant analysis, LDA; decision tree, DT; and random forest, RF) were used to determine the maximum classification accuracy and the corresponding features. RESULTS A maximum classification accuracy of 84.62% were achieved with three classifiers (LDA, DT and RF). Results showed that the mouth, but not the eyes AOI, was a powerful feature in detecting ASD. CONCLUSIONS Natural gaze behavior could be leveraged to identify ASD, suggesting that ASD might be objectively screened with eye tracking technology in everyday social interaction. In addition, the comparison between our and previous findings suggests that eye tracking features that could identify ASD might be culture dependent and context sensitive.

Download Full-text

Machine Learning Analytics of Resting-State Functional Connectivity Predicts Survival Outcomes of Glioblastoma Multiforme Patients

Frontiers in Neurology ◽

10.3389/fneur.2021.642241 ◽

2021 ◽

Vol 12 ◽

Author(s):

Bidhan Lamichhane ◽

Andy G. S. Daniel ◽

John J. Lee ◽

Daniel S. Marcus ◽

Joshua S. Shimony ◽

...

Keyword(s):

Machine Learning ◽

Glioblastoma Multiforme ◽

Functional Connectivity ◽

Resting State ◽

Learning Analytics ◽

Support Vector ◽

Resting State Functional Connectivity ◽

Term Survival ◽

Long Term Survival

Glioblastoma multiforme (GBM) is the most frequently occurring brain malignancy. Due to its poor prognosis with currently available treatments, there is a pressing need for easily accessible, non-invasive techniques to help inform pre-treatment planning, patient counseling, and improve outcomes. In this study we determined the feasibility of resting-state functional connectivity (rsFC) to classify GBM patients into short-term and long-term survival groups with respect to reported median survival (14.6 months). We used a support vector machine with rsFC between regions of interest as predictive features. We employed a novel hybrid feature selection method whereby features were first filtered using correlations between rsFC and OS, and then using the established method of recursive feature elimination (RFE) to select the optimal feature subset. Leave-one-subject-out cross-validation evaluated the performance of models. Classification between short- and long-term survival accuracy was 71.9%. Sensitivity and specificity were 77.1 and 65.5%, respectively. The area under the receiver operating characteristic curve was 0.752 (95% CI, 0.62–0.88). These findings suggest that highly specific features of rsFC may predict GBM survival. Taken together, the findings of this study support that resting-state fMRI and machine learning analytics could enable a radiomic biomarker for GBM, augmenting care and planning for individual patients.

Download Full-text

Support Vector Regression Method for Regional Economic Mid- and Long-Term Predictions Based on Wireless Network Communication

Wireless Communications and Mobile Computing ◽

10.1155/2021/1837681 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Lingyu Dong

Keyword(s):

Machine Learning ◽

Communication Network ◽

Support Vector Regression ◽

Wireless Network ◽

Regression Method ◽

Network Communication ◽

Support Vector ◽

Network Coverage ◽

Data Packets

In recent years, wireless sensor network technology has continued to develop, and it has become one of the research hotspots in the information field. People have higher and higher requirements for the communication rate and network coverage of the communication network, which also makes the problems of limited wireless mobile communication network coverage and insufficient wireless resource utilization efficiency become increasingly prominent. This article is aimed at studying a support vector regression method for long-term prediction in the context of wireless network communication and applying the method to regional economy. This article uses the contrast experiment method and the space occupancy rate algorithm, combined with the vector regression algorithm of machine learning. Research on the laws of machine learning under the premise of less sample data solves the problem of the lack of a unified framework that can be referred to in machine learning with limited samples. The experimental results show that the distance between AP1 and AP2 is 0.4 m, and the distance between AP2 and Client2 is 0.6 m. When BPSK is used for OFDM modulation, 2500 MHz is used as the USRP center frequency, and 0.5 MHz is used as the USRP bandwidth; AP1 can send data packets. The length is 100 bytes, the number of sent data packets is 100, the gain of Client2 is 0-38, the receiving gain of AP2 is 0, and the receiving gain of AP1 is 19. The support vector regression method based on wireless network communication for regional economic mid- and long-term predictions was completed well.

Download Full-text

A Novel Method for Sea-Land Clutter Separation Using Regularized Randomized and Kernel Ridge Neural Networks

Sensors ◽

10.3390/s20226491 ◽

2020 ◽

Vol 20 (22) ◽

pp. 6491

Author(s):

Le Zhang ◽

Jeyan Thiyagalingam ◽

Anke Xue ◽

Shuwen Xu

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Classification Accuracy ◽

Signal Amplitude ◽

Statistical Characteristics ◽

Support Vector ◽

Amplitude Change ◽

Novel Method

Classification of clutter, especially in the context of shore based radars, plays a crucial role in several applications. However, the task of distinguishing and classifying the sea clutter from land clutter has been historically performed using clutter models and/or coastal maps. In this paper, we propose two machine learning, particularly neural network, based approaches for sea-land clutter separation, namely the regularized randomized neural network (RRNN) and the kernel ridge regression neural network (KRR). We use a number of features, such as energy variation, discrete signal amplitude change frequency, autocorrelation performance, and other statistical characteristics of the respective clutter distributions, to improve the performance of the classification. Our evaluation based on a unique mixed dataset, which is comprised of partially synthetic clutter data for land and real clutter data from sea, offers improved classification accuracy. More specifically, the RRNN and KRR methods offer 98.50% and 98.75% accuracy, outperforming the conventional support vector machine and extreme learning based solutions.

Download Full-text

Detecting conversation topics in primary care office visits from transcripts of patient-provider interactions

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocz140 ◽

2019 ◽

Vol 26 (12) ◽

pp. 1493-1504 ◽

Cited By ~ 4

Author(s):

Jihyun Park ◽

Dimitrios Kotzias ◽

Patty Kuo ◽

Robert L Logan IV ◽

Kritzia Merced ◽

...

Keyword(s):

Machine Learning ◽

Primary Care ◽

Classification Accuracy ◽

Conditional Random Fields ◽

Markov Models ◽

Support Vector ◽

Office Visits ◽

Noisy Information ◽

Gated Recurrent Units ◽

Sequential Models

Abstract Objective Amid electronic health records, laboratory tests, and other technology, office-based patient and provider communication is still the heart of primary medical care. Patients typically present multiple complaints, requiring physicians to decide how to balance competing demands. How this time is allocated has implications for patient satisfaction, payments, and quality of care. We investigate the effectiveness of machine learning methods for automated annotation of medical topics in patient-provider dialog transcripts. Materials and Methods We used dialog transcripts from 279 primary care visits to predict talk-turn topic labels. Different machine learning models were trained to operate on single or multiple local talk-turns (logistic classifiers, support vector machines, gated recurrent units) as well as sequential models that integrate information across talk-turn sequences (conditional random fields, hidden Markov models, and hierarchical gated recurrent units). Results Evaluation was performed using cross-validation to measure 1) classification accuracy for talk-turns and 2) precision, recall, and F1 scores at the visit level. Experimental results showed that sequential models had higher classification accuracy at the talk-turn level and higher precision at the visit level. Independent models had higher recall scores at the visit level compared with sequential models. Conclusions Incorporating sequential information across talk-turns improves the accuracy of topic prediction in patient-provider dialog by smoothing out noisy information from talk-turns. Although the results are promising, more advanced prediction techniques and larger labeled datasets will likely be required to achieve prediction performance appropriate for real-world clinical applications.

Download Full-text

A Novel Machine Learning Data Preprocessing Method for Enhancing Classification Algorithms Performance

Proceedings of the 16th International Conference on Engineering Applications of Neural Networks (INNS) - EANN '15 ◽

10.1145/2797143.2797155 ◽

2015 ◽

Cited By ~ 5

Author(s):

Theodoros Iliou ◽

Christos-Nikolaos Anagnostopoulos ◽

Marina Nerantzaki ◽

George Anastassopoulos

Keyword(s):

Machine Learning ◽

Data Preprocessing ◽

Classification Algorithms ◽

Preprocessing Method ◽

Learning Data

Download Full-text

A Hybrid Feature Selection Method for Improve the Accuracy of Medical Classification Process

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a9624.1111121 ◽

2021 ◽

Vol 11 (1) ◽

pp. 50-55

Author(s):

Maria Mohammad Yousef ◽

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Dimensionality Reduction ◽

Classification Accuracy ◽

Fitness Function ◽

Machine Learning Algorithms ◽

Feature Subset Selection ◽

High Dimensionality ◽

Support Vector ◽

Feature Subset

Generally, medical dataset classification has become one of the biggest problems in data mining research. Every database has a given number of features but it is observed that some of these features can be redundant and can be harmful as well as disrupt the process of classification and this problem is known as a high dimensionality problem. Dimensionality reduction in data preprocessing is critical for increasing the performance of machine learning algorithms. Besides the contribution of feature subset selection in dimensionality reduction gives a significant improvement in classification accuracy. In this paper, we proposed a new hybrid feature selection approach based on (GA assisted by KNN) to deal with issues of high dimensionality in biomedical data classification. The proposed method first applies the combination between GA and KNN for feature selection to find the optimal subset of features where the classification accuracy of the k-Nearest Neighbor (kNN) method is used as the fitness function for GA. After selecting the best-suggested subset of features, Support Vector Machine (SVM) are used as the classifiers. The proposed method experiments on five medical datasets of the UCI Machine Learning Repository. It is noted that the suggested technique performs admirably on these databases, achieving higher classification accuracy while using fewer features.

Download Full-text