scholarly journals Evaluating the Impact of a Two-Stage Multivariate Data Cleansing Approach to Improve to the Performance of Machine Learning Classifiers: A Case Study in Human Activity Recognition

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1858 ◽  
Author(s):  
Dionicio Neira-Rodado ◽  
Chris Nugent ◽  
Ian Cleland ◽  
Javier Velasquez ◽  
Amelec Viloria

Human activity recognition (HAR) is a popular field of study. The outcomes of the projects in this area have the potential to impact on the quality of life of people with conditions such as dementia. HAR is focused primarily on applying machine learning classifiers on data from low level sensors such as accelerometers. The performance of these classifiers can be improved through an adequate training process. In order to improve the training process, multivariate outlier detection was used in order to improve the quality of data in the training set and, subsequently, performance of the classifier. The impact of the technique was evaluated with KNN and random forest (RF) classifiers. In the case of KNN, the performance of the classifier was improved from 55.9% to 63.59%.

Human Activity Recognition and assisting user on the basis of his context is attracting researchers since decade Researchers are working in the area to increase the accuracy of detection by various means. The challenging issue is to determine the correct supervised classifier for the detection purpose. This paper intent to examine the methodology used to recognize HAR and the impact of classifiers practiced in training and Testing. We have also tried to identify the suitable supervised machine learning model for HAR. Data of 30 Users with 561 features belonging to accelerometer and gyroscope sensor of smartphone from UCI repository is used for evaluation purpose. Nine different supervised machine learning Models are trained and tested on the dataset. The result concludes that HAR is a process which depends upon the classifiers used. It also conclude that out of 9 different Machine learning models ANN performs well and after that SVM, kNN, Random Forest and Extra Tree are equally good models for the purpose of HAR with Accuracy and execution time as the performance evaluation metric.


2021 ◽  
Vol 191 ◽  
pp. 367-372
Author(s):  
Ariza-Colpas Paola ◽  
Oñate-Bowen Alvaro Agustín ◽  
Suarez-Brieva Eydy del Carmen ◽  
Oviedo-Carrascal Ana ◽  
Urina Triana Miguel ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document