Role of Feature Engineering and Classifier Selection for Machine Learning Predictions

Author(s):  
Makarand Velankar ◽  
Vaibhav Khatavkar ◽  
Vinayak Jagtap ◽  
Parag Kulkarni

Features play a crucial role in several computational tasks. Feature values are input to machine learning algorithms for the prediction. The prediction accuracy depends on various factors such as selection of dataset, features and machine learning classifiers. Various feature selection and reduction approaches are experimented with to obtain better accuracies and reduce the computational overheads. Feature engineering is designing new features suitable for a specific task with the help of domain knowledge. The challenges in feature engineering are presented for the computational music domain as a case study. The experiments are performed with different combinations of feature sets and machine learning classifiers to test the accuracy of the proposed model. Music emotion recognition is used as a case study for the experimentation. Experimental results for the task of music emotion recognition provide insights into the role of features and classifiers in prediction accuracy. Different machine learning classifiers provided varied results, and the choice of a classifier is also an important decision to be made in the proposed model. The engineered features designed with the help of domain experts improved the results. It emphasizes the need for feature engineering for different domains for prediction accuracy improvement. Approaches to design an optimized model with the appropriate feature set and classifier for machine learning tasks are presented.

2019 ◽  
Vol 8 (4) ◽  
pp. 2187-2191

Music in an essential part of life and the emotion carried by it is key to its perception and usage. Music Emotion Recognition (MER) is the task of identifying the emotion in musical tracks and classifying them accordingly. The objective of this research paper is to check the effectiveness of popular machine learning classifiers like XGboost, Random Forest, Decision Trees, Support Vector Machine (SVM), K-Nearest-Neighbour (KNN) and Gaussian Naive Bayes on the task of MER. Using the MIREX-like dataset [17] to test these classifiers, the effects of oversampling algorithms like Synthetic Minority Oversampling Technique (SMOTE) [22] and Random Oversampling (ROS) were also verified. In all, the Gaussian Naive Bayes classifier gave the maximum accuracy of 40.33%. The other classifiers gave accuracies in between 20.44% and 38.67%. Thus, a limit on the classification accuracy has been reached using these classifiers and also using traditional musical or statistical metrics derived from the music as input features. In view of this, deep learning-based approaches using Convolutional Neural Networks (CNNs) [13] and spectrograms of the music clips for MER is a promising alternative.


2019 ◽  
Vol 141 (8) ◽  
Author(s):  
Tae Hyong Kim ◽  
Ahnryul Choi ◽  
Hyun Mu Heo ◽  
Kyungran Kim ◽  
Kyungsuk Lee ◽  
...  

Pre-impact fall detection can send alarm service faster to reduce long-lie conditions and decrease the risk of hospitalization. Detecting various types of fall to determine the impact site or direction prior to impact is important because it increases the chance of decreasing the incidence or severity of fall-related injuries. In this study, a robust pre-impact fall detection model was developed to classify various activities and falls as multiclass and its performance was compared with the performance of previous developed models. Twelve healthy subjects participated in this study. All subjects were asked to place an inertial measuring unit module by fixing on a belt near the left iliac crest to collect accelerometer data for each activity. Our novel proposed model consists of feature calculation and infinite latent feature selection (ILFS) algorithm, auto labeling of activities, and application of machine learning classifiers for discrete and continuous time series data. Nine machine-learning classifiers were applied to detect falls prior to impact and derive final detection results by sorting the classifier. Our model showed the highest classification accuracy. Results for the proposed model that could classify as multiclass showed significantly higher average classification accuracy of 99.57 ± 0.01% for discrete data-based classifiers and 99.84 ± 0.02% for continuous time series-based classifiers than previous models (p < 0.01). In the future, multiclass pre-impact fall detection models can be applied to fall protector devices by detecting various activities for sending alerts or immediate feedback reactions to prevent falls.


2021 ◽  
Vol 14 (1) ◽  
pp. 16
Author(s):  
Chandrashekar Jatoth ◽  
Rishabh Jain ◽  
Ugo Fiore ◽  
Subrahmanyam Chatharasupalli

Although the blockchain technology is gaining a widespread adoption across multiple sectors, its most popular application is in cryptocurrency. The decentralized and anonymous nature of transactions in a cryptocurrency blockchain has attracted a multitude of participants, and now significant amounts of money are being exchanged by the day. This raises the need of analyzing the blockchain to discover information related to the nature of participants in transactions. This study focuses on the identification for risky and non-risky blocks in a blockchain. In this paper, the proposed approach is to use ensemble learning with or without feature selection using correlation-based feature selection. Ensemble learning yielded good results in the experiments, but class-wise analysis reveals that ensemble learning with feature selection improves even further. After training Machine Learning classifiers on the dataset, we observe an improvement in accuracy of 2–3% and in F-score of 7–8%.


Sign in / Sign up

Export Citation Format

Share Document