Effect of feature selection methods on machine learning classifiers for detecting email spams

Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox’s proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox’s model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients’ future follow-up plan.

Download Full-text

A Feature Selection Approach for Fall Detection Using Various Machine Learning Classifiers

IEEE Access ◽

10.1109/access.2021.3105581 ◽

2021 ◽

pp. 1-1

Author(s):

Tuan Le Minh ◽

Ly Van Tran ◽

Son Vu Truong Dao

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Fall Detection ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Hybrid Machine Learning Classifiers for Indoor User Localization Problem

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8375.0110321 ◽

2021 ◽

Vol 10 (3) ◽

pp. 49-53

Author(s):

Hamza Turabieh ◽

Ahmad S. Alghamdi

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Indoor Localization ◽

Signal Strength ◽

Access Point ◽

Support Vector ◽

Linear Discriminant ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

User Location

Wi-Fi technology is now everywhere either inside or outside buildings. Using Wi-fi technology introduces an indoor localization service(s) (ILS). Determining indoor user location is a hard and complex problem. Several applications highlight the importance of indoor user localization such as disaster management, health care zones, Internet of Things applications (IoT), and public settlement planning. The measurements of Wi-Fi signal strength (i.e., Received Signal Strength Indicator (RSSI)) can be used to determine indoor user location. In this paper, we proposed a hybrid model between a wrapper feature selection algorithm and machine learning classifiers to determine indoor user location. We employed the Minimum Redundancy Maximum Relevance (mRMR) algorithm as a feature selection to select the most active access point (AP) based on RSSI values. Six different machine learning classifiers were used in this work (i.e., Decision Tree (DT), Support Vector Machine (SVM), k-nearest neighbors (kNN), Linear Discriminant Analysis (LDA), Ensemble-Bagged Tree (EBaT), and Ensemble Boosted Tree (EBoT)). We examined all classifiers on a public dataset obtained from UCI repository. The obtained results show that EBoT outperforms all other classifiers based on accuracy value/

Download Full-text

Feature Selection and Performance Comparison of Various Machine Learning Classifiers for Analyzing Students’ Performance Using Rapid Miner

Lecture Notes in Electrical Engineering - Applications of Artificial Intelligence and Machine Learning ◽

10.1007/978-981-16-3067-5_2 ◽

2021 ◽

pp. 13-22

Author(s):

Vikas Rattan ◽

Varun Malik ◽

Ruchi Mittal ◽

Jaiteg Singh ◽

Pawan Kumar Chand

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Performance Comparison ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

And Performance

Download Full-text

Hybrid Model of Correlation Based Filter Feature Selection and Machine Learning Classifiers Applied on Smart Meter Data Set

2019 IEEE/ACM Symposium on Software Engineering in Africa (SEiA) ◽

10.1109/seia.2019.00009 ◽

2019 ◽

Author(s):

Janvier Omar Sinayobye ◽

Kyanda Swaib Kaawaase ◽

Fred N. Kiwanuka ◽

Richard Musabe

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Hybrid Model ◽

Smart Meter ◽

Data Set ◽

Machine Learning Classifiers ◽

Learning Classifiers

Download Full-text

Improved Classification of Blockchain Transactions Using Feature Engineering and Ensemble Learning

Future Internet ◽

10.3390/fi14010016 ◽

2021 ◽

Vol 14 (1) ◽

pp. 16

Author(s):

Chandrashekar Jatoth ◽

Rishabh Jain ◽

Ugo Fiore ◽

Subrahmanyam Chatharasupalli

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Ensemble Learning ◽

Feature Engineering ◽

Machine Learning Classifiers ◽

Blockchain Technology ◽

Widespread Adoption ◽

Learning Classifiers ◽

Correlation Based Feature Selection

Although the blockchain technology is gaining a widespread adoption across multiple sectors, its most popular application is in cryptocurrency. The decentralized and anonymous nature of transactions in a cryptocurrency blockchain has attracted a multitude of participants, and now significant amounts of money are being exchanged by the day. This raises the need of analyzing the blockchain to discover information related to the nature of participants in transactions. This study focuses on the identification for risky and non-risky blocks in a blockchain. In this paper, the proposed approach is to use ensemble learning with or without feature selection using correlation-based feature selection. Ensemble learning yielded good results in the experiments, but class-wise analysis reveals that ensemble learning with feature selection improves even further. After training Machine Learning classifiers on the dataset, we observe an improvement in accuracy of 2–3% and in F-score of 7–8%.

Download Full-text