A Novel Dynamic Hybridization Method for Best Feature Selection

Nassima Dif; Zakaria Elberrichi

doi:10.4018/ijamc.2021040106

A Novel Dynamic Hybridization Method for Best Feature Selection

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2021040106 ◽

2021 ◽

Vol 12 (2) ◽

pp. 85-99

Author(s):

Nassima Dif ◽

Zakaria Elberrichi

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

Optimization Problems ◽

Learning Algorithm ◽

Accuracy Score ◽

Hybridization Method ◽

K Nearest Neighbor ◽

Feature Selection Problem ◽

Combinatorial Optimization Problems ◽

The Comparative Study

Hybrid metaheuristics has received a lot of attention lately to solve combinatorial optimization problems. The purpose of hybridization is to create a cooperation between metaheuristics for better solutions. Most proposed works were interested in static hybridization. The objective of this work is to propose a novel dynamic hybridization method (GPBD) that generates the most suitable sequential hybridization between GA, PSO, BAT, and DE metaheuristics, according to each problem. The authors choose to test this approach for solving the best feature selection problem in a wrapper tactic, performed on face image recognition datasets, with the k-nearest neighbor (KNN) learning algorithm. The comparative study of the metaheuristics and their hybridization GPBD shows that the proposed approach achieved the best results. It was definitely competitive with other filter approaches proposed in the literature. It achieved a perfect accuracy score of 100% for Orl10P, Pix10P, and PIE10P datasets.

Get full-text (via PubEx)

A Hybrid GA-LDA Scheme for Feature Selection in Content-Based Image Retrieval

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2018040103 ◽

2018 ◽

Vol 9 (2) ◽

pp. 48-71 ◽

Cited By ~ 3

Author(s):

Khadidja Belattar ◽

Sihem Mostefai ◽

Amer Draa

Keyword(s):

Feature Selection ◽

Image Retrieval ◽

Nearest Neighbor ◽

Skin Lesions ◽

Processing Technique ◽

Input Image ◽

Content Based Image Retrieval ◽

K Nearest Neighbor ◽

Feature Selection Problem ◽

Linear Discriminant

Feature selection is an important pre-processing technique in the pattern recognition domain. This article proposes a hybridization between Genetic Algorithm (GA) and the Linear Discriminant Analysis (LDA) for solving the feature selection problem in Content-Based Image Retrieval (CBIR) applied to dermatological images. In the first step, we preprocess and segment the input image, then we derive color and texture features characterizing healthy skin and the segmented skin lesion. At this stage, a binary GA is used to evolve chromosome subsets whose fitness is evaluated by a Logistic Regression classifier. The optimal identified features are then used to feed LDA for a CBIR system, based on a K-Nearest Neighbor classification. To assess the proposed approach, the authors have opted for a K-fold cross validation method on a database of 1097 images of melanomas and other skin lesions. As a result, the authors obtained a reduced number of features and an improved CBDIR system compared to PCA, LDA and ICA methods.

Get full-text (via PubEx)

Product Review Based Customer Sentiment Analysis using an Ensemble of mRMR and Forest Optimization Algorithm (FOA)

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2022010107 ◽

2022 ◽

Vol 13 (1) ◽

pp. 0-0

Keyword(s):

Feature Selection ◽

Sentiment Analysis ◽

Optimization Algorithm ◽

Nearest Neighbor ◽

Hybrid Approach ◽

Support Vector ◽

K Nearest Neighbor ◽

Feature Selection Technique ◽

Feature Selection Problem

This research presents a way of feature selection problem for classification of sentiments that use ensemble-based classifier. This includes a hybrid approach of minimum redundancy and maximum relevance (mRMR) technique and Forest Optimization Algorithm (FOA) (i.e. mRMR-FOA) based feature selection. Before applying the FOA on sentiment analysis, it has been used as feature selection technique applied on 10 different classification datasets publically available on UCI machine learning repository. The classifiers for example k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and Naïve Bayes used the ensemble based algorithm for available datasets. The mRMR-FOA uses the Blitzer’s dataset (customer reviews on electronic products survey) to select the significant features. The classification of sentiments has noticed to improve by 12 to 18%. The evaluated results are further enhanced by the ensemble of k-NN, NB and SVM with an accuracy of 88.47% for the classification of sentiment analysis task.

Get full-text (via PubEx)

Genetic Algorithm to Optimize k-Nearest Neighbor Parameter for Benchmarked Medical Datasets Classification

Jurnal Online Informatika ◽

10.15575/join.v5i2.656 ◽

2020 ◽

Vol 5 (2) ◽

pp. 153

Author(s):

Rizki Tri Prasetio

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Feature Selection ◽

Nearest Neighbor ◽

Learning Algorithm ◽

P Value ◽

Computer Assisted ◽

K Nearest Neighbor ◽

Forward Selection ◽

Backward Elimination

Computer assisted medical diagnosis is a major machine learning problem being researched recently. General classifiers learn from the data itself through training process, due to the inexperience of an expert in determining parameters. This research proposes a methodology based on machine learning paradigm. Integrates the search heuristic that is inspired by natural evolution called genetic algorithm with the simplest and the most used learning algorithm, k-nearest Neighbor. The genetic algorithm were used for feature selection and parameter optimization while k-nearest Neighbor were used as a classifier. The proposed method is experimented on five benchmarked medical datasets from University California Irvine Machine Learning Repository and compared with original k-NN and other feature selection algorithm i.e., forward selection, backward elimination and greedy feature selection. Experiment results show that the proposed method is able to achieve good performance with significant improvement with p value of t-Test is 0.0011.

Get full-text (via PubEx)

Efficient detection of hacker community based on twitter data using complex networks and machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210458 ◽

2021 ◽

pp. 1-17

Author(s):

Ahmed Al-Tarawneh ◽

Ja’afer Al-Saraireh

Keyword(s):

Machine Learning ◽

Complex Networks ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

Efficient Detection ◽

Suggested Keywords

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.

Get full-text (via PubEx)

k-Nearest Neighbor Learning with Graph Neural Networks

Mathematics ◽

10.3390/math9080830 ◽

2021 ◽

Vol 9 (8) ◽

pp. 830

Author(s):

Seokho Kang

Keyword(s):

Neural Network ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Weighting Function ◽

High Sensitivity ◽

Training Data ◽

K Nearest Neighbor ◽

Main Challenge ◽

Benchmark Datasets ◽

Graph Neural Networks

k-nearest neighbor (kNN) is a widely used learning algorithm for supervised learning tasks. In practice, the main challenge when using kNN is its high sensitivity to its hyperparameter setting, including the number of nearest neighbors k, the distance function, and the weighting function. To improve the robustness to hyperparameters, this study presents a novel kNN learning method based on a graph neural network, named kNNGNN. Given training data, the method learns a task-specific kNN rule in an end-to-end fashion by means of a graph neural network that takes the kNN graph of an instance to predict the label of the instance. The distance and weighting functions are implicitly embedded within the graph neural network. For a query instance, the prediction is obtained by performing a kNN search from the training data to create a kNN graph and passing it through the graph neural network. The effectiveness of the proposed method is demonstrated using various benchmark datasets for classification and regression tasks.

Get full-text (via PubEx)

Intelligent Dynamic Identification Technique of Industrial Products in a Robotic Workplace

Sensors ◽

10.3390/s21051797 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1797

Author(s):

Ján Vachálek ◽

Dana Šišmišová ◽

Pavol Vašek ◽

Jan Rybář ◽

Juraj Slovák ◽

...

Keyword(s):

Machine Learning ◽

Control Charts ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Conveyor Belt ◽

Standard Uncertainty ◽

K Nearest Neighbor ◽

Industrial Products ◽

Dynamic Identification ◽

Identification Technique

The article deals with aspects of identifying industrial products in motion based on their color. An automated robotic workplace with a conveyor belt, robot and an industrial color sensor is created for this purpose. Measured data are processed in a database and then statistically evaluated in form of type A standard uncertainty and type B standard uncertainty, in order to obtain combined standard uncertainties results. Based on the acquired data, control charts of RGB color components for identified products are created. Influence of product speed on the measuring process identification and process stability is monitored. In case of identification uncertainty i.e., measured values are outside the limits of control charts, the K-nearest neighbor machine learning algorithm is used. This algorithm, based on the Euclidean distances to the classified value, estimates its most accurate iteration. This results into the comprehensive system for identification of product moving on conveyor belt, where based on the data collection and statistical analysis using machine learning, industry usage reliability is demonstrated.

Get full-text (via PubEx)

A Feature Selection Approach for Network Intrusion Detection Based on Tree-Seed Algorithm and K-Nearest Neighbor

2018 IEEE 4th International Symposium on Wireless Systems within the International Conferences on Intelligent Data Acquisition and Advanced Computing Systems (IDAACS-SWS) ◽

10.1109/idaacs-sws.2018.8525522 ◽

2018 ◽

Cited By ~ 3

Author(s):

Feng Chen ◽

Zhiwei Ye ◽

Chunzhi Wang ◽

Lingyu Yan ◽

Ruoxi Wang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Nearest Neighbor ◽

Network Intrusion Detection ◽

K Nearest Neighbor ◽

Network Intrusion ◽

Selection Approach ◽

Feature Selection Approach ◽

Tree Seed

Get full-text (via PubEx)

KNN classifier based approach for multi-class sentiment analysis of twitter data

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.12656 ◽

2018 ◽

Vol 7 (3) ◽

pp. 1372

Author(s):

Soudamini Hota ◽

Sudhir Pathak

Keyword(s):

Sentiment Analysis ◽

Opinion Mining ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Online News ◽

Classification Algorithm ◽

Sentiment Classification ◽

Supervised Machine Learning ◽

K Nearest Neighbor ◽

News Reports

‘Sentiment’ literally means ‘Emotions’. Sentiment analysis, synonymous to opinion mining, is a type of data mining that refers to the analy-sis of data obtained from microblogging sites, social media updates, online news reports, user reviews etc., in order to study the sentiments of the people towards an event, organization, product, brand, person etc. In this work, sentiment classification is done into multiple classes. The proposed methodology based on KNN classification algorithm shows an improvement over one of the existing methodologies which is based on SVM classification algorithm. The data used for analysis has been taken from Twitter, this being the most popular microblogging site. The source data has been extracted from Twitter using Python’s Tweepy. N-Gram modeling technique has been used for feature extraction and the supervised machine learning algorithm k-nearest neighbor has been used for sentiment classification. The performance of proposed and existing techniques is compared in terms of accuracy, precision and recall. It is analyzed and concluded that the proposed technique performs better in terms of all the standard evaluation parameters.

Get full-text (via PubEx)

Accelerating wrapper-based feature selection with K-nearest-neighbor

Knowledge-Based Systems ◽

10.1016/j.knosys.2015.03.009 ◽

2015 ◽

Vol 83 ◽

pp. 81-91 ◽

Cited By ~ 54

Author(s):

Aiguo Wang ◽

Ning An ◽

Guilin Chen ◽

Lian Li ◽

Gil Alterovitz

Keyword(s):

Feature Selection ◽

Nearest Neighbor ◽

K Nearest Neighbor

Get full-text (via PubEx)

Using k-Nearest Neighbor and Feature Selection as an Improvement to Hierarchical Clustering

Methods and Applications of Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24674-9_21 ◽

2004 ◽

pp. 191-200 ◽

Cited By ~ 3

Author(s):

Phivos Mylonas ◽

Manolis Wallace ◽

Stefanos Kollias

Keyword(s):

Feature Selection ◽

Hierarchical Clustering ◽

Nearest Neighbor ◽

K Nearest Neighbor

Get full-text (via PubEx)