wrapper feature selection
Recently Published Documents


TOTAL DOCUMENTS

95
(FIVE YEARS 41)

H-INDEX

13
(FIVE YEARS 3)

2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Abdullateef O. Balogun ◽  
Shuib Basri ◽  
Saipunidzam Mahamad ◽  
Luiz Fernando Capretz ◽  
Abdullahi Abubakar Imam ◽  
...  

The high dimensionality of software metric features has long been noted as a data quality problem that affects the performance of software defect prediction (SDP) models. This drawback makes it necessary to apply feature selection (FS) algorithm(s) in SDP processes. FS approaches can be categorized into three types, namely, filter FS (FFS), wrapper FS (WFS), and hybrid FS (HFS). HFS has been established as superior because it combines the strength of both FFS and WFS methods. However, selecting the most appropriate FFS (filter rank selection problem) for HFS is a challenge because the performance of FFS methods depends on the choice of datasets and classifiers. In addition, the local optima stagnation and high computational costs of WFS due to large search spaces are inherited by the HFS method. Therefore, as a solution, this study proposes a novel rank aggregation-based hybrid multifilter wrapper feature selection (RAHMFWFS) method for the selection of relevant and irredundant features from software defect datasets. The proposed RAHMFWFS is divided into two stepwise stages. The first stage involves a rank aggregation-based multifilter feature selection (RMFFS) method that addresses the filter rank selection problem by aggregating individual rank lists from multiple filter methods, using a novel rank aggregation method to generate a single, robust, and non-disjoint rank list. In the second stage, the aggregated ranked features are further preprocessed by an enhanced wrapper feature selection (EWFS) method based on a dynamic reranking strategy that is used to guide the feature subset selection process of the HFS method. This, in turn, reduces the number of evaluation cycles while amplifying or maintaining its prediction performance. The feasibility of the proposed RAHMFWFS was demonstrated on benchmarked software defect datasets with Naïve Bayes and Decision Tree classifiers, based on accuracy, the area under the curve (AUC), and F-measure values. The experimental results showed the effectiveness of RAHMFWFS in addressing filter rank selection and local optima stagnation problems in HFS, as well as the ability to select optimal features from SDP datasets while maintaining or enhancing the performance of SDP models. To conclude, the proposed RAHMFWFS achieved good performance by improving the prediction performances of SDP models across the selected datasets, compared to existing state-of-the-arts HFS methods.


2021 ◽  
Vol 11 (21) ◽  
pp. 10237
Author(s):  
Thaer Thaher ◽  
Atef Zaguia ◽  
Sana Al Azwari ◽  
Majdi Mafarja ◽  
Hamouda Chantar ◽  
...  

The students’ performance prediction (SPP) problem is a challenging problem that managers face at any institution. Collecting educational quantitative and qualitative data from many resources such as exam centers, virtual courses, e-learning educational systems, and other resources is not a simple task. Even after collecting data, we might face imbalanced data, missing data, biased data, and different data types such as strings, numbers, and letters. One of the most common challenges in this area is the large number of attributes (features). Determining the highly valuable features is needed to improve the overall students’ performance. This paper proposes an evolutionary-based SPP model utilizing an enhanced form of the Whale Optimization Algorithm (EWOA) as a wrapper feature selection to keep the most informative features and enhance the prediction quality. The proposed EWOA combines the Whale Optimization Algorithm (WOA) with Sine Cosine Algorithm (SCA) and Logistic Chaotic Map (LCM) to improve the overall performance of WOA. The SCA will empower the exploitation process inside WOA and minimize the probability of being stuck in local optima. The main idea is to enhance the worst half of the population in WOA using SCA. Besides, LCM strategy is employed to control the population diversity and improve the exploration process. As such, we handled the imbalanced data using the Adaptive Synthetic (ADASYN) sampling technique and converting WOA to binary variant employing transfer functions (TFs) that belong to different families (S-shaped and V-shaped). Two real educational datasets are used, and five different classifiers are employed: the Decision Trees (DT), k-Nearest Neighbors (k-NN), Naive Bayes (NB), Linear Discriminant Analysis (LDA), and LogitBoost (LB). The obtained results show that the LDA classifier is the most reliable classifier with both datasets. In addition, the proposed EWOA outperforms other methods in the literature as wrapper feature selection with selected transfer functions.


2021 ◽  
Vol 11 (5) ◽  
pp. 7714-7719
Author(s):  
S. Nuanmeesri ◽  
W. Sriurai

The goal of the current study is to develop a diagnosis model for chili pepper disease diagnosis by applying filter and wrapper feature selection methods as well as a Multi-Layer Perceptron Neural Network (MLPNN). The data used for developing the model include 1) types, 2) causative agents, 3) areas of infection, 4) growth stages of infection, 5) conditions, 6) symptoms, and 7) 14 types of chili pepper diseases. These datasets were applied to the 3 feature selection techniques, including information gain, gain ratio, and wrapper. After selecting the key features, the selected datasets were utilized to develop the diagnosis model towards the application of MLPNN. According to the model’s effectiveness evaluation results, estimated by 10-fold cross-validation, it can be seen that the diagnosis model developed by applying the wrapper method along with MLPNN provided the highest level of effectiveness, with an accuracy of 98.91%, precision of 98.92%, and recall of 98.89%. The findings showed that the developed model is applicable.


Author(s):  
Chiranjeevi Karri ◽  
M.S.R. Naidu ◽  
Vuppula Manohar ◽  
B. Suribabu Naick ◽  
G Rameshbabu

To improve the wrapper feature selection technique, swarm intelligence (SI) has been a preferred choice. The use of a binary whale optimization algorithm (BWOA) to handle the moleular descriptors selection problem in AMPHETAMINE-TYPE STIMULANTS (ATS) drug categorization has attracted this research. This work aims to improve the classifier's learning and prediction abilities in order to produce better classification results. BWOA are generated using S-shaped transfer functions, which are subsequently consolidated using a k-Nearest Neighbor (k-NN) classifier in the wrapper feature selection. Our goal is to see how different sigmoid transfer functions affect the significant feature selection and classification in BWOA. For performance assessment, several indicators and Wilcoxon's rank-sum test are used. The BWOA-S3 delivers performance improvements with the lowest fitness value, fast convergence, good classification accuracy, and a compact feature subset, according to experimental data. Three distinct classifiers also ratify the generalization of the best feature subset.


2021 ◽  
pp. 115756
Author(s):  
Debanshu Banerjee ◽  
Bitanu Chatterjee ◽  
Pratik Bhowal ◽  
Trinav Bhattacharyya ◽  
Samir Malakar ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document