Genetic Algorithm-based Feature Selection Approach for Enhancing the Effectiveness of Similarity Searching in Ligand-based Virtual Screening

Background: In the last years, similarity searching has gained wide popularity as a method for performing Ligand-Based Virtual Screening (LBVS). This screening technique functions by making a comparison of the target compound’s features with that of each compound in the database of compounds. It is well known that none of the individual similarity measures could provide the best performances each time pertaining to an active compound structure, representing all types of activity classes. In the literature, we find several techniques and strategies that have been proposed to improve the overall effectiveness of ligand-based virtual screening approaches. Objective: In this work, our main objective is to propose a features selection approach based on genetic algorithm (FSGASS) to improve similarity searching pertaining to ligand-based virtual screening. Methods: Our contribution allows us to identify the most important and relevant characteristics of chemical compounds and to minimize their number in their representations. This will allow the reduction of features space, the elimination of redundancy, the reduction of training execution time, and the increase of the performance of the screening process. Results: The obtained results demonstrate superiority in the performance compared with these obtained with Tanimoto coefficient, which is considered as the most widely coefficient to quantify the similarity in the domain of LBVS. Conclusion: Our results show that significant improvements can be obtained by using molecular similarity research methods at the basis of features selection.

Download Full-text

Genetic algorithm based feature selection approach for effective intrusion detection system

2015 International Conference on Computer Communication and Informatics (ICCCI) ◽

10.1109/iccci.2015.7218109 ◽

2015 ◽

Cited By ~ 12

Author(s):

Ketan Sanjay Desale ◽

Roshani Ade

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

An effective feature selection approach driven genetic algorithm wrapped Bayes naïve

International Journal of Data Analysis Techniques and Strategies ◽

10.1504/ijdats.2016.079056 ◽

2016 ◽

Vol 8 (3) ◽

pp. 220

Author(s):

Sidahmed Mokeddem ◽

Baghdad Atmani ◽

Mostefa Mokaddem

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

A Composite Hybrid Feature Selection Learning-Based Optimization of Genetic Algorithm For Breast Cancer Detection

10.20944/preprints202003.0298.v1 ◽

2020 ◽

Author(s):

Ahmed Abdullah Farid ◽

Gamal Selim ◽

Hatem Khater

Keyword(s):

Breast Cancer ◽

Genetic Algorithm ◽

Feature Selection ◽

Early Stage ◽

Fitness Function ◽

Support Vector ◽

Initial Population ◽

Tree Classifier ◽

Selection Approach ◽

Feature Selection Approach

Breast cancer is a significant health issue across the world. Breast cancer is the most widely-diagnosed cancer in women; early-stage diagnosis of disease and therapies increase patient safety. This paper proposes a synthetic model set of features focused on the optimization of the genetic algorithm (CHFS-BOGA) to forecast breast cancer. This hybrid feature selection approach combines the advantages of three filter feature selection approaches with an optimize Genetic Algorithm (OGA) to select the best features to improve the performance of the classification process and scalability. We propose OGA by improving the initial population generating and genetic operators using the results of filter approaches as some prior information with using the C4.5 decision tree classifier as a fitness function instead of probability and random selection. The authors collected available updated data from Wisconsin UCI machine learning with a total of 569 rows and 32 columns. The dataset evaluated using an explorer set of weka data mining open-source software for the analysis purpose. The results show that the proposed hybrid feature selection approach significantly outperforms the single filter approaches and principal component analysis (PCA) for optimum feature selection. These characteristics are good indicators for the return prediction. The highest accuracy achieved with the proposed system before (CHFS-BOGA) using the support vector machine (SVM) classifiers was 97.3%. The highest accuracy after (CHFS-BOGA-SVM) was 98.25% on split 70.0% train, remainder test, and 100% on the full training set. Moreover, the receiver operating characteristic (ROC) curve was equal to 1.0. The results showed that the proposed (CHFS-BOGA-SVM) system was able to accurately classify the type of breast tumor, whether malignant or benign.

Download Full-text