scholarly journals Optimization for Gene Selection and Cancer Classification

Proceedings ◽  
2021 ◽  
Vol 74 (1) ◽  
pp. 21
Author(s):  
Hülya Başeğmez ◽  
Emrah Sezer ◽  
Çiğdem Selçukcan Erol

Recently, gene selection has played an important role in cancer diagnosis and classification. In this study, it was studied to select high descriptive genes for use in cancer diagnosis in order to develop a classification analysis for cancer diagnosis using microarray data. For this purpose, comparative analysis and intersections of six different methods obtained by using two feature selection algorithms and three search algorithms are presented. As a result of the six different feature subset selection methods applied, it was seen that instead of 15,155 genes, 24 genes should be focused. In this case, cancer diagnosis may be possible using 24 candidate genes that have been reduced, rather than similar studies involving larger features. However, in order to see the diagnostic success of diagnoses made using these candidate genes, they should be examined in a wet laboratory.

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1374
Author(s):  
Jemal Abawajy ◽  
Abdulbasit Darem ◽  
Asma A. Alhashmi

Malicious software (“malware”) has become one of the serious cybersecurity issues in Android ecosystem. Given the fast evolution of Android malware releases, it is practically not feasible to manually detect malware apps in the Android ecosystem. As a result, machine learning has become a fledgling approach for malware detection. Since machine learning performance is largely influenced by the availability of high quality and relevant features, feature selection approaches play key role in machine learning based detection of malware. In this paper, we formulate the feature selection problem as a quadratic programming problem and analyse how commonly used filter-based feature selection methods work with emphases on Android malware detection. We compare and contrast several feature selection methods along several factors including the composition of relevant features selected. We empirically evaluate the predictive accuracy of the feature subset selection algorithms and compare their predictive accuracy and the execution time using several learning algorithms. The results of the experiments confirm that feature selection is necessary for improving accuracy of the learning models as well decreasing the run time. The results also show that the performance of the feature selection algorithms vary from one learning algorithm to another and no one feature selection approach performs better than the other approaches all the time.


Author(s):  
ROSA BLANCO ◽  
PEDRO LARRAÑAGA ◽  
IÑAKI INZA ◽  
BASILIO SIERRA

Despite the fact that cancer classification has considerably improved, nowadays a general method that classifies known types of cancer has not yet been developed. In this work, we propose the use of supervised classification techniques, coupled with feature subset selection algorithms, to automatically perform this classification in gene expression datasets. Due to the large number of features of gene expression datasets, the search of a highly accurate combination of features is done by means of the new Estimation of Distribution Algorithms paradigm. In order to assess the accuracy level of the proposed approach, the naïve-Bayes classification algorithm is employed in a wrapper form. Promising results are achieved, in addition to a considerable reduction in the number of genes. Stating the optimal selection of genes as a search task, an automatic and robust choice in the genes finally selected is performed, in contrast to previous works that research the same types of problems.


2018 ◽  
Vol 7 (2.32) ◽  
pp. 39
Author(s):  
Dr Swarna Kuchibhotla ◽  
Mr Niranjan M.S.R

This paper mainly focuses on classification of various Acoustic emotional corpora with frequency domain features using feature subset selection methods. The emotional speech samples are classified into neutral,  happy, fear , anger,  disgust and sad  states by using properties of statistics  of spectral features estimated from Berlin and Spanish emotional utterances. The Sequential Forward Selection(SFS) and Sequential Floating Forward Selection(SFFS)feature subset selection algorithms are  for extracting more informative features. The number of speech emotional samples available for training is smaller than that of the number of features extracted from the speech sample in both Berlin and Spanish corpora which is called curse of dimensionality. Because of this  feature vector of high dimensionality the efficiency of the classifier decreases and at the same time the computational time also increases. For additional  improvement in the efficiency of the classifier  a subset of  features which are optimal is needed and is obtained by using feature subset selection methods. This will enhances the performance of the system with high efficiency and lower computation time. The classifier used in this work is the standard K Nearest Neighbour (KNN) Classifier. Experimental evaluation   proved  that the performance of the classifier is enhanced with SFFS because it vanishes the nesting effect suffered by SFS. The results also showed that an optimal feature subset is a better choice for classification rather than full feature set.  


Sign in / Sign up

Export Citation Format

Share Document