scholarly journals Optimal combination of feature selection and classification via local hyperplane based learning strategy

2015 ◽  
Vol 16 (1) ◽  
Author(s):  
Xiaoping Cheng ◽  
Hongmin Cai ◽  
Yue Zhang ◽  
Bo Xu ◽  
Weifeng Su
Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Xianghua Chu ◽  
Shuxiang Li ◽  
Da Gao ◽  
Wei Zhao ◽  
Jianshuang Cui ◽  
...  

This paper aims to propose an improved learning algorithm for feature selection, termed as binary superior tracking artificial bee colony with dynamic Cauchy mutation (BSTABC-DCM). To enhance exploitation capacity, a binary learning strategy is proposed to enable each bee to learn from the superior individuals in each dimension. A dynamic Cauchy mutation is introduced to diversify the population distribution. Ten datasets from UCI repository are adopted as test problems, and the average results of cross-validation of BSTABC-DCM are compared with other seven popular swarm intelligence metaheuristics. Experimental results demonstrate that BSTABC-DCM could obtain the optimal classification accuracy and select the best representative features for the UCI problems.


2017 ◽  
pp. 108-115
Author(s):  
Є.В. БОДЯНСЬКИЙ ◽  
І.Г. ПЕРОВА ◽  
Г.В. СТОЙКА

Feature Selection task is one of most complicated and actual in Data Mining area. Any approaches for it solving are based on non-mathematical and presentative hypothesis. New approach for evaluation of medical features information quantity, based on optimal combination of Feature Selection and Feature Extraction methods. This approach permits to produce optimal reduced number of features with linguistic interpreting of each ones. Hybrid system of Feature Selection/Extraction is proposed. This system is numerically simple, can produce Feature Selection/ Extraction with any number of features using standard method of principal component analysis and calculating distance between first principal component and all medical features.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e8812 ◽  
Author(s):  
Tao Jin ◽  
Chi Wang ◽  
Suyan Tian

Multiple sclerosis (MS) is one of the most common neurological disabilities of the central nervous system. Immune-modulatory therapy with Interferon-β (IFN-β) is a commonly used first-line treatment to prevent MS patients from relapses. Nevertheless, a large proportion of MS patients on IFN-β therapy experience their first relapse within 2 years of treatment initiation. Feature selection, a machine learning strategy, is routinely used in the fields of bioinformatics and computational biology to determine which subset of genes is most relevant to an outcome of interest. The majority of feature selection methods focus on alterations in gene expression levels. In this study, we sought to determine which genes are most relevant to relapse of MS patients on IFN-β therapy. Rather than the usual focus on alterations in gene expression levels, we devised a feature selection method based on alterations in gene-to-gene interactions. In this study, we applied the proposed method to a longitudinal microarray dataset and evaluated the IFN-β effect on MS patients to identify gene pairs with differentially correlated edges that are consistent over time in the responder group compared to the non-responder group. The resulting gene list had a good predictive ability on an independent validation set and explicit biological implications related to MS. To conclude, it is anticipated that the proposed method will gain widespread interest and application in personalized treatment research to facilitate prediction of which patients may respond to a specific regimen.


2015 ◽  
Vol 24 (05) ◽  
pp. 1540023 ◽  
Author(s):  
Ioannis Tsamardinos ◽  
Amin Rakhshani ◽  
Vincenzo Lagani

In a typical supervised data analysis task, one needs to perform the following two tasks: (a) select an optimal combination of learning methods (e.g., for variable selection and classifier) and tune their hyper-parameters (e.g., K in K-NN), also called model selection, and (b) provide an estimate of the performance of the final, reported model. Combining the two tasks is not trivial because when one selects the set of hyper-parameters that seem to provide the best estimated performance, this estimation is optimistic (biased/overfitted) due to performing multiple statistical comparisons. In this paper, we discuss the theoretical properties of performance estimation when model selection is present and we confirm that the simple Cross-Validation with model selection is indeed optimistic (overestimates performance) in small sample scenarios and should be avoided. We present in detail and investigate the theoretical properties of the Nested Cross Validation and a method by Tibshirani and Tibshirani for removing the estimation bias. In computational experiments with real datasets both protocols provide conservative estimation of performance and should be preferred. These statements hold true even if feature selection is performed as preprocessing.


Author(s):  
Bing Li ◽  
Yong-Ping Zhao

Lacking of the management of simultaneous fault is one of the limitations of condition monitoring for a gas turbine, which is critical for the safety and decision-making of aircraft operation. To this end, this paper employed a multi-label (ML) learning strategy to address the simultaneous fault issues. Moreover, a feature selection algorithm is proposed, which is based on the viewpoint that different class labels might be distinguished by certain specific characteristics of their own. The proposed algorithm achieves the goal of label-specific feature selection by iteratively optimizing the weight reconstruction matrix, and the learned label-specific features for the corresponding label can be used for multi-label classification. As thus, sensor data for different components of aircraft engines can be determined by the proposed algorithm to deal with the simultaneous fault diagnosis. Finally, comprehensive experiments on the benchmark data sets of multi-label learning validate the advantages and feasibility of the presented approaches, and the effectiveness of their application to simultaneous fault diagnosis of aircraft engines is also proved by extensive experiments.


Sign in / Sign up

Export Citation Format

Share Document