Preserving Ordinal Consensus: Towards Feature Selection for Unlabeled Data

To better pre-process unlabeled data, most existing feature selection methods remove redundant and noisy information by exploring some intrinsic structures embedded in samples. However, these unsupervised studies focus too much on the relations among samples, totally neglecting the feature-level geometric information. This paper proposes an unsupervised triplet-induced graph to explore a new type of potential structure at feature level, and incorporates it into simultaneous feature selection and clustering. In the feature selection part, we design an ordinal consensus preserving term based on a triplet-induced graph. This term enforces the projection vectors to preserve the relative proximity of original features, which contributes to selecting more relevant features. In the clustering part, Self-Paced Learning (SPL) is introduced to gradually learn from ‘easy’ to ‘complex’ samples. SPL alleviates the dilemma of falling into the bad local minima incurred by noise and outliers. Specifically, we propose a compelling regularizer for SPL to obtain a robust loss. Finally, an alternating minimization algorithm is developed to efficiently optimize the proposed model. Extensive experiments on different benchmark datasets consistently demonstrate the superiority of our proposed method.

Download Full-text

BETTER ALTERNATIVES FOR STEPWISE DISCRIMINANT ANALYSIS

Acta Universitatis Lodziensis Folia oeconomica ◽

10.18778/0208-6018.311.02 ◽

2015 ◽

Vol 1 (311) ◽

Author(s):

Katarzyna Stąpor

Keyword(s):

Feature Selection ◽

Discriminant Analysis ◽

Tabu Search ◽

Stepwise Discriminant Analysis ◽

Selection Methods ◽

Discrimination Power ◽

Statistical Software ◽

Software Packages ◽

Benchmark Datasets

Discriminant Analysis can best be defined as a technique which allows the classification of an individual into several dictinctive populations on the basis of a set of measurements. Stepwise discriminant analysis (SDA) is concerned with selecting the most important variables whilst retaining the highest discrimination power possible. The process of selecting a smaller number of variables is often necessary for a variety number of reasons. In the existing statistical software packages SDA is based on the classic feature selection methods. Many problems with such stepwise procedures have been identified. In this work the new method based on the metaheuristic strategy tabu search will be presented together with the experimental results conducted on the selected benchmark datasets. The results are promising.

Download Full-text

Combination of Active Learning and Semi-Supervised Learning under a Self-Training Scheme

Entropy ◽

10.3390/e21100988 ◽

2019 ◽

Vol 21 (10) ◽

pp. 988 ◽

Cited By ~ 4

Author(s):

Fazakis ◽

Kanas ◽

Aridas ◽

Karlos ◽

Kotsiantis

Keyword(s):

Active Learning ◽

Supervised Learning ◽

Unlabeled Data ◽

Classification Algorithms ◽

Training Phase ◽

Learning Methods ◽

Training Scheme ◽

Wide Range ◽

Benchmark Datasets ◽

Scientific Fields

One of the major aspects affecting the performance of the classification algorithms is the amount of labeled data which is available during the training phase. It is widely accepted that the labeling procedure of vast amounts of data is both expensive and time-consuming since it requires the employment of human expertise. For a wide variety of scientific fields, unlabeled examples are easy to collect but hard to handle in a useful manner, thus improving the contained information for a subject dataset. In this context, a variety of learning methods have been studied in the literature aiming to efficiently utilize the vast amounts of unlabeled data during the learning process. The most common approaches tackle problems of this kind by individually applying active learning or semi-supervised learning methods. In this work, a combination of active learning and semi-supervised learning methods is proposed, under a common self-training scheme, in order to efficiently utilize the available unlabeled data. The effective and robust metrics of the entropy and the distribution of probabilities of the unlabeled set, to select the most sufficient unlabeled examples for the augmentation of the initial labeled set, are used. The superiority of the proposed scheme is validated by comparing it against the base approaches of supervised, semi-supervised, and active learning in the wide range of fifty-five benchmark datasets.

Download Full-text

Discriminative Feature Selection for Multiple Ocular Diseases Classification by Sparse Induced Graph Regularized Group Lasso

Lecture Notes in Computer Science - Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015 ◽

10.1007/978-3-319-24571-3_2 ◽

2015 ◽

pp. 11-19 ◽

Cited By ~ 3

Author(s):

Xiangyu Chen ◽

Yanwu Xu ◽

Shuicheng Yan ◽

Tat-Seng Chua ◽

Damon Wing Kee Wong ◽

...

Keyword(s):

Feature Selection ◽

Group Lasso ◽

Ocular Diseases ◽

Discriminative Feature ◽

Selection For ◽

Induced Graph

Download Full-text

Accurate prediction of DNA N4-methylcytosine sites via boost-learning various types of sequence features

BMC Genomics ◽

10.1186/s12864-020-07033-8 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Zhixun Zhao ◽

Xiaocai Zhang ◽

Fang Chen ◽

Liang Fang ◽

Jinyan Li

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Case Studies ◽

Learning Algorithm ◽

State Of The Art ◽

Learning Algorithms ◽

Feature Space ◽

Sequence Features ◽

Independent Test ◽

Benchmark Datasets

Abstract Background DNA N4-methylcytosine (4mC) is a critical epigenetic modification and has various roles in the restriction-modification system. Due to the high cost of experimental laboratory detection, computational methods using sequence characteristics and machine learning algorithms have been explored to identify 4mC sites from DNA sequences. However, state-of-the-art methods have limited performance because of the lack of effective sequence features and the ad hoc choice of learning algorithms to cope with this problem. This paper is aimed to propose new sequence feature space and a machine learning algorithm with feature selection scheme to address the problem. Results The feature importance score distributions in datasets of six species are firstly reported and analyzed. Then the impact of the feature selection on model performance is evaluated by independent testing on benchmark datasets, where ACC and MCC measurements on the performance after feature selection increase by 2.3% to 9.7% and 0.05 to 0.19, respectively. The proposed method is compared with three state-of-the-art predictors using independent test and 10-fold cross-validations, and our method outperforms in all datasets, especially improving the ACC by 3.02% to 7.89% and MCC by 0.06 to 0.15 in the independent test. Two detailed case studies by the proposed method have confirmed the excellent overall performance and correctly identified 24 of 26 4mC sites from the C.elegans gene, and 126 out of 137 4mC sites from the D.melanogaster gene. Conclusions The results show that the proposed feature space and learning algorithm with feature selection can improve the performance of DNA 4mC prediction on the benchmark datasets. The two case studies prove the effectiveness of our method in practical situations.

Download Full-text

Fuzzy Mutual Information Feature Selection Based on Representative Samples

International Journal of Software Innovation ◽

10.4018/ijsi.2018010105 ◽

2018 ◽

Vol 6 (1) ◽

pp. 58-72

Author(s):

Omar A. M. Salem ◽

Liwei Wang

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Classification Performance ◽

Feature Subset ◽

Classification Models ◽

Negative Effect ◽

Benchmark Datasets ◽

Real World Datasets ◽

And Storage ◽

Representative Samples

Building classification models from real-world datasets became a difficult task, especially in datasets with high dimensional features. Unfortunately, these datasets may include irrelevant or redundant features which have a negative effect on the classification performance. Selecting the significant features and eliminating undesirable features can improve the classification models. Fuzzy mutual information is widely used feature selection to find the best feature subset before classification process. However, it requires more computation and storage space. To overcome these limitations, this paper proposes an improved fuzzy mutual information feature selection based on representative samples. Based on benchmark datasets, the experiments show that the proposed method achieved better results in the terms of classification accuracy, selected feature subset size, storage, and stability.

Download Full-text

Feature selection for damage degree classification of planetary gearboxes using support vector machine

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1177/0954406211404853 ◽

2011 ◽

Vol 225 (9) ◽

pp. 2250-2264 ◽

Cited By ~ 26

Author(s):

J Qu ◽

Z Liu ◽

M J Zuo ◽

H-Z Huang

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Planetary Gear ◽

Support Vector ◽

Classification Problems ◽

Damage Degree ◽

Binary Model ◽

Benchmark Datasets ◽

Degree Classification ◽

Feature Dimension

Feature selection is an effective way of improving classification, reducing feature dimension, and speeding up computation. This work studies a reported support vector machine (SVM) based method of feature selection. Our results reveal discrepancies in both its feature ranking and feature selection schemes. Modifications are thus made on which our SVM-based method of feature selection is proposed. Using the weighting fusion technique and the one-against-all approach, our binary model has been extensively updated for multi-class classification problems. Three benchmark datasets are employed to demonstrate the performance of the proposed method. The multi-class model of the proposed method is also used for feature selection in planetary gear damage degree classification. The results of all datasets exhibit the consistently effective classification made possible by the proposed method.

Download Full-text

Fusion of Feature Selection and Random Forest for an Anomaly-Based Intrusion Detection System

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8332 ◽

2019 ◽

Vol 16 (8) ◽

pp. 3603-3607 ◽

Cited By ~ 1

Author(s):

Shraddha Khonde ◽

V. Ulagamuthalvi

Keyword(s):

Feature Selection ◽

Random Forest ◽

Intrusion Detection ◽

Real Time ◽

Intrusion Detection System ◽

New Technologies ◽

Detection System ◽

Sensitive Data ◽

Detection Systems ◽

New Type

Considering current network scenario hackers and intruders has become a big threat today. As new technologies are emerging fast, extensive use of these technologies and computers, what plays an important role is security. Most of the computers in network can be easily compromised with attacks. Big issue of concern is increase in new type of attack these days. Security to the sensitive data is very big threat to deal with, it need to consider as high priority issue which should be addressed immediately. Highly efficient Intrusion Detection Systems (IDS) are available now a days which detects various types of attacks on network. But we require the IDS which is intelligent enough to detect and analyze all type of new threats on the network. Maximum accuracy is expected by any of this intelligent intrusion detection system. An Intrusion Detection System can be hardware or software that analyze and monitors all activities of network to detect malicious activities happened inside the network. It also informs and helps administrator to deal with malicious packets, which if enters in network can harm more number of computers connected together. In our work we have implemented an intellectual IDS which helps administrator to analyze real time network traffic. IDS does it by classifying packets entering into the system as normal or malicious. This paper mainly focus on techniques used for feature selection to reduce number of features from KDD-99 dataset. This paper also explains algorithm used for classification i.e., Random Forest which works with forest of trees to classify real time packet as normal or malicious. Random forest makes use of ensembling techniques to give final output which is derived by combining output from number of trees used to create forest. Dataset which is used while performing experiments is KDD-99. This dataset is used to train all trees to get more accuracy with help of random forest. From results achieved we can observe that random forest algorithm gives more accuracy in distributed network with reduced false alarm rate.

Download Full-text

Constructive Lazy Wolf Search Algorithm for Feature Selection in Classification

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213019500167 ◽

2019 ◽

Vol 28 (05) ◽

pp. 1950016

Author(s):

H. Benjamin Fredrick David ◽

A. Suruliandi ◽

S. P. Raja

Keyword(s):

Feature Selection ◽

Search Algorithm ◽

Search Time ◽

Fine Tuning ◽

Natural Factors ◽

Benchmark Datasets ◽

Database Technology ◽

Hidden Patterns ◽

Improved Performance ◽

The Stability

Data mining integrates statistical analysis, machine learning and database technology to extract hidden patterns and relationships from data. The presence of irrelevant, redundant and inconsistent attributes in the data ushers poor classification accuracy. In this paper, a novel bio-inspired heuristic swarm optimization algorithm for feature selection, namely Constructive Lazy Wolf Search Algorithm is proposed based on the backbone of the Wolf Search Algorithm. It is based on the behavior of the real wolves, which search for their food and consequently survive the attacks of the threats by avoiding them. Based on the study conducted on the behavior of wolves two natural factors, namely laziness and health are introduced for attaining highest efficiency. Restricting and controlling the wolves’ behavior by allowing only healthy and constructive lazy wolves to take part in the search reduces the search time and complexity required to search for the best fitness. The proposed algorithm is then applied on a prisoner dataset for crime propensity prediction along with a few benchmark datasets to prove the stability in the improved performance compared with other bio-inspired optimization algorithms. The accuracy achieved by fine-tuning the proposed algorithm was 98.19% providing accurate crime prevention.

Download Full-text

Rule Extraction From Neuro-fuzzy System for Classification Using Feature Weights

International Journal of Fuzzy System Applications ◽

10.4018/ijfsa.2020040103 ◽

2020 ◽

Vol 9 (2) ◽

pp. 59-79

Author(s):

Heisnam Rohen Singh ◽

Saroj Kr Biswas

Keyword(s):

Feature Selection ◽

Fuzzy System ◽

Fuzzy Systems ◽

False Positive Rate ◽

Linguistic Features ◽

Neuro Fuzzy ◽

Quantitative Accuracy ◽

Positive Rate ◽

Benchmark Datasets ◽

Improved Accuracy

Recent trends in data mining and machine learning focus on knowledge extraction and explanation, to make crucial decisions from data, but data is virtually enormous in size and mostly associated with noise. Neuro-fuzzy systems are most suitable for representing knowledge in a data-driven environment. Many neuro-fuzzy systems were proposed for feature selection and classification; however, they focus on quantitative (accuracy) than qualitative (transparency). Such neuro-fuzzy systems for feature selection and classification include Enhance Neuro-Fuzzy (ENF) and Adaptive Dynamic Clustering Neuro-Fuzzy (ADCNF). Here a neuro-fuzzy system is proposed for feature selection and classification with improved accuracy and transparency. The novelty of the proposed system lies in determining a significant number of linguistic features for each input and in suggesting a compelling order of classification rules using the importance of input feature and the certainty of the rules. The performance of the proposed system is tested with 8 benchmark datasets. 10-fold cross-validation is used to compare the accuracy of the systems. Other performance measures such as false positive rate, precision, recall, f-measure, Matthews correlation coefficient and Nauck's index are also used for comparing the systems. It is observed from the experimental results that the proposed system is superior to the existing neuro-fuzzy systems.

Download Full-text

Feature Extraction for Kernel Minimum Squared Error by Sparsity Shrinkage

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.536-537.450 ◽

2014 ◽

Vol 536-537 ◽

pp. 450-453 ◽

Cited By ~ 2

Author(s):

Jiang Jiang ◽

Xi Chen ◽

Hai Tao Gan

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Optimization Problem ◽

Subset Selection ◽

Experimental Results ◽

Computational Burden ◽

Squared Error ◽

Kernel Minimum Squared Error ◽

Benchmark Datasets ◽

Training Examples

In this paper, a sparsity based model is proposed for feature selection in kernel minimum squared error (KMSE). By imposing a sparsity shrinkage term, we formulate the procedure of subset selection as an optimization problem. With the chosen small portion of training examples, the computational burden of feature extraction is largely alleviated. Experimental results conducted on several benchmark datasets indicate the effectivity and efficiency of our method.

Download Full-text