Combining clustering of variables and feature selection using random forests

Communications in Statistics - Simulation and Computation ◽

10.1080/03610918.2018.1563145 ◽

2019 ◽

pp. 1-20 ◽

Author(s):

Marie Chavent ◽

Robin Genuer ◽

Jérôme Saracco

Keyword(s):

Feature Selection ◽

Random Forests ◽

Clustering Of Variables

Download Full-text

Feature selection using different evaluate strategy and random forests

10.1109/icceai52939.2021.00062 ◽

2021 ◽

Author(s):

Zhuo Wang ◽

Huan Li ◽

Bin Nie ◽

Jianqiang Du ◽

Yuwen Du ◽

...

Keyword(s):

Feature Selection ◽

Download Full-text

Random Forests with Latent Variables to Foster Feature Selection in the Context of Highly Correlated Variables. Illustration with a Bioinformatics Application.

Advances in Intelligent Data Analysis XVII - Lecture Notes in Computer Science ◽

10.1007/978-3-030-01768-2_24 ◽

2018 ◽

pp. 290-302

Author(s):

Christine Sinoquet ◽

Kamel Mekhnacha

Keyword(s):

Feature Selection ◽

Random Forests ◽

Latent Variables ◽

Correlated Variables ◽

Bioinformatics Application ◽

Highly Correlated

Download Full-text

Random KNN feature selection - a fast and stable alternative to Random Forests

BMC Bioinformatics ◽

10.1186/1471-2105-12-450 ◽

2011 ◽

Vol 12 (1) ◽

Author(s):

Shengqiao Li ◽

E James Harner ◽

Donald A Adjeroh

Keyword(s):

Feature Selection ◽

Download Full-text

Random Forests Followed by Computed ABC Analysis as a Feature Selection Method for Machine Learning in Biomedical Data

Studies in Classification, Data Analysis, and Knowledge Organization - Advanced Studies in Classification and Data Science ◽

10.1007/978-981-15-3311-2_5 ◽

2020 ◽

pp. 57-69

Author(s):

Jörn Lötsch ◽

Alfred Ultsch

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forests ◽

Feature Selection Method ◽

Selection Method ◽

Biomedical Data ◽

Download Full-text

A Hybrid Improved Ant Colony Optimization and Random Forests Feature Selection Method for Microarray Data

2009 Fifth International Joint Conference on INC, IMS and IDC ◽

10.1109/ncm.2009.66 ◽

2009 ◽

Author(s):

Wen Xiong ◽

Cong Wang

Keyword(s):

Feature Selection ◽

Ant Colony Optimization ◽

Random Forests ◽

Microarray Data ◽

Feature Selection Method ◽

Selection Method ◽

Download Full-text

A Feature Selection Algorithm Based on Graph Theory and Random Forests for Protein Secondary Structure Prediction

Bioinformatics Research and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-540-72031-7_54 ◽

2007 ◽

pp. 590-600 ◽

Author(s):

Gulsah Altun ◽

Hae-Jin Hu ◽

Stefan Gremalschi ◽

Robert W. Harrison ◽

Yi Pan

Keyword(s):

Feature Selection ◽

Graph Theory ◽

Secondary Structure ◽

Random Forests ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Protein Secondary Structure Prediction

Download Full-text

Combining Multiple Feature-Ranking Techniques and Clustering of Variables for Feature Selection

IEEE Access ◽

10.1109/access.2019.2947701 ◽

2019 ◽

Vol 7 ◽

pp. 151482-151492 ◽

Author(s):

Anwar Ul Haq ◽

Defu Zhang ◽

He Peng ◽

Sami Ur Rahman

Keyword(s):

Feature Selection ◽

Feature Ranking ◽

Multiple Feature ◽

Clustering Of Variables

Download Full-text

On the Interpretability of Machine Learning Models and Experimental Feature Selection in Case of Multicollinear Data

Electronics ◽

10.3390/electronics9050761 ◽

2020 ◽

Vol 9 (5) ◽

pp. 761

Author(s):

Franc Drobnič ◽

Andrej Kos ◽

Matevž Pustišek

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forests ◽

Experimental Approach ◽

Feature Selection Method ◽

Model Quality ◽

Human In The Loop ◽

Model Interpretation ◽

Small Feature ◽

Original Feature

In the field of machine learning, a considerable amount of research is involved in the interpretability of models and their decisions. The interpretability contradicts the model quality. Random Forests are among the best quality technologies of machine learning, but their operation is of “black box” character. Among the quantifiable approaches to the model interpretation, there are measures of association of predictors and response. In case of the Random Forests, this approach usually consists of calculating the model’s feature importances. Known methods, including the built-in one, are less suitable in settings with strong multicollinearity of features. Therefore, we propose an experimental approach to the feature selection task, a greedy forward feature selection method with least-trees-used criterion. It yields a set of most informative features that can be used in a machine learning (ML) training process with similar prediction quality as the original feature set. We verify the results of the proposed method on two known datasets, one with small feature multicollinearity and another with large feature multicollinearity. The proposed method also allows for a domain expert help with selecting among equally important features, which is known as the human-in-the-loop approach.

Download Full-text

Hybrid biogeography based simultaneous feature selection and MHC class I peptide binding prediction using support vector machines and random forests

Journal of Immunological Methods ◽

10.1016/j.jim.2012.09.013 ◽

2013 ◽

Vol 387 (1-2) ◽

pp. 284-292 ◽

Author(s):

Atulji Srivastava ◽

Shameek Ghosh ◽

N. Anantharaman ◽

V.K. Jayaraman

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Mhc Class I ◽

Random Forests ◽

Peptide Binding ◽

Class I ◽

Support Vector ◽

Binding Prediction ◽

Vector Machines ◽

Peptide Binding Prediction

Download Full-text

Random forests for feature selection in QSPR Models - an application for predicting standard enthalpy of formation of hydrocarbons

Journal of Cheminformatics ◽

10.1186/1758-2946-5-9 ◽

2013 ◽

Vol 5 (1) ◽

Author(s):

Ana L Teixeira ◽

João P Leal ◽

Andre O Falcao

Keyword(s):

Feature Selection ◽

Enthalpy Of Formation ◽

Random Forests ◽

Standard Enthalpy ◽

Standard Enthalpy Of Formation

Download Full-text