A novel bagging approach for variable ranking and selection via a mixed importance measure

2017 ◽  
Vol 45 (10) ◽  
pp. 1734-1755
Author(s):  
Chun-Xia Zhang ◽  
Jiang-She Zhang ◽  
Guan-Wei Wang ◽  
Nan-Nan Ji
2016 ◽  
Vol 31 (4) ◽  
pp. 1237-1262 ◽  
Author(s):  
Chun-Xia Zhang ◽  
Jiang-She Zhang ◽  
Sang-Woon Kim

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sofia Kapsiani ◽  
Brendan J. Howlin

AbstractAgeing is a major risk factor for many conditions including cancer, cardiovascular and neurodegenerative diseases. Pharmaceutical interventions that slow down ageing and delay the onset of age-related diseases are a growing research area. The aim of this study was to build a machine learning model based on the data of the DrugAge database to predict whether a chemical compound will extend the lifespan of Caenorhabditis elegans. Five predictive models were built using the random forest algorithm with molecular fingerprints and/or molecular descriptors as features. The best performing classifier, built using molecular descriptors, achieved an area under the curve score (AUC) of 0.815 for classifying the compounds in the test set. The features of the model were ranked using the Gini importance measure of the random forest algorithm. The top 30 features included descriptors related to atom and bond counts, topological and partial charge properties. The model was applied to predict the class of compounds in an external database, consisting of 1738 small-molecules. The chemical compounds of the screening database with a predictive probability of ≥ 0.80 for increasing the lifespan of Caenorhabditis elegans were broadly separated into (1) flavonoids, (2) fatty acids and conjugates, and (3) organooxygen compounds.


2019 ◽  
Vol 35 (19) ◽  
pp. 3663-3671 ◽  
Author(s):  
Stephan Seifert ◽  
Sven Gundlach ◽  
Silke Szymczak

Abstract Motivation It has been shown that the machine learning approach random forest can be successfully applied to omics data, such as gene expression data, for classification or regression and to select variables that are important for prediction. However, the complex relationships between predictor variables, in particular between causal predictor variables, make the interpretation of currently applied variable selection techniques difficult. Results Here we propose a new variable selection approach called surrogate minimal depth (SMD) that incorporates surrogate variables into the concept of minimal depth (MD) variable importance. Applying SMD, we show that simulated correlation patterns can be reconstructed and that the increased consideration of variable relationships improves variable selection. When compared with existing state-of-the-art methods and MD, SMD has higher empirical power to identify causal variables while the resulting variable lists are equally stable. In conclusion, SMD is a promising approach to get more insight into the complex interplay of predictor variables and outcome in a high-dimensional data setting. Availability and implementation https://github.com/StephanSeifert/SurrogateMinimalDepth. Supplementary information Supplementary data are available at Bioinformatics online.


2013 ◽  
Vol 842 ◽  
pp. 746-749
Author(s):  
Bo Yang ◽  
Liang Zhang

A novel sparse weighted LSSVM classifier is proposed in this paper, which is based on Suykens weighted LSSVM. Unlike Suykens weighted LSSVM, the proposed weighted method is more suitable for classification. The distance between sample and classification border is used as the sample importance measure in our weighted method. Based on this importance measure, a new weight calculating function, using which can adjust the sparseness of weight, is designed. In order to solve the imbalance problem, a kind of normalization weights calculating method is proposed. Finally, the proposed method is used on digit recognition. Comparative experiment results show that the proposed sparse weighted LSSVM can improve the recognition correct rate effectively.


2017 ◽  
Vol 65 (1) ◽  
pp. 54-62
Author(s):  
Justin Newton Scanlan ◽  
Natasha A. Lannin ◽  
Tammy Hoffmann ◽  
Mandy Stanley ◽  
Rachael McDonald

Sign in / Sign up

Export Citation Format

Share Document