Optimal Bayesian Feature Selection with Bounded False Discovery Rate

Author(s):  
Ali Foroughi pour ◽  
Lori A. Dalton
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
David Chardin ◽  
Olivier Humbert ◽  
Caroline Bailleux ◽  
Fanny Burel-Vandenbos ◽  
Valerie Rigau ◽  
...  

Abstract Background Supervised classification methods have been used for many years for feature selection in metabolomics and other omics studies. We developed a novel primal-dual based classification method (PD-CR) that can perform classification with rejection and feature selection on high dimensional datasets. PD-CR projects data onto a low dimension space and performs classification by minimizing an appropriate quadratic cost. It simultaneously optimizes the selected features and the prediction accuracy with a new tailored, constrained primal-dual method. The primal-dual framework is general enough to encompass various robust losses and to allow for convergence analysis. Here, we compare PD-CR to three commonly used methods: partial least squares discriminant analysis (PLS-DA), random forests and support vector machines (SVM). We analyzed two metabolomics datasets: one urinary metabolomics dataset concerning lung cancer patients and healthy controls; and a metabolomics dataset obtained from frozen glial tumor samples with mutated isocitrate dehydrogenase (IDH) or wild-type IDH. Results PD-CR was more accurate than PLS-DA, Random Forests and SVM for classification using the 2 metabolomics datasets. It also selected biologically relevant metabolites. PD-CR has the advantage of providing a confidence score for each prediction, which can be used to perform classification with rejection. This substantially reduces the False Discovery Rate. Conclusion PD-CR is an accurate method for classification of metabolomics datasets which can outperform PLS-DA, Random Forests and SVM while selecting biologically relevant features. Furthermore the confidence score provided with PD-CR can be used to perform classification with rejection and reduce the false discovery rate.


2008 ◽  
Vol 1 (2) ◽  
pp. 57-66 ◽  
Author(s):  
Seoung Bum Kim ◽  
Victoria C. P. Chen ◽  
Youngja Park ◽  
Thomas R. Ziegler ◽  
Dean P. Jones

2018 ◽  
Vol 15 (4) ◽  
pp. 1066-1078 ◽  
Author(s):  
Alexej Gossmann ◽  
Shaolong Cao ◽  
Damian Brzyski ◽  
Lan-Juan Zhao ◽  
Hong-Wen Deng ◽  
...  

Genetics ◽  
2003 ◽  
Vol 164 (2) ◽  
pp. 829-833
Author(s):  
Chiara Sabatti ◽  
Susan Service ◽  
Nelson Freimer

Abstract We explore the implications of the false discovery rate (FDR) controlling procedure in disease gene mapping. With the aid of simulations, we show how, under models commonly used, the simple step-down procedure introduced by Benjamini and Hochberg controls the FDR for the dependent tests on which linkage and association genome screens are based. This adaptive multiple comparison procedure may offer an important tool for mapping susceptibility genes for complex diseases.


2019 ◽  
Vol 21 (Supplement_3) ◽  
pp. iii71-iii71
Author(s):  
T Kaisman-Elbaz ◽  
Y Elbaz ◽  
V Merkin ◽  
L Dym ◽  
A Noy ◽  
...  

Abstract BACKGROUND Glioblastoma is known for its dismal prognosis though its dependency on patients’ readily available RBCs parameters defining the patient’s anemic status such as hemoglobin level and Red blood cells distribution Width (RDW) is not fully established. Several works demonstrated a connection between low hemoglobin level or high RDW values to overall glioblastoma patient’s survival, but in other works, a clear connection was not found. This study addresses this unclarity. MATERIAL AND METHODS In this work, 170 glioblastoma patients, diagnosed and treated in Soroka University Medical Center (SUMC) in the last 12 years were retrospectively inspected for their survival dependency on pre-operative RBCs parameters using multivariate analysis followed by false discovery rate procedure due to the multiple hypothesis testing. A survival stratification tree and Kaplan-Meier survival curves that indicate the patient’s prognosis according to these parameters were prepared. RESULTS Beside KPS>70 and tumor resection supplemented by oncological treatment, age<70 (HR=0.4, 95% CI 0.24–0.65), low hemoglobin level (HR=1.79, 95% CI 1.06–2.99) and RDW<14% (HR=0.57, 95% CI 0.37–0.88) were found to be prognostic to patients’ overall survival in multivariate analysis, accounting for false discovery rate of less than 5%. CONCLUSION A survival stratification highlighted a non-anemic subgroup of nearly 30% of the cohort’s patients whose median overall survival was 21.1 months (95% CI 16.2–27.2) - higher than the average Stupp protocol overall median survival of about 15 months. A discussion on the beneficial or detrimental effect of RBCs parameters on glioblastoma prognosis and its possible causes is given.


2020 ◽  
Vol 223 (1) ◽  
pp. 19-22
Author(s):  
Jingjing Zhu ◽  
Chong Wu ◽  
Lang Wu

Abstract It is critical to identify potential causal targets for SARS-CoV-2, which may guide drug repurposing options. We assessed the associations between genetically predicted protein levels and COVID-19 severity. Leveraging data from the COVID-19 Host Genetics Initiative comparing 6492 hospitalized COVID-19 patients and 1 012 809 controls, we identified 18 proteins with genetically predicted levels to be associated with COVID-19 severity at a false discovery rate of &lt;0.05, including 12 that showed an association even after Bonferroni correction. Of the 18 proteins, 6 showed positive associations and 12 showed inverse associations. In conclusion, we identified 18 candidate proteins for COVID-19 severity.


Sign in / Sign up

Export Citation Format

Share Document