scholarly journals Fast approximate inference for variable selection in Dirichlet process mixtures, with an application to pan-cancer proteomics

Author(s):  
Oliver M. Crook ◽  
Laurent Gatto ◽  
Paul D. W. Kirk

Abstract The Dirichlet Process (DP) mixture model has become a popular choice for model-based clustering, largely because it allows the number of clusters to be inferred. The sequential updating and greedy search (SUGS) algorithm (Wang & Dunson, 2011) was proposed as a fast method for performing approximate Bayesian inference in DP mixture models, by posing clustering as a Bayesian model selection (BMS) problem and avoiding the use of computationally costly Markov chain Monte Carlo methods. Here we consider how this approach may be extended to permit variable selection for clustering, and also demonstrate the benefits of Bayesian model averaging (BMA) in place of BMS. Through an array of simulation examples and well-studied examples from cancer transcriptomics, we show that our method performs competitively with the current state-of-the-art, while also offering computational benefits. We apply our approach to reverse-phase protein array (RPPA) data from The Cancer Genome Atlas (TCGA) in order to perform a pan-cancer proteomic characterisation of 5157 tumour samples. We have implemented our approach, together with the original SUGS algorithm, in an open-source R package named sugsvarsel, which accelerates analysis by performing intensive computations in C++ and provides automated parallel processing. The R package is freely available from: https://github.com/ococrook/sugsvarsel

Nutrients ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 1098
Author(s):  
Ewelina Łukaszyk ◽  
Katarzyna Bień-Barkowska ◽  
Barbara Bień

Identifying factors that affect mortality requires a robust statistical approach. This study’s objective is to assess an optimal set of variables that are independently associated with the mortality risk of 433 older comorbid adults that have been discharged from the geriatric ward. We used both the stepwise backward variable selection and the iterative Bayesian model averaging (BMA) approaches to the Cox proportional hazards models. Potential predictors of the mortality rate were based on a broad range of clinical data; functional and laboratory tests, including geriatric nutritional risk index (GNRI); lymphocyte count; vitamin D, and the age-weighted Charlson comorbidity index. The results of the multivariable analysis identified seven explanatory variables that are independently associated with the length of survival. The mortality rate was higher in males than in females; it increased with the comorbidity level and C-reactive proteins plasma level but was negatively affected by a person’s mobility, GNRI and lymphocyte count, as well as the vitamin D plasma level.


2001 ◽  
Vol 20 (21) ◽  
pp. 3215-3230 ◽  
Author(s):  
Valerie Viallefont ◽  
Adrian E. Raftery ◽  
Sylvia Richardson

Author(s):  
Lorenzo Bencivelli ◽  
Massimiliano Giuseppe Marcellino ◽  
Gianluca Moretti

Sign in / Sign up

Export Citation Format

Share Document