Improving Practices for Selecting a Subset of Important Predictors in Psychology: An Application to Predicting Pain

2019 ◽  
Author(s):  
Sierra Bainter ◽  
Thomas Granville McCauley ◽  
Tor D Wager ◽  
Elizabeth Reynolds Losin

In this paper we address the problem of selecting important predictors from some larger set of candidate predictors. Standard techniques are limited by lack of power and high false positive rates. A Bayesian variable selection approach used widely in biostatistics, stochastic search variable selection, can be used instead to combat these issues by accounting for uncertainty in the other predictors of the model. In this paper we present Bayesian variable selection to aid researchers facing this common scenario, along with an online application (https://ssvsforpsych.shinyapps.io/ssvsforpsych/) to perform the analysis and visualize the results. Using an application to predict pain ratings, we demonstrate how this approach quickly identifies reliable predictors, even when the set of possible predictors is larger than the sample size. This technique is widely applicable to research questions that may be relatively data-rich, but with limited information or theory to guide variable selection.

2011 ◽  
Vol 93 (4) ◽  
pp. 303-318 ◽  
Author(s):  
TIMO KNÜRR ◽  
ESA LÄÄRÄ ◽  
MIKKO J. SILLANPÄÄ

SummaryA new estimation-based Bayesian variable selection approach is presented for genetic analysis of complex traits based on linear or logistic regression. By assigning a mixture of uniform priors (MU) to genetic effects, the approach provides an intuitive way of specifying hyperparameters controlling the selection of multiple influential loci. It aims at avoiding the difficulty of interpreting assumptions made in the specifications of priors. The method is compared in two real datasets with two other approaches, stochastic search variable selection (SSVS) and a re-formulation of Bayes B utilizing indicator variables and adaptive Student's t-distributions (IAt). The Markov Chain Monte Carlo (MCMC) sampling performance of the three methods is evaluated using the publicly available software OpenBUGS (model scripts are provided in the Supplementary material). The sensitivity of MU to the specification of hyperparameters is assessed in one of the data examples.


2019 ◽  
Vol 158 (5) ◽  
pp. 210
Author(s):  
Bo Ning ◽  
Alexander Wise ◽  
Jessi Cisewski-Kehe ◽  
Sarah Dodson-Robinson ◽  
Debra Fischer

2003 ◽  
Vol 19 (1) ◽  
pp. 90-97 ◽  
Author(s):  
K. E. Lee ◽  
N. Sha ◽  
E. R. Dougherty ◽  
M. Vannucci ◽  
B. K. Mallick

Metrika ◽  
2016 ◽  
Vol 80 (3) ◽  
pp. 289-308 ◽  
Author(s):  
Hengzhen Huang ◽  
Shuangshuang Zhou ◽  
Min-Qian Liu ◽  
Zong-Feng Qi

2009 ◽  
Vol 91 (5) ◽  
pp. 307-311 ◽  
Author(s):  
KLARA L. VERBYLA ◽  
BEN J. HAYES ◽  
PHILIP J. BOWMAN ◽  
MICHAEL E. GODDARD

SummaryGenomic selection describes a selection strategy based on genomic breeding values predicted from dense single nucleotide polymorphism (SNP) data. Multiple methods have been proposed but the critical issue is how to decide whether an SNP should be included in the predictive set to estimate breeding values. One major disadvantage of the traditional Bayes B approach is its high computational demands caused by the changing dimensionality of the models. The use of stochastic search variable selection (SSVS) retains the same assumptions about the distribution of SNP effects as Bayes B, while maintaining constant dimensionality. When Bayesian SSVS was used to predict genomic breeding values for real dairy data over a range of traits it produced accuracies higher or equivalent to other genomic selection methods with significantly decreased computational and time demands than Bayes B.


2019 ◽  
Author(s):  
An-Shun Tai ◽  
George C. Tseng ◽  
Wen-Ping Hsieh

AbstractGene expression deconvolution is a powerful tool for exploring the microenvironment of complex tissues comprised of multiple cell groups using transcriptomic data. Characterizing cell activities for a particular condition has been regarded as a primary mission against diseases. For example, cancer immunology aims to clarify the role of the immune system in the progression and development of cancer through analyzing the immune cell components of tumors. To that end, many deconvolution methods have been proposed for inferring cell subpopulations within tissues. Nevertheless, two problems limit the practicality of current approaches. First, all approaches use external purified data to preselect cell type-specific genes that contribute to deconvolution. However, some types of cells cannot be found in purified profiles and the genes specifically over- or under-expressed in them cannot be identified. This is particularly a problem in cancer studies. Hence, a preselection strategy that is independent from deconvolution is inappropriate. The second problem is that existing approaches do not recover the expression profiles of unknown cells present in bulk tissues, which results in biased estimation of unknown cell proportions. Furthermore, it causes the shift-invariant property of deconvolution to fail, which then affects the estimation performance. To address these two problems, we propose a novel deconvolution approach, BayICE, which employs hierarchical Bayesian modeling with stochastic search variable selection. We develop a comprehensive Markov chain Monte Carlo procedure through Gibbs sampling to estimate cell proportions, gene expression profiles, and signature genes. Simulation and validation studies illustrate that BayICE outperforms existing deconvolution approaches in estimating cell proportions. Subsequently, we demonstrate an application of BayICE in the RNA sequencing of patients with non-small cell lung cancer. The model is implemented in the R package “BayICE” and the algorithm is available for download.


Sign in / Sign up

Export Citation Format

Share Document