Genetic analysis of complex traits via Bayesian variable selection: the utility of a mixture of uniform priors

SummaryA new estimation-based Bayesian variable selection approach is presented for genetic analysis of complex traits based on linear or logistic regression. By assigning a mixture of uniform priors (MU) to genetic effects, the approach provides an intuitive way of specifying hyperparameters controlling the selection of multiple influential loci. It aims at avoiding the difficulty of interpreting assumptions made in the specifications of priors. The method is compared in two real datasets with two other approaches, stochastic search variable selection (SSVS) and a re-formulation of Bayes B utilizing indicator variables and adaptive Student's t-distributions (IAt). The Markov Chain Monte Carlo (MCMC) sampling performance of the three methods is evaluated using the publicly available software OpenBUGS (model scripts are provided in the Supplementary material). The sensitivity of MU to the specification of hyperparameters is assessed in one of the data examples.

Download Full-text

Improving Practices for Selecting a Subset of Important Predictors in Psychology: An Application to Predicting Pain

10.31234/osf.io/j8t7s ◽

2019 ◽

Author(s):

Sierra Bainter ◽

Thomas Granville McCauley ◽

Tor D Wager ◽

Elizabeth Reynolds Losin

Keyword(s):

Variable Selection ◽

Bayesian Variable Selection ◽

Limited Information ◽

Online Application ◽

Stochastic Search Variable Selection ◽

Selection Approach ◽

Pain Ratings ◽

Research Questions ◽

Standard Techniques ◽

Search Variable

In this paper we address the problem of selecting important predictors from some larger set of candidate predictors. Standard techniques are limited by lack of power and high false positive rates. A Bayesian variable selection approach used widely in biostatistics, stochastic search variable selection, can be used instead to combat these issues by accounting for uncertainty in the other predictors of the model. In this paper we present Bayesian variable selection to aid researchers facing this common scenario, along with an online application (https://ssvsforpsych.shinyapps.io/ssvsforpsych/) to perform the analysis and visualize the results. Using an application to predict pain ratings, we demonstrate how this approach quickly identifies reliable predictors, even when the set of possible predictors is larger than the sample size. This technique is widely applicable to research questions that may be relatively data-rich, but with limited information or theory to guide variable selection.

Download Full-text

Stochastic Search Variable Selection for Identifying Multiple Quantitative Trait Loci

Genetics ◽

10.1093/genetics/164.3.1129 ◽

2003 ◽

Vol 164 (3) ◽

pp. 1129-1138 ◽

Cited By ~ 17

Author(s):

Nengjun Yi ◽

Varghese George ◽

David B Allison

Keyword(s):

Quantitative Trait Loci ◽

Variable Selection ◽

Quantitative Trait ◽

Complex Traits ◽

Stochastic Search ◽

Model Parameters ◽

Data Set ◽

Stochastic Search Variable Selection ◽

Trait Loci ◽

Search Variable

AbstractIn this article, we utilize stochastic search variable selection methodology to develop a Bayesian method for identifying multiple quantitative trait loci (QTL) for complex traits in experimental designs. The proposed procedure entails embedding multiple regression in a hierarchical normal mixture model, where latent indicators for all markers are used to identify the multiple markers. The markers with significant effects can be identified as those with higher posterior probability included in the model. A simple and easy-to-use Gibbs sampler is employed to generate samples from the joint posterior distribution of all unknowns including the latent indicators, genetic effects for all markers, and other model parameters. The proposed method was evaluated using simulated data and illustrated using a real data set. The results demonstrate that the proposed method works well under typical situations of most QTL studies in terms of number of markers and marker density.

Download Full-text

Stochastic search variable selection in vector error correction models with an application to a model of the UK macroeconomy

Journal of Applied Econometrics ◽

10.1002/jae.1238 ◽

2011 ◽

Vol 28 (1) ◽

pp. 62-81 ◽

Cited By ~ 11

Author(s):

Markus Jochmann ◽

Gary Koop ◽

Roberto Leon-Gonzalez ◽

Rodney W. Strachan

Keyword(s):

Variable Selection ◽

Error Correction ◽

Stochastic Search ◽

Error Correction Models ◽

Vector Error Correction ◽

Stochastic Search Variable Selection ◽

Vector Error ◽

The Uk ◽

Vector Error Correction Models ◽

Search Variable

Download Full-text

The Dawn of the Age of Multi-Parent MAGIC Populations in Plant Breeding: Novel Powerful Next-Generation Resources for Genetic Analysis and Selection of Recombinant Elite Material

Biology ◽

10.3390/biology9080229 ◽

2020 ◽

Vol 9 (8) ◽

pp. 229 ◽

Cited By ~ 1

Author(s):

Andrea Arrones ◽

Santiago Vilanova ◽

Mariola Plazas ◽

Giulio Mangino ◽

Laura Pascual ◽

...

Keyword(s):

Genetic Analysis ◽

Plant Breeding ◽

Complex Traits ◽

Genetic Recombination ◽

Phenotypic Diversity ◽

Recombinant Inbred Lines ◽

Plant Genetic Resources ◽

Diallel Cross ◽

Genetic Mosaic ◽

Selection Of

The compelling need to increase global agricultural production requires new breeding approaches that facilitate exploiting the diversity available in the plant genetic resources. Multi-parent advanced generation inter-cross (MAGIC) populations are large sets of recombinant inbred lines (RILs) that are a genetic mosaic of multiple founder parents. MAGIC populations display emerging features over experimental bi-parental and germplasm populations in combining significant levels of genetic recombination, a lack of genetic structure, and high genetic and phenotypic diversity. The development of MAGIC populations can be performed using “funnel” or “diallel” cross-designs, which are of great relevance choosing appropriate parents and defining optimal population sizes. Significant advances in specific software development are facilitating the genetic analysis of the complex genetic constitutions of MAGIC populations. Despite the complexity and the resources required in their development, due to their potential and interest for breeding, the number of MAGIC populations available and under development is continuously growing, with 45 MAGIC populations in different crops being reported here. Though cereals are by far the crop group where more MAGIC populations have been developed, MAGIC populations have also started to become available in other crop groups. The results obtained so far demonstrate that MAGIC populations are a very powerful tool for the dissection of complex traits, as well as a resource for the selection of recombinant elite breeding material and cultivars. In addition, some new MAGIC approaches that can make significant contributions to breeding, such as the development of inter-specific MAGIC populations, the development of MAGIC-like populations in crops where pure lines are not available, and the establishment of strategies for the straightforward incorporation of MAGIC materials in breeding pipelines, have barely been explored. The evidence that is already available indicates that MAGIC populations will play a major role in the coming years in allowing for impressive gains in plant breeding for developing new generations of dramatically improved cultivars.

Download Full-text

Acceleration of the stochastic search variable selection via componentwise Gibbs sampling

Metrika ◽

10.1007/s00184-016-0604-x ◽

2016 ◽

Vol 80 (3) ◽

pp. 289-308 ◽

Cited By ~ 2

Author(s):

Hengzhen Huang ◽

Shuangshuang Zhou ◽

Min-Qian Liu ◽

Zong-Feng Qi

Keyword(s):

Variable Selection ◽

Gibbs Sampling ◽

Stochastic Search ◽

Stochastic Search Variable Selection ◽

Search Variable

Download Full-text

Stochastic search variable selection for log-linear models

Journal of Statistical Computation and Simulation ◽

10.1080/00949650008812054 ◽

2000 ◽

Vol 68 (1) ◽

pp. 23-37 ◽

Cited By ~ 15

Author(s):

Ioannis Ntzoufras ◽

Jonathan J. Forster ◽

Petros Dellaportas

Keyword(s):

Variable Selection ◽

Linear Models ◽

Stochastic Search ◽

Stochastic Search Variable Selection ◽

Selection For ◽

Log Linear ◽

Search Variable

Download Full-text

Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle

Genetics Research ◽

10.1017/s0016672309990243 ◽

2009 ◽

Vol 91 (5) ◽

pp. 307-311 ◽

Cited By ~ 93

Author(s):

KLARA L. VERBYLA ◽

BEN J. HAYES ◽

PHILIP J. BOWMAN ◽

MICHAEL E. GODDARD

Keyword(s):

Variable Selection ◽

Genomic Selection ◽

Critical Issue ◽

Stochastic Search ◽

Selection Strategy ◽

Genomic Breeding ◽

Breeding Values ◽

Stochastic Search Variable Selection ◽

Snp Data ◽

Search Variable

SummaryGenomic selection describes a selection strategy based on genomic breeding values predicted from dense single nucleotide polymorphism (SNP) data. Multiple methods have been proposed but the critical issue is how to decide whether an SNP should be included in the predictive set to estimate breeding values. One major disadvantage of the traditional Bayes B approach is its high computational demands caused by the changing dimensionality of the models. The use of stochastic search variable selection (SSVS) retains the same assumptions about the distribution of SNP effects as Bayes B, while maintaining constant dimensionality. When Bayesian SSVS was used to predict genomic breeding values for real dairy data over a range of traits it produced accuracies higher or equivalent to other genomic selection methods with significantly decreased computational and time demands than Bayes B.

Download Full-text

BayICE: A hierarchical Bayesian deconvolution model with stochastic search variable selection

10.1101/732743 ◽

2019 ◽

Author(s):

An-Shun Tai ◽

George C. Tseng ◽

Wen-Ping Hsieh

Keyword(s):

Gene Expression ◽

Variable Selection ◽

Immune Cell ◽

Expression Profiles ◽

Gene Expression Profiles ◽

R Package ◽

Stochastic Search ◽

Hierarchical Bayesian ◽

Stochastic Search Variable Selection ◽

Search Variable

AbstractGene expression deconvolution is a powerful tool for exploring the microenvironment of complex tissues comprised of multiple cell groups using transcriptomic data. Characterizing cell activities for a particular condition has been regarded as a primary mission against diseases. For example, cancer immunology aims to clarify the role of the immune system in the progression and development of cancer through analyzing the immune cell components of tumors. To that end, many deconvolution methods have been proposed for inferring cell subpopulations within tissues. Nevertheless, two problems limit the practicality of current approaches. First, all approaches use external purified data to preselect cell type-specific genes that contribute to deconvolution. However, some types of cells cannot be found in purified profiles and the genes specifically over- or under-expressed in them cannot be identified. This is particularly a problem in cancer studies. Hence, a preselection strategy that is independent from deconvolution is inappropriate. The second problem is that existing approaches do not recover the expression profiles of unknown cells present in bulk tissues, which results in biased estimation of unknown cell proportions. Furthermore, it causes the shift-invariant property of deconvolution to fail, which then affects the estimation performance. To address these two problems, we propose a novel deconvolution approach, BayICE, which employs hierarchical Bayesian modeling with stochastic search variable selection. We develop a comprehensive Markov chain Monte Carlo procedure through Gibbs sampling to estimate cell proportions, gene expression profiles, and signature genes. Simulation and validation studies illustrate that BayICE outperforms existing deconvolution approaches in estimating cell proportions. Subsequently, we demonstrate an application of BayICE in the RNA sequencing of patients with non-small cell lung cancer. The model is implemented in the R package “BayICE” and the algorithm is available for download.

Download Full-text

Improving Practices for Selecting a Subset of Important Predictors in Psychology: An Application to Predicting Pain

Advances in Methods and Practices in Psychological Science ◽

10.1177/2515245919885617 ◽

2020 ◽

Vol 3 (1) ◽

pp. 66-80 ◽

Cited By ~ 1

Author(s):

Sierra A. Bainter ◽

Thomas G. McCauley ◽

Tor Wager ◽

Elizabeth A. Reynolds Losin

Keyword(s):

Variable Selection ◽

Multiple Testing ◽

Selection Procedure ◽

Experimental Pain ◽

Bayesian Variable Selection ◽

Large Set ◽

Web Based ◽

Stochastic Search Variable Selection ◽

Variable Selection Procedure ◽

Multivariate Relationships

Frequently, researchers in psychology are faced with the challenge of narrowing down a large set of predictors to a smaller subset. There are a variety of ways to do this, but commonly it is done by choosing predictors with the strongest bivariate correlations with the outcome. However, when predictors are correlated, bivariate relationships may not translate into multivariate relationships. Further, any attempts to control for multiple testing are likely to result in extremely low power. Here we introduce a Bayesian variable-selection procedure frequently used in other disciplines, stochastic search variable selection (SSVS). We apply this technique to choosing the best set of predictors of the perceived unpleasantness of an experimental pain stimulus from among a large group of sociocultural, psychological, and neurobiological (functional MRI) individual-difference measures. Using SSVS provides information about which variables predict the outcome, controlling for uncertainty in the other variables of the model. This approach yields new, useful information to guide the choice of relevant predictors. We have provided Web-based open-source software for performing SSVS and visualizing the results.

Download Full-text

Two-Level Stochastic Search Variable Selection in GLMs with Missing Predictors

The International Journal of Biostatistics ◽

10.2202/1557-4679.1173 ◽

2010 ◽

Vol 6 (1) ◽

Cited By ~ 6

Author(s):

Robin Mitra ◽

David Dunson

Keyword(s):

Variable Selection ◽

Stochastic Search ◽

Stochastic Search Variable Selection ◽

Search Variable

Download Full-text