scholarly journals How to improve parameter estimates in GLM-based fMRI data analysis: cross-validated Bayesian model averaging

2016 ◽  
Author(s):  
Joram Soch ◽  
Achim Pascal Meyer ◽  
John-Dylan Haynes ◽  
Carsten Allefeld

AbstractIn functional magnetic resonance imaging (fMRI), model quality of general linear models (GLMs) for first-level analysis is rarely assessed. In recent work (Soch et al., 2016: “How to avoid mismodelling in GLM-based fMRI data analysis: cross-validated Bayesian model selection”, NeuroImage, vol. 141, pp. 469-489; DOI: 10.1016/j. neuroimage.2016.07.047), we have introduced cross-validated Bayesian model selection (cvBMS) to infer the best model for a group of subjects and use it to guide second-level analysis. While this is the optimal approach given that the same GLM has to be used for all subjects, there is a much more efficient procedure when model selection only addresses nuisance variables and regressors of interest are included in all candidate models. In this work, we propose cross-validated Bayesian model averaging (cvBMA) to improve parameter estimates for these regressors of interest by combining information from all models using their posterior probabilities. This is particularly useful as different models can lead to different conclusions regarding experimental effects and the most complex model is not necessarily the best choice. We find that cvBMS can prevent not detecting established effects and that cvBMA can be more sensitive to experimental effects than just using even the best model in each subject or the model which is best in a group of subjects.

NeuroImage ◽  
2017 ◽  
Vol 158 ◽  
pp. 186-195 ◽  
Author(s):  
Joram Soch ◽  
Achim Pascal Meyer ◽  
John-Dylan Haynes ◽  
Carsten Allefeld

Author(s):  
Eduardo A. Aponte ◽  
Yu Yao ◽  
Sudhir Raman ◽  
Stefan Frässle ◽  
Jakob Heinzle ◽  
...  

AbstractIn generative modeling of neuroimaging data, such as dynamic causal modeling (DCM), one typically considers several alternative models, either to determine the most plausible explanation for observed data (Bayesian model selection) or to account for model uncertainty (Bayesian model averaging). Both procedures rest on estimates of the model evidence, a principled trade-off between model accuracy and complexity. In the context of DCM, the log evidence is usually approximated using variational Bayes. Although this approach is highly efficient, it makes distributional assumptions and is vulnerable to local extrema. This paper introduces the use of thermodynamic integration (TI) for Bayesian model selection and averaging in the context of DCM. TI is based on Markov chain Monte Carlo sampling which is asymptotically exact but orders of magnitude slower than variational Bayes. In this paper, we explain the theoretical foundations of TI, covering key concepts such as the free energy and its origins in statistical physics. Our aim is to convey an in-depth understanding of the method starting from its historical origin in statistical physics. In addition, we demonstrate the practical application of TI via a series of examples which serve to guide the user in applying this method. Furthermore, these examples demonstrate that, given an efficient implementation and hardware capable of parallel processing, the challenge of high computational demand can be overcome successfully. The TI implementation presented in this paper is freely available as part of the open source software TAPAS.


2020 ◽  
Vol 3 (2) ◽  
pp. 200-215
Author(s):  
Max Hinne ◽  
Quentin F. Gronau ◽  
Don van den Bergh ◽  
Eric-Jan Wagenmakers

Many statistical scenarios initially involve several candidate models that describe the data-generating process. Analysis often proceeds by first selecting the best model according to some criterion and then learning about the parameters of this selected model. Crucially, however, in this approach the parameter estimates are conditioned on the selected model, and any uncertainty about the model-selection process is ignored. An alternative is to learn the parameters for all candidate models and then combine the estimates according to the posterior probabilities of the associated models. This approach is known as Bayesian model averaging (BMA). BMA has several important advantages over all-or-none selection methods, but has been used only sparingly in the social sciences. In this conceptual introduction, we explain the principles of BMA, describe its advantages over all-or-none model selection, and showcase its utility in three examples: analysis of covariance, meta-analysis, and network analysis.


2016 ◽  
Vol 27 (1) ◽  
pp. 250-268 ◽  
Author(s):  
Rachel Carroll ◽  
Andrew B Lawson ◽  
Christel Faes ◽  
Russell S Kirby ◽  
Mehreteab Aregay ◽  
...  

In disease mapping where predictor effects are to be modeled, it is often the case that sets of predictors are fixed, and the aim is to choose between fixed model sets. Model selection methods, both Bayesian model selection and Bayesian model averaging, are approaches within the Bayesian paradigm for achieving this aim. In the spatial context, model selection could have a spatial component in the sense that some models may be more appropriate for certain areas of a study region than others. In this work, we examine the use of spatially referenced Bayesian model averaging and Bayesian model selection via a large-scale simulation study accompanied by a small-scale case study. Our results suggest that BMS performs well when a strong regression signature is found.


2020 ◽  
Author(s):  
Eduardo Aponte ◽  
Yu Yao ◽  
Sudhir Raman ◽  
Stefan Frassle ◽  
Jakob Heinzle ◽  
...  

In generative modeling of neuroimaging data, such as dynamic causal modeling (DCM), one typically considers several alternative models, either to determine the most plausible explanation for observed data (Bayesian model selection) or to account for model uncertainty (Bayesian model averaging). Both procedures rest on estimates of the model evidence, a principled trade-off between model accuracy and complexity. In the context of DCM, the log evidence is usually approximated using variational Bayes. Although this approach is highly efficient, it makes distributional assumptions and is vulnerable to local extrema. This paper introduces the use of thermodynamic integration (TI) for Bayesian model selection and averaging in the context of DCM. TI is based on Markov chain Monte Carlo sampling which is asymptotically exact but orders of magnitude slower than variational Bayes. In this paper, we explain the theoretical foundations of TI, covering key concepts such as the free energy and its origins in statistical physics. Our aim is to convey an in-depth understanding of the method starting from its historical origin in statistical physics. In addition, we demonstrate the practical application of TI via a series of examples which serve to guide the user in applying this method. Furthermore, these examples demonstrate that, given an efficient implementation and hardware capable of parallel processing, the challenge of high computational demand can be overcome successfully. The TI implementation presented in this paper is freely available as part of the open source software TAPAS.


2019 ◽  
Author(s):  
Max Hinne ◽  
Quentin Frederik Gronau ◽  
Don van den Bergh ◽  
Eric-Jan Wagenmakers

Many statistical scenarios initially involve several candidate models that describe the data-generating process. Analysis often proceeds by first selecting the best model according to some criterion, and then learning about the parameters of this selected model. Crucially however, in this approach the parameter estimates are conditioned on the selected model, and any uncertainty about the model selection process is ignored. An alternative is to learn the parameters for allcandidate models, and then combine the estimates according to the posterior probabilities of the associated models. The result is known as Bayesian model averaging (BMA). BMA has several important advantages over all-or-none selection methods, but has been used only sparingly in the social sciences. In this conceptual introduction we explain the principles of BMA, describe its advantages over all-or-none model selection, and showcase its utility for three examples: ANCOVA, meta-analysis, and network analysis.


2019 ◽  
Author(s):  
Max Hinne ◽  
Quentin Frederik Gronau ◽  
Don van den Bergh ◽  
Eric-Jan Wagenmakers

Many statistical scenarios initially involve several candidate models that describe the data-generating process. Analysis often proceeds by first selecting the best model according to some criterion, and then learning about the parameters of this selected model. Crucially however, in this approach the parameter estimates are conditioned on the selected model, and any uncertainty about the model selection process is ignored. An alternative is to learn the parameters for all candidate models, and then combine the estimates according to the posterior probabilities of the associated models. The result is known as Bayesian model averaging (BMA). BMA has several important advantages over all-or-none selection methods, but has been used only sparingly in the social sciences. In this conceptual introduction we explain the principles of BMA, describe its advantages over all-or-none model selection, and showcase its utility for three examples: ANCOVA, meta-analysis, and network analysis.


2021 ◽  
Author(s):  
Carlos R Oliveira ◽  
Eugene D Shapiro ◽  
Daniel M Weinberger

Vaccine effectiveness (VE) studies are often conducted after the introduction of new vaccines to ensure they provide protection in real-world settings. Although susceptible to confounding, the test-negative case-control study design is the most efficient method to assess VE post-licensure. Control of confounding is often needed during the analyses, which is most efficiently done through multivariable modeling. When a large number of potential confounders are being considered, it can be challenging to know which variables need to be included in the final model. This paper highlights the importance of considering model uncertainty by re-analyzing a Lyme VE study using several confounder selection methods. We propose an intuitive Bayesian Model Averaging (BMA) framework for this task and compare the performance of BMA to that of traditional single-best-model-selection methods. We demonstrate how BMA can be advantageous in situations when there is uncertainty about model selection by systematically considering alternative models and increasing transparency.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Kwangbom Choi ◽  
Yang Chen ◽  
Daniel A. Skelly ◽  
Gary A. Churchill

Abstract Background Single-cell RNA sequencing is a powerful tool for characterizing cellular heterogeneity in gene expression. However, high variability and a large number of zero counts present challenges for analysis and interpretation. There is substantial controversy over the origins and proper treatment of zeros and no consensus on whether zero-inflated count distributions are necessary or even useful. While some studies assume the existence of zero inflation due to technical artifacts and attempt to impute the missing information, other recent studies argue that there is no zero inflation in scRNA-seq data. Results We apply a Bayesian model selection approach to unambiguously demonstrate zero inflation in multiple biologically realistic scRNA-seq datasets. We show that the primary causes of zero inflation are not technical but rather biological in nature. We also demonstrate that parameter estimates from the zero-inflated negative binomial distribution are an unreliable indicator of zero inflation. Conclusions Despite the existence of zero inflation in scRNA-seq counts, we recommend the generalized linear model with negative binomial count distribution, not zero-inflated, as a suitable reference model for scRNA-seq analysis.


Sign in / Sign up

Export Citation Format

Share Document