scholarly journals Bayesian Uncertainty Estimation for Gaussian Graphical Models and Centrality Indices

2021 ◽  
Author(s):  
Joran Jongerling ◽  
Sacha Epskamp ◽  
Donald Ray Williams

Gaussian Graphical Models (GGMs) are often estimated using regularized estimation and the graphical LASSO (GLASSO). However, the GLASSO has difficulty estimating(uncertainty in) centrality indices of nodes. Regularized Bayesian estimation might provide a solution, as it is better suited to deal with bias in the sampling distribution ofcentrality indices. This study therefore compares estimation of GGMs with a Bayesian GLASSO- and a Horseshoe prior to estimation using the frequentist GLASSO in an extensive simulation study. Results showed that out of the two Bayesian estimation methods, the Bayesian GLASSO performed best. In addition, the Bayesian GLASSOperformed better than the frequentist GLASSO with respect to bias in edge weights, centrality measures, correlation between estimated and true partial correlations, andspecificity. With respect to sensitivity the frequentist GLASSO performs better.However, sensitivity of the Bayesian GLASSO is close to that of the frequentist GLASSO (except for the smallest N used in the simulations) and tends to be favored over the frequentist GLASSO in terms of F1. With respect to uncertainty in the centrality measures, the Bayesian GLASSO shows good coverage for strength andcloseness centrality. Uncertainty in betweenness centrality is estimated less well, and typically overestimated by the Bayesian GLASSO.

2020 ◽  
Author(s):  
Victor Bernal ◽  
Rainer Bischoff ◽  
Peter Horvatovich ◽  
Victor Guryev ◽  
Marco Grzegorczyk

Abstract Background: In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes (‘high dimensional problem’). Shrinkage methods address this issue by learning a regularized GGM. However, it is an open question how the shrinkage affects the final result and its interpretation.Results: We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as the ‘un-shrunk’ partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. We apply the ‘un-shrunk’ method to two gene expression datasets from Escherichia coli and Mus musculus.Conclusions: GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the “high-dimensional” problem. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Victor Bernal ◽  
Rainer Bischoff ◽  
Peter Horvatovich ◽  
Victor Guryev ◽  
Marco Grzegorczyk

Abstract Background In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes (‘high dimensional problem’). Shrinkage methods address this issue by learning a regularized GGM. However, it remains open to study how the shrinkage affects the final result and its interpretation. Results We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as ‘un-shrinking’ the partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. This is demonstrated on two gene expression datasets from Escherichia coli and Mus musculus. Conclusions GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the ‘high-dimensional problem’. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Ginette Lafit ◽  
Francis Tuerlinckx ◽  
Inez Myin-Germeys ◽  
Eva Ceulemans

AbstractGaussian Graphical Models (GGMs) are extensively used in many research areas, such as genomics, proteomics, neuroimaging, and psychology, to study the partial correlation structure of a set of variables. This structure is visualized by drawing an undirected network, in which the variables constitute the nodes and the partial correlations the edges. In many applications, it makes sense to impose sparsity (i.e., some of the partial correlations are forced to zero) as sparsity is theoretically meaningful and/or because it improves the predictive accuracy of the fitted model. However, as we will show by means of extensive simulations, state-of-the-art estimation approaches for imposing sparsity on GGMs, such as the Graphical lasso, ℓ1 regularized nodewise regression, and joint sparse regression, fall short because they often yield too many false positives (i.e., partial correlations that are not properly set to zero). In this paper we present a new estimation approach that allows to control the false positive rate better. Our approach consists of two steps: First, we estimate an undirected network using one of the three state-of-the-art estimation approaches. Second, we try to detect the false positives, by flagging the partial correlations that are smaller in absolute value than a given threshold, which is determined through cross-validation; the flagged correlations are set to zero. Applying this new approach to the same simulated data, shows that it indeed performs better. We also illustrate our approach by using it to estimate (1) a gene regulatory network for breast cancer data, (2) a symptom network of patients with a diagnosis within the nonaffective psychotic spectrum and (3) a symptom network of patients with PTSD.


2020 ◽  
Author(s):  
Josue E. Rodriguez ◽  
Donald Ray Williams ◽  
Philippe Rast ◽  
Joris Mulder

Network theory has emerged as a popular framework for conceptualizing psychological constructs and mental disorders. Initially, network analysis was motivated in part by the thought that it can be used for hypothesis generation. Although the customary approach for network modeling is inherently exploratory, we argue that there is untapped potential for confirmatory hypothesis testing. In this work, we bring to fruition the potential of Gaussian graphical models for generating testable hypotheses. This is accomplished by merging exploratory and confirmatory analyses into a cohesive framework built around Bayesian hypothesis testing of partial correlations. We first present a motivating example based on a customary, exploratory analysis, where it is made clear how information encoded by the conditional (in)dependence structure can be used to formulate hypotheses. Building upon this foundation, we then provide several empirical examples that unify exploratory and confirmatory testing in psychopathology symptom networks. In particular, we (1) estimate exploratory graphs; (2) derive hypotheses based on the most central structures; and (3) test those hypotheses in a confirmatory setting. Our confirmatory results uncovered an intricate web of relations, including an order to edge weights within comorbidity networks. This illuminates the rich and informative inferences that can be drawn with the proposed approach. We conclude with recommendations for applied researchers, in addition to discussing how our methodology answers recent calls to begin developing formal models related to the conditional (in)dependence structure of psychological networks.


2018 ◽  
Author(s):  
Donald Ray Williams ◽  
Philippe Rast

Gaussian graphical models are an increasingly popular technique in psychology to characterize relationships among observed variables. These relationships are represented as covariances in the precision matrix. Standardizing this covariance matrix and reversing the sign yields corresponding partial correlations that imply pairwise dependencies in which the effects of all other variables have been controlled for. In order to estimate the precision matrix, the graphicallasso (glasso) has emerged as the default estimation method, which uses l-1-based regularization. Glasso was developed and optimized for high dimensional settings where the number of variables (p) exceeds the number of observations (n) which are uncommon in psychological applications. Here we propose to go “back to the basics”, wherein the precision matrix is first estimated withnon-regularized maximum likelihood and then Fisher Z-transformed confidence intervals are used to determine non-zero relationships. We first show the exact correspondence between the confidence level and specificity, which is due to 1 - specificity denoting the false positive rate (i.e., alpha). With simulations in low-dimensional settings (p << n), we then demonstrate superior performance compared to glasso for determining conditional relationships, in addition tofrequentist risk measured with various loss functions. Further, our results indicate that glasso is inconsistent for the purpose of model selection, whereas the proposed method converged on the true model with a probability that approached 100%. We end by discussing implications for estimating Gaussian graphical models in psychology.


Sign in / Sign up

Export Citation Format

Share Document