scholarly journals A partial graphical model with a structural prior on the direct links between predictors and responses

Author(s):  
Eunice Okome Obiang ◽  
Pascal Jézéquel ◽  
Frédéric Proïa

This paper is devoted to the estimation of a partial graphical model with a structural Bayesian penalization. Precisely, we are interested in the linear regression setting where the estimation is made through the direct links between potentially high-dimensional predictors and multiple responses, since it is known that Gaussian graphical models enable to exhibit direct links only, whereas coefficients in linear regressions contain both direct and indirect relations (due \textit {e.g.} to strong correlations among the variables). A smooth penalty reflecting a generalized Gaussian Bayesian prior on the covariates is added, either enforcing patterns (like row structures) in the direct links or regulating the joint influence of predictors. We give a theoretical guarantee for our method, taking the form of an upper bound on the estimation error arising with high probability, provided that the model is suitably regularized. Empirical studies on synthetic data and a real dataset are conducted.


2019 ◽  
Vol 35 (23) ◽  
pp. 5011-5017 ◽  
Author(s):  
Victor Bernal ◽  
Rainer Bischoff ◽  
Victor Guryev ◽  
Marco Grzegorczyk ◽  
Peter Horvatovich

Abstract Motivation One of the main goals in systems biology is to learn molecular regulatory networks from quantitative profile data. In particular, Gaussian graphical models (GGMs) are widely used network models in bioinformatics where variables (e.g. transcripts, metabolites or proteins) are represented by nodes, and pairs of nodes are connected with an edge according to their partial correlation. Reconstructing a GGM from data is a challenging task when the sample size is smaller than the number of variables. The main problem consists in finding the inverse of the covariance estimator which is ill-conditioned in this case. Shrinkage-based covariance estimators are a popular approach, producing an invertible ‘shrunk’ covariance. However, a proper significance test for the ‘shrunk’ partial correlation (i.e. the GGM edges) is an open challenge as a probability density including the shrinkage is unknown. In this article, we present (i) a geometric reformulation of the shrinkage-based GGM, and (ii) a probability density that naturally includes the shrinkage parameter. Results Our results show that the inference using this new ‘shrunk’ probability density is as accurate as Monte Carlo estimation (an unbiased non-parametric method) for any shrinkage value, while being computationally more efficient. We show on synthetic data how the novel test for significance allows an accurate control of the Type I error and outperforms the network reconstruction obtained by the widely used R package GeneNet. This is further highlighted in two gene expression datasets from stress response in Eschericha coli, and the effect of influenza infection in Mus musculus. Availability and implementation https://github.com/V-Bernal/GGM-Shrinkage Supplementary information Supplementary data are available at Bioinformatics online.



2021 ◽  
Vol 17 (3) ◽  
pp. 1-33
Author(s):  
Beilun Wang ◽  
Jiaqi Zhang ◽  
Yan Zhang ◽  
Meng Wang ◽  
Sen Wang

Recently, the Internet of Things (IoT) receives significant interest due to its rapid development. But IoT applications still face two challenges: heterogeneity and large scale of IoT data. Therefore, how to efficiently integrate and process these complicated data becomes an essential problem. In this article, we focus on the problem that analyzing variable dependencies of data collected from different edge devices in the IoT network. Because data from different devices are heterogeneous and the variable dependencies can be characterized into a graphical model, we can focus on the problem that jointly estimating multiple, high-dimensional, and sparse Gaussian Graphical Models for many related tasks (edge devices). This is an important goal in many fields. Many IoT networks have collected massive multi-task data and require the analysis of heterogeneous data in many scenarios. Past works on the joint estimation are non-distributed and involve computationally expensive and complex non-smooth optimizations. To address these problems, we propose a novel approach: Multi-FST. Multi-FST can be efficiently implemented on a cloud-server-based IoT network. The cloud server has a low computational load and IoT devices use asynchronous communication with the server, leading to efficiency. Multi-FST shows significant improvement, over baselines, when tested on various datasets.



Biometrika ◽  
2021 ◽  
Author(s):  
J Zapata ◽  
S Y Oh ◽  
A Petersen

Abstract The covariance structure of multivariate functional data can be highly complex, especially if the multivariate dimension is large, making extensions of statistical methods for standard multivariate data to the functional data setting challenging. For example, Gaussian graphical models have recently been extended to the setting of multivariate functional data by applying multivariate methods to the coefficients of truncated basis expansions. However, a key difficulty compared to multivariate data is that the covariance operator is compact, and thus not invertible. The methodology in this paper addresses the general problem of covariance modelling for multivariate functional data, and functional Gaussian graphical models in particular. As a first step, a new notion of separability for the covariance operator of multivariate functional data is proposed, termed partial separability, leading to a novel Karhunen–Loève-type expansion for such data. Next, the partial separability structure is shown to be particularly useful in order to provide a well-defined functional Gaussian graphical model that can be identified with a sequence of finite-dimensional graphical models, each of identical fixed dimension. This motivates a simple and efficient estimation procedure through application of the joint graphical lasso. Empirical performance of the method for graphical model estimation is assessed through simulation and analysis of functional brain connectivity during a motor task.



2017 ◽  
Vol 42 (2) ◽  
Author(s):  
Vilda Purutçuoğlu ◽  
Ezgi Ayyıldız ◽  
Ernst Wit

AbstractIntroduction:The Gaussian Graphical Model (GGM) is one of the well-known probabilistic models which is based on the conditional independency of nodes in the biological system. Here, we compare the estimates of the GGM parameters by the graphical lasso (glasso) method and the threshold gradient descent (TGD) algorithm.Methods:We evaluate the performance of both techniques via certain measures such as specificity, F-measure and AUC (area under the curve). The analyses are conducted by Monte Carlo runs under different dimensional systems.Results:The results indicate that the TGD algorithm is more accurate than the glasso method in all selected criteria, whereas, it is more computationally demanding than this method too.Discussion and conclusion:Therefore, in high dimensional systems, we recommend glasso for its computational efficiency in spite of its loss in accuracy and we believe than the computational cost of the TGD algorithm can be improved by suggesting alternative steps in inference of the network.



2020 ◽  
Author(s):  
Donald Ray Williams

Studying complex relations in multivariate datasets is a common task across the sciences. Recently, the Gaussian graphical model has emerged as an increasingly popular model for characterizing the conditional dependence structure of random variables. Although the graphical lasso ($\ell_1$-regularization) is the most well-known estimator, it has several drawbacks that make it less than ideal for model selection. There are now alternative forms of regularization that were developed specifically to overcome issues inherent to the $\ell_1$-penalty.To date, however, these alternatives have been slow to work their way into software for research workers. To address this dearth of software, I developed the package \textbf{GGMncv} that includes a variety of nonconvex penalties, two algorithms for their estimation, plotting capabilities, and an approach for making statistical inference. As an added bonus, \textbf{GGMncv} can be used for nonconvex penalized least squares. After describing the various nonconvex penalties, the functionality of \textbf{GGMncv} is demonstrated through examples using a dataset from personality psychology.



Biometrics ◽  
2019 ◽  
Vol 75 (4) ◽  
pp. 1288-1298
Author(s):  
Gwenaël G. R. Leday ◽  
Sylvia Richardson


Author(s):  
Cody Mazza-Anthony ◽  
Bogdan Mazoure ◽  
Mark Coates


Biometrika ◽  
2020 ◽  
Author(s):  
S Na ◽  
M Kolar ◽  
O Koyejo

Abstract Differential graphical models are designed to represent the difference between the conditional dependence structures of two groups, thus are of particular interest for scientific investigation. Motivated by modern applications, this manuscript considers an extended setting where each group is generated by a latent variable Gaussian graphical model. Due to the existence of latent factors, the differential network is decomposed into sparse and low-rank components, both of which are symmetric indefinite matrices. We estimate these two components simultaneously using a two-stage procedure: (i) an initialization stage, which computes a simple, consistent estimator, and (ii) a convergence stage, implemented using a projected alternating gradient descent algorithm applied to a nonconvex objective, initialized using the output of the first stage. We prove that given the initialization, the estimator converges linearly with a nontrivial, minimax optimal statistical error. Experiments on synthetic and real data illustrate that the proposed nonconvex procedure outperforms existing methods.



2013 ◽  
Vol 14 (2) ◽  
pp. 107-117
Author(s):  
Yi-zhou He ◽  
Xi Chen ◽  
Hao Wang


Sign in / Sign up

Export Citation Format

Share Document