Doubly functional graphical models in high dimensions

Biometrika ◽  
2020 ◽  
Vol 107 (2) ◽  
pp. 415-431
Author(s):  
Xinghao Qiao ◽  
Cheng Qian ◽  
Gareth M James ◽  
Shaojun Guo

Summary We consider estimating a functional graphical model from multivariate functional observations. In functional data analysis, the classical assumption is that each function has been measured over a densely sampled grid. However, in practice the functions have often been observed, with measurement error, at a relatively small number of points. We propose a class of doubly functional graphical models to capture the evolving conditional dependence relationship among a large number of sparsely or densely sampled functions. Our approach first implements a nonparametric smoother to perform functional principal components analysis for each curve, then estimates a functional covariance matrix and finally computes sparse precision matrices, which in turn provide the doubly functional graphical model. We derive some novel concentration bounds, uniform convergence rates and model selection properties of our estimator for both sparsely and densely sampled functional data in the high-dimensional large-$p$, small-$n$ regime. We demonstrate via simulations that the proposed method significantly outperforms possible competitors. Our proposed method is applied to a brain imaging dataset.

Biometrika ◽  
2020 ◽  
Author(s):  
S Na ◽  
M Kolar ◽  
O Koyejo

Abstract Differential graphical models are designed to represent the difference between the conditional dependence structures of two groups, thus are of particular interest for scientific investigation. Motivated by modern applications, this manuscript considers an extended setting where each group is generated by a latent variable Gaussian graphical model. Due to the existence of latent factors, the differential network is decomposed into sparse and low-rank components, both of which are symmetric indefinite matrices. We estimate these two components simultaneously using a two-stage procedure: (i) an initialization stage, which computes a simple, consistent estimator, and (ii) a convergence stage, implemented using a projected alternating gradient descent algorithm applied to a nonconvex objective, initialized using the output of the first stage. We prove that given the initialization, the estimator converges linearly with a nontrivial, minimax optimal statistical error. Experiments on synthetic and real data illustrate that the proposed nonconvex procedure outperforms existing methods.


Biometrika ◽  
2021 ◽  
Author(s):  
J Zapata ◽  
S Y Oh ◽  
A Petersen

Abstract The covariance structure of multivariate functional data can be highly complex, especially if the multivariate dimension is large, making extensions of statistical methods for standard multivariate data to the functional data setting challenging. For example, Gaussian graphical models have recently been extended to the setting of multivariate functional data by applying multivariate methods to the coefficients of truncated basis expansions. However, a key difficulty compared to multivariate data is that the covariance operator is compact, and thus not invertible. The methodology in this paper addresses the general problem of covariance modelling for multivariate functional data, and functional Gaussian graphical models in particular. As a first step, a new notion of separability for the covariance operator of multivariate functional data is proposed, termed partial separability, leading to a novel Karhunen–Loève-type expansion for such data. Next, the partial separability structure is shown to be particularly useful in order to provide a well-defined functional Gaussian graphical model that can be identified with a sequence of finite-dimensional graphical models, each of identical fixed dimension. This motivates a simple and efficient estimation procedure through application of the joint graphical lasso. Empirical performance of the method for graphical model estimation is assessed through simulation and analysis of functional brain connectivity during a motor task.


2020 ◽  
Author(s):  
Donald Ray Williams

Studying complex relations in multivariate datasets is a common task across the sciences. Recently, the Gaussian graphical model has emerged as an increasingly popular model for characterizing the conditional dependence structure of random variables. Although the graphical lasso ($\ell_1$-regularization) is the most well-known estimator, it has several drawbacks that make it less than ideal for model selection. There are now alternative forms of regularization that were developed specifically to overcome issues inherent to the $\ell_1$-penalty.To date, however, these alternatives have been slow to work their way into software for research workers. To address this dearth of software, I developed the package \textbf{GGMncv} that includes a variety of nonconvex penalties, two algorithms for their estimation, plotting capabilities, and an approach for making statistical inference. As an added bonus, \textbf{GGMncv} can be used for nonconvex penalized least squares. After describing the various nonconvex penalties, the functionality of \textbf{GGMncv} is demonstrated through examples using a dataset from personality psychology.


Author(s):  
Zachary D. Kurtz ◽  
Richard Bonneau ◽  
Christian L. Müller

AbstractDetecting community-wide statistical relationships from targeted amplicon-based and metagenomic profiling of microbes in their natural environment is an important step toward understanding the organization and function of these communities. We present a robust and computationally tractable latent graphical model inference scheme that allows simultaneous identification of parsimonious statistical relationships among microbial species and unobserved factors that influence the prevalence and variability of the abundance measurements. Our method comes with theoretical performance guarantees and is available within the SParse InversE Covariance estimation for Ecological ASsociation Inference (SPIEC-EASI) framework (‘SpiecEasi’ R-package). Using simulations, as well as a comprehensive collection of amplicon-based gut microbiome datasets, we illustrate the method’s ability to jointly identify compositional biases, latent factors that correlate with observed technical covariates, and robust statistical microbial associations that replicate across different gut microbial data sets.


2015 ◽  
Author(s):  
Scott M. Lundberg ◽  
William B. Tu ◽  
Brian Raught ◽  
Linda Z. Penn ◽  
Michael M. Hoffman ◽  
...  

Introduction: A cell's epigenome arises from interactions among regulatory factors --- transcription factors, histone modifications, and other DNA-associated proteins --- co-localized at particular genomic regions. Identifying the network of interactions among regulatory factors, the chromatin network, is of paramount importance in understanding epigenome regulation. Methods: We developed a novel computational approach, ChromNet, to infer the chromatin network from a set of ChIP-seq datasets. ChromNet has four key features that enable its use on large collections of ChIP-seq data. First, rather than using pairwise co-localization of factors along the genome, ChromNet identifies conditional dependence relationships that better discriminate direct and indirect interactions. Second, our novel statistical technique, the group graphical model, improves inference of conditional dependence on highly correlated datasets. Such datasets are common because some transcription factors form a complex and the same transcription factor is often assayed in different laboratories or cell types. Third, ChromNet's computationally efficient method and the group graphical model enable the learning of a joint network across all cell types, which greatly increases the scope of possible interactions. We have shown that this results in a significantly higher fold enrichment for validated protein interactions. Fourth, ChromNet provides an efficient way to identify the genomic context that drives a particular network edge, which provides a more comprehensive understanding of regulatory factor interactions. Results: We applied ChromNet to all available ChIP-seq data from the ENCODE Project, consisting of 1451 ChIP-seq datasets, which revealed previously known physical interactions better than alternative approaches. ChromNet also identified previously unreported regulatory factor interactions. We experimentally validated one of these interactions, between the MYC and HCFC1 transcription factors. Discussion: ChromNet provides a useful tool for understanding the interactions among regulatory factors and identifying novel interactions. We have provided an interactive web-based visualization of the full ENCODE chromatin network and the ability to incorporate custom datasets at http://chromnet.cs.washington.edu.


2018 ◽  
Vol 373 (1758) ◽  
pp. 20170377 ◽  
Author(s):  
Hexuan Liu ◽  
Jimin Kim ◽  
Eli Shlizerman

We propose an approach to represent neuronal network dynamics as a probabilistic graphical model (PGM). To construct the PGM, we collect time series of neuronal responses produced by the neuronal network and use singular value decomposition to obtain a low-dimensional projection of the time-series data. We then extract dominant patterns from the projections to get pairwise dependency information and create a graphical model for the full network. The outcome model is a functional connectome that captures how stimuli propagate through the network and thus represents causal dependencies between neurons and stimuli. We apply our methodology to a model of the Caenorhabditis elegans somatic nervous system to validate and show an example of our approach. The structure and dynamics of the C. elegans nervous system are well studied and a model that generates neuronal responses is available. The resulting PGM enables us to obtain and verify underlying neuronal pathways for known behavioural scenarios and detect possible pathways for novel scenarios. This article is part of a discussion meeting issue ‘Connectome to behaviour: modelling C. elegans at cellular resolution’.


2003 ◽  
Vol 6 (3) ◽  
pp. 201-211 ◽  
Author(s):  
INGRID K. CHRISTOFFELS ◽  
ANNETTE M. B. DE GROOT ◽  
LOURENS J. WALDORP

Simultaneous interpreting (SI) is a complex skill, where language comprehension and production take place at the same time in two different languages. In this study we identified some of the basic cognitive skills involved in SI, focusing on the roles of memory and lexical retrieval. We administered a reading span task in two languages and a verbal digit span task in the native language to assess memory capacity, and a picture naming and a word translation task to tap the retrieval time of lexical items in two languages, and we related performance on these four tasks to interpreting skill in untrained bilinguals. The results showed that word translation and picture naming latencies correlate with interpreting performance. Also digit span and reading span were associated with SI performance, only less strongly so. A graphical models analysis indicated that specifically word translation efficiency and working memory form independent subskills of SI performance in untrained bilinguals.


2009 ◽  
Vol 21 (11) ◽  
pp. 3010-3056 ◽  
Author(s):  
Shai Litvak ◽  
Shimon Ullman

In this letter, we develop and simulate a large-scale network of spiking neurons that approximates the inference computations performed by graphical models. Unlike previous related schemes, which used sum and product operations in either the log or linear domains, the current model uses an inference scheme based on the sum and maximization operations in the log domain. Simulations show that using these operations, a large-scale circuit, which combines populations of spiking neurons as basic building blocks, is capable of finding close approximations to the full mathematical computations performed by graphical models within a few hundred milliseconds. The circuit is general in the sense that it can be wired for any graph structure, it supports multistate variables, and it uses standard leaky integrate-and-fire neuronal units. Following previous work, which proposed relations between graphical models and the large-scale cortical anatomy, we focus on the cortical microcircuitry and propose how anatomical and physiological aspects of the local circuitry may map onto elements of the graphical model implementation. We discuss in particular the roles of three major types of inhibitory neurons (small fast-spiking basket cells, large layer 2/3 basket cells, and double-bouquet neurons), subpopulations of strongly interconnected neurons with their unique connectivity patterns in different cortical layers, and the possible role of minicolumns in the realization of the population-based maximum operation.


Biometrika ◽  
2016 ◽  
Vol 103 (4) ◽  
pp. 761-777 ◽  
Author(s):  
Kean Ming Tan ◽  
Yang Ning ◽  
Daniela M. Witten ◽  
Han Liu

Sign in / Sign up

Export Citation Format

Share Document