Scalable Estimator for Multi-task Gaussian Graphical Models Based in an IoT Network

2021 ◽  
Vol 17 (3) ◽  
pp. 1-33
Author(s):  
Beilun Wang ◽  
Jiaqi Zhang ◽  
Yan Zhang ◽  
Meng Wang ◽  
Sen Wang

Recently, the Internet of Things (IoT) receives significant interest due to its rapid development. But IoT applications still face two challenges: heterogeneity and large scale of IoT data. Therefore, how to efficiently integrate and process these complicated data becomes an essential problem. In this article, we focus on the problem that analyzing variable dependencies of data collected from different edge devices in the IoT network. Because data from different devices are heterogeneous and the variable dependencies can be characterized into a graphical model, we can focus on the problem that jointly estimating multiple, high-dimensional, and sparse Gaussian Graphical Models for many related tasks (edge devices). This is an important goal in many fields. Many IoT networks have collected massive multi-task data and require the analysis of heterogeneous data in many scenarios. Past works on the joint estimation are non-distributed and involve computationally expensive and complex non-smooth optimizations. To address these problems, we propose a novel approach: Multi-FST. Multi-FST can be efficiently implemented on a cloud-server-based IoT network. The cloud server has a low computational load and IoT devices use asynchronous communication with the server, leading to efficiency. Multi-FST shows significant improvement, over baselines, when tested on various datasets.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Vincent Bessonneau ◽  
Roy R. Gerona ◽  
Jessica Trowbridge ◽  
Rachel Grashow ◽  
Thomas Lin ◽  
...  

AbstractGiven the complex exposures from both exogenous and endogenous sources that an individual experiences during life, exposome-wide association studies that interrogate levels of small molecules in biospecimens have been proposed for discovering causes of chronic diseases. We conducted a study to explore associations between environmental chemicals and endogenous molecules using Gaussian graphical models (GGMs) of non-targeted metabolomics data measured in a cohort of California women firefighters and office workers. GGMs revealed many exposure-metabolite associations, including that exposures to mono-hydroxyisononyl phthalate, ethyl paraben and 4-ethylbenzoic acid were associated with metabolites involved in steroid hormone biosynthesis, and perfluoroalkyl substances were linked to bile acids—hormones that regulate cholesterol and glucose metabolism—and inflammatory signaling molecules. Some hypotheses generated from these findings were confirmed by analysis of data from the National Health and Nutrition Examination Survey. Taken together, our findings demonstrate a novel approach to discovering associations between chemical exposures and biological processes of potential relevance for disease causation.



Mathematics ◽  
2021 ◽  
Vol 9 (17) ◽  
pp. 2105
Author(s):  
Claudia Angelini ◽  
Daniela De De Canditiis ◽  
Anna Plaksienko

In this paper, we consider the problem of estimating multiple Gaussian Graphical Models from high-dimensional datasets. We assume that these datasets are sampled from different distributions with the same conditional independence structure, but not the same precision matrix. We propose jewel, a joint data estimation method that uses a node-wise penalized regression approach. In particular, jewel uses a group Lasso penalty to simultaneously guarantee the resulting adjacency matrix’s symmetry and the graphs’ joint learning. We solve the minimization problem using the group descend algorithm and propose two procedures for estimating the regularization parameter. Furthermore, we establish the estimator’s consistency property. Finally, we illustrate our estimator’s performance through simulated and real data examples on gene regulatory networks.



2009 ◽  
Vol 21 (11) ◽  
pp. 3010-3056 ◽  
Author(s):  
Shai Litvak ◽  
Shimon Ullman

In this letter, we develop and simulate a large-scale network of spiking neurons that approximates the inference computations performed by graphical models. Unlike previous related schemes, which used sum and product operations in either the log or linear domains, the current model uses an inference scheme based on the sum and maximization operations in the log domain. Simulations show that using these operations, a large-scale circuit, which combines populations of spiking neurons as basic building blocks, is capable of finding close approximations to the full mathematical computations performed by graphical models within a few hundred milliseconds. The circuit is general in the sense that it can be wired for any graph structure, it supports multistate variables, and it uses standard leaky integrate-and-fire neuronal units. Following previous work, which proposed relations between graphical models and the large-scale cortical anatomy, we focus on the cortical microcircuitry and propose how anatomical and physiological aspects of the local circuitry may map onto elements of the graphical model implementation. We discuss in particular the roles of three major types of inhibitory neurons (small fast-spiking basket cells, large layer 2/3 basket cells, and double-bouquet neurons), subpopulations of strongly interconnected neurons with their unique connectivity patterns in different cortical layers, and the possible role of minicolumns in the realization of the population-based maximum operation.



Biometrics ◽  
2017 ◽  
Vol 73 (3) ◽  
pp. 769-779 ◽  
Author(s):  
Zhixiang Lin ◽  
Tao Wang ◽  
Can Yang ◽  
Hongyu Zhao


2019 ◽  
Author(s):  
Arshdeep Sekhon ◽  
Beilun Wang ◽  
Yanjun Qi

AbstractWe focus on integrating different types of extra knowledge (other than the observed samples) for estimating the sparse structure change between two p-dimensional Gaussian Graphical Models (i.e. differential GGMs). Previous differential GGM estimators either fail to include additional knowledge or cannot scale up to a high-dimensional (large p) situation. This paper proposes a novel method KDiffNet that incorporates Additional Knowledge in identifying Differential Networks via an Elementary Estimator. We design a novel hybrid norm as a superposition of two structured norms guided by the extra edge information and the additional node group knowledge. KDiffNet is solved through a fast parallel proximal algorithm, enabling it to work in large-scale settings. KDiffNet can incorporate various combinations of existing knowledge without re-designing the optimization. Through rigorous statistical analysis we show that, while considering more evidence, KDiffNet achieves the same convergence rate as the state-of-the-art. Empirically on multiple synthetic datasets and one real-world fMRI brain data, KDiffNet significantly outperforms the cutting edge baselines with regard to the prediction performance, while achieving the same level of time cost or less.



2019 ◽  
Vol 26 (11) ◽  
pp. 1195-1202 ◽  
Author(s):  
Jelena Gligorijevic ◽  
Djordje Gligorijevic ◽  
Martin Pavlovski ◽  
Elizabeth Milkovits ◽  
Lucas Glass ◽  
...  

Abstract Objective Clinical trials, prospective research studies on human participants carried out by a distributed team of clinical investigators, play a crucial role in the development of new treatments in health care. This is a complex and expensive process where investigators aim to enroll volunteers with predetermined characteristics, administer treatment(s), and collect safety and efficacy data. Therefore, choosing top-enrolling investigators is essential for efficient clinical trial execution and is 1 of the primary drivers of drug development cost. Materials and Methods To facilitate clinical trials optimization, we propose DeepMatch (DM), a novel approach that builds on top of advances in deep learning. DM is designed to learn from both investigator and trial-related heterogeneous data sources and rank investigators based on their expected enrollment performance on new clinical trials. Results Large-scale evaluation conducted on 2618 studies provides evidence that the proposed ranking-based framework improves the current state-of-the-art by up to 19% on ranking investigators and up to 10% on detecting top/bottom performers when recruiting investigators for new clinical trials. Discussion The extensive experimental section suggests that DM can provide substantial improvement over current industry standards in several regards: (1) the enrollment potential of the investigator list, (2) the time it takes to generate the list, and (3) data-informed decisions about new investigators. Conclusion Due to the great significance of the problem at hand, related research efforts are set to shift the paradigm of how investigators are chosen for clinical trials, thereby optimizing and automating them and reducing the cost of new therapies.



Genes ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 167 ◽  
Author(s):  
Qingyang Zhang

The nonparanormal graphical model has emerged as an important tool for modeling dependency structure between variables because it is flexible to non-Gaussian data while maintaining the good interpretability and computational convenience of Gaussian graphical models. In this paper, we consider the problem of detecting differential substructure between two nonparanormal graphical models with false discovery rate control. We construct a new statistic based on a truncated estimator of the unknown transformation functions, together with a bias-corrected sample covariance. Furthermore, we show that the new test statistic converges to the same distribution as its oracle counterpart does. Both synthetic data and real cancer genomic data are used to illustrate the promise of the new method. Our proposed testing framework is simple and scalable, facilitating its applications to large-scale data. The computational pipeline has been implemented in the R package DNetFinder, which is freely available through the Comprehensive R Archive Network.



2020 ◽  
Vol 14 (1) ◽  
pp. 2439-2483
Author(s):  
Yuhao Wang ◽  
Santiago Segarra ◽  
Caroline Uhler


Author(s):  
Eugene Santos Jr. ◽  
Eunice E. Santos ◽  
Hien Nguyen ◽  
Long Pan ◽  
John Korah

With the proliferation of the Internet and rapid development of information and communication infrastructure, E-governance has become a viable option for effective deployment of government services and programs. Areas of E-governance such as Homeland security and disaster relief have to deal with vast amounts of dynamic heterogeneous data. Providing rapid real-time search capabilities for such databases/sources is a challenge. Intelligent Foraging, Gathering, and Matching (I-FGM) is an established framework developed to assist analysts to find information quickly and effectively by incrementally collecting, processing and matching information nuggets. This framework has previously been used to develop a distributed, free text information retrieval application. In this chapter, we provide a comprehensive solution for the E-GOV analyst by extending the I-FGM framework to image collections and creating a “live” version of I-FGM deployable for real-world use. We present a Content Based Image Retrieval (CBIR) technique that incrementally processes the images, extracts low-level features and map them to higher level concepts. Our empirical evaluation of the algorithm shows that our approach performs competitively compared to some existing approaches in terms of retrieving relevant images while offering the speed advantages of a distributed and incremental process, and unified framework for both text and images. We describe our production level prototype that has a sophisticated user interface which can also deal with multiple queries from multiple users. The interface provides real-time updating of the search results and provides “under the hood” details of I-FGM processes as the queries are being processed.



Sign in / Sign up

Export Citation Format

Share Document