scholarly journals Adding Extra Knowledge in Scalable Learning of Sparse Differential Gaussian Graphical Models

2019 ◽  
Author(s):  
Arshdeep Sekhon ◽  
Beilun Wang ◽  
Yanjun Qi

AbstractWe focus on integrating different types of extra knowledge (other than the observed samples) for estimating the sparse structure change between two p-dimensional Gaussian Graphical Models (i.e. differential GGMs). Previous differential GGM estimators either fail to include additional knowledge or cannot scale up to a high-dimensional (large p) situation. This paper proposes a novel method KDiffNet that incorporates Additional Knowledge in identifying Differential Networks via an Elementary Estimator. We design a novel hybrid norm as a superposition of two structured norms guided by the extra edge information and the additional node group knowledge. KDiffNet is solved through a fast parallel proximal algorithm, enabling it to work in large-scale settings. KDiffNet can incorporate various combinations of existing knowledge without re-designing the optimization. Through rigorous statistical analysis we show that, while considering more evidence, KDiffNet achieves the same convergence rate as the state-of-the-art. Empirically on multiple synthetic datasets and one real-world fMRI brain data, KDiffNet significantly outperforms the cutting edge baselines with regard to the prediction performance, while achieving the same level of time cost or less.

Author(s):  
Feng Huang ◽  
Xiang Yue ◽  
Zhankun Xiong ◽  
Zhouxin Yu ◽  
Shichao Liu ◽  
...  

Abstract MicroRNAs (miRNAs) play crucial roles in multifarious biological processes associated with human diseases. Identifying potential miRNA-disease associations contributes to understanding the molecular mechanisms of miRNA-related diseases. Most of the existing computational methods mainly focus on predicting whether a miRNA-disease association exists or not. However, the roles of miRNAs in diseases are prominently diverged, for instance, Genetic variants of miRNA (mir-15) may affect the expression level of miRNAs leading to B cell chronic lymphocytic leukemia, while circulating miRNAs (including mir-1246, mir-1307-3p, etc.) have potentials to detecting breast cancer in the early stage. In this paper, we aim to predict multi-type miRNA-disease associations instead of taking them as binary. To this end, we innovatively represent miRNA-disease-type triples as a tensor and introduce tensor decomposition methods to solve the prediction task. Experimental results on two widely-adopted miRNA-disease datasets: HMDD v2.0 and HMDD v3.2 show that tensor decomposition methods improve a recent baseline in a large scale (up to $38\%$ in Top-1F1). We then propose a novel method, Tensor Decomposition with Relational Constraints (TDRC), which incorporates biological features as relational constraints to further the existing tensor decomposition methods. Compared with two existing tensor decomposition methods, TDRC can produce better performance while being more efficient.


2021 ◽  
Vol 17 (3) ◽  
pp. 1-33
Author(s):  
Beilun Wang ◽  
Jiaqi Zhang ◽  
Yan Zhang ◽  
Meng Wang ◽  
Sen Wang

Recently, the Internet of Things (IoT) receives significant interest due to its rapid development. But IoT applications still face two challenges: heterogeneity and large scale of IoT data. Therefore, how to efficiently integrate and process these complicated data becomes an essential problem. In this article, we focus on the problem that analyzing variable dependencies of data collected from different edge devices in the IoT network. Because data from different devices are heterogeneous and the variable dependencies can be characterized into a graphical model, we can focus on the problem that jointly estimating multiple, high-dimensional, and sparse Gaussian Graphical Models for many related tasks (edge devices). This is an important goal in many fields. Many IoT networks have collected massive multi-task data and require the analysis of heterogeneous data in many scenarios. Past works on the joint estimation are non-distributed and involve computationally expensive and complex non-smooth optimizations. To address these problems, we propose a novel approach: Multi-FST. Multi-FST can be efficiently implemented on a cloud-server-based IoT network. The cloud server has a low computational load and IoT devices use asynchronous communication with the server, leading to efficiency. Multi-FST shows significant improvement, over baselines, when tested on various datasets.


2019 ◽  
Vol 5 (11) ◽  
pp. eaau4996 ◽  
Author(s):  
Jakob Runge ◽  
Peer Nowack ◽  
Marlene Kretschmer ◽  
Seth Flaxman ◽  
Dino Sejdinovic

Identifying causal relationships and quantifying their strength from observational time series data are key problems in disciplines dealing with complex dynamical systems such as the Earth system or the human body. Data-driven causal inference in such systems is challenging since datasets are often high dimensional and nonlinear with limited sample sizes. Here, we introduce a novel method that flexibly combines linear or nonlinear conditional independence tests with a causal discovery algorithm to estimate causal networks from large-scale time series datasets. We validate the method on time series of well-understood physical mechanisms in the climate system and the human heart and using large-scale synthetic datasets mimicking the typical properties of real-world data. The experiments demonstrate that our method outperforms state-of-the-art techniques in detection power, which opens up entirely new possibilities to discover and quantify causal networks from time series across a range of research fields.


2019 ◽  
Author(s):  
Elisa Benedetti ◽  
Nathalie Gerstner ◽  
Maja Pučić-Baković ◽  
Toma Keser ◽  
Karli R. Reiding ◽  
...  

AbstractGlycomics measurements, like all other high-throughput technologies, are subject to technical variation due to fluctuations in the experimental conditions. The removal of this non-biological signal from the data is referred to as normalization. Contrary to other omics data types, a systematic evaluation of normalization options for glycomics data has not been published so far. In this paper, we assess the quality of different normalization strategies for glycomics data with an innovative approach. It has been shown previously that Gaussian Graphical Models (GGMs) inferred from glycomics data are able to identify enzymatic steps in the glycan synthesis pathways in a data-driven fashion. Based on this finding, we here quantify the quality of a given normalization method according to how well a GGM inferred from the respective normalized data reconstructs known synthesis reactions in the glycosylation pathway. The method therefore exploits a biological measure of goodness. We analyzed 23 different normalization combinations applied to six large-scale glycomics cohorts across three experimental platforms (LC-ESI-MS, UHPLC-FLD and MALDI-FTICR-MS). Based on our results, we recommend normalizing glycan data using the ‘Probabilistic Quotient’ method followed by log-transformation, irrespective of the measurement platform.


Author(s):  
S. Pragati ◽  
S. Kuldeep ◽  
S. Ashok ◽  
M. Satheesh

One of the situations in the treatment of disease is the delivery of efficacious medication of appropriate concentration to the site of action in a controlled and continual manner. Nanoparticle represents an important particulate carrier system, developed accordingly. Nanoparticles are solid colloidal particles ranging in size from 1 to 1000 nm and composed of macromolecular material. Nanoparticles could be polymeric or lipidic (SLNs). Industry estimates suggest that approximately 40% of lipophilic drug candidates fail due to solubility and formulation stability issues, prompting significant research activity in advanced lipophile delivery technologies. Solid lipid nanoparticle technology represents a promising new approach to lipophile drug delivery. Solid lipid nanoparticles (SLNs) are important advancement in this area. The bioacceptable and biodegradable nature of SLNs makes them less toxic as compared to polymeric nanoparticles. Supplemented with small size which prolongs the circulation time in blood, feasible scale up for large scale production and absence of burst effect makes them interesting candidates for study. In this present review this new approach is discussed in terms of their preparation, advantages, characterization and special features.


2008 ◽  
Vol 59 (11) ◽  
Author(s):  
Iulia Lupan ◽  
Sergiu Chira ◽  
Maria Chiriac ◽  
Nicolae Palibroda ◽  
Octavian Popescu

Amino acids are obtained by bacterial fermentation, extraction from natural protein or enzymatic synthesis from specific substrates. With the introduction of recombinant DNA technology, it has become possible to apply more rational approaches to enzymatic synthesis of amino acids. Aspartase (L-aspartate ammonia-lyase) catalyzes the reversible deamination of L-aspartic acid to yield fumaric acid and ammonia. It is one of the most important industrial enzymes used to produce L-aspartic acid on a large scale. Here we described a novel method for [15N] L-aspartic synthesis from fumarate and ammonia (15NH4Cl) using a recombinant aspartase.


2020 ◽  
Vol 27 (2) ◽  
pp. 105-110 ◽  
Author(s):  
Niaz Ahmad ◽  
Muhammad Aamer Mehmood ◽  
Sana Malik

: In recent years, microalgae have emerged as an alternative platform for large-scale production of recombinant proteins for different commercial applications. As a production platform, it has several advantages, including rapid growth, easily scale up and ability to grow with or without the external carbon source. Genetic transformation of several species has been established. Of these, Chlamydomonas reinhardtii has become significantly attractive for its potential to express foreign proteins inexpensively. All its three genomes – nuclear, mitochondrial and chloroplastic – have been sequenced. As a result, a wealth of information about its genetic machinery, protein expression mechanism (transcription, translation and post-translational modifications) is available. Over the years, various molecular tools have been developed for the manipulation of all these genomes. Various studies show that the transformation of the chloroplast genome has several advantages over nuclear transformation from the biopharming point of view. According to a recent survey, over 100 recombinant proteins have been expressed in algal chloroplasts. However, the expression levels achieved in the algal chloroplast genome are generally lower compared to the chloroplasts of higher plants. Work is therefore needed to make the algal chloroplast transformation commercially competitive. In this review, we discuss some examples from the algal research, which could play their role in making algal chloroplast commercially successful.


Sign in / Sign up

Export Citation Format

Share Document