Fundamental limitations of network reconstruction from temporal data

Inferring properties of the interaction matrix that characterizes how nodes in a networked system directly interact with each other is a well-known network reconstruction problem. Despite a decade of extensive studies, network reconstruction remains an outstanding challenge. The fundamental limitations governing which properties of the interaction matrix (e.g. adjacency pattern, sign pattern or degree sequence) can be inferred from given temporal data of individual nodes remain unknown. Here, we rigorously derive the necessary conditions to reconstruct any property of the interaction matrix. Counterintuitively, we find that reconstructing any property of the interaction matrix is generically as difficult as reconstructing the interaction matrix itself, requiring equally informative temporal data. Revealing these fundamental limitations sheds light on the design of better network reconstruction algorithms that offer practical improvements over existing methods.

Download Full-text

New Results on Degree Sequences of Uniform Hypergraphs

The Electronic Journal of Combinatorics ◽

10.37236/3414 ◽

2013 ◽

Vol 20 (4) ◽

Cited By ~ 9

Author(s):

Sarah Behrens ◽

Catherine Erbes ◽

Michael Ferrara ◽

Stephen G. Hartke ◽

Benjamin Reiniger ◽

...

Keyword(s):

Efficient Algorithm ◽

Necessary Conditions ◽

Degree Sequence ◽

Sufficient Conditions ◽

Degree Sequences ◽

Uniform Hypergraph ◽

Graphic Sequence ◽

Uniform Hypergraphs ◽

Open Question

A sequence of nonnegative integers is $k$-graphic if it is the degree sequence of a $k$-uniform hypergraph. The only known characterization of $k$-graphic sequences is due to Dewdney in 1975. As this characterization does not yield an efficient algorithm, it is a fundamental open question to determine a more practical characterization. While several necessary conditions appear in the literature, there are few conditions that imply a sequence is $k$-graphic. In light of this, we present sharp sufficient conditions for $k$-graphicality based on a sequence's length and degree sum.Kocay and Li gave a family of edge exchanges (an extension of 2-switches) that could be used to transform one realization of a 3-graphic sequence into any other realization. We extend their result to $k$-graphic sequences for all $k \geq 3$. Finally we give several applications of edge exchanges in hypergraphs, including generalizing a result of Busch et al. on packing graphic sequences.

Download Full-text

X-rays Image reconstruction using Proximal Algorithm and adapted TV Regularization

MATEC Web of Conferences ◽

10.1051/matecconf/202134801011 ◽

2021 ◽

Vol 348 ◽

pp. 01011

Author(s):

Aicha Allag ◽

Redouane Drai ◽

Tarek Boutkedjirt ◽

Abdessalam Benammar ◽

Wahiba Djerir

Keyword(s):

Structural Information ◽

X Rays ◽

Reconstruction Algorithms ◽

Reconstruction Problem ◽

Tv Regularization ◽

Object Based ◽

Suggested Technique ◽

Reconstruction Image ◽

Ill Posed

Computed tomography (CT) aims to reconstruct an internal distribution of an object based on projection measurements. In the case of a limited number of projections, the reconstruction problem becomes significantly ill-posed. Practically, reconstruction algorithms play a crucial role in overcoming this problem. In the case of missing or incomplete data, and in order to improve the quality of the reconstruction image, the choice of a sparse regularisation by adding l1 norm is needed. The reconstruction problem is then based on using proximal operators. We are interested in the Douglas-Rachford method and employ total variation (TV) regularization. An efficient technique based on these concepts is proposed in this study. The primary goal is to achieve high-quality reconstructed images in terms of PSNR parameter and relative error. The numerical simulation results demonstrate that the suggested technique minimizes noise and artifacts while preserving structural information. The results are encouraging and indicate the effectiveness of the proposed strategy.

Download Full-text

Potentially eventually positive star sign patterns

Electronic Journal of Linear Algebra ◽

10.13001/1081-3810.2963 ◽

2016 ◽

Vol 31 ◽

pp. 541-548

Author(s):

Yu Ber-Lin ◽

Huang Ting-Zhu ◽

Jie Cui ◽

Deng Chunhua

Keyword(s):

Positive Integer ◽

Necessary Conditions ◽

Positive Sign ◽

Real Matrix ◽

Sign Pattern ◽

Positive Real ◽

Sign Patterns

An $n$-by-$n$ real matrix $A$ is eventually positive if there exists a positive integer $k_{0}$ such that $A^{k}>0$ for all $k\geq k_{0}$. An $n$-by-$n$ sign pattern $\mathcal{A}$ is potentially eventually positive (PEP) if there exists an eventually positive real matrix $A$ with the same sign pattern as $\mathcal{A}$. An $n$-by-$n$ sign pattern $\mathcal{A}$ is a minimal potentially eventually positive sign pattern (MPEP sign pattern) if $\mathcal{A}$ is PEP and no proper subpattern of $\mathcal{A}$ is PEP. Berman, Catral, Dealba, et al. [Sign patterns that allow eventual positivity, {\it ELA}, 19(2010): 108-120] established some sufficient and some necessary conditions for an $n$-by-$n$ sign pattern to allow eventual positivity and classified the potentially eventually positive sign patterns of order $n\leq 3$. However, the identification and classification of PEP signpatterns of order $n\geq 4$ remain open. In this paper, all the $n$-by-$n$ PEP star sign patterns are classified by identifying all the MPEP star sign patterns.

Download Full-text

Performance Assessment of the Network Reconstruction Approaches on Various Interactomes

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2021.666705 ◽

2021 ◽

Vol 8 ◽

Author(s):

M. Kaan Arici ◽

Nurcan Tuncbag

Keyword(s):

Protein Interactions ◽

Structural Information ◽

Network Reconstruction ◽

Heat Diffusion ◽

Reconstruction Algorithms ◽

Personalized Pagerank ◽

Cancer Driver ◽

Pathway Reconstruction ◽

Main Challenge ◽

Steiner Forest

Beyond the list of molecules, there is a necessity to collectively consider multiple sets of omic data and to reconstruct the connections between the molecules. Especially, pathway reconstruction is crucial to understanding disease biology because abnormal cellular signaling may be pathological. The main challenge is how to integrate the data together in an accurate way. In this study, we aim to comparatively analyze the performance of a set of network reconstruction algorithms on multiple reference interactomes. We first explored several human protein interactomes, including PathwayCommons, OmniPath, HIPPIE, iRefWeb, STRING, and ConsensusPathDB. The comparison is based on the coverage of each interactome in terms of cancer driver proteins, structural information of protein interactions, and the bias toward well-studied proteins. We next used these interactomes to evaluate the performance of network reconstruction algorithms including all-pair shortest path, heat diffusion with flux, personalized PageRank with flux, and prize-collecting Steiner forest (PCSF) approaches. Each approach has its own merits and weaknesses. Among them, PCSF had the most balanced performance in terms of precision and recall scores when 28 pathways from NetPath were reconstructed using the listed algorithms. Additionally, the reference interactome affects the performance of the network reconstruction approaches. The coverage and disease- or tissue-specificity of each interactome may vary, which may result in differences in the reconstructed networks.

Download Full-text

Addressing confounding artifacts in reconstruction of gene co-expression networks

10.1101/202903 ◽

2017 ◽

Cited By ~ 4

Author(s):

Princy Parsana ◽

Claire Ruberman ◽

Andrew E. Jaffe ◽

Michael C. Schatz ◽

Alexis Battle ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Network Inference ◽

Principal Component ◽

Network Reconstruction ◽

Common Interest ◽

Expression Data ◽

Reconstruction Algorithms ◽

Wide Range ◽

False Discoveries

AbstractBackgroundGene co-expression networks capture diverse biological relationships between genes, and are important tools in predicting gene function and understanding disease mechanisms. Functional interactions between genes have not been fully characterized for most organisms, and therefore reconstruction of gene co-expression networks has been of common interest in a variety of settings. However, methods routinely used for reconstruction of gene co-expression networks do not account for confounding artifacts known to affect high dimensional gene expression measurements.ResultsIn this study, we show that artifacts such as batch effects in gene expression data confound commonly used network reconstruction algorithms. Both theoretically and empirically, we demonstrate that removing the effects of top principal components from gene expression measurements prior to network inference can reduce false discoveries, especially when well annotated technical covariates are not available. Using expression data from the GTEx project in multiple tissues and hundreds of individuals, we show that this latent factor residualization approach often reduces false discoveries in the reconstructed networks.ConclusionNetwork reconstruction is susceptible to confounders that affect measurements of gene expression. Even controlling for major individual known technical covariates fails to fully eliminate confounding variation from the data. In studies where a wide range of annotated technical factors are measured and available, correcting gene expression data with multiple covariates can also improve network reconstruction, but such extensive annotations are not always available. Our study shows that principal component correction, which does not depend on study design or annotation of all relevant confounders, removes patterns of artifactual variation and improves network reconstruction in both simulated data, and gene expression data from GTEx project. We have implemented our PC correction approach in the Bioconductor package sva which can be used prior to network reconstruction with a range of methods.

Download Full-text

ANAT 3.0: a framework for elucidating functional protein subnetworks using graph-theoretic and machine learning approaches

BMC Bioinformatics ◽

10.1186/s12859-021-04449-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

L. F. Signorini ◽

T. Almozlino ◽

R. Sharan

Keyword(s):

Machine Learning ◽

Protein Interaction ◽

Network Reconstruction ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Reconstruction Algorithms ◽

Functional Protein ◽

Protein Protein Interaction ◽

Graphical Tool ◽

Protein Protein Interaction Networks

Abstract Background ANAT is a Cytoscape plugin for the inference of functional protein–protein interaction networks in yeast and human. It is a flexible graphical tool for scientists to explore and elucidate the protein–protein interaction pathways of a process under study. Results Here we present ANAT3.0, which comes with updated PPI network databases of 544,455 (human) and 155,504 (yeast) interactions, and a new machine-learning layer for refined network elucidation. Together they improve network reconstruction to more than twofold increase in the quality of reconstructing known signaling pathways from KEGG. Conclusions ANAT3.0 includes improved network reconstruction algorithms and more comprehensive protein–protein interaction networks than previous versions. ANAT is available for download on the Cytoscape Appstore and at https://www.cs.tau.ac.il/~bnet/ANAT/.

Download Full-text

Reconstructing Sample-Specific Networks using LIONESS

10.1101/2021.09.27.461954 ◽

2021 ◽

Author(s):

Marieke Lydia Kuijjer ◽

Kimberly Glass

Keyword(s):

Data Preprocessing ◽

Network Reconstruction ◽

Single Sample ◽

Reconstruction Algorithms ◽

Data Types ◽

Aggregate Network ◽

Expected Outcomes

We recently developed LIONESS, a method to estimate sample-specific networks based on the output of an aggregate network reconstruction approach. In this manuscript, we describe how to apply LIONESS to different network reconstruction algorithms and data types. We highlight how decisions related to data preprocessing may affect the output networks, discuss expected outcomes, and give examples of how to analyze and compare single sample networks.

Download Full-text

Reconstruction Algorithms for DNA-Storage Systems

10.1101/2020.09.16.300186 ◽

2020 ◽

Author(s):

Omer Sabary ◽

Alexander Yucovich ◽

Guy Shapira ◽

Eitan Yaakobi

Keyword(s):

Simulated Data ◽

Reconstruction Algorithm ◽

Longest Common Subsequence ◽

Levenshtein Distance ◽

Reconstruction Algorithms ◽

Reconstruction Problem ◽

Common Subsequence ◽

Minimum Number ◽

Entire Sequence ◽

Error Probabilities

AbstractIn the trace reconstruction problem a length-n string x yields a collection of noisy copies, called traces, y1, …, yt where each yi is independently obtained from x by passing through a deletion channel, which deletes every symbol with some fixed probability. The main goal under this paradigm is to determine the required minimum number of i.i.d traces in order to reconstruct x with high probability. The trace reconstruction problem can be extended to the model where each trace is a result of x passing through a deletion-insertion-substitution channel, which introduces also insertions and substitutions. Motivated by the storage channel of DNA, this work is focused on another variation of the trace reconstruction problem, which is referred by the DNA reconstruction problem. A DNA reconstruction algorithm is a mapping which receives t traces y1, …, yt as an input and produces , an estimation of x. The goal in the DNA reconstruction problem is to minimize the edit distance between the original string and the algorithm’s estimation. For the deletion channel case, the problem is referred by the deletion DNA reconstruction problem and the goal is to minimize the Levenshtein distance .In this work, we present several new algorithms for these reconstruction problems. Our algorithms look globally on the entire sequence of the traces and use dynamic programming algorithms, which are used for the shortest common supersequence and the longest common subsequence problems, in order to decode the original sequence. Our algorithms do not require any limitations on the input and the number of traces, and more than that, they perform well even for error probabilities as high as 0.27. The algorithms have been tested on simulated data as well as on data from previous DNA experiments and are shown to outperform all previous algorithms.

Download Full-text

lionessR: single-sample network reconstruction in R

10.1101/582098 ◽

2019 ◽

Author(s):

Marieke L. Kuijjer ◽

John Quackenbush ◽

Kimberly Glass

Keyword(s):

Regulatory Network ◽

Bone Cancer ◽

Linear Interpolation ◽

R Package ◽

Network Reconstruction ◽

Single Sample ◽

Reconstruction Method ◽

Reconstruction Algorithms ◽

Cancer Dataset ◽

Weighted Adjacency Matrix

SummaryWe recently developed LIONESS (Linear Interpolation to Obtain Network Estimates for Single Samples), a method that can be used together with network reconstruction algorithms to extract networks for individual samples in a population. LIONESS was originally made available as a function within the PANDA (Passing Attributes between Networks for Data Assimilation) regulatory network reconstruction framework. In this application note, we describe lionessR, an R implementation of LIONESS that can be applied to any network reconstruction method in R that outputs a complete, weighted adjacency matrix. As an example, we use lionessR to model single-sample co-expression networks on a bone cancer dataset, and show how lionessR can be used to identify differential co-expression between two groups of patients.Availability and implementationThe lionessR open source R package, which includes a vignette of the application, is freely available athttps://github.com/mararie/[email protected]

Download Full-text

Reconstructing Words from Right-Bounded-Block Words

International Journal of Foundations of Computer Science ◽

10.1142/s0129054121420016 ◽

2021 ◽

pp. 1-22

Author(s):

Pamela Fleischmann ◽

Marie Lejeune ◽

Florin Manea ◽

Dirk Nowotka ◽

Michel Rigo

Keyword(s):

Upper Bounds ◽

Reconstruction Algorithms ◽

Reconstruction Problem ◽

Time Bounds ◽

Minimal Information

A reconstruction problem of words from scattered factors asks for the minimal information, like multisets of scattered factors of a given length or the number of occurrences of scattered factors from a given set, necessary to uniquely determine a word. We show that a word [Formula: see text] can be reconstructed from the number of occurrences of at most [Formula: see text] scattered factors of the form [Formula: see text], where [Formula: see text] is the number of occurrences of the letter [Formula: see text] in [Formula: see text]. Moreover, we generalise the result to alphabets of the form [Formula: see text] by showing that at most [Formula: see text] scattered factors suffices to reconstruct [Formula: see text]. Both results improve on the upper bounds known so far. Complexity time bounds on reconstruction algorithms are also considered here.

Download Full-text