scholarly journals PathExt: a general framework for path-based mining of omics-integrated biological networks

Author(s):  
Narmada Sambaturu ◽  
Vaidehi Pusadkar ◽  
Sridhar Hannenhalli ◽  
Nagasuma Chandra

Abstract Motivation Transcriptomes are routinely used to prioritize genes underlying specific phenotypes. Current approaches largely focus on differentially expressed genes (DEGs), despite the recognition that phenotypes emerge via a network of interactions between genes and proteins, many of which may not be differentially expressed. Furthermore, many practical applications lack sufficient samples or an appropriate control to robustly identify statistically significant DEGs. Results We provide a computational tool—PathExt, which, in contrast to differential genes, identifies differentially active paths when a control is available, and most active paths otherwise, in an omics-integrated biological network. The sub-network comprising such paths, referred to as the TopNet, captures the most relevant genes and processes underlying the specific biological context. The TopNet forms a well-connected graph, reflecting the tight orchestration in biological systems. Two key advantages of PathExt are (i) it can extract characteristic genes and pathways even when only a single sample is available, and (ii) it can be used to study a system even in the absence of an appropriate control. We demonstrate the utility of PathExt via two diverse sets of case studies, to characterize (i) Mycobacterium tuberculosis response upon exposure to 18 antibacterial drugs where only one transcriptomic sample is available for each exposure; and (ii) tissue-relevant genes and processes using transcriptomic data for 39 human tissues. Overall, PathExt is a general tool for prioritizing context-relevant genes in any omics-integrated biological network for any condition(s) of interest, even with a single sample or in the absence of appropriate controls. Availabilityand implementation The source code for PathExt is available at https://github.com/NarmadaSambaturu/PathExt. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Author(s):  
Narmada Sambaturu ◽  
Vaidehi Pusadkar ◽  
Sridhar Hannenhalli ◽  
Nagasuma Chandra

AbstractMotivationLarge scale transcriptomic data are routinely used to prioritize genes underlying specific phenotypes. Current approaches largely focus on differentially expressed genes (DEGs), despite the recognition that phenotypes emerge via a network of interactions between genes and proteins, many of which may not be differentially expressed. Furthermore, many practical applications lack sufficient samples or an appropriate control to robustly identify statistically significant DEGs.ResultsWe provide a computational tool - PathExt, which, in contrast to differential genes, identifies differentially active paths when a control is available, and most active paths otherwise, in an omics-integrated biological network. The sub-network comprising such paths, referred to as the Top-Net, captures the most relevant genes and processes underlying the specific biological context. The TopNet forms a well-connected graph, reflecting the tight orchestration in biological systems. Two key advantages of PathExt are (i) it can extract characteristic genes and pathways even when only a single sample is available, and (ii) it can be used to study a system even in the absence of an appropriate control. We demonstrate the utility of PathExt via two diverse sets of case studies, to characterize (a) Mycobacterium tuberculosis (M.tb) response upon exposure to 18 antibacterial drugs where only one transcriptomic sample is available for each exposure; and (b) tissue-relevant genes and processes using transcriptomic data from GTEx (Genotype-Tissue Expression) for 39 human tissues. Overall, PathExt is a general tool for prioritizing context-relevant genes in any omics-integrated biological network for any condition(s) of interest, even with a single sample or in the absence of appropriate controls.AvailabilityThe source code for PathExt is available at https://github.com/NarmadaSambaturu/[email protected], [email protected]


Author(s):  
Lun Hu ◽  
Jun Zhang ◽  
Xiangyu Pan ◽  
Hong Yan ◽  
Zhu-Hong You

Abstract Motivation Clustering analysis in a biological network is to group biological entities into functional modules, thus providing valuable insight into the understanding of complex biological systems. Existing clustering techniques make use of lower-order connectivity patterns at the level of individual biological entities and their connections, but few of them can take into account of higher-order connectivity patterns at the level of small network motifs. Results Here, we present a novel clustering framework, namely HiSCF, to identify functional modules based on the higher-order structure information available in a biological network. Taking advantage of higher-order Markov stochastic process, HiSCF is able to perform the clustering analysis by exploiting a variety of network motifs. When compared with several state-of-the-art clustering models, HiSCF yields the best performance for two practical clustering applications, i.e. protein complex identification and gene co-expression module detection, in terms of accuracy. The promising performance of HiSCF demonstrates that the consideration of higher-order network motifs gains new insight into the analysis of biological networks, such as the identification of overlapping protein complexes and the inference of new signaling pathways, and also reveals the rich higher-order organizational structures presented in biological networks. Availability and implementation HiSCF is available at https://github.com/allenv5/HiSCF. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Genes ◽  
2021 ◽  
Vol 12 (8) ◽  
pp. 1113
Author(s):  
Michael Schwabe ◽  
Sven Griep ◽  
Henrike Schmidtberg ◽  
Rudy Plarre ◽  
Alexander Goesmann ◽  
...  

The clothes moth Tineola bisselliella is one of a few insects that can digest keratin, leading to the destruction of clothing, textiles and artwork. The mechanism of keratin digestion is not yet fully understood, partly reflecting the lack of publicly available genomic and transcriptomic data. Here we present a high-quality gut transcriptome of T. bisselliella generated from larvae reared on keratin-rich and keratin-free diets. The overall transcriptome consists of 428,221 contigs that were functionally annotated and screened for candidate enzymes involved in keratin utilization. As a mechanism for keratin digestion, we identified cysteine synthases, cystathionine β-synthases and cystathionine γ-lyases. These enzymes release hydrogen sulfite, which may reduce the disulfide bonds in keratin. The dataset also included 27 differentially expressed contigs with trypsin domains, among which 20 were associated with keratin feeding. Finally, we identified seven collagenases that were upregulated on the keratin-rich diet. In addition to this enzymatic repertoire potentially involved in breaking down keratin, our analysis of poly(A)-enriched and poly(A)-depleted transcripts suggested that T. bisselliella larvae possess an unstable intestinal microbiome that may nevertheless contribute to keratin digestion.


Author(s):  
Peter Ebert ◽  
Marcel H Schulz

Abstract Motivation The generation of genome-wide maps of histone modifications using chromatin immunoprecipitation sequencing (ChIP-seq) is a standard approach to dissect the complexity of the epigenome. Interpretation and differential analysis of histone datasets remains challenging due to regulatory meaningful co-occurrences of histone marks and their difference in genomic spread. To ease interpretation, chromatin state segmentation maps are a commonly employed abstraction combining individual histone marks. We developed the tool SCIDDO as a fast, flexible, and statistically sound method for the differential analysis of chromatin state segmentation maps. Results We demonstrate the utility of SCIDDO in a comparative analysis that identifies differential chromatin domains (DCD) in various regulatory contexts and with only moderate computational resources. We show that the identified DCDs correlate well with observed changes in gene expression and can recover a substantial number of differentially expressed genes. We showcase SCIDDO’s ability to directly interrogate chromatin dynamics such as enhancer switches in downstream analysis, which simplifies exploring specific questions about regulatory changes in chromatin. By comparing SCIDDO to competing methods, we provide evidence that SCIDDO’s performance in identifying differentially expressed genes (DEG) via differential chromatin marking is more stable across a range of cell-type comparisons and parameter cut-offs. Availability The SCIDDO source code is openly available under github.com/ptrebert/sciddo Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Jose Lugo-Martinez ◽  
Daniel Zeiberg ◽  
Thomas Gaudelet ◽  
Noël Malod-Dognin ◽  
Natasa Przulj ◽  
...  

Abstract Motivation Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins and drugs) and edges represent relational ties between these objects (binds-to, interacts-with and regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. Results We present a hypergraph-based approach for modeling biological systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs. We then introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of hypergraphlets; i.e. small hypergraphs rooted at a vertex of interest. We empirically evaluate this method on fifteen biological networks and show its potential use in a positive-unlabeled setting to estimate the interactome sizes in various species. Availability and implementation https://github.com/jlugomar/hypergraphlet-kernels Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (9) ◽  
pp. 2934-2935 ◽  
Author(s):  
Yi Zheng ◽  
Fangqing Zhao

Abstract Summary Circular RNAs (circRNAs) are proved to have unique compositions and splicing events distinct from canonical mRNAs. However, there is no visualization tool designed for the exploration of complex splicing patterns in circRNA transcriptomes. Here, we present CIRI-vis, a Java command-line tool for quantifying and visualizing circRNAs by integrating the alignments and junctions of circular transcripts. CIRI-vis can be applied to visualize the internal structure and isoform abundance of circRNAs and perform circRNA transcriptome comparison across multiple samples. Availability and implementation https://sourceforge.net/projects/ciri/files/CIRI-vis. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Reagon Karki ◽  
Alpha Tom Kodamullil ◽  
Charles Tapley Hoyt ◽  
Martin Hofmann-Apitius

Abstract Background Literature derived knowledge assemblies have been used as an effective way of representing biological phenomenon and understanding disease etiology in systems biology. These include canonical pathway databases such as KEGG, Reactome and WikiPathways and disease specific network inventories such as causal biological networks database, PD map and NeuroMMSig. The represented knowledge in these resources delineates qualitative information focusing mainly on the causal relationships between biological entities. Genes, the major constituents of knowledge representations, tend to express differentially in different conditions such as cell types, brain regions and disease stages. A classical approach of interpreting a knowledge assembly is to explore gene expression patterns of the individual genes. However, an approach that enables quantification of the overall impact of differentially expressed genes in the corresponding network is still lacking. Results Using the concept of heat diffusion, we have devised an algorithm that is able to calculate the magnitude of regulation of a biological network using expression datasets. We have demonstrated that molecular mechanisms specific to Alzheimer (AD) and Parkinson Disease (PD) regulate with different intensities across spatial and temporal resolutions. Our approach depicts that the mitochondrial dysfunction in PD is severe in cortex and advanced stages of PD patients. Similarly, we have shown that the intensity of aggregation of neurofibrillary tangles (NFTs) in AD increases as the disease progresses. This finding is in concordance with previous studies that explain the burden of NFTs in stages of AD. Conclusions This study is one of the first attempts that enable quantification of mechanisms represented as biological networks. We have been able to quantify the magnitude of regulation of a biological network and illustrate that the magnitudes are different across spatial and temporal resolution.


Author(s):  
Congting Ye ◽  
Qian Zhou ◽  
Xiaohui Wu ◽  
Chen Yu ◽  
Guoli Ji ◽  
...  

Abstract Motivation Alternative polyadenylation (APA) plays a key post-transcriptional regulatory role in mRNA stability and functions in eukaryotes. Single cell RNA-seq (scRNA-seq) is a powerful tool to discover cellular heterogeneity at gene expression level. Given 3′ enriched strategy in library construction, the most commonly used scRNA-seq protocol—10× Genomics enables us to improve the study resolution of APA to the single cell level. However, currently there is no computational tool available for investigating APA profiles from scRNA-seq data. Results Here, we present a package scDAPA for detecting and visualizing dynamic APA from scRNA-seq data. Taking bam/sam files and cell cluster labels as inputs, scDAPA detects APA dynamics using a histogram-based method and the Wilcoxon rank-sum test, and visualizes candidate genes with dynamic APA. Benchmarking results demonstrated that scDAPA can effectively identify genes with dynamic APA among different cell groups from scRNA-seq data. Availability and implementation The scDAPA package is implemented in Shell and R, and is freely available at https://scdapa.sourceforge.io. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Srikanth Ravichandran ◽  
András Hartmann ◽  
Antonio del Sol

Abstract Summary Single-cell RNA-sequencing is increasingly employed to characterize disease or ageing cell subpopulation phenotypes. Despite exponential increase in data generation, systematic identification of key regulatory factors for controlling cellular phenotype to enable cell rejuvenation in disease or ageing remains a challenge. Here, we present SigHotSpotter, a computational tool to predict hotspots of signaling pathways responsible for the stable maintenance of cell subpopulation phenotypes, by integrating signaling and transcriptional networks. Targeted perturbation of these signaling hotspots can enable precise control of cell subpopulation phenotypes. SigHotSpotter correctly predicts the signaling hotspots with known experimental validations in different cellular systems. The tool is simple, user-friendly and is available as web-server or as stand-alone software. We believe SigHotSpotter will serve as a general purpose tool for the systematic prediction of signaling hotspots based on single-cell RNA-seq data, and potentiate novel cell rejuvenation strategies in the context of disease and ageing. Availability and implementation SigHotSpotter is at https://SigHotSpotter.lcsb.uni.lu as a web tool. Source code, example datasets and other information are available at https://gitlab.com/srikanth.ravichandran/sighotspotter. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document