scholarly journals Causal network perturbations for instance-specific analysis of single cell and disease samples

2019 ◽  
Vol 36 (8) ◽  
pp. 2515-2521 ◽  
Author(s):  
Kristina L Buschur ◽  
Maria Chikina ◽  
Panayiotis V Benos

Abstract Motivation Complex diseases involve perturbation in multiple pathways and a major challenge in clinical genomics is characterizing pathway perturbations in individual samples. This can lead to patient-specific identification of the underlying mechanism of disease thereby improving diagnosis and personalizing treatment. Existing methods rely on external databases to quantify pathway activity scores. This ignores the data dependencies and that pathways are incomplete or condition-specific. Results ssNPA is a new approach for subtyping samples based on deregulation of their gene networks. ssNPA learns a causal graph directly from control data. Sample-specific network neighborhood deregulation is quantified via the error incurred in predicting the expression of each gene from its Markov blanket. We evaluate the performance of ssNPA on liver development single-cell RNA-seq data, where the correct cell timing is recovered; and two TCGA datasets, where ssNPA patient clusters have significant survival differences. In all analyses ssNPA consistently outperforms alternative methods, highlighting the advantage of network-based approaches. Availability and implementation http://www.benoslab.pitt.edu/Software/ssnpa/. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Kristina L. Buschur ◽  
Maria Chikina ◽  
Panayiotis V. Benos

AbstractComplex diseases involve perturbation in multiple pathways and a major challenge in clinical genomics is characterizing pathway perturbations in individual samples. This can lead to patient-specific identification of the underlying mechanism of disease thereby improving diagnosis and personalizing treatment. Existing methods rely on external databases to quantify pathway activity scores. This ignores the data dependencies and that pathways are incomplete or condition-specific.ssNPA is a new approach for subtyping samples based on deregulation of their gene networks. ssNPA learns a causal graph directly from control data. Sample-specific network neighborhood deregulation is quantified via the error incurred in predicting the expression of each gene from its Markov blanket. We evaluate the performance of ssNPA on liver development single-cell RNAseq data, where the correct cell timing is recovered. In all analyses ssNPA consistently outperforms alternative methods, highlighting the advantage of network-based approaches.


2018 ◽  
Vol 35 (16) ◽  
pp. 2843-2846 ◽  
Author(s):  
Hung Nguyen ◽  
Sangam Shrestha ◽  
Sorin Draghici ◽  
Tin Nguyen

Abstract Summary Since cancer is a heterogeneous disease, tumor subtyping is crucial for improved treatment and prognosis. We have developed a subtype discovery tool, called PINSPlus, that is: (i) robust against noise and unstable quantitative assays, (ii) able to integrate multiple types of omics data in a single analysis and (iii) dramatically superior to established approaches in identifying known subtypes and novel subgroups with significant survival differences. Our validation on 12,158 samples from 44 datasets shows that PINSPlus vastly outperforms other approaches. The software is easy-to-use and can partition hundreds of patients in a few minutes on a personal computer. Availability and implementation The package is available at https://cran.r-project.org/package=PINSPlus. Data and R script used in this manuscript are available at https://bioinformatics.cse.unr.edu/software/PINSPlus/. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Sumit Mukherjee ◽  
Alberto Carignano ◽  
Georg Seelig ◽  
Su-In Lee

AbstractIdentifying the gene regulatory networks that control development or disease is one of the most important problems in biology. Here, we introduce a computational approach, called PIPER (ProgressIve network PERturbation), to identify the perturbed genes that drive differences in the gene regulatory network across different points in a biological progression. PIPER employs algorithms tailor-made for single cell RNA sequencing (scRNA-seq) data to jointly identify gene networks for multiple progressive conditions. It then performs differential network analysis along the identified gene networks to identify master regulators. We demonstrate that PIPER outperforms state-of-the-art alternative methods on simulated data and is able to predict known key regulators of differentiation on real scRNA-Seq datasets.


Gut ◽  
2021 ◽  
pp. gutjnl-2020-322835
Author(s):  
Elias Orouji ◽  
Ayush T Raman ◽  
Anand K Singh ◽  
Alexey Sorokin ◽  
Emre Arslan ◽  
...  

ObjectiveEnhancer aberrations are beginning to emerge as a key epigenetic feature of colorectal cancers (CRC), however, a comprehensive knowledge of chromatin state patterns in tumour progression, heterogeneity of these patterns and imparted therapeutic opportunities remain poorly described.DesignWe performed comprehensive epigenomic characterisation by mapping 222 chromatin profiles from 69 samples (33 colorectal adenocarcinomas, 4 adenomas, 21 matched normal tissues and 11 colon cancer cell lines) for six histone modification marks: H3K4me3 for Pol II-bound and CpG-rich promoters, H3K4me1 for poised enhancers, H3K27ac for enhancers and transcriptionally active promoters, H3K79me2 for transcribed regions, H3K27me3 for polycomb repressed regions and H3K9me3 for heterochromatin.ResultsWe demonstrate that H3K27ac-marked active enhancer state could distinguish between different stages of CRC progression. By epigenomic editing, we present evidence that gains of tumour-specific enhancers for crucial oncogenes, such as ASCL2 and FZD10, was required for excessive proliferation. Consistently, combination of MEK plus bromodomain inhibition was found to have synergistic effects in CRC patient-derived xenograft models. Probing intertumour heterogeneity, we identified four distinct enhancer subtypes (EPIgenome-based Classification, EpiC), three of which correlate well with previously defined transcriptomic subtypes (consensus molecular subtypes, CMSs). Importantly, CMS2 can be divided into two EpiC subgroups with significant survival differences. Leveraging such correlation, we devised a combinatorial therapeutic strategy of enhancer-blocking bromodomain inhibitors with pathway-specific inhibitors (PARPi, EGFRi, TGFβi, mTORi and SRCi) for EpiC groups.ConclusionOur data suggest that the dynamics of active enhancer underlies CRC progression and the patient-specific enhancer patterns can be leveraged for precision combination therapy.


Author(s):  
Yuzhou Chang ◽  
Carter Allen ◽  
Changlin Wan ◽  
Dongjun Chung ◽  
Chi Zhang ◽  
...  

Abstract Motivation Single-cell RNA-Seq (scRNA-Seq) data is useful in discovering cell heterogeneity and signature genes in specific cell populations in cancer and other complex diseases. Specifically, the investigation of condition-specific functional gene modules (FGM) can help to understand interactive gene networks and complex biological processes in different cell clusters. QUBIC2 is recognized as one of the most efficient and effective biclustering tools for condition-specific FGM identification from scRNA-Seq data. However, its limited availability to a C implementation restricted its application to only a few downstream analysis functionalities. We developed an R package named IRIS-FGM (Integrative scRNA-Seq Interpretation System for Functional Gene Module analysis) to support the investigation of FGMs and cell clustering using scRNA-Seq data. Empowered by QUBIC2, IRIS-FGM can effectively identify condition-specific FGMs, predict cell types/clusters, uncover differentially expressed genes, and perform pathway enrichment analysis. It is noteworthy that IRIS-FGM can also take Seurat objects as input, facilitating easy integration with the existing analysis pipeline. Availability and Implementation IRIS-FGM is implemented in the R environment (as of version 3.6) with the source code freely available at https://github.com/BMEngineeR/IRISFGM. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Cesim Erten ◽  
Aissa Houdjedj ◽  
Hilal Kazan

Abstract Background Recent cancer genomic studies have generated detailed molecular data on a large number of cancer patients. A key remaining problem in cancer genomics is the identification of driver genes. Results We propose BetweenNet, a computational approach that integrates genomic data with a protein-protein interaction network to identify cancer driver genes. BetweenNet utilizes a measure based on betweenness centrality on patient specific networks to identify the so-called outlier genes that correspond to dysregulated genes for each patient. Setting up the relationship between the mutated genes and the outliers through a bipartite graph, it employs a random-walk process on the graph, which provides the final prioritization of the mutated genes. We compare BetweenNet against state-of-the art cancer gene prioritization methods on lung, breast, and pan-cancer datasets. Conclusions Our evaluations show that BetweenNet is better at recovering known cancer genes based on multiple reference databases. Additionally, we show that the GO terms and the reference pathways enriched in BetweenNet ranked genes and those that are enriched in known cancer genes overlap significantly when compared to the overlaps achieved by the rankings of the alternative methods.


Author(s):  
Irzam Sarfraz ◽  
Muhammad Asif ◽  
Joshua D Campbell

Abstract Motivation R Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for storing one or more matrix-like assays along with associated row and column data. These objects have been used to facilitate the storage and analysis of high-throughput genomic data generated from technologies such as single-cell RNA sequencing. One common computational task in many genomics analysis workflows is to perform subsetting of the data matrix before applying down-stream analytical methods. For example, one may need to subset the columns of the assay matrix to exclude poor-quality samples or subset the rows of the matrix to select the most variable features. Traditionally, a second object is created that contains the desired subset of assay from the original object. However, this approach is inefficient as it requires the creation of an additional object containing a copy of the original assay and leads to challenges with data provenance. Results To overcome these challenges, we developed an R package called ExperimentSubset, which is a data container that implements classes for efficient storage and streamlined retrieval of assays that have been subsetted by rows and/or columns. These classes are able to inherently provide data provenance by maintaining the relationship between the subsetted and parent assays. We demonstrate the utility of this package on a single-cell RNA-seq dataset by storing and retrieving subsets at different stages of the analysis while maintaining a lower memory footprint. Overall, the ExperimentSubset is a flexible container for the efficient management of subsets. Availability and implementation ExperimentSubset package is available at Bioconductor: https://bioconductor.org/packages/ExperimentSubset/ and Github: https://github.com/campbio/ExperimentSubset. Supplementary information Supplementary data are available at Bioinformatics online.


Cancers ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 173
Author(s):  
Maria Adamaki ◽  
Vassilios Zoumpourlis

Prostate cancer (PCa) is the most frequently diagnosed type of cancer among Caucasian males over the age of 60 and is characterized by remarkable heterogeneity and clinical behavior, ranging from decades of indolence to highly lethal disease. Despite the significant progress in PCa systemic therapy, therapeutic response is usually transient, and invasive disease is associated with high mortality rates. Immunotherapy has emerged as an efficacious and non-toxic treatment alternative that perfectly fits the rationale of precision medicine, as it aims to treat patients on the basis of patient-specific, immune-targeted molecular traits, so as to achieve the maximum clinical benefit. Antibodies acting as immune checkpoint inhibitors and vaccines entailing tumor-specific antigens seem to be the most promising immunotherapeutic strategies in offering a significant survival advantage. Even though patients with localized disease and favorable prognostic characteristics seem to be the ones that markedly benefit from such interventions, there is substantial evidence to suggest that the survival benefit may also be extended to patients with more advanced disease. The identification of biomarkers that can be immunologically targeted in patients with disease progression is potentially amenable in this process and in achieving significant advances in the decision for precision treatment of PCa.


Author(s):  
Givanna H Putri ◽  
Irena Koprinska ◽  
Thomas M Ashhurst ◽  
Nicholas J C King ◽  
Mark N Read

Abstract Motivation Many ‘automated gating’ algorithms now exist to cluster cytometry and single-cell sequencing data into discrete populations. Comparative algorithm evaluations on benchmark datasets rely either on a single performance metric, or a few metrics considered independently of one another. However, single metrics emphasize different aspects of clustering performance and do not rank clustering solutions in the same order. This underlies the lack of consensus between comparative studies regarding optimal clustering algorithms and undermines the translatability of results onto other non-benchmark datasets. Results We propose the Pareto fronts framework as an integrative evaluation protocol, wherein individual metrics are instead leveraged as complementary perspectives. Judged superior are algorithms that provide the best trade-off between the multiple metrics considered simultaneously. This yields a more comprehensive and complete view of clustering performance. Moreover, by broadly and systematically sampling algorithm parameter values using the Latin Hypercube sampling method, our evaluation protocol minimizes (un)fortunate parameter value selections as confounding factors. Furthermore, it reveals how meticulously each algorithm must be tuned in order to obtain good results, vital knowledge for users with novel data. We exemplify the protocol by conducting a comparative study between three clustering algorithms (ChronoClust, FlowSOM and Phenograph) using four common performance metrics applied across four cytometry benchmark datasets. To our knowledge, this is the first time Pareto fronts have been used to evaluate the performance of clustering algorithms in any application domain. Availability and implementation Implementation of our Pareto front methodology and all scripts and datasets to reproduce this article are available at https://github.com/ghar1821/ParetoBench. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Martin Pirkl ◽  
Niko Beerenwinkel

Abstract Motivation Cancer is one of the most prevalent diseases in the world. Tumors arise due to important genes changing their activity, e.g. when inhibited or over-expressed. But these gene perturbations are difficult to observe directly. Molecular profiles of tumors can provide indirect evidence of gene perturbations. However, inferring perturbation profiles from molecular alterations is challenging due to error-prone molecular measurements and incomplete coverage of all possible molecular causes of gene perturbations. Results We have developed a novel mathematical method to analyze cancer driver genes and their patient-specific perturbation profiles. We combine genetic aberrations with gene expression data in a causal network derived across patients to infer unobserved perturbations. We show that our method can predict perturbations in simulations, CRISPR perturbation screens and breast cancer samples from The Cancer Genome Atlas. Availability and implementation The method is available as the R-package nempi at https://github.com/cbg-ethz/nempi and http://bioconductor.org/packages/nempi. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document