Disease characterization using a partial correlation-based sample-specific network

Abstract A single-sample network (SSN) is a biological molecular network constructed from single-sample data given a reference dataset and can provide insights into the mechanisms of individual diseases and aid in the development of personalized medicine. In this study, we proposed a computational method, a partial correlation-based single-sample network (P-SSN), which not only infers a network from each single-sample data given a reference dataset but also retains the direct interactions by excluding indirect interactions (https://github.com/hyhRise/P-SSN). By applying P-SSN to analyze tumor data from the Cancer Genome Atlas and single cell data, we validated the effectiveness of P-SSN in predicting driver mutation genes (DMGs), producing network distance, identifying subtypes and further classifying single cells. In particular, P-SSN is highly effective in predicting DMGs based on single-sample data. P-SSN is also efficient for subtyping complex diseases and for clustering single cells by introducing network distance between any two samples.

Download Full-text

Transposable element expression in tumors is associated with immune infiltration and increased antigenicity

Nature Communications ◽

10.1038/s41467-019-13035-2 ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 16

Author(s):

Yu Kong ◽

Christopher M. Rose ◽

Ashley A. Cass ◽

Alexander G. Williams ◽

Martine Darwish ◽

...

Keyword(s):

Dna Methylation ◽

De Novo ◽

Computational Method ◽

The Cancer Genome Atlas ◽

Potential Consequence ◽

Sequencing Data ◽

Antiviral Responses ◽

Genome Wide ◽

Cancer Genome Atlas ◽

Demethylation Agent

AbstractProfound global loss of DNA methylation is a hallmark of many cancers. One potential consequence of this is the reactivation of transposable elements (TEs) which could stimulate the immune system via cell-intrinsic antiviral responses. Here, we develop REdiscoverTE, a computational method for quantifying genome-wide TE expression in RNA sequencing data. Using The Cancer Genome Atlas database, we observe increased expression of over 400 TE subfamilies, of which 262 appear to result from a proximal loss of DNA methylation. The most recurrent TEs are among the evolutionarily youngest in the genome, predominantly expressed from intergenic loci, and associated with antiviral or DNA damage responses. Treatment of glioblastoma cells with a demethylation agent results in both increased TE expression and de novo presentation of TE-derived peptides on MHC class I molecules. Therapeutic reactivation of tumor-specific TEs may synergize with immunotherapy by inducing inflammation and the display of potentially immunogenic neoantigens.

Download Full-text

Personalized characterization of diseases using sample-specific networks

10.1101/042838 ◽

2016 ◽

Author(s):

Xiaoping Liu ◽

Yuetong Wang ◽

Hongbin Ji ◽

Kazuyuki Aihara ◽

Luonan Chen

Keyword(s):

Drug Resistance ◽

Molecular Mechanisms ◽

Complex Disease ◽

The Cancer Genome Atlas ◽

Single Sample ◽

System Level ◽

Driver Genes ◽

Network Information ◽

Cancer Genome Atlas ◽

Network Patterns

ABSTRACTA complex disease generally results not from malfunction of individual molecules but from dysfunction of the relevant system or network, which dynamically changes with time and conditions. Thus, estimating a condition-specific network from a sample is crucial to elucidating the molecular mechanisms of complex diseases at the system level. However, there is currently no effective way to construct such an individual-specific network by expression profiling of a single sample because of the requirement of multiple samples for computing correlations. We developed here with a statistical method, i.e., a sample-specific network method, which allows us to construct individual-specific networks based on molecular expression of a single sample. Using this method, we can characterize various human diseases at a network level. In particular, such sample-specific networks can lead to the identification of individual-specific disease modules as well as driver genes, even without gene sequencing information. Extensive analysis by using the Cancer Genome Atlas data not only demonstrated the effectiveness of the method, but also found new individual-specific driver genes and network patterns for various cancers. Biological experiments on drug resistance further validated one important advantage of our method over the traditional methods, i.e., we even identified those drug resistance genes that actually have no clearly differential expression between samples with and without the resistance, due to the additional network information.

Download Full-text

An instance-specific causal framework for learning intercellular communication networks that define microenvironments of individual tumors

10.1101/2021.11.11.467838 ◽

2021 ◽

Author(s):

Xueer Chen ◽

Lujia Chen ◽

Cornelius H.L. Kurten ◽

Fattaneh Jabbari ◽

Lazar Vujanovic ◽

...

Keyword(s):

Communication Networks ◽

Intercellular Communication ◽

Single Cells ◽

The Cancer Genome Atlas ◽

Analysis Framework ◽

Sequencing Data ◽

Network Learning ◽

Cancer Genome Atlas ◽

Causal Framework ◽

Intercellular Communications

Cells within a tumor microenvironment (TME) dynamically communicate and influence each other's cellular states through an intercellular communication network (ICN). In cancers, intercellular communications underlie immune evasion mechanisms of individual tumors. We developed an instance-specific causal analysis framework for discovering tumor-specific ICNs. Using head and neck squamous cell carcinoma (HNSCC) tumors as a testbed, we first mined single-cell RNA-sequencing data to discover gene expression modules (GEMs) that reflect the states of transcriptomic processes within tumor and stromal single cells. By deconvoluting bulk transcriptomes of HNSCC tumors profiled by The Cancer Genome Atlas (TCGA), we estimated the activation states of these transcriptomic processes in individual tumors. Finally, we applied instance-specific causal network learning to discover an ICN within each tumor. Our results show that cellular states of cells in TMEs are coordinated through ICNs that enable multi-way communications among epithelial, fibroblast, endothelial, and immune cells. Further analyses of individual ICNs revealed structural patterns that were shared across subsets of tumors, leading to the discovery of 4 different subtypes of networks that underlie disparate TMEs of HNSCC. Patients with distinct TMEs exhibited significantly different clinical outcomes. Our results show that the capability of estimating instance-specific ICNs reveals heterogeneity of ICNs and sheds light on the importance of intercellular communication in impacting disease development and progression.

Download Full-text

Comparing cancer cell lines and tumor samples by genomic profiles

10.1101/028159 ◽

2015 ◽

Cited By ~ 2

Author(s):

Rileen Sinha ◽

Nikolaus Schultz ◽

Chris Sander

Keyword(s):

Cell Lines ◽

Cancer Cell ◽

Cancer Cell Line ◽

Cancer Cell Lines ◽

Response To Therapy ◽

Computational Method ◽

The Cancer Genome Atlas ◽

Cancer Genome Atlas ◽

Cancer Types ◽

Tumor Material

Cancer cell lines are often used in laboratory experiments as models of tumors, although they can have substantially different genetic and epigenetic profiles compared to tumors. We have developed a general computational method, TumorComparer, to systematically quantify similarities and differences between tumor material when detailed genetic and molecular profiles are available. The comparisons can be flexibly tailored to a particular biological question by placing a higher weight on functional alterations of interest (weighted similarity). In a first pan-cancer application, we have compared 260 cell lines from the Cancer Cell Line Encyclopaedia (CCLE) and 1914 tumors of six different cancer types from The Cancer Genome Atlas (TCGA), using weights to emphasize genomic alterations that frequently recur in tumors. We report the potential suitability of particular cell lines as tumor models and identify apparently unsuitable outlier cell lines, some of which are in wide use, for each of the six cancer types. In future, this weighted similarity method may be generalized for use in a clinical setting to compare patient profiles consisting of genomic patterns combined with clinical attributes, such as diagnosis, treatment and response to therapy.

Download Full-text

CloneSig can jointly infer intra-tumor heterogeneity and mutational signature activity in bulk tumor sequencing data

Nature Communications ◽

10.1038/s41467-021-24992-y ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Judith Abécassis ◽

Fabien Reyal ◽

Jean-Philippe Vert

Keyword(s):

Tumor Heterogeneity ◽

Cancer Genomics ◽

Computational Method ◽

The Cancer Genome Atlas ◽

Sequencing Data ◽

Cancer Dataset ◽

Whole Exome Sequencing Data ◽

Cancer Genome Atlas ◽

Pan Cancer ◽

Mutational Processes

AbstractSystematic DNA sequencing of cancer samples has highlighted the importance of two aspects of cancer genomics: intra-tumor heterogeneity (ITH) and mutational processes. These two aspects may not always be independent, as different mutational processes could be involved in different stages or regions of the tumor, but existing computational approaches to study them largely ignore this potential dependency. Here, we present CloneSig, a computational method to jointly infer ITH and mutational processes in a tumor from bulk-sequencing data. Extensive simulations show that CloneSig outperforms current methods for ITH inference and detection of mutational processes when the distribution of mutational signatures changes between clones. Applied to a large cohort of 8,951 tumors with whole-exome sequencing data from The Cancer Genome Atlas, and on a pan-cancer dataset of 2,632 whole-genome sequencing tumor samples from the Pan-Cancer Analysis of Whole Genomes initiative, CloneSig obtains results overall coherent with previous studies.

Download Full-text

Transposable Element Exprssion in Tumors is Associated with Immune Infiltration and Increased Antigenicity

10.1101/388215 ◽

2018 ◽

Cited By ~ 1

Author(s):

Yu Kong ◽

Chris Rose ◽

Ashley A. Cass ◽

Martine Darwish ◽

Steve Lianoglou ◽

...

Keyword(s):

Dna Methylation ◽

Transposable Element ◽

De Novo ◽

Computational Method ◽

The Cancer Genome Atlas ◽

Sequencing Data ◽

Genome Wide ◽

Dna Damage Responses ◽

Cancer Genome Atlas ◽

Demethylation Agent

AbstractProfound loss of DNA methylation is a well-recognized hallmark of cancer. Given its role in silencing transposable elements (TEs), we hypothesized that extensive TE expression occurs in tumors with highly demethylated DNA. We developed REdiscoverTE, a computational method for quantifying genome-wide TE expression in RNA sequencing data. Using The Cancer Genome Atlas database, we observed increased expression of over 400 TE subfamilies, of which 262 appeared to result from a proximal loss of DNA methylation. The most recurrent TEs were among the evolutionarily youngest in the genome, predominantly expressed from intergenic loci, and associated with antiviral or DNA damage responses. Treatment of glioblastoma cells with a demethylation agent resulted in both increased TE expression and de novo presentation of TE-derived peptides on MHC class I molecules. Therapeutic reactivation of tumor-specific TEs may synergize with immunotherapy by inducing both inflammation and the display of potentially immunogenic neoantigens.One Sentence SummaryTransposable element expression in tumors is associated with increased immune response and provides tumor-associated antigens

Download Full-text

CloneSig: Joint inference of intra-tumor heterogeneity and signature deconvolution in tumor bulk sequencing data

10.1101/825778 ◽

2019 ◽

Cited By ~ 2

Author(s):

Judith Abécassis ◽

Fabien Reyal ◽

Jean-Philippe Vert

Keyword(s):

Cancer Progression ◽

Tumor Heterogeneity ◽

Cancer Genomics ◽

Computational Method ◽

The Cancer Genome Atlas ◽

Sequencing Data ◽

Joint Inference ◽

Nucleotide Context ◽

Cancer Genome Atlas ◽

Mutational Processes

The possibility to sequence DNA in cancer samples has triggered much effort recently to identify the forces at the genomic level that shape tumorigenesis and cancer progression. It has resulted in novel understanding or clarification of two important aspects of cancer genomics: (i) intra-tumor heterogeneity (ITH), as captured by the variability in observed prevalences of somatic mutations within a tumor, and (ii) mutational processes, as revealed by the distribution of the types of somatic mutation and their immediate nucleotide context. These two aspects are not independent from each other, as different mutational processes can be involved in different subclones, but current computational approaches to study them largely ignore this dependency. In particular, sequential methods that first estimate subclones and then analyze the mutational processes active in each clone can easily miss changes in mutational processes if the clonal decomposition step fails, and conversely information regarding mutational signatures is overlooked during the subclonal reconstruction. To address current limitations, we present CloneSig, a new computational method to jointly infer ITH and mutational processes in a tumor from bulk-sequencing data, including whole-exome sequencing (WES) data, by leveraging their dependency. We show through an extensive benchmark on simulated samples that CloneSig is always as good as or better than state-of-the-art methods for ITH inference and detection of mutational processes. We then apply CloneSig to a large cohort of 8,954 tumors with WES data from the cancer genome atlas (TCGA), where we obtain results coherent with previous studies on whole-genome sequencing (WGS) data, as well as new promising findings. This validates the applicability of CloneSig to WES data, paving the way to its use in a clinical setting where WES is increasingly deployed nowadays.

Download Full-text

An Improved, Assay Platform Agnostic, Absolute Single Sample Breast Cancer Subtype Classifier

Cancers ◽

10.3390/cancers12123506 ◽

2020 ◽

Vol 12 (12) ◽

pp. 3506

Author(s):

Mi-kyoung Seo ◽

Soonmyung Paik ◽

Sangwoo Kim

Keyword(s):

Breast Cancer ◽

Cross Validation ◽

Molecular Subtype ◽

Breast Cancer Subtype ◽

The Cancer Genome Atlas ◽

Single Sample ◽

Study Cohort ◽

Average Accuracy ◽

Number Of Genes ◽

Cancer Genome Atlas

While intrinsic molecular subtypes provide important biological classification of breast cancer, the subtype assignment of individuals is influenced by assay technology and study cohort composition. We sought to develop a platform-independent absolute single-sample subtype classifier based on a minimal number of genes. Pairwise ratios for subtype-specific differentially expressed genes from un-normalized expression data from 432 breast cancer (BC) samples of The Cancer Genome Atlas (TCGA) were used as inputs for machine learning. The subtype classifier with the fewest number of genes and maximal classification power was selected during cross-validation. The final model was evaluated on 5816 samples from 10 independent studies profiled with four different assay platforms. Upon cross-validation within the TCGA cohort, a random forest classifier (MiniABS) with 11 genes achieved the best accuracy of 88.2%. Applying MiniABS to five validation sets of RNA-seq and microarray data showed an average accuracy of 85.15% (vs. 77.72% for Absolute Intrinsic Molecular Subtype (AIMS)). Only MiniABS could be applied to five low-throughput datasets, showing an average accuracy of 87.93%. The MiniABS can absolutely subtype BC using the raw expression levels of only 11 genes, regardless of assay platform, with higher accuracy than existing methods.

Download Full-text

Explore synergistic and competitive miRNA regulation mechanisms in the miRNA-mRNA regulatory network from the information decomposition perspective

10.1101/2021.12.20.473520 ◽

2021 ◽

Author(s):

chu pan

Keyword(s):

Binding Site ◽

Computational Method ◽

The Cancer Genome Atlas ◽

Untranslated Regions ◽

Mirna Regulation ◽

Information Measurement ◽

Microrna Regulation ◽

Functional Correlation ◽

Cancer Genome Atlas ◽

Regulation Mechanisms

Since multiple microRNAs can target 3' untranslated regions of the same mRNA transcript, it is likely that these endogenous microRNAs may form synergistic alliances, or compete for the same mRNA harbouring overlapping binding site matches. Synergistic and competitive microRNA regulation is an intriguing yet poorly elucidated mechanism. We here introduce a computational method based on the multivariate information measurement to quantify such implicit interaction effects between microRNAs. Our informatics method of integrating sequence and expression data is designed to establish the functional correlation between microRNAs. To demonstrate our method, we exploited TargetScan and The Cancer Genome Atlas data. As a result, we indeed observed that the microRNA pair with neighbouring binding site(s) on the mRNA is likely to trigger synergistic events, while the microRNA pair with overlapping binding site(s) on the mRNA is likely to cause competitive events, provided that the pair of microRNAs has a high functional similarity and the corresponding triplet presents a positive/negative 'synergy-redundancy' score.

Download Full-text

Identification of HCC-Related Genes Based on Differential Partial Correlation Network

Frontiers in Genetics ◽

10.3389/fgene.2021.672117 ◽

2021 ◽

Vol 12 ◽

Author(s):

Yuyao Gao ◽

Xiao Chang ◽

Jie Xia ◽

Shaoyan Sun ◽

Zengchao Mu ◽

...

Keyword(s):

Functional Module ◽

Disease Diagnosis ◽

Computational Method ◽

The Cancer Genome Atlas ◽

Control Group ◽

Disease Genes ◽

Mechanism Study ◽

Cancer Genome Atlas ◽

Disease Related Genes ◽

Systematic Identification

Hepatocellular carcinoma (HCC) is one of the most common causes of cancer-related death, but its pathogenesis is still unclear. As the disease is involved in multiple biological processes, systematic identification of disease genes and module biomarkers can provide a better understanding of disease mechanisms. In this study, we provided a network-based approach to integrate multi-omics data and discover disease-related genes. We applied our method to HCC data from The Cancer Genome Atlas (TCGA) database and obtained a functional module with 15 disease-related genes as network biomarkers. The results of classification and hierarchical clustering demonstrate that the identified functional module can effectively distinguish between the disease and the control group in both supervised and unsupervised methods. In brief, this computational method to identify potential functional disease modules could be useful to disease diagnosis and further mechanism study of complex diseases.

Download Full-text