Modelling cancer progression using Mutual Hazard Networks

Rudolf Schill; Stefan Solbrig; Tilo Wettig; Rainer Spang

doi:10.1093/bioinformatics/btz513

Modelling cancer progression using Mutual Hazard Networks

Bioinformatics ◽

10.1093/bioinformatics/btz513 ◽

2019 ◽

Vol 36 (1) ◽

pp. 241-249 ◽

Cited By ~ 2

Author(s):

Rudolf Schill ◽

Stefan Solbrig ◽

Tilo Wettig ◽

Rainer Spang

Keyword(s):

Cancer Progression ◽

Learning Algorithm ◽

Directed Acyclic Graphs ◽

The Cancer Genome Atlas ◽

Supplementary Information ◽

Cross Sectional ◽

Acyclic Graphs ◽

Cancer Genome Atlas ◽

Occurrence State ◽

Occurrence Patterns

Abstract Motivation Cancer progresses by accumulating genomic events, such as mutations and copy number alterations, whose chronological order is key to understanding the disease but difficult to observe. Instead, cancer progression models use co-occurrence patterns in cross-sectional data to infer epistatic interactions between events and thereby uncover their most likely order of occurrence. State-of-the-art progression models, however, are limited by mathematical tractability and only allow events to interact in directed acyclic graphs, to promote but not inhibit subsequent events, or to be mutually exclusive in distinct groups that cannot overlap. Results Here we propose Mutual Hazard Networks (MHN), a new Machine Learning algorithm to infer cyclic progression models from cross-sectional data. MHN model events by their spontaneous rate of fixation and by multiplicative effects they exert on the rates of successive events. MHN compared favourably to acyclic models in cross-validated model fit on four datasets tested. In application to the glioblastoma dataset from The Cancer Genome Atlas, MHN proposed a novel interaction in line with consecutive biopsies: IDH1 mutations are early events that promote subsequent fixation of TP53 mutations. Availability and implementation Implementation and data are available at https://github.com/RudiSchill/MHN. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Modelling cancer progression using Mutual Hazard Networks

10.1101/450841 ◽

2018 ◽

Author(s):

Rudolf Schill ◽

Stefan Solbrig ◽

Tilo Wettig ◽

Rainer Spang

Keyword(s):

Cancer Progression ◽

Learning Algorithm ◽

Directed Acyclic Graphs ◽

Model Fit ◽

The Cancer Genome Atlas ◽

Cross Sectional ◽

Epistatic Interactions ◽

Acyclic Graphs ◽

Cancer Genome Atlas ◽

Multiplicative Effects

AbstractMotivationCancer progresses by accumulating genomic events, such as mutations and copy number alterations, whose chronological order is key to understanding the disease but difficult to observe. Instead, cancer progression models use co-occurence patterns in cross-sectional data to infer epistatic interactions between events and thereby uncover their most likely order of occurence. State-of-the-art progression models, however, are limited by mathematical tractability and only allow events to interact in directed acyclic graphs, to promote but not inhibit subsequent events, or to be mutually exclusive in distinct groups that cannot overlap.ResultsHere we propose Mutual Hazard Networks (MHN), a new Machine Learning algorithm to infer cyclic progression models from cross-sectional data. MHN model events by their spontaneous rate of fixation and by multiplicative effects they exert on the rates of successive events. MHN compared favourably to acyclic models in cross-validated model fit on four datasets tested. In application to the glioblastoma dataset from The Cancer Genome Atlas, MHN proposed a novel interaction in line with consecutive biopsies: IDH1 mutations are early events that promote subsequent fixation of TP53 mutations.AvailabilityImplementation and data are available at https://github.com/RudiSchill/MHN.

Download Full-text

CAPRI: Efficient Inference of Cancer Progression Models from Cross-sectional Data

10.1101/008110 ◽

2014 ◽

Cited By ~ 1

Author(s):

Daniele Ramazzotti ◽

Giulio Caravagna ◽

Loes Olde Loohuis ◽

Alex Graudenzi ◽

Ilya Korsunsky ◽

...

Keyword(s):

Cancer Progression ◽

Directed Acyclic Graphs ◽

The Cancer Genome Atlas ◽

Scoring Method ◽

Cross Sectional ◽

Progression Model ◽

Acyclic Graphs ◽

Cancer Genome Atlas ◽

Atypical Chronic Myeloid Leukemia ◽

Insight Into

We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related genomic data have become available (e.g., The Cancer Genome Atlas, TCGA) typically involving hundreds of patients. At present, most of these data are aggregated in a cross-sectional fashion providing all measurements at the time of diagnosis. Our goal is to infer cancer ?progression? models from such data. These models are represented as directed acyclic graphs (DAGs) of collections of ?selectivity? relations, where a mutation in a gene A ?selects? for a later mutation in a gene B. Gaining insight into the structure of such progressions has the potential to improve both the stratification of patients and personalized therapy choices. Results: The CAPRI algorithm relies on a scoring method based on a probabilistic theory developed by Suppes, coupled with bootstrap and maximum likelihood inference. The resulting algorithm is efficient, achieves high accuracy, and has good complexity, also, in terms of convergence properties. CAPRI performs especially well in the presence of noise in the data, and with limited sample sizes. Moreover CAPRI, in contrast to other approaches, robustly reconstructs different types of confluent trajectories despite irregularities in the data. We also report on an ongoing investigation using CAPRI to study atypical Chronic Myeloid Leukemia, in which we uncovered non trivial selectivity relations and exclusivity patterns among key genomic events.

Download Full-text

Cancer progression models and fitness landscapes: a many-to-many relationship

10.1101/141465 ◽

2017 ◽

Author(s):

Ramon Diaz-Uriarte

Keyword(s):

Cancer Progression ◽

Fitness Landscape ◽

Simulated Data ◽

Directed Acyclic Graphs ◽

Fitness Landscapes ◽

Gene Interactions ◽

Cross Sectional ◽

Data Set ◽

Large Variability ◽

Acyclic Graphs

AbstractThe identification of constraints, due to gene interactions, in the order of accumulation of mutations during cancer progression can allow us to single out therapeutic targets. Cancer progression models (CPMs) use genotype frequency data from cross-sectional samples to try to identify these constraints, and return Directed Acyclic Graphs (DAGs) of genes. On the other hand, fitness landscapes, which map genotypes to fitness, contain all possible paths of tumor progression. Thus, we expect a correspondence between DAGs from CPMs and the fitness landscapes where evolution happened. But many fitness landscapes —e.g., those with reciprocal sign epistasis— cannot be represented by CPMs. Using simulated data under 500 fitness landscapes, I show that CPMs’ performance (prediction of genotypes that can exist) degrades with reciprocal sign epistasis. There is large variability in the DAGs inferred from each landscape, which is also affected by mutation rate, detection regime, and fitness landscape features, in ways that depend on CPM method. And the same DAG is often observed in very different landscapes, which differ in more than 50% of their accessible genotypes. Using a pancreatic data set, I show that this many-to-many relationship affects the analysis of empirical data. Fitness landscapes that are widely different from each other can, when evolutionary processes run repeatedly on them, both produce data similar to the empirically observed one, and lead to DAGs that are very different among themselves. Because reciprocal sign epistasis can be common in cancer, these results question the use and interpretation of CPMs.

Download Full-text

HNRNPH1-stabilized LINC00662 promotes ovarian cancer progression by activating the GRP78/p38 pathway

Oncogene ◽

10.1038/s41388-021-01884-5 ◽

2021 ◽

Author(s):

Yong Wu ◽

Qinhao Guo ◽

Xingzhu Ju ◽

Zhixiang Hu ◽

Lingfang Xia ◽

...

Keyword(s):

Ovarian Cancer ◽

Cancer Progression ◽

Prognostic Indicator ◽

Mapk Signaling ◽

The Cancer Genome Atlas ◽

Cancer Genome Atlas ◽

Proliferation And Metastasis ◽

Nuclear Ribonucleoprotein

AbstractNumerous studies suggest an important role for copy number alterations (CNAs) in cancer progression. However, CNAs of long intergenic noncoding RNAs (lincRNAs) in ovarian cancer (OC) and their potential functions have not been fully investigated. Here, based on analysis of The Cancer Genome Atlas (TCGA) database, we identified in this study an oncogenic lincRNA termed LINC00662 that exhibited a significant correlation between its CNA and its increased expression. LINC00662 overexpression is highly associated with malignant features in OC patients and is a prognostic indicator. LINC00662 significantly promotes OC cell proliferation and metastasis in vitro and in vivo. Mechanistically, LINC00662 is stabilized by heterogeneous nuclear ribonucleoprotein H1 (HNRNPH1). Moreover, LINC00662 exerts oncogenic effects by interacting with glucose-regulated protein 78 (GRP78) and preventing its ubiquitination in OC cells, leading to activation of the oncogenic p38 MAPK signaling pathway. Taken together, our results define an oncogenic role for LINC00662 in OC progression mediated via GRP78/p38 signaling, with potential implications regarding therapeutic targets for OC.

Download Full-text

Cross-Sectional Model-Building for Research on Subjective Well-Being: Gaining Clarity on Control Variables

Social Indicators Research ◽

10.1007/s11205-020-02586-3 ◽

2021 ◽

Author(s):

David Bartram

Keyword(s):

Model Building ◽

Well Being ◽

Directed Acyclic Graphs ◽

Subjective Well Being ◽

Social Scientists ◽

Cross Sectional ◽

Acyclic Graphs ◽

Standard Set ◽

Substantial Bias ◽

Parsimonious Model

AbstractHappiness/well-being researchers who use quantitative analysis often do not give persuasive reasons why particular variables should be included as controls in their cross-sectional models. One commonly sees notions of a “standard set” of controls, or the “usual suspects”, etc. These notions are not coherent and can lead to results that are significantly biased with respect to a genuine causal relationship.This article presents some core principles for making more effective decisions of that sort. The contribution is to introduce a framework (the “causal revolution”, e.g. Pearl and Mackenzie 2018) unfamiliar to many social scientists (though well established in epidemiology) and to show how it can be put into practice for empirical analysis of causal questions. In simplified form, the core principles are: control for confounding variables, and do not control for intervening variables or colliders. A more comprehensive approach uses directed acyclic graphs (DAGs) to discern models that meet a minimum/efficient criterion for identification of causal effects.The article demonstrates this mode of analysis via a stylized investigation of the effect of unemployment on happiness. Most researchers would include other determinants of happiness as controls for this purpose. One such determinant is income—but income is an intervening variable in the path from unemployment to happiness, and including it leads to substantial bias. Other commonly-used variables are simply unnecessary, e.g. religiosity and sex. From this perspective, identifying the effect of unemployment on happiness requires controlling only for age and education; a small (parsimonious) model is evidently preferable to a more complex one in this instance.

Download Full-text

Identification of genes associated with cancer progression and prognosis in lung adenocarcinoma: Analyses based on microarray from Oncomine and The Cancer Genome Atlas databases

Molecular Genetics & Genomic Medicine ◽

10.1002/mgg3.528 ◽

2018 ◽

Vol 7 (2) ◽

Cited By ~ 8

Author(s):

Wei Liu ◽

Songyun Ouyang ◽

Zhigang Zhou ◽

Meng Wang ◽

Tingting Wang ◽

...

Keyword(s):

Lung Adenocarcinoma ◽

Cancer Progression ◽

Cancer Genome ◽

The Cancer Genome Atlas ◽

Cancer Genome Atlas ◽

Genome Atlas

Download Full-text

Pan-Cancer Analysis of the Genomic Alterations and Mutations of the Matrisome

Cancers ◽

10.3390/cancers12082046 ◽

2020 ◽

Vol 12 (8) ◽

pp. 2046 ◽

Cited By ~ 2

Author(s):

Valerio Izzi ◽

Martin N. Davis ◽

Alexandra Naba

Keyword(s):

Cancer Progression ◽

Protein Function ◽

The Cancer Genome Atlas ◽

Copy Number Alterations ◽

Cellular Functions ◽

Genomic Alterations ◽

Genes Encoding ◽

Cancer Genome Atlas ◽

Biomechanical Changes ◽

Pan Cancer

The extracellular matrix (ECM) is a master regulator of all cellular functions and a major component of the tumor microenvironment. We previously defined the “matrisome” as the ensemble of genes encoding ECM proteins and proteins modulating ECM structure or function. While compositional and biomechanical changes in the ECM regulate cancer progression, no study has investigated the genomic alterations of matrisome genes in cancers and their consequences. Here, mining The Cancer Genome Atlas (TCGA) data, we found that copy number alterations and mutations are frequent in matrisome genes, even more so than in the rest of the genome. We also found that these alterations are predicted to significantly impact gene expression and protein function. Moreover, we identified matrisome genes whose mutational burden is an independent predictor of survival. We propose that studying genomic alterations of matrisome genes will further our understanding of the roles of this compartment in cancer progression and will lead to the development of innovative therapeutic strategies targeting the ECM.

Download Full-text

IsoformSwitchAnalyzeR: analysis of changes in genome-wide patterns of alternative splicing and its functional consequences

Bioinformatics ◽

10.1093/bioinformatics/btz247 ◽

2019 ◽

Vol 35 (21) ◽

pp. 4469-4471 ◽

Cited By ~ 21

Author(s):

Kristoffer Vitting-Seerup ◽

Albin Sandelin

Keyword(s):

Alternative Splicing ◽

The Cancer Genome Atlas ◽

Supplementary Information ◽

Rna Seq ◽

Genome Wide ◽

Functional Consequences ◽

Cancer Genome Atlas ◽

Health And Disease ◽

Splicing Patterns

Abstract Summary Alternative splicing is an important mechanism involved in health and disease. Recent work highlights the importance of investigating genome-wide changes in splicing patterns and the subsequent functional consequences. Current computational methods only support such analysis on a gene-by-gene basis. Therefore, we extended IsoformSwitchAnalyzeR R library to enable analysis of genome-wide changes in specific types of alternative splicing and predicted functional consequences of the resulting isoform switches. As a case study, we analyzed RNA-seq data from The Cancer Genome Atlas and found systematic changes in alternative splicing and the consequences of the associated isoform switches. Availability and implementation Windows, Linux and Mac OS: http://bioconductor.org/packages/IsoformSwitchAnalyzeR. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

An R Implementation of Tumor-Stroma-Immune Transcriptome Deconvolution Pipeline using DeMixT

10.1101/566075 ◽

2019 ◽

Author(s):

Shaolong Cao ◽

Zeya Wang ◽

Fan Gao ◽

Jingxiao Chen ◽

Feng Zhang ◽

...

Keyword(s):

Cancer Progression ◽

Expression Profiles ◽

Progression Free Survival ◽

Tumor Stroma ◽

R Package ◽

The Cancer Genome Atlas ◽

Biological Information ◽

Computationally Efficient ◽

Multiple Cancer ◽

Cancer Genome Atlas

AbstractThe deconvolution of transcriptomic data from heterogeneous tissues in cancer studies remains challenging. Available software faces difficulties for accurately estimating both component-specific proportions and expression profiles for individual samples. To address these challenges, we present a new R-implementation pipeline for the more accurate and efficient transcriptome deconvolution of high dimensional data from mixtures of more than two components. The pipeline utilizes the computationally efficient DeMixT R-package with OpenMP and additional cancer-specific biological information to perform three-component deconvolution without requiring data from the immune profiles. It enables a wide application of DeMixT to gene expression datasets available from cancer consortium such as the Cancer Genome Atlas (TCGA) projects, where, other than the mixed tumor samples, a handful of normal samples are profiled in multiple cancer types. We have applied this pipeline to two TCGA datasets in colorectal adenocarcinoma (COAD) and prostate adenocarcinoma (PRAD). In COAD, we found varying distributions of immune proportions across the Consensus Molecular Subtypes, from the highest to the lowest being CMS1, CMS3, CMS4 and CMS2. In PRAD, we found the immune proportions are associated with progression-free survival (p<0.01) and negatively correlated with Gleason scores (p<0.001). Our DeMixT-centered analysis protocol opens up new opportunities to investigate the tumor-stroma-immune microenvironment, by providing both proportions and component-specific expressions, and thus better define the underlying biology of cancer progression.Availability and implementation: An R package, scripts and data are available: https://github.com/wwylab/DeMixTallmaterials.

Download Full-text

The tissue differentiation and cancer manifolds in gene and protein expression spaces

10.1101/2021.08.20.457160 ◽

2021 ◽

Author(s):

J Nieves ◽

A Gonzalez

Keyword(s):

Protein Expression ◽

Cancer Progression ◽

Tissue Differentiation ◽

The Cancer Genome Atlas ◽

Normal Tissues ◽

Gene And Protein Expression ◽

Human Protein Atlas ◽

Cancer Genome Atlas ◽

Tissue Of Origin ◽

Using Data

AbstractIt is well known that, for a particular tissue, the homeostatic and cancer attractors are well apart both in gene expression and in protein expression spaces. By using data for 15 tissues and the corresponding tumors from The Cancer Genome Atlas, and for 49 normal tissues and 20 tumors from The Human Protein Atlas, we show that the set of normal attractors are also well separated from the set of tumors. Roughly speaking, one may say that there is a cancer progression axis orthogonal to the normal tissue differentiation and cancer manifolds. This separation suggests that therapies targeting common genes, which define the cancer axis, may be effective, irrespective of the tissue of origin.

Download Full-text