scholarly journals Conditional prediction of consecutive tumor evolution using cancer progression models: What genotype comes next?

2021 ◽  
Vol 17 (12) ◽  
pp. e1009055
Author(s):  
Juan Diaz-Colunga ◽  
Ramon Diaz-Uriarte

Accurate prediction of tumor progression is key for adaptive therapy and precision medicine. Cancer progression models (CPMs) can be used to infer dependencies in mutation accumulation from cross-sectional data and provide predictions of tumor progression paths. However, their performance when predicting complete evolutionary trajectories is limited by violations of assumptions and the size of available data sets. Instead of predicting full tumor progression paths, here we focus on short-term predictions, more relevant for diagnostic and therapeutic purposes. We examine whether five distinct CPMs can be used to answer the question “Given that a genotype with n mutations has been observed, what genotype with n + 1 mutations is next in the path of tumor progression?” or, shortly, “What genotype comes next?”. Using simulated data we find that under specific combinations of genotype and fitness landscape characteristics CPMs can provide predictions of short-term evolution that closely match the true probabilities, and that some genotype characteristics can be much more relevant than global features. Application of these methods to 25 cancer data sets shows that their use is hampered by a lack of information needed to make principled decisions about method choice. Fruitful use of these methods for short-term predictions requires adapting method’s use to local genotype characteristics and obtaining reliable indicators of performance; it will also be necessary to clarify the interpretation of the method’s results when key assumptions do not hold.

2020 ◽  
Author(s):  
Juan Diaz-Colunga ◽  
Ramon Diaz-Uriarte

AbstractAccurate prediction of tumor progression is key for adaptive therapy and precision medicine. Cancer progression models (CPMs) can be used to infer dependencies in mutation accumulation from cross-sectional data and provide predictions of tumor progression paths. But their performance when predicting the complete evolutionary paths is limited by violations of assumptions and the size of available data sets. Instead of predicting full tumor progression paths, we can focus on short-term predictions, more relevant for diagnostic and therapeutic purposes. Here we examine if five distinct CPMs can be used to answer the question “Given that a genotype with n mutations has been observed, what genotype with n + 1 mutations is next in the path of tumor progression” or, shortly, “What genotype comes next”. Using simulated data we find that under specific combinations of genotype and fitness landscape characteristics CPMs can provide predictions of short-term evolution that closely match the true probabilities, and that some genotype characteristics (fitness and probability of being a local fitness maximum) can be much more relevant than global features. Thus, CPMs can provide short-term predictions even when global, long-term predictions are not possible because fitness landscape- and evolutionary model-specific assumptions are violated. When good performance is possible, we observe significant variation in the quality of predictions of different methods. Genotype-specific and global fitness landscape characteristics are required to determine which method provides best results in each case. Application of these methods to 25 cancer data sets shows that their use is hampered by lack of the information needed to make principled decisions about method choice and what predictions to trust. Fruitful use of these methods for short-term predictions requires adapting method’s use to local genotype characteristics and obtaining reliable indicators of performance; it will also be necessary to clarify the interpretation of the method’s results when key assumptions do not hold.


2018 ◽  
Author(s):  
Ramon Diaz-Uriarte ◽  
Claudia Vasallo

AbstractSuccessful prediction of the likely paths of tumor progression is valuable for diagnostic, prognostic, and treatment purposes. Cancer progression models (CPMs) use cross-sectional samples to identify restrictions in the order of accumulation of driver mutations and thus CPMs encode the paths of tumor progression. Here we analyze the performance of four CPMs to examine whether they can be used to predict the true distribution of paths of tumor progression and to estimate evolutionary unpredictability. Employing simulations we show that if fitness landscapes are single peaked (have a single fitness maximum) there is good agreement between true and predicted distributions of paths of tumor progression when sample sizes are large, but performance is poor with the currently common much smaller sample sizes. Under multi-peaked fitness landscapes (i.e., those with multiple fitness maxima), performance is poor and improves only slightly with sample size. In all cases, detection regime (when tumors are sampled) is a key determinant of performance. Estimates of evolutionary unpredictability from the best performing CPM, among the four examined, tend to overestimate the true un-predictability and the bias is affected by detection regime; CPMs could be useful for estimating upper bounds to the true evolutionary unpredictability. Analysis of twenty-two cancer data sets shows low evolutionary unpredictability for several of the data sets. But most of the predictions of paths of tumor progression are very unreliable, and unreliability increases with the number of features analyzed. Our results indicate that CPMs could be valuable tools for predicting cancer progression but that, currently, obtaining useful predictions of paths of tumor progression from CPMs is dubious, and emphasize the need for methodological work that can account for the probably multi-peaked fitness landscapes in cancer.Author SummaryKnowing the likely paths of tumor progression is instrumental for cancer precision medicine as it would allow us to identify genetic targets that block disease progression and to improve therapeutic decisions. Direct information about paths of tumor progression is scarce, but cancer progression models (CPMs), which use as input cross-sectional data on genetic alterations, can be used to predict these paths. CPMs, however, make assumptions about fitness landscapes (genotype-fitness maps) that might not be met in cancer. We examine if four CPMs can be used to predict successfully the distribution of tumor progression paths; we find that some CPMs work well when sample sizes are large and fitness landscapes have a single fitness maximum, but in fitness landscapes with multiple fitness maxima prediction is poor. However, the best performing CPM in our study could be used to estimate evolutionary unpredictability. When we apply the best performing CPM in our study to twenty-two cancer data sets we find that predictions are generally unreliable but that some cancer data sets show low unpredictability. Our results highlight that CPMs could be valuable tools for predicting disease progression, but emphasize the need for methodological work to account for multi-peaked fitness landscapes.


2014 ◽  
Author(s):  
Ramon Diaz-Uriarte

Cancer progression is caused by the sequential accumulation of mutations, but not all orders of accumulation of mutations are equally likely. When the fixation of some mutations depends on the presence of previous ones, identifying restrictions in the order of accumulation of mutations can lead to the discovery of therapeutic targets and diagnostic markers. Using simulated data sets, I conducted a comprehensive comparison of the performance of all available methods to identify these restrictions from cross-sectional data. In contrast to previous work, I embedded restrictions within evolutionary models of tumor progression that included passengers (mutations not responsible for the development of cancer, known to be very common). This allowed me to asses the effects of having to filter out passengers, of sampling schemes, and of deviations from order restrictions. Poor choices of method, filtering, and sampling lead to large errors in all performance metrics. Having to filter passengers lead to decreased performance, especially because true restrictions were missed. Overall, the best method for identifying order restrictions were Oncogenetic Trees, a fast and easy to use method that, although unable to recover dependencies of mutations on more than one mutation, showed good performance in most scenarios, superior to Conjunctive Bayesian Networks and Progression Networks. Single cell sampling provided no advantage, but sampling in the final stages of the disease vs.\ sampling at different stages had severe effects. Evolutionary model and deviations from order restrictions had major, and sometimes counterintuitive, interactions with other factors that affected performance. This paper provides practical recommendations for using these methods with experimental data. Moreover, it shows that it is both possible and necessary to embed assumptions about order restrictions and the nature of driver status within evolutionary models of cancer progression to evaluate the performance of inferential approaches.


2019 ◽  
Author(s):  
Runpu Chen ◽  
Steve Goodison ◽  
Yijun Sun

AbstractThe interpretation of accumulating genomic data with respect to tumor evolution and cancer progression requires integrated models. We developed a computational approach that enables the construction of disease progression models using static sample data. Application to breast cancer data revealed a linear, branching evolutionary model with two distinct trajectories for malignant progression. Here, we used the progression model as a foundation to investigate the relationships between matched primary and metastasis breast tumor samples. Mapping paired data onto the model confirmed that molecular breast cancer subtypes can shift during progression, and supported directional tumor evolution through luminal subtypes to increasingly malignant states. Cancer progression modeling through the analysis of available static samples represents a promising breakthrough. Further refinement of a roadmap of breast cancer progression will facilitate the development of improved cancer diagnostics, prognostics and targeted therapeutics.


2019 ◽  
Vol 116 (19) ◽  
pp. 9501-9510 ◽  
Author(s):  
Noam Auslander ◽  
Yuri I. Wolf ◽  
Eugene V. Koonin

Cancer arises through the accumulation of somatic mutations over time. Understanding the sequence of mutation occurrence during cancer progression can assist early and accurate diagnosis and improve clinical decision-making. Here we employ long short-term memory (LSTM) networks, a class of recurrent neural network, to learn the evolution of a tumor through an ordered sequence of mutations. We demonstrate the capacity of LSTMs to learn complex dynamics of the mutational time series governing tumor progression, allowing accurate prediction of the mutational burden and the occurrence of mutations in the sequence. Using the probabilities learned by the LSTM, we simulate mutational data and show that the simulation results are statistically indistinguishable from the empirical data. We identify passenger mutations that are significantly associated with established cancer drivers in the sequence and demonstrate that the genes carrying these mutations are substantially enriched in interactions with the corresponding driver genes. Breaking the network into modules consisting of driver genes and their interactors, we show that these interactions are associated with poor patient prognosis, thus likely conferring growth advantage for tumor progression. Thus, application of LSTM provides for prediction of numerous additional conditional drivers and reveals hitherto unknown aspects of cancer evolution.


2021 ◽  
Author(s):  
Kim Philipp Jablonski ◽  
Martin Franz-Xaver Pirkl ◽  
Domagoj Cevid ◽  
Peter Buehlmann ◽  
Niko Beerenwinkel

Signaling pathways control cellular behavior. Dysregulated pathways, for example due to mutations that cause genes and proteins to be expressed abnormally, can lead to diseases, such as cancer. We introduce a novel computational approach, called Differential Causal Effects (dce), which compares normal to cancerous cells using the statistical framework of causality. The method allows to detect individual edges in a signaling pathway that are dysregulated in cancer cells, while accounting for confounding. Hence, artificial signals from, for example, batch effects have less influence on the result and dce has a higher chance to detect the biological signals. We show that dce outperforms competing methods on synthetic data sets and on CRISPR knockout screens. In an exploratory analysis on breast cancer data from TCGA, we recover known and discover new genes involved in breast cancer progression.


2020 ◽  
Author(s):  
Phillip B. Nicol ◽  
Dániel L. Barabási ◽  
Amir Asiaee ◽  
Kevin R. Coombes

AbstractMotivationCancer progression, including the development of intratumor heterogeneity, is inherently a spatial process. Mathematical models of tumor evolution can provide insights into patterns of heterogeneity that can emerge in the presence of spatial growth.SummaryWe develop SITH, an R package that implements a lattice-based stochastic model of tumor growth and mutation. SITH provides 3D interactive visualizations of the simulated tumor and highlights heavily mutated regions. SITH can produce synthetic bulk and single-cell sequencing data sets by sampling from the tumor. The streamlined API will make SITH a useful tool for investigating the relationship between spatial growth and intratumor heterogeneity.Availability and ImplementationSITH is a part of CRAN and can thus be installed by running install.packages(“SITH”) from the R console. See https://CRAN.R-project.org/package=SITH for the user manual and package vignette.


2017 ◽  
Author(s):  
Ramon Diaz-Uriarte

AbstractThe identification of constraints, due to gene interactions, in the order of accumulation of mutations during cancer progression can allow us to single out therapeutic targets. Cancer progression models (CPMs) use genotype frequency data from cross-sectional samples to try to identify these constraints, and return Directed Acyclic Graphs (DAGs) of genes. On the other hand, fitness landscapes, which map genotypes to fitness, contain all possible paths of tumor progression. Thus, we expect a correspondence between DAGs from CPMs and the fitness landscapes where evolution happened. But many fitness landscapes —e.g., those with reciprocal sign epistasis— cannot be represented by CPMs. Using simulated data under 500 fitness landscapes, I show that CPMs’ performance (prediction of genotypes that can exist) degrades with reciprocal sign epistasis. There is large variability in the DAGs inferred from each landscape, which is also affected by mutation rate, detection regime, and fitness landscape features, in ways that depend on CPM method. And the same DAG is often observed in very different landscapes, which differ in more than 50% of their accessible genotypes. Using a pancreatic data set, I show that this many-to-many relationship affects the analysis of empirical data. Fitness landscapes that are widely different from each other can, when evolutionary processes run repeatedly on them, both produce data similar to the empirically observed one, and lead to DAGs that are very different among themselves. Because reciprocal sign epistasis can be common in cancer, these results question the use and interpretation of CPMs.


2020 ◽  
Author(s):  
Phillip B. Nicol ◽  
Kevin R. Coombes ◽  
Courtney Deaver ◽  
Oksana A. Chkrebtii ◽  
Subhadeep Paul ◽  
...  

ABSTRACTCancer is the process of accumulating genetic alterations that confer selective advantages to tumor cells. The order in which aberrations occur is not arbitrary, and inferring the order of events is a challenging problem due to the lack of longitudinal samples from tumors. Moreover, a network model of oncogenesis should capture biological facts such as distinct progression trajectories of cancer subtypes and patterns of mutual exclusivity of alterations in the same pathways. In this paper, we present the Disjunctive Bayesian Network (DBN), a novel cancer progression model. Unlike previous models of oncogenesis, DBN naturally captures mutually exclusive alterations. Besides, DBN is flexible enough to represent progression trajectories of cancer subtypes, therefore allowing one to learn the progression network from unstratified data, i.e., mixed samples from multiple subtypes. We provide a scalable genetic algorithm to learn the structure of DBN from cross-sectional cancer data. To test our model, we simulate synthetic data from known progression networks and show that our algorithm infers the ground truth network with high accuracy. Finally, we apply our model to copy number data for colon cancer and mutation data for bladder cancer and observe that the recovered progression network matches known biological facts.


Author(s):  
Donata Grimm ◽  
Sofia Mathes ◽  
Linn Woelber ◽  
Caroline Van Aken ◽  
Barbara Schmalfeldt ◽  
...  

Abstract Purpose The aim of this multicenter cross-sectional study was to analyze a cohort of breast (BC) and gynecological cancer (GC) patients regarding their interest in, perception of and demand for integrative therapeutic health approaches. Methods BC and GC patients were surveyed at their first integrative clinic visit using validated standardized questionnaires. Treatment goals and potential differences between the two groups were evaluated. Results 340 patients (272 BC, 68 GC) participated in the study. The overall interest in IM was 95.3% and correlated with older age, recent chemotherapy, and higher education. A total of 89.4% were using integrative methods at the time of enrolment, primarily exercise therapy (57.5%), and vitamin supplementation (51.4%). The major short-term goal of the BC patients was a side-effects reduction of conventional therapy (70.4%); the major long-term goal was the delay of a potential tumor progression (69.3%). In the GC group, major short-term and long-term goals were slowing tumor progression (73.1% and 79.1%) and prolonging survival (70.1% and 80.6%). GC patients were significantly more impaired by the side-effects of conventional treatment than BC patients [pain (p = 0.006), obstipation (< 0.005)]. Conclusion Our data demonstrate a high overall interest in and use of IM in BC and GC patients. This supports the need for specialized IM counseling and the implementation of integrative treatments into conventional oncological treatment regimes in both patient groups. Primary tumor site, cancer diagnosis, treatment phase, and side effects had a relevant impact on the demand for IM in our study population.


Sign in / Sign up

Export Citation Format

Share Document