scholarly journals Estimation of cancer cell fractions and clone trees from multi-region sequencing of tumors

2021 ◽  
Author(s):  
Lily Zheng ◽  
Laura Wood ◽  
Rachel Karchin ◽  
Robert B Scharpf

Multi-region sequencing of one or multiple biopsies of solid tumors from a patient can be used to improve our understanding of the diversity of subclones in the patient's tumor and shed light on the evolutionary history of the disease. Due to the large number of possible evolutionary relationships between clones and the fundamental uncertainty of the mutational composition of subclones, elucidating the most probable evolutionary relationships poses statistical and computational challenges. We developed a Bayesian hierarchical model called PICTograph to model uncertainty in the assignment of mutations to subclones and an approach to reduce the space of possible graphical models that postulate their evolutionary origin. Compared to available methods, our approach provided more consistent and accurate estimates of cancer cell fractions and better tree topology reconstruction over a range of simulated clonal diversity. Application of PICTograph to whole exome sequencing data of individuals with pancreatic cancer precursor lesions confirmed known early occurring mutations and indicated substantial molecular diversity, including multiple distinct subclones (range 6 - 12) and intra-sample mixing of subclones. As the complete evolutionary history for some patients was not identifiable, we used ensemble-based visualizations to distinguish between highly probable evolutionary relationships recovered in multiple models from uncertain relationships occurring in a small subset of models. These analyses indicate that PICTograph provides a useful approximation to evolutionary inference, particularly when the evolutionary course of a patient's cancer is complex.

2018 ◽  
Author(s):  
Ke Yuan ◽  
Geoff Macintyre ◽  
Wei Liu ◽  
Florian Markowetz ◽  

AbstractEstimating and clustering cancer cell fractions of genomic alterations are central tasks for studying intratumour heterogeneity. We present Ccube, a probabilistic framework for inferring the cancer cell fraction of somatic point mutations and the subclonal composition from whole-genome sequencing data. We develop a variational inference method for model fitting, which allows us to handle samples with large number of the variants (more than 2 million) while quantifying uncertainty in a Bayesian fashion. Ccube is available at https://github.com/keyuan/ccube.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e13053-e13053
Author(s):  
Tiancheng Han ◽  
Jianing Yu ◽  
Xiaojing Lin ◽  
Hongyu Xie ◽  
Xue Song ◽  
...  

e13053 Background: Circulating tumor DNA (ctDNA) has been applied and showed potential in cancer early/late-stage detection, tumor genotyping and post-operation recurrence monitoring. The fraction of ctDNA in cell-free DNA (noted as ccf hereby), in addition to standard SNV/INDEL/CNV analysis, has also been showed to associate with the tumor progression and prognosis. In theory, accurate ccf can further be useful in correcting and improving given SNV/INDEL/CNV results. Existing tools capable for calculating ccf (PureCN, FACETS, Sequenza, etc.) use coverage data in targeted regions and SNP allele frequency to calculate the tumor fraction, which fail to give accurate estimation at relatively low ctDNA concentrations. Methods: A Maximum Likelihood model was built to estimate ccf. We first select informative SNPs with significantly different VAF in the case and paired-control samples. The mutation type of an informative SNP is determined by the variant allele frequency (VAF) in the paired samples and the copy number of the case sample. Likelihood of each SNP given a specific ccf was then calculated. After clustering SNPs into clones, the ccf of each clone was estimated using a global likelihood. Results: Performance of the method was validated by ctDNA dilution series analysis. 6 cfDNA from cancer patient was diluted (concentrations: 1/3 - 1/81). Detection limit of the method is ~2%, and correlation between estimated and expected ccf ranged from 0.93 to 0.98. Conclusions: We have developed a novel method to better estimate cancer cell fractions in cell-free DNA. Results showed our method is able to calculate ccf at lower ctDNA concentrations with higher accuracy and stability than benchmarked tools. We describe here a method for target-sequencing data that is more sensible, accurate and stable than currently available tools.


IMA Fungus ◽  
2021 ◽  
Vol 12 (1) ◽  
Author(s):  
João P. M. Araújo ◽  
Mitsuru G. Moriguchi ◽  
Shigeru Uchiyama ◽  
Noriko Kinjo ◽  
Yu Matsuura

AbstractThe entomopathogenic genus Ophiocordyceps includes a highly diverse group of fungal species, predominantly parasitizing insects in the orders Coleoptera, Hemiptera, Hymenoptera and Lepidoptera. However, other insect orders are also parasitized by these fungi, for example the Blattodea (termites and cockroaches). Despite their ubiquity in nearly all environments insects occur, blattodeans are rarely found infected by filamentous fungi and thus, their ecology and evolutionary history remain obscure. In this study, we propose a new species of Ophiocordyceps infecting the social cockroaches Salganea esakii and S. taiwanensis, based on 16 years of collections and field observations in Japan, especially in the Ryukyu Archipelago. We found a high degree of genetic similarity between specimens from different islands, infecting these two Salganea species and that this relationship is ancient, likely not originating from a recent host jump. Furthermore, we found that Ophiocordyceps lineages infecting cockroaches evolved around the same time, at least twice, one from beetles and the other from termites. We have also investigated the evolutionary relationships between Ophiocordyceps and termites and present the phylogenetic placement of O. cf. blattae. Our analyses also show that O. sinensis could have originated from an ancestor infecting termite, instead of beetle larvae as previously proposed.


Author(s):  
Olga Kozhar ◽  
Mee-Sook Kim ◽  
Jorge Ibarra Caballero ◽  
Ned Klopfenstein ◽  
Phil Cannon ◽  
...  

Emerging pathogens have been increasing exponentially over the last century. The knowledge on whether these organisms are native to ecosystems or have been recently introduced is often of great importance. Understanding the ecological and evolutionary processes promoting emergence can help to control their spread and forecast epidemics. Using restriction site-associated DNA sequencing data, we studied genetic relationships, pathways of spread, and evolutionary history of Phellinus noxius, an emerging root-rotting fungus of unknown origin, in eastern Asia, Australia, and the Pacific Islands. We analyzed patterns of genetic variation using Bayesian inference, maximum likelihood phylogeny, populations splits and mixtures measuring correlations in allele frequencies and genetic drift, and finally applied coalescent based theory using approximate Bayesian computation (ABC) with supervised machine learning. Population structure analyses revealed five genetic groups with signatures of complex recent and ancient migration histories. The most probable scenario of ancient pathogen spread is movement from west to east: from Malaysia to the Pacific Islands, with subsequent spread to Taiwan and Australia. Furthermore, ABC analyses indicate that P. noxius spread occurred thousands of generations ago, contradicting previous assumptions that it was recently introduced in multiple areas. Our results suggest that recent emergence of P. noxius in east Asia, Australia, and the Pacific Islands is likely driven by anthropogenic and natural disturbances, including deforestation, land-use change, severe weather events, and introduction of exotic plants. This study provides a novel example of utilization of genome wide allele frequency data to unravel dynamics of pathogen emergence under conditions of changing ecosystems.


2019 ◽  
Author(s):  
Aaron P. Ragsdale ◽  
Simon Gravel

AbstractLinkage disequilibrium is used to infer evolutionary history and to identify regions under selection or associated with a given trait. In each case, we require accurate estimates of linkage disequilibrium from sequencing data. Unphased data presents a challenge because the co-occurrence of alleles at different loci is ambiguous. Commonly used estimators for the common statistics r2 and D2 exhibit large and variable upward biases that complicate interpretation and comparison across cohorts. Here, we show how to find unbiased estimators for a wide range of two-locus statistics, including D2, for both single and multiple randomly mating populations. These provide accurate estimates over three orders of magnitude in LD. We also use these estimators to construct an estimator for r2 that is less biased than commonly used estimators, but nevertheless argue for using rather than r2 for population size estimates.


2019 ◽  
Vol 11 (9) ◽  
pp. 2531-2541 ◽  
Author(s):  
Valeria Mateo-Estrada ◽  
Lucía Graña-Miraglia ◽  
Gamaliel López-Leal ◽  
Santiago Castillo-Ramírez

Abstract The Gram-negative Acinetobacter genus has several species of clear medical relevance. Many fully sequenced genomes belonging to the genus have been published in recent years; however, there has not been a recent attempt to infer the evolutionary history of Acinetobacter with that vast amount of information. Here, through a phylogenomic approach, we established the most up-to-date view of the evolutionary relationships within this genus and highlighted several cases of poor classification, especially for the very closely related species within the Acinetobacter calcoaceticus–Acinetobacter baumannii complex (Acb complex). Furthermore, we determined appropriate phylogenetic markers for this genus and showed that concatenation of the top 13 gives a very decent reflection of the evolutionary relationships for the genus Acinetobacter. The intersection between our top markers and previously defined universal markers is very small. In general, our study shows that, although there seems to be hardly any universal markers, bespoke phylogenomic approaches can be used to infer the phylogeny of different bacterial genera. We expect that ad hoc phylogenomic approaches will be the standard in the years to come and will provide enough information to resolve intricate evolutionary relationships like those observed in the Acb complex.


2006 ◽  
Vol 04 (01) ◽  
pp. 59-74 ◽  
Author(s):  
YING-JUN HE ◽  
TRINH N. D. HUYNH ◽  
JESPER JANSSON ◽  
WING-KIN SUNG

To construct a phylogenetic tree or phylogenetic network for describing the evolutionary history of a set of species is a well-studied problem in computational biology. One previously proposed method to infer a phylogenetic tree/network for a large set of species is by merging a collection of known smaller phylogenetic trees on overlapping sets of species so that no (or as little as possible) branching information is lost. However, little work has been done so far on inferring a phylogenetic tree/network from a specified set of trees when in addition, certain evolutionary relationships among the species are known to be highly unlikely. In this paper, we consider the problem of constructing a phylogenetic tree/network which is consistent with all of the rooted triplets in a given set [Formula: see text] and none of the rooted triplets in another given set [Formula: see text]. Although NP-hard in the general case, we provide some efficient exact and approximation algorithms for a number of biologically meaningful variants of the problem.


Author(s):  
Bo R. Rueda ◽  
Anne M. Friel ◽  
Ling Zhang ◽  
Michael D. Curley ◽  
Gayatry Mohapatra ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document