scholarly journals Conifer: Clonal Tree Inference for Tumor Heterogeneity With Single-cell and Bulk Sequencing Data

Author(s):  
Leila Baghaarabani ◽  
Sama Goliaei ◽  
Mohammad-Hadi Foroughmand-Araabi ◽  
Seyed Peyman Shariatpanahi ◽  
Bahram Goliaei

Abstract Background: An important and effective step in cancer treatment is understanding the clonal evolution of cancer tumors. Clones are cell populations with different genotypes, resulting from the differences in the somatic mutations that occur and accumulate during cancer development. An appropriate approach for better understanding a tumor population is determining the variant allele frequency with which the mutation occurs in the entire population. Bulk sequencing data can be used to provide that information, but the frequencies are not informative enough in identifying different clones and their evolutionary relationships. On the other hand, single-cell sequencing data provides valuable information about branching events in the evolution of a cancerous tumor. However, in the single-cell sequencing data, the total population of sequenced cells is naturally much smaller than bulk sequencing so it is not precise enough for calculating cell prevalence.Result: In this study, a new method called Conifer (ClONal tree Inference For hEterogeneity of tumoR) is proposed which combines aggregated variant allele frequency from bulk sequencing data with branch evolution information from single-cell sequencing data, in order to better understand clones and their evolutionary relationships. It is proven that the accuracy of clone identification is increased by using Conifer compared to other existing methods in both real and simulated data. Also, it is shown that the approach of Conifer in using single-cell sequencing data together with bulk sequencing data has reduced the possibility of cloning mutations with similar frequency but belonging to different clones.Conclusions: In this study, we provided an accurate and robust method to identify clones of tumor heterogeneity and their evolutionary history by combining single-cell and bulk sequencing data.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Leila Baghaarabani ◽  
Sama Goliaei ◽  
Mohammad-Hadi Foroughmand-Araabi ◽  
Seyed Peyman Shariatpanahi ◽  
Bahram Goliaei

Abstract Background Genetic heterogeneity of a cancer tumor that develops during clonal evolution is one of the reasons for cancer treatment failure, by increasing the chance of drug resistance. Clones are cell populations with different genotypes, resulting from differences in somatic mutations that occur and accumulate during cancer development. An appropriate approach for identifying clones is determining the variant allele frequency of mutations that occurred in the tumor. Although bulk sequencing data can be used to provide that information, the frequencies are not informative enough for identifying different clones with the same prevalence and their evolutionary relationships. On the other hand, single-cell sequencing data provides valuable information about branching events in the evolution of a cancerous tumor. However, the temporal order of mutations may be determined with ambiguities using only single-cell data, while variant allele frequencies from bulk sequencing data can provide beneficial information for inferring the temporal order of mutations with fewer ambiguities. Result In this study, a new method called Conifer (ClONal tree Inference For hEterogeneity of tumoR) is proposed which combines aggregated variant allele frequency from bulk sequencing data with branching event information from single-cell sequencing data to more accurately identify clones and their evolutionary relationships. It is proven that the accuracy of clone identification and clonal tree inference is increased by using Conifer compared to other existing methods on various sets of simulated data. In addition, it is discussed that the evolutionary tree provided by Conifer on real cancer data sets is highly consistent with information in both bulk and single-cell data. Conclusions In this study, we have provided an accurate and robust method to identify clones of tumor heterogeneity and their evolutionary history by combining single-cell and bulk sequencing data.


2021 ◽  
Author(s):  
Taro Matsutani ◽  
Michiaki Hamada

Intra-tumor heterogeneity is a phenomenon in which mutation profiles differ from cell to cell within the same tumor and is observed in almost all tumors. Understanding intra-tumor heterogeneity is essential from the clinical perspective. Numerous methods have been developed to predict this phenomenon based on variant allele frequency. Among the methods, CloneSig models the variant allele frequency and mutation signatures simultaneously and provides an accurate clone decomposition. However, this method has limitations in terms of clone number selection and modeling. We propose SigTracer, a novel hierarchical Bayesian approach for analyzing intra-tumor heterogeneity based on mutation signatures to tackle these issues. We show that SigTracer predicts more reasonable clone decompositions than the existing methods that use artificial data that mimic cancer genomes. We applied SigTracer to whole-genome sequences of blood cancer samples. The results were consistent with past findings that single base substitutions caused by a specific signature (previously reported as SBS9) related to the activation-induced cytidine deaminase intensively lie within immunoglobulin-coding regions for chronic lymphocytic leukemia samples. Furthermore, we showed that this signature mutates regions responsible for cell-cell adhesion. Accurate assignments of mutations to signatures by SigTracer can provide novel insights into signature origins and mutational processes.


2021 ◽  
Vol 3 (4) ◽  
Author(s):  
Taro Matsutani ◽  
Michiaki Hamada

Abstract Intra-tumor heterogeneity is a phenomenon in which mutation profiles differ from cell to cell within the same tumor and is observed in almost all tumors. Understanding intra-tumor heterogeneity is essential from the clinical perspective. Numerous methods have been developed to predict this phenomenon based on variant allele frequency. Among the methods, CloneSig models the variant allele frequency and mutation signatures simultaneously and provides an accurate clone decomposition. However, this method has limitations in terms of clone number selection and modeling. We propose SigTracer, a novel hierarchical Bayesian approach for analyzing intra-tumor heterogeneity based on mutation signatures to tackle these issues. We show that SigTracer predicts more reasonable clone decompositions than the existing methods against artificial data that mimic cancer genomes. We applied SigTracer to whole-genome sequences of blood cancer samples. The results were consistent with past findings that single base substitutions caused by a specific signature (previously reported as SBS9) related to the activation-induced cytidine deaminase intensively lie within immunoglobulin-coding regions for chronic lymphocytic leukemia samples. Furthermore, we showed that this signature mutates regions responsible for cell–cell adhesion. Accurate assignments of mutations to signatures by SigTracer can provide novel insights into signature origins and mutational processes.


2018 ◽  
Author(s):  
Pavel Skums ◽  
Vyacheslau Tsivina ◽  
Alex Zelikovsky

AbstractIntra-tumor heterogeneity is one of the major factors influencing cancer progression and treatment outcome. However, evolutionary dynamics of cancer clone populations remain poorly understood. Quantification of clonal selection and inference of fitness landscapes of tumors is a key step to understanding evolutionary mechanisms driving cancer. These problems could be addressed using single cell sequencing, which provides an unprecedented insight into intra-tumor heterogeneity allowing to study and quantify selective advantages of individual clones. Here we present SCIFIL, a computational tool for inference of fitness landscapes of heterogeneous cancer clone populations from single cell sequencing data. SCIFIL allows to estimate maximum likelihood fitnesses of clone variants, measure their selective advantages and order of appearance by fitting an evolutionary model into the tumor phylogeny. We demonstrate the accuracy and utility of our approach on simulated and experimental data. SCIFIL can be used to provide new insight into the evolutionary dynamics of cancer. Its source code is available at https://github.com/compbel/SCIFIL


2019 ◽  
Vol 35 (14) ◽  
pp. i398-i407 ◽  
Author(s):  
Pavel Skums ◽  
Viachaslau Tsyvina ◽  
Alex Zelikovsky

Abstract Summary Intra-tumor heterogeneity is one of the major factors influencing cancer progression and treatment outcome. However, evolutionary dynamics of cancer clone populations remain poorly understood. Quantification of clonal selection and inference of fitness landscapes of tumors is a key step to understanding evolutionary mechanisms driving cancer. These problems could be addressed using single-cell sequencing (scSeq), which provides an unprecedented insight into intra-tumor heterogeneity allowing to study and quantify selective advantages of individual clones. Here, we present Single Cell Inference of FItness Landscape (SCIFIL), a computational tool for inference of fitness landscapes of heterogeneous cancer clone populations from scSeq data. SCIFIL allows to estimate maximum likelihood fitnesses of clone variants, measure their selective advantages and order of appearance by fitting an evolutionary model into the tumor phylogeny. We demonstrate the accuracy our approach, and show how it could be applied to experimental tumor data to study clonal selection and infer evolutionary history. SCIFIL can be used to provide new insight into the evolutionary dynamics of cancer. Availability and implementation Its source code is available at https://github.com/compbel/SCIFIL.


2016 ◽  
Author(s):  
Hamim Zafar ◽  
Anthony Tzen ◽  
Nicholas Navin ◽  
Ken Chen ◽  
Luay Nakhleh

AbstractSingle-cell sequencing (SCS) enables the inference of tumor phylogenies that provide insights on intra-tumor heterogeneity and evolutionary trajectories. Recently introduced methods perform this task under the infinite-sites assumption, violations of which, due to chromosomal deletions and loss of heterozygosity, necessitate the development of inference methods that utilize finite-site models. We propose a statistical inference method for tumor phylogenies from noisy SCS data under a finite-sites model. The performance of our method on synthetic and experimental datasets from two colorectal cancer patients to trace evolutionary lineages in primary and metastatic tumors suggest that employing a finite-sites model leads to improved inference of tumor phylogenies.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 2730-2730
Author(s):  
Maher Albitar ◽  
Marina Y Konopleva ◽  
Ivan De Dios ◽  
Jeffrey Justin Estella ◽  
Spiraggelos Antzoulatos ◽  
...  

Introduction: Isocitrate dehydrogenase 1 and 2 (IDH1/2) are homodimeric enzymes that play an important role in cellular metabolism, epigenetic regulation, and DNA repair. Early studies suggested that mutations in IDH1/2 were loss of function mutations associated with a tumor suppressor function. However, biallelic mutations are extremely rare, and studies demonstrate that mutant IDH1/2 enzymes are responsible for NADPH-dependent reduction of αKG to the oncometabolite d-2-hydroxyglutarate (D2HG), suggesting an oncoprotein. Cellular RNA levels are tightly regulated by very complex cellular processes, and the regulation of mutant mRNA in cancer cells is rarely studied. We explored the effects of IDH1/2 mutations on mRNA levels in patients with Acute myeloid leukemia (AML). Using next generation sequencing (NGS) and variant allele frequency (VAF) of mutant RNA, we compared relative mutant mRNA or variant allele frequency (RNA-VAF) with variant allele frequency of mutant DNA (DNA-VAF) in the same samples from patients with AML. Methods: RNA and DNA were extracted from 48 bone marrow and peripheral blood samples from patients with confirmed AML, including 12 patients with IDH1 mutations, 2 with IDH2 mutation and 34 samples from AML without IDH1/2 mutations. Samples were collected pretherapy as well as while on therapy. We sequenced the DNA using 177 gene panel and the RNA using 1408 gene panel. The DNA sequencing is based on Single Primer Extension (SPE) library preparation with unique molecular identifier (UMI) (Qiagen, Germantown, MD). Average coverage of DNA sequencing was >1000X. The RNA sequencing is based on hybrid capture and the number of reads ranged from 5 to 10 million. Sequencing data of DNA is analyzed using the DRAGEN Platform. Sequence duplicates were removed before calculating VAF. The RNA sequencing data is analyzed using Illumina basespace. RNA VAF is calculated also after removing duplicates using Isaac variant caller. Only mutations detected by both DNA and RNA variant callers are compared. Results: A total of 176 mutations were detected using the DNA panel and 122 mutations using the RNA panel. Some mutations were called by RNA variant caller, but not by DNA variant caller and vice versa. All mutations detected in IDH1 and IDH2 were detected in both DNA and RNA. When the IDH1/2 mutations are considered (#14), the VAF in RNA (median: 41%, range: 13%-74%) was significant higher (P=0.006, Wilcoxon matched pairs test ) as compared with DNA (median:28%, range: 13%-74%). The VAF of the other 31 mutations that were detected in both DNA and RNA varied dependent on the gene. ASXL1, DNMT3A, RUNX1, PTPN11, SRSF2, STAG2 and U2AF1 mutations showed no significant difference between DNA and RNA in VAF (P=0.71). Although the number is small, mutations in NRAS and NPM1 showed significantly higher VAF in RNA as compared with with that of DNA (P=0.008). Conclusion: This data suggests that, in general, stability of mutant RNA varies between genes and between the mutations in the same gene. Mutant IDH1/2 RNA is significantly more stable in myeloid leukemic cells a compared with the wild-type mRNA. Most likely this reflects increased levels of mutant IDH1/2 as compared with wild-type IDH1/2, confirming that IDH1/2 is oncoprotein and may explain the efficacy of therapeutic inhibition of IDH1/2 in treating cancers. Furthermore this suggests that mRNA testing might be more sensitive in monitoring minimal residual disease in patients with IDH1/2 mutations. Figure Disclosures Albitar: Genomic Testing Ccoperative: Employment, Equity Ownership. Konopleva:Forty-Seven: Consultancy, Honoraria; Stemline Therapeutics: Consultancy, Honoraria, Research Funding; Calithera: Research Funding; Eli Lilly: Research Funding; AbbVie: Consultancy, Honoraria, Research Funding; Cellectis: Research Funding; Amgen: Consultancy, Honoraria; F. Hoffman La-Roche: Consultancy, Honoraria, Research Funding; Genentech: Honoraria, Research Funding; Ascentage: Research Funding; Kisoji: Consultancy, Honoraria; Reata Pharmaceuticals: Equity Ownership, Patents & Royalties; Ablynx: Research Funding; Astra Zeneca: Research Funding; Agios: Research Funding. Loghavi:GLG Consultants: Consultancy; AlphaSights: Consultancy; MDACC: Employment. Takahashi:Symbio Pharmaceuticals: Consultancy. Kantarjian:Jazz Pharma: Research Funding; Pfizer: Honoraria, Research Funding; Ariad: Research Funding; Cyclacel: Research Funding; Novartis: Research Funding; Astex: Research Funding; Takeda: Honoraria; Agios: Honoraria, Research Funding; BMS: Research Funding; Actinium: Honoraria, Membership on an entity's Board of Directors or advisory committees; AbbVie: Honoraria, Research Funding; Amgen: Honoraria, Research Funding; Daiichi-Sankyo: Research Funding; Immunogen: Research Funding. DiNardo:medimmune: Honoraria; agios: Consultancy, Honoraria; notable labs: Membership on an entity's Board of Directors or advisory committees; jazz: Honoraria; abbvie: Consultancy, Honoraria; celgene: Consultancy, Honoraria; daiichi sankyo: Honoraria; syros: Honoraria.


2020 ◽  
Vol 8 (Suppl 3) ◽  
pp. A520-A520
Author(s):  
Son Pham ◽  
Tri Le ◽  
Tan Phan ◽  
Minh Pham ◽  
Huy Nguyen ◽  
...  

BackgroundSingle-cell sequencing technology has opened an unprecedented ability to interrogate cancer. It reveals significant insights into the intratumoral heterogeneity, metastasis, therapeutic resistance, which facilitates target discovery and validation in cancer treatment. With rapid advancements in throughput and strategies, a particular immuno-oncology study can produce multi-omics profiles for several thousands of individual cells. This overflow of single-cell data poses formidable challenges, including standardizing data formats across studies, performing reanalysis for individual datasets and meta-analysis.MethodsN/AResultsWe present BioTuring Browser, an interactive platform for accessing and reanalyzing published single-cell omics data. The platform is currently hosting a curated database of more than 10 million cells from 247 projects, covering more than 120 immune cell types and subtypes, and 15 different cancer types. All data are processed and annotated with standardized labels of cell types, diseases, therapeutic responses, etc. to be instantly accessed and explored in a uniform visualization and analytics interface. Based on this massive curated database, BioTuring Browser supports searching similar expression profiles, querying a target across datasets and automatic cell type annotation. The platform supports single-cell RNA-seq, CITE-seq and TCR-seq data. BioTuring Browser is now available for download at www.bioturing.com.ConclusionsN/A


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Atsushi Kondo ◽  
China Nagano ◽  
Shinya Ishiko ◽  
Takashi Omori ◽  
Yuya Aoto ◽  
...  

AbstractGitelman syndrome is an autosomal recessive inherited salt-losing tubulopathy. It has a prevalence of around 1 in 40,000 people, and heterozygous carriers are estimated at approximately 1%, although the exact prevalence is unknown. We estimated the predicted prevalence of Gitelman syndrome based on multiple genome databases, HGVD and jMorp for the Japanese population and gnomAD for other ethnicities, and included all 274 pathogenic missense or nonsense variants registered in HGMD Professional. The frequencies of all these alleles were summed to calculate the total variant allele frequency in SLC12A3. The carrier frequency and the disease prevalence were assumed to be twice and the square of the total allele frequency, respectively, according to the Hardy–Weinberg principle. In the Japanese population, the total carrier frequencies were 0.0948 (9.5%) and 0.0868 (8.7%) and the calculated prevalence was 0.00225 (2.3 in 1000 people) and 0.00188 (1.9 in 1000 people) in HGVD and jMorp, respectively. Other ethnicities showed a prevalence varying from 0.000012 to 0.00083. These findings indicate that the prevalence of Gitelman syndrome in the Japanese population is higher than expected and that some other ethnicities also have a higher prevalence than has previously been considered.


Sign in / Sign up

Export Citation Format

Share Document