Targeted Gene Profiling Identifies Differential Genetic Make-up Depending On Chronic Lymphocytic Leukaemia Subtype

Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 1383-1383
Author(s):  
Adam Burns ◽  
Ruth Clifford ◽  
Helene Dreau ◽  
Chris S.R Hatton ◽  
Shirley Henderson ◽  
...  

Abstract Abstract 1383 Explorative genome-wide next-generation sequencing of leukaemias and lymphomas has revealed a wide spectrum of acquired mutations and considerable tumour heterogeneity that might be responsible for disease initiation, resistance to treatments and relapse. There is, therefore, a clinical need to identify these genetic abnormalities in a diagnostic setting. Here, we present the development and validation of a targeted next generation mutation analysis tool. To compare the distribution pattern of genetic abnormalities in chronic lymphocytic leukemia (CLL), we performed targeted deep sequencing on CLL samples using a TruSeq custom designed targeted amplicon assay (TSCA, Illumina). We reveal differential mutation distribution patterns depending on clinical CLL subgroups. The TSCA panel was designed to amplify 21 genes (table 1) with known or suspected links to either the development of CLL or as response predictors, including TP53, SF3B1 (Puente, Nature, 2011; Quesada et al, 2012) and NOTCH1 (Rossi, Blood, 2012). Where genes have known mutational hotspots in CLL, only these regions were included in our panel, for example exons 5–8 of TP53. For genes such as MAP2K1, where mutations are distributed throughout the coding region, every exon was targeted. In total, we were able to design an amplicon panel able to cover 99% of our desired 36,035bp target region. Table 1. List of genes included in CLL custom amplicon panel ASXL1 ATM CHD2 DDX3X FBXW7 HMCN1 IRF4 KLHL6 LRP1B MAP2K1 MAPK1 MED12 NOTCH1 PCLO POT1 SAMHD1 SF3B1 TP53 XPO1 ZFPM2 ZMYM3 In order to validate our approach, we used samples previously subjected to whole genome sequencing as controls. Of the 13 individual mutations in the control cohort, we were successfully able to detect 10 (77%) with our custom assay to an average depth of 1380x. A 19bp deletion in TP53 failed to be picked up by the variant calling software, and 2 point mutations in ATM were not detected due to the targeted nature of the assay. There was a single false positive mutation across all samples in ZFPM2, caused by a sequencing error in a homopolymer region. The sample group consisted of 45 representative CLL cases, split into two cohorts. The first cohort consisted of 11 cases that have yet to receive any treatment, whilst the second cohort comprised 34 relapsed/refractory cases. Analysis of further samples is in progress. We performed library preparation according to the manufacturers instructions. Each sample was dual indexed with two 8bp “barcodes” prior to equimolar pooling, and the final pooled library was processed on an Illumina MiSeq instrument using the TruSeq 2×150bp paired end sequencing protocol. The run produced 1.6Gb of passed filter sequence data, with 92.8% of above the quality threshold of Q30. The average depth of coverage across all samples was 849x. Primary analysis of the sequencing data was performed using the cloud based data analysis package from Illumina, which carried out the alignment and variant calling. A conservative quality score threshold of >99 was set, with all variants above this carried forward for further analysis. Our custom amplicon panel detected mutations in 35 of the samples, comprising 8 indels and 45 point mutations. Of the 54 mutations, 40 were missense, 8 were frame-shifts, 1 was a nonsense mutation and 5 are predicted to have functional effects on splicing domains. The most frequently mutated gene was TP53, followed by SF3B1, PCLO and NOTCH1 (figure 1). Fig 1 Frequency of genes with somatic mutations in our CLL cohort. Fig 1. Frequency of genes with somatic mutations in our CLL cohort. Importantly, there was good correlation between mutation allele frequencies from whole genome sequencing, targeted deep sequencing and TSCA, demonstrating that the high sensitivity of large-scale genome sequencers can be reliably applied in a diagnostic setting. We describe mutation hotspots and mutation distribution patterns and link them to clinical behaviour. For example: SF3B1 mutations occurred in 15% of patients and were linked to reduced progression free survival. In conclusion, our technique allows for rapid mutation detection of the most frequently mutated genes in CLL. Further refinements in amplicon design and variant calling will lead to added precision. TSCA design and validation for other haematological diseases is in progress. Disclosures: No relevant conflicts of interest to declare.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Luciano Calderón ◽  
Nuria Mauri ◽  
Claudio Muñoz ◽  
Pablo Carbonell-Bejerano ◽  
Laura Bree ◽  
...  

AbstractGrapevine cultivars are clonally propagated to preserve their varietal attributes. However, genetic variations accumulate due to the occurrence of somatic mutations. This process is anthropically influenced through plant transportation, clonal propagation and selection. Malbec is a cultivar that is well-appreciated for the elaboration of red wine. It originated in Southwestern France and was introduced in Argentina during the 1850s. In order to study the clonal genetic diversity of Malbec grapevines, we generated whole-genome resequencing data for four accessions with different clonal propagation records. A stringent variant calling procedure was established to identify reliable polymorphisms among the analyzed accessions. The latter procedure retrieved 941 single nucleotide variants (SNVs). A reduced set of the detected SNVs was corroborated through Sanger sequencing, and employed to custom-design a genotyping experiment. We successfully genotyped 214 Malbec accessions using 41 SNVs, and identified 14 genotypes that clustered in two genetically divergent clonal lineages. These lineages were associated with the time span of clonal propagation of the analyzed accessions in Argentina and Europe. Our results show the usefulness of this approach for the study of the scarce intra-cultivar genetic diversity in grapevines. We also provide evidence on how human actions might have driven the accumulation of different somatic mutations, ultimately shaping the Malbec genetic diversity pattern.


Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 3214-3214 ◽  
Author(s):  
Andreas Agathangelidis ◽  
Viktor Ljungström ◽  
Lydia Scarfò ◽  
Claudia Fazi ◽  
Maria Gounari ◽  
...  

Abstract Chronic lymphocytic leukemia (CLL) is preceded by monoclonal B cell lymphocytosis (MBL), characterized by the presence of monoclonal CLL-like B cells in the peripheral blood, yet at lower numbers than those required for the diagnosis of CLL. MBL is distinguished into low-count (LC-MBL) and high-count (HC-MBL), based on the number of circulating CLL-like cells. While the former does not virtually progress into a clinically relevant disease, the latter may evolve into CLL at a rate of 1% per year. In CLL, genomic studies have led to the discovery of recurrent gene mutations that drive disease progression. These driver mutations may be detected in HC-MBL and even in multipotent hematopoietic progenitor cells from CLL patients, suggesting that they may be essential for CLL onset. Using whole-genome sequencing (WGS) we profiled LC-MBL and HC-MBL cases but also CLL patients with stable lymphocytosis (range: 39.8-81.8*109 CLL cells/l) for >10 years (hereafter termed indolent CLL). This would refine our understanding of the type of genetic aberrations that may be involved in the initial transformation rather than linked to clinical progression as is the case for most, if not all, CLL driver mutations. To this end, we whole-genome sequenced CD19+CD5+CD20dim cells from 6 LC-MBL, 5 HC-MBL and 5 indolent CLL cases; buccal control DNA and polymorphonuclear (PMN) cells were analysed in all cases. We also performed targeted deep-sequencing on 11 known driver genes (ATM, BIRC3, MYD88, NOTCH1, SF3B1, TP53, EGR2, POT1, NFKBIE, XPO1, FBXW7) in 8 LC-MBL, 13 HC-MBL and 7 indolent CLL cases and paired PMN samples. Overall similar mutation signatures/frequencies were observed for LC/HC-MBL and CLL concerning i) the entire genome; with an average of 2040 somatic mutations observed for LC-MBL, 2558 for HC-MBL and 2400 for CLL (186 for PMN samples), as well as ii) in the exome; with an average of non-synonymous mutations of 8.9 for LC-MBL, 14.6 for HC-MBL, 11.6 for indolent CLL (0.9 for PMN samples). Regarding putative CLL driver genes, WGS analysis revealed only 2 somatic mutations within NOTCH1, and FBXW7 in one HC-MBL case each. After stringent filtering, 106 non-coding variants (NCVs) of potential relevance to CLL were identified in all MBL/CLL samples and 4 NCVs in 2/24 PMN samples. Seventy-two of 110 NCVs (65.5%) caused a potential breaking event in transcription factor binding motifs (TFBM). Of these, 29 concerned cancer-associated genes, including BTG2, BCL6 and BIRC3 (4, 2 and 2 samples, respectively), while 16 concerned genes implicated in pathways critical for CLL e.g. the NF-κB and spliceosome pathways. Shared mutations between MBL/CLL and their paired PMN samples were identified in all cases: 2 mutations were located within exons, whereas an average of 15.8 mutations/case for LC-MBL, 8.2 for HC-MBL and 9 for CLL, respectively, concerned the non-coding part. Finally, 16 sCNAs were identified in 9 MBL/CLL samples; of the Döhner model aberrations, only del(13q) was detected in 7/9 cases bearing sCNAs (2 LC-MBL, 3 HC-MBL, 2 indolent CLL). Targeted deep-sequencing analysis (coverage 3000x) confirmed the 2 variants detected by WGS, i.e. in NOTCH1 (n=1) and FBXW7 (n=1), while 4 subclonal likely damaging variants were detected with a VAF <10% in POT1 (n=2), TP53 (n=1), and SF3B1 (n=1) in 4 HC-MBL samples. In conclusion, LC-MBL and CLL with stable lymphocytosis for >10 years display similar low genomic complexity and absence of exonic driver mutations, assessed both with WGS and deep-sequencing, underscoring their common low propensity to progress. On the other hand, HC-MBL comprising cases that may ultimately evolve into clinically relevant CLL can acquire exonic driver mutations associated with more dismal prognosis, as exemplified by subclonal driver mutations detected by deep-sequenicng. The existence of NCVs in TFBMs targeting pathways critical for CLL prompts further investigation into their actual relevance to the clinical behavior. Shared mutations between CLL and PMN cells indicate that some somatic mutations may occur before CLL onset, likely at the hematopoietic stem-cell level. Their potential oncogenic role likely depends on the cellular context and/or microenvironmental stimuli to which the affected cells are exposed. Disclosures Stamatopoulos: Novartis: Honoraria, Research Funding; Janssen: Honoraria, Other: Travel expenses, Research Funding; Gilead: Consultancy, Honoraria, Research Funding; Abbvie: Honoraria, Other: Travel expenses. Ghia:Adaptive: Consultancy; Gilead: Consultancy, Honoraria, Research Funding, Speakers Bureau; Abbvie: Consultancy, Honoraria; Janssen: Consultancy, Honoraria, Speakers Bureau; Roche: Honoraria, Research Funding.


2018 ◽  
Author(s):  
Shuto Hayashi ◽  
Rui Yamaguchi ◽  
Shinichi Mizuno ◽  
Mitsuhiro Komura ◽  
Satoru Miyano ◽  
...  

AbstractAlthough human leukocyte antigen (HLA) genotyping based on amplicon, whole exome sequence (WES), and RNA sequence data has been achieved in recent years, accurate genotyping from whole genome sequence (WGS) data remains a challenge due to the low depth. Furthermore, there is no method to identify the sequences of unknown HLA types not registered in HLA databases. We developed a Bayesian model, called ALPHLARD, that collects reads potentially generated from HLA genes and accurately determines a pair of HLA types for each of HLA-A, -B, -C, -DPA1, -DPB1, -DQA1, -DQB1, and -DRB1 genes at 6-digit resolution. Furthermore, ALPHLARD can detect rare germline variants not stored in HLA databases and call somatic mutations from paired normal and tumor sequence data. We illustrate the capability of ALPHLARD using 253 WES data and 25 WGS data from Illumina platforms. By comparing the results of HLA genotyping from SBT and amplicon sequencing methods, ALPHLARD achieved 98.8% for WES data and 98.5% for WGS data at 4-digit resolution. We also detected three somatic point mutations and one case of loss of heterozygosity in the HLA genes from the WGS data. ALPHLARD showed good performance for HLA genotyping even from low-coverage data. It also has a potential to detect rare germline variants and somatic mutations in HLA genes. It would help to fill in the current gaps in HLA reference databases and unveil the immunological significance of somatic mutations identified in HLA genes.


2021 ◽  
pp. gr.275579.121
Author(s):  
Daniel P Cooke ◽  
David C Wedge ◽  
Gerton Lunter

Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, where genotyping accuracies are confidently determined with comprehensive benchmarks, polyploids have been neglected; there are no benchmarks measuring genotyping error rates for small variants using real sequencing reads. We previously introduced a variant calling method - Octopus - that accurately calls germline variants in diploids and somatic mutations in tumors. Here, we evaluate Octopus and other popular tools on whole-genome tetraploid and hexaploid datasets created using in silico mixtures of diploid Genome In a Bottle (GIAB) samples. We find that genotyping errors are abundant for typical sequencing depths, but that Octopus makes 25% fewer errors than other methods on average. We supplement our benchmarks with concordance analysis in real autotriploid banana datasets.


2021 ◽  
Author(s):  
Daniel P Cooke ◽  
David C Wedge ◽  
Gerton Lunter

Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, where genotyping accuracies are confidently determined with comprehensive benchmarks, polyploids have been neglected; there are no benchmarks measuring genotyping error rates for small variants using real sequencing reads. We previously introduced a variant calling method – Octopus – that accurately calls germline variants in diploids and somatic mutations in tumors. Here, we evaluate Octopus and other popular tools on whole-genome tetraploid and hexaploid datasets created using in silico mixtures of diploid Genome In a Bottle samples. We find that genotyping errors are abundant for typical sequencing depths, but that Octopus makes 25% fewer errors than other methods on average. We supplement our benchmarks with concordance analysis in real autotriploid banana datasets.


2021 ◽  
Vol 9 (8) ◽  
pp. 1585
Author(s):  
Ana C. Reis ◽  
Liliana C. M. Salvador ◽  
Suelee Robbe-Austerman ◽  
Rogério Tenreiro ◽  
Ana Botelho ◽  
...  

Classical molecular analyses of Mycobacterium bovis based on spoligotyping and Variable Number Tandem Repeat (MIRU-VNTR) brought the first insights into the epidemiology of animal tuberculosis (TB) in Portugal, showing high genotypic diversity of circulating strains that mostly cluster within the European 2 clonal complex. Previous surveillance provided valuable information on the prevalence and spatial occurrence of TB and highlighted prevalent genotypes in areas where livestock and wild ungulates are sympatric. However, links at the wildlife–livestock interfaces were established mainly via classical genotype associations. Here, we apply whole genome sequencing (WGS) to cattle, red deer and wild boar isolates to reconstruct the M. bovis population structure in a multi-host, multi-region disease system and to explore links at a fine genomic scale between M. bovis from wildlife hosts and cattle. Whole genome sequences of 44 representative M. bovis isolates, obtained between 2003 and 2015 from three TB hotspots, were compared through single nucleotide polymorphism (SNP) variant calling analyses. Consistent with previous results combining classical genotyping with Bayesian population admixture modelling, SNP-based phylogenies support the branching of this M. bovis population into five genetic clades, three with apparent geographic specificities, as well as the establishment of an SNP catalogue specific to each clade, which may be explored in the future as phylogenetic markers. The core genome alignment of SNPs was integrated within a spatiotemporal metadata framework to further structure this M. bovis population by host species and TB hotspots, providing a baseline for network analyses in different epidemiological and disease control contexts. WGS of M. bovis isolates from Portugal is reported for the first time in this pilot study, refining the spatiotemporal context of TB at the wildlife–livestock interface and providing further support to the key role of red deer and wild boar on disease maintenance. The SNP diversity observed within this dataset supports the natural circulation of M. bovis for a long time period, as well as multiple introduction events of the pathogen in this Iberian multi-host system.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shumaila Sayyab ◽  
Anders Lundmark ◽  
Malin Larsson ◽  
Markus Ringnér ◽  
Sara Nystedt ◽  
...  

AbstractThe mechanisms driving clonal heterogeneity and evolution in relapsed pediatric acute lymphoblastic leukemia (ALL) are not fully understood. We performed whole genome sequencing of samples collected at diagnosis, relapse(s) and remission from 29 Nordic patients. Somatic point mutations and large-scale structural variants were called using individually matched remission samples as controls, and allelic expression of the mutations was assessed in ALL cells using RNA-sequencing. We observed an increased burden of somatic mutations at relapse, compared to diagnosis, and at second relapse compared to first relapse. In addition to 29 known ALL driver genes, of which nine genes carried recurrent protein-coding mutations in our sample set, we identified putative non-protein coding mutations in regulatory regions of seven additional genes that have not previously been described in ALL. Cluster analysis of hundreds of somatic mutations per sample revealed three distinct evolutionary trajectories during ALL progression from diagnosis to relapse. The evolutionary trajectories provide insight into the mutational mechanisms leading relapse in ALL and could offer biomarkers for improved risk prediction in individual patients.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Agata Stodolna ◽  
Miao He ◽  
Mahesh Vasipalli ◽  
Zoya Kingsbury ◽  
Jennifer Becq ◽  
...  

Abstract Background Clinical-grade whole-genome sequencing (cWGS) has the potential to become the standard of care within the clinic because of its breadth of coverage and lack of bias towards certain regions of the genome. Colorectal cancer presents a difficult treatment paradigm, with over 40% of patients presenting at diagnosis with metastatic disease. We hypothesised that cWGS coupled with 3′ transcriptome analysis would give new insights into colorectal cancer. Methods Patients underwent PCR-free whole-genome sequencing and alignment and variant calling using a standardised pipeline to output SNVs, indels, SVs and CNAs. Additional insights into the mutational signatures and tumour biology were gained by the use of 3′ RNA-seq. Results Fifty-four patients were studied in total. Driver analysis identified the Wnt pathway gene APC as the only consistently mutated driver in colorectal cancer. Alterations in the PI3K/mTOR pathways were seen as previously observed in CRC. Multiple private CNAs, SVs and gene fusions were unique to individual tumours. Approximately 30% of patients had a tumour mutational burden of > 10 mutations/Mb of DNA, suggesting suitability for immunotherapy. Conclusions Clinical whole-genome sequencing offers a potential avenue for the identification of private genomic variation that may confer sensitivity to targeted agents and offer patients new options for targeted therapies.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Kelley Paskov ◽  
Jae-Yoon Jung ◽  
Brianna Chrisman ◽  
Nate T. Stockham ◽  
Peter Washington ◽  
...  

Abstract Background As next-generation sequencing technologies make their way into the clinic, knowledge of their error rates is essential if they are to be used to guide patient care. However, sequencing platforms and variant-calling pipelines are continuously evolving, making it difficult to accurately quantify error rates for the particular combination of assay and software parameters used on each sample. Family data provide a unique opportunity for estimating sequencing error rates since it allows us to observe a fraction of sequencing errors as Mendelian errors in the family, which we can then use to produce genome-wide error estimates for each sample. Results We introduce a method that uses Mendelian errors in sequencing data to make highly granular per-sample estimates of precision and recall for any set of variant calls, regardless of sequencing platform or calling methodology. We validate the accuracy of our estimates using monozygotic twins, and we use a set of monozygotic quadruplets to show that our predictions closely match the consensus method. We demonstrate our method’s versatility by estimating sequencing error rates for whole genome sequencing, whole exome sequencing, and microarray datasets, and we highlight its sensitivity by quantifying performance increases between different versions of the GATK variant-calling pipeline. We then use our method to demonstrate that: 1) Sequencing error rates between samples in the same dataset can vary by over an order of magnitude. 2) Variant calling performance decreases substantially in low-complexity regions of the genome. 3) Variant calling performance in whole exome sequencing data decreases with distance from the nearest target region. 4) Variant calls from lymphoblastoid cell lines can be as accurate as those from whole blood. 5) Whole-genome sequencing can attain microarray-level precision and recall at disease-associated SNV sites. Conclusion Genotype datasets from families are powerful resources that can be used to make fine-grained estimates of sequencing error for any sequencing platform and variant-calling methodology.


Sign in / Sign up

Export Citation Format

Share Document