Whole genome sequencing of sporadic Burkitt lymphoma in HIV-infected and uninfected patients.

2013 ◽  
Vol 31 (15_suppl) ◽  
pp. 8577-8577
Author(s):  
Deborah Ritter ◽  
Kimberly Walker ◽  
Myoung Kwon ◽  
Premal Lulla ◽  
Catherine M. Bollard ◽  
...  

8577 Background: Burkitt Lymphoma is defined by canonical translocations between MYC and immunoglobulin IgH, IgK or IgL (8:14, 8:2, 8:22, respectively), and is commonly associated with HIV. The identification of HIV from sequenced samples is critical to understanding HIV-associated Burkitt Lymphoma. While recent novel gene mutations (ID3 and TCF3) have been implicated in functional roles, concomitant genomic structural variants and the interaction of HIV with structural variation is less well defined. Methods: We sequenced the whole genomes of 15 patients with 100bp paired-end reads on Illumina Hi-Seq platform, resulting in an average insert size of 278 (+/- 63) and coverage of 60X tumor and 30X normal. We included 7 HIV-negative, and 8 HIV-positive subjects. Sequencing reads were mapped to the reference genome using BWA. Large-scale structural variation was detected by the BreakDancer and Crest programs. Functional annotation was used to prioritize structural variants for validation. Single nucleotide variants and small insertions and deletions were detected by CARNAC, a somatic variation discovery pipeline. The subset of WGS reads that failed to align to the human reference genome were tested for the presence of HIV sequences by comparing the unmapped reads to a database of viral DNA sequences which included the common subtypes of HIV defined by Los Alamos. Reads matching HIV or EBV with an expectation value of <10-4 were analyzed to determine virus coverage and viral integration sites. Results: Canonical MYC-IgH translocations were identified in 9/15 (60%) tumor samples, with 2 additional subjects harboring either a deletion or an inversion near exon1 of MYC; 4 had no MYC rearrangement. MYC translocations occurred equally in both groups. TP53 and SMARC4 point mutations were observed recurrently in the HIV uninfected group but not in the HIV infected patients. Variable levels of HIV DNA sequence were observed in normal tissue of all HIV infected patients. Conclusions: Whole genome sequencing has identified known somatic variants in HIV infected and uninfected patients. Two genes, TP53 and SMARC4, appear to be differentially mutated, but additional samples are needed to achieve statistical significance.

2015 ◽  
Vol 117 (suppl_1) ◽  
Author(s):  
Matthew Wheeler ◽  
Daryl Waggott ◽  
Megan Grove ◽  
Frederick Dewey ◽  
Cuiping Pan ◽  
...  

Background: Technological advances have greatly reduced the cost of whole genome sequencing. For single individuals clinical application is apparent, while exome sequencing in tens of thousands of people has allowed a more global view of genetic variation that can inform interpretation of specific variants in individuals. We hypothesized that genome sequencing of patients with monogenic cardiomyopathy would facilitate discovery of genetic modifiers of phenotype. Methods and Results: We identified 48 individuals diagnosed with cardiomyopathy and with putative mutations in MYH7, the gene encoding beta myosin heavy chain. We carried out whole genome sequencing and applied a newly developed analytical pipeline optimized for discovery of genes modifying severity of clinical presentation and outcomes. Using a combination of external priors and rare variant burden tests we scored genes as potential modifiers. There were 96 genes that reached a modifier score of 6 out of 12 or better (9=2, 8=8, 7=17, 6=69). We identified NCKAP1, a gene that regulates actin filament dynamics, and CAMSAP1, a calmodulin regulate gene that regulates microtubule dynamics, as top scoring modifiers of hypertrophic cardiomyopathy phenotypes (score=9) while LDB2, RYR2, FBN1 and ATP1A2 had modifier scores of 8. Of the top scoring genes, 21 out of 96 were identified as candidates a priori. Our candidate prioritization scheme identified the previously described modifiers of cardiomyopathy phenotype, FHOD3 and MYBPC3, as top scoring genes. We identified structural variants in 21 clinically sequenced cardiomyopathy associated genes, 13 of which were at less than 10% frequency. Copy number variants in ILK and CSRP3 were nominally associated with ejection fraction (p=0.03), while 8 genes showed copy gains (GLA, FKTN, SGCD, TTN, SOS1, ANKRD1, VCL and NEBL). Structural variants were found in CSRP3, MYL3 and TNNC1, all of which have been implicated as causative for HCM. Conclusion: Evaluation of the whole genome sequence, even in the case of putatively monogenic disease, leads to important diagnostic and scientific insights not revealed by panel-based sequencing.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Shunichi Kosugi ◽  
Yukihide Momozawa ◽  
Xiaoxi Liu ◽  
Chikashi Terao ◽  
Michiaki Kubo ◽  
...  

The Lancet ◽  
2017 ◽  
Vol 389 ◽  
pp. S34
Author(s):  
Gianmarco Contino ◽  
Maria Secrier ◽  
Paul A W Edward ◽  
Rebecca Fitzgerald

Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 68-68
Author(s):  
Jinghui Zhang ◽  
Li Ding ◽  
Linda Holmfeldt ◽  
Gang Wu ◽  
Susan L. Heatley ◽  
...  

Abstract Abstract 68 Early T-cell precursor acute lymphoblastic leukemia (ETP ALL) is characterized by an immature T-lineage immunophenotype (cCD3+, CD1a-, CD8- and CD5dim) aberrant expression of myeloid and stem cell markers, a distinct gene expression profile and very poor outcome. The underlying genetic basis of this form of leukemia is unknown. Here we report results of whole genome sequencing (WGS) of tumor and normal DNA from 12 children with ETP ALL. Genomes were sequenced to 30-fold haploid coverage using the Illumina GAIIx platform, and all putative somatic sequence and structural variants were validated. The frequency of mutations in 43 genes was assessed in a recurrence cohort of 52 ETP and 42 non-ETP T-ALL samples from patients enrolled in St Jude, Children's Oncology Group and AEIOP trials. Transcriptomic resequencing was performed for two WGS cases, and whole exome sequencing for three ETP ALL cases in the recurrence cohort. We identified 44 interchromosomal translocations (mean 4 per patient, range 0–12), 32 intrachromosomal translocations (mean 3, 0–7), 53 deletions (mean 4, 0–10) and 16 insertions (mean 1, 0–5). Three cases exhibited a pattern of complex rearrangements suggestive of a single cellular catastrophe (“chromothripsis”), two of which had mutations targeting mismatch and DNA repair (MLH3 and DCLRE1C). While no single chromosomal alteration was present in all cases, 10 of 12 ETP ALLs harbored chromosomal rearrangements, several of which involved complex multichromosomal translocations and resulted in the expression of chimeric in-frame novel fusion genes disrupting hematopoietic regulators, including ETV6-INO80D, NAP1L1-MLLT10, RUNX1-EVX1 and NUP214-SQSTM1, each occurring in a single case. An additional ETP case with the ETV6-INO80D fusion was identified in the recurrence cohort. Additionally, 51% of structural variants had breakpoints in genes, including those with roles in hematopoiesis and leukemogenesis, and genes also targeted by mutation in other cases (MLH3, SUZ12, RUNX1). We identified a high frequency of activating mutations in genes regulating cytokine receptor and Ras signalling in ETP ALL (67.2% of ETP compared to 19% of non-ETP T-ALL) including NRAS (17%), FLT3 (14%), JAK3 (9%), SH2B3 (or LNK; 9%), IL7R (8%), JAK1 (8%), KRAS (3%), and BRAF (2%). Seven cases (5 ETP, 2 non-ETP) harbored in frame insertion mutations in the transmembrane domain of IL7R, which were transforming when expressed in the murine cell lines, and resulted in enhanced colony formation when expressed in primary murine hematopoietic cells. The IL7R mutations resulted in constitutive Jak-Stat activation in these cell lines and primary leukemic cells expressing these mutations. Fifty-eight percent of ETP cases (compared to 17% of non-ETP cases) harbored mutations known or predicted to disrupt hematopoietic and lymphoid development, including ETV6 (33%), RUNX1 (16%), IKZF1 (14%), GATA3 (10%), EP300 (5%) and GATA2 (2%). GATA3 regulates early T cell development, and mutations in this gene were observed exclusively in ETP ALL. The mutations were commonly biallelic, and were clustered at R276, a residue critical for binding of GATA3 to DNA. Strikingly, mutations disrupting chromatin modifying genes were also highly enriched in ETP ALL. Genes encoding the the polycomb repressor complex 2 (EZH2, SUZ12 and EED), that mediates histone 3 lysine 27 (H3K27) trimethylation were deleted or mutated in 42% of ETP ALL compared to 12% of non-ETP T-ALL. In addition, alterations of the H3K36 trimethylase SETD2 were observed in 5 ETP cases, but not in non-ETP ALL. We also identified recurrent mutations in genes that have not previously been implicated in hematopoietic malignancies including RELN, DNM2, ECT2L, HNRNPA1 and HNRNPR. Using gene set enrichment analysis we demonstrate that the gene expression profile of ETP ALL shares features not only with normal human hematopoietic stem cells, but also with leukemic initiating cells (LIC) purified from patients with acute myeloid leukemia (AML). These results indicate that mutations that drive proliferation, impair differentiation and disrupt histone modification cooperate to induce an aggressive leukemia with an aberrant immature phenotype. The similarity of the gene expression pattern with that observed in the LIC of AML raises the possibility that myeloid-directed therapies might improve the outcome of ETP ALL. Disclosures: Evans: St. Jude Children's research Hospital: Employment, Patents & Royalties; NIH & NCI: Research Funding; Aldagen: Membership on an entity's Board of Directors or advisory committees.


Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 4254-4254
Author(s):  
Zachary Hunter ◽  
Lian Xu ◽  
Guang Yang ◽  
Xia Liu ◽  
Yang Cao ◽  
...  

Abstract Background Over 90% of patients with Waldenström's Macroglobulinemia (WM), and 50-80% of patients with the precursor condition, IgM MGUS, express MYD88 L265P. These findings suggest that other mutations may support progression of IgM MGUS to WM. Chromosomal aberrations including large losses in 6q are commonly present in WM patients, though the gene loss accounting for WM pathogenesis remains unclear. We therefore sought to delineate copy number alterations (CNA) and structural variants using whole genome sequencing (WGS) in order to more clearly define other important gene alterations in WM. Methods DNA from CD19+ bone marrow lymphoplasmacytic lymphoma cells (LPC) and CD19-depleted peripheral blood mononuclear cells from 10 WM patients was used for paired tumor/germline analysis by WGS. Coverage in the tumor sample was divided by the coverage in the paired germline sample for each matching position, resulting in coverage ratios for each 100Kb window. Statistically significant windows within each genome were then analyzed across the cohort by randomizing the coverage positions to assess the probability of observing the given frequency of a CNA by random chance. TaqMan quantitative polymerase chain reaction (PCR) copy number assays was used to validate findings. Translocations were validated by Sanger sequencing across the breakpoint including flanking sequences. Results Functional annotation for identified CNAs was undertaken using Ingenuity Pathway Analysis that revealed a significant enrichment for pathways dysregulated in B-cell malignancies (Table 1). Iteratively randomizing the genomic position of CNAs not related to the chromosome 6 deletions revealed a greater than 3 fold increase in the targeting of COSMIC genes than expected by chance (p< 0.001). Affected genes in the COSMIC census were BTG1 (9/10; 90%), FOXP1 (7/10; 70%), FNBP1 (7/10; 70%), CD74 (7/10; 70%), TOP1 (6/10; 60%), MYB (5/10; 50%), CBLB (5/10; 50%), ETV6 (5/10; 50%), TNFAIP3 (5/10; 50%), FBXW7 (5/10; 50%), PRDM1 (5/10; 50%), TFE3 (4/10; 40%), JAK1 (4/10; 40%), MAML2 (4/10; 40%), FAM46C (4/10; 40%), EBF1 (4/10; 40%), STL (4/10; 40%), and BIRC3 (4/10; 40%). Other affected genes of interested included PRDM2 (8/10; 80%), HIVEP2 (8/10; 80%), ARID1B (7/10; 70%) as well as LYN (7/10; 70%). There were no singular regions of statistical significance in 6q to denote a minimally deleted region though neither of the previously suspected target genes for 6q loss, PRDM1 and TNFAIP3, were included in the regions of highest statistical significance. Losses in HIVEP2 (8/10; 80%) as well as ARID1B (7/10; 70%) and BCLAF1 (7/10; 70%) constituted the most common deletions in chromosome 6, and were present in patients with and without the large-scale losses in 6q. While no recurrent translocations were noted in this study, 2 or the 5 (40%) of the 6q deletions corresponded with translocation events. In one case, this was a result of chromothripsis focused on 6q while in the other case, a t(6;X) translocation linked to the amplification of Xq was identified. Validation studies confirmed presence of somatic deletions in BTG1 (4/5; 80%) at Chr. 12q21.33, HIVEP2 (4/5; 80%) at 6q24.2, LYN (3/5 60%) at 8q12.1, PLEKHG1 (3/5; 60%) at 6q25.1, ARID1B (3/5 60%) at 6q25.1, PDRM2 (2/5; 40%) at 1p36.21, FOXP1 (2/5; 40%) at 3p13, and MKLN1 (2/5 40%) at 7q32. As some CVAs were subclonal, we validated the correlation between the PCR relative copy number and WGS coverage predictions (rho = .926; p =2.2x10-16). Conclusions Highly recurrent CNAs are present in WM LPCs that include genes with critical regulatory roles in lymphocytic growth and survival signaling. Disclosures: No relevant conflicts of interest to declare.


2019 ◽  
Author(s):  
James M. Holt ◽  
Camille L. Birch ◽  
Donna M. Brown ◽  
Manavalan Gajapathy ◽  
Nadiya Sosonkina ◽  
...  

AbstractPurposeClinical whole genome sequencing is becoming more common for determining the molecular diagnosis of rare disease. However, standard clinical practice often focuses on small variants such as single nucleotide variants and small insertions/deletions. This leaves a wide range of larger “structural variants” that are not commonly analyzed in patients.MethodsWe developed a pipeline for processing structural variants for patients who received whole genome sequencing through the Undiagnosed Diseases Network (UDN). This pipeline called structural variants, stored them in an internal database, and filtered the variants based on internal frequencies and external annotations. The remaining variants were manually inspected and then interesting findings were reported as research variants to clinical sites in the UDN.ResultsOf 477 analyzed UDN cases, 286 cases (≈ 60%) received at least one structural variant as a research finding. The variants in 16 cases (≈ 4%) are considered “Certain” or “Highly likely” molecularly diagnosed and another 4 cases are currently in review. Of those 20 cases, at least 13 were identified originally through our pipeline with one finding leading to identification of a new disease. As part of this paper, we have also released the collection of variant calls identified in our cohort along with heterozygous and homozygous call counts. This data is available at https://github.com/HudsonAlpha/UDN_SV_export.ConclusionStructural variants are key genetic features that should be analyzed during routine clinical genomic analysis. For our UDN patients, structural variants helped solve ≈ 4% of the total number of cases (≈ 13% of all genome sequencing solves), a success rate we expect to improve with better tools and greater understanding of the human genome.


2021 ◽  
Author(s):  
Lucía Peña Pérez ◽  
Nicolai Frengen ◽  
Julia Hauenstein ◽  
Charlotte Gran ◽  
Charlotte Gustafsson ◽  
...  

Multiple myeloma (MM) is an incurable and aggressive plasma cell malignancy characterized by a complex karyotype with multiple structural variants (SVs) and copy number variations (CNVs). Linked-read whole-genome sequencing (lrWGS) allows for refined detection and reconstruction of SVs by providing long-range genetic information from standard short-read sequencing. This makes lrWGS an attractive solution for capturing the full genomic complexity of MM. Here we show that high-quality lrWGS data can be generated from low numbers of FACS sorted cells without DNA purification. Using this protocol, we analyzed FACS sorted MM cells from 37 MM patients with lrWGS. We found high concordance between lrWGS and FISH for the detection of recurrent translocations and CNVs. Outside of the regions investigated by FISH, we identified >150 additional SVs and CNVs across the cohort. Analysis of the lrWGS data allowed for resolving the structure of diverse SVs affecting the MYC and t(11;14) loci causing the duplication of genes and gene regulatory elements. In addition, we identified private SVs causing the dysregulation of genes recurrently involved in translocations with the IGH locus and show that these can alter the molecular classification of the MM. Overall, we conclude that lrWGS allows for the detection of aberrations critical for MM prognostics and provides a feasible route for providing comprehensive genetics. Implementing lrWGS could provide more accurate clinical prognostics, facilitate genomic medicine initiatives, and greatly improve the stratification of patients included in clinical trials.


2020 ◽  
Author(s):  
Andrew G. Sharo ◽  
Zhiqiang Hu ◽  
Steven E. Brenner

AbstractWhole genome sequencing resolves clinical cases where standard diagnostic methods have failed. However, preliminary studies show that at least half of these cases still remain unresolved, even after whole genome sequencing. Structural variants (genomic variants larger than 50 base pairs) of uncertain significance may be the genetic cause of a portion of these unresolved cases. Historically, structural variants (SVs) have been difficult to detect with confidence from short-read sequencing. As both detection algorithms and long-read/linked-read sequencing methods become more accessible, clinical researchers will have access to thousands of reliable SVs of unknown disease relevance. Filtering these SVs by overlap with cataloged SVs is an imperfect solution. Innovative methods to predict the pathogenicity of these SVs will be needed to realize the full diagnostic potential of long-read sequencing. To address this emerging need, we developed StrVCTVRE (Structural Variant Classifier Trained on Variants Rare and Exonic), a classifier that can be used to distinguish pathogenic SVs from benign SVs that overlap exons. We made use of features that capture gene importance, coding region, conservation, expression, and exon structure in a random forest classifier. We found that some features, such as expression and conservation, are important but are absent from SV classification guidelines. Although databases of SVs reflect size biases from sequencing techniques, we leveraged multiple databases to construct a size-matched training set of rare, putatively benign and pathogenic SVs. In independent test sets, we found our method performs accurately across a wide SV size range, which will allow clinical researchers to eliminate nearly 60% of SVs from consideration at an elevated sensitivity of 90%. However, our method and its assessment are still constrained by a small training dataset and acquisition bias in databases of pathogenic variants. StrVCTVRE fills an empty niche in the clinical evaluation of SVs of unknown significance. We anticipate researchers will use it to prioritize SVs in patients where no variant is immediately compelling, empowering deeper investigation into novel SVs and disease genes to resolve cases.


2021 ◽  
Author(s):  
Marsha M. Wheeler ◽  
Adrienne M Stilp ◽  
Shuquan Rao ◽  
Bjarni V Halldorsson ◽  
Doruk V Beyter ◽  
...  

Genome-wide association studies (GWAS) have identified thousands of single nucleotide variants and small indels that contribute to the genetic architecture of hematologic traits. While structural variants (SVs) are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of SVs to quantitative blood cell trait variation is unknown. Here we utilized SVs detected from whole genome sequencing (WGS) in ancestrally diverse participants of the NHLBI TOPMed program (N=50,675). Using single variant tests, we assessed the association of common and rare SVs with red cell-, white cell-, and platelet-related quantitative traits. The results show 33 independent SVs (23 common and 10 rare) reaching genome-wide significance. The majority of significant association signals (N=27) replicated in independent datasets from deCODE genetics and the UK BioBank. Moreover, most trait-associated SVs (N=24) are within 1Mb of previously-reported GWAS loci. SV analyses additionally discovered an association between a complex structural variant on 17p11.2 and white blood cell-related phenotypes. Based on functional annotation, the majority of significant SVs are located in non-coding regions (N=26) and predicted to impact regulatory elements and/or local chromatin domain boundaries in blood cells. We predict that several trait-associated SVs represent the causal variant. This is supported by genome-editing experiments which provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.


Sign in / Sign up

Export Citation Format

Share Document