scholarly journals Whole-genome sequencing with long reads reveals complex structure and origin of structural variation in human genetic variations and somatic mutations in cancer

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Akihiro Fujimoto ◽  
Jing Hao Wong ◽  
Yukiko Yoshii ◽  
Shintaro Akiyama ◽  
Azusa Tanaka ◽  
...  

AbstractBackgroundIdentification of germline variation and somatic mutations is a major issue in human genetics. However, due to the limitations of DNA sequencing technologies and computational algorithms, our understanding of genetic variation and somatic mutations is far from complete.MethodsIn the present study, we performed whole-genome sequencing using long-read sequencing technology (Oxford Nanopore) for 11 Japanese liver cancers and matched normal samples which were previously sequenced for the International Cancer Genome Consortium (ICGC). We constructed an analysis pipeline for the long-read data and identified germline and somatic structural variations (SVs).ResultsIn polymorphic germline SVs, our analysis identified 8004 insertions, 6389 deletions, 27 inversions, and 32 intra-chromosomal translocations. By comparing to the chimpanzee genome, we correctly inferred events that caused insertions and deletions and found that most insertions were caused by transposons andAluis the most predominant source, while other types of insertions, such as tandem duplications and processed pseudogenes, are rare. We inferred mechanisms of deletion generations and found that most non-allelic homolog recombination (NAHR) events were caused by recombination errors in SINEs. Analysis of somatic mutations in liver cancers showed that long reads could detect larger numbers of SVs than a previous short-read study and that mechanisms of cancer SV generation were different from that of germline deletions.ConclusionsOur analysis provides a comprehensive catalog of polymorphic and somatic SVs, as well as their possible causes. Our software are available athttps://github.com/afujimoto/CAMPHORandhttps://github.com/afujimoto/CAMPHORsomatic.

2018 ◽  
Vol 64 (3) ◽  
pp. 191-197 ◽  
Author(s):  
Takeshi Mizuguchi ◽  
Tomoko Toyota ◽  
Hiroaki Adachi ◽  
Noriko Miyake ◽  
Naomichi Matsumoto ◽  
...  

2018 ◽  
Vol 12 (6) ◽  
pp. e0006566 ◽  
Author(s):  
Elizabeth M. Batty ◽  
Suwittra Chaemchuen ◽  
Stuart Blacksell ◽  
Allen L. Richards ◽  
Daniel Paris ◽  
...  

2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Lydia Y. Liu ◽  
Vinayak Bhandari ◽  
Adriana Salcedo ◽  
Shadrielle M. G. Espiritu ◽  
Quaid D. Morris ◽  
...  

AbstractWhole-genome sequencing can be used to estimate subclonal populations in tumours and this intra-tumoural heterogeneity is linked to clinical outcomes. Many algorithms have been developed for subclonal reconstruction, but their variabilities and consistencies are largely unknown. We evaluate sixteen pipelines for reconstructing the evolutionary histories of 293 localized prostate cancers from single samples, and eighteen pipelines for the reconstruction of 10 tumours with multi-region sampling. We show that predictions of subclonal architecture and timing of somatic mutations vary extensively across pipelines. Pipelines show consistent types of biases, with those incorporating SomaticSniper and Battenberg preferentially predicting homogenous cancer cell populations and those using MuTect tending to predict multiple populations of cancer cells. Subclonal reconstructions using multi-region sampling confirm that single-sample reconstructions systematically underestimate intra-tumoural heterogeneity, predicting on average fewer than half of the cancer cell populations identified by multi-region sequencing. Overall, these biases suggest caution in interpreting specific architectures and subclonal variants.


2020 ◽  
Vol 8 (11) ◽  
pp. 1775
Author(s):  
Andrey Shelenkov ◽  
Lyudmila Petrova ◽  
Valeria Fomina ◽  
Mikhail Zamyatin ◽  
Yulia Mikhaylova ◽  
...  

Proteus mirabilis is a component of the normal intestinal microflora of humans and animals, but can cause urinary tract infections and even sepsis in hospital settings. In recent years, the number of multidrug-resistant P. mirabilis isolates, including the ones producing extended-spectrum β-lactamases (ESBLs), is increasing worldwide. However, the number of investigations dedicated to this species, especially, whole-genome sequencing, is much lower in comparison to the members of the ESKAPE pathogens group. This study presents a detailed analysis of clinical multidrug-resistant ESBL-producing P. mirabilis isolate using short- and long-read whole-genome sequencing, which allowed us to reveal possible horizontal gene transfer between Klebsiella pneumoniae and P. mirabilis plasmids and to locate the CRISPR-Cas system in the genome together with its probable phage targets, as well as multiple virulence genes. We believe that the data presented will contribute to the understanding of antibiotic resistance acquisition and virulence mechanisms for this important pathogen.


2020 ◽  
Vol 8 (6) ◽  
pp. 855 ◽  
Author(s):  
Alexandra Irrgang ◽  
Natalie Pauly ◽  
Bernd-Alois Tenhagen ◽  
Mirjam Grobbel ◽  
Annemarie Kaesbohrer ◽  
...  

Resistance to carbapenems is a severe threat to human health. These last resort antimicrobials are indispensable for the treatment of severe human infections with multidrug-resistant Gram-negative bacteria. In accordance with their increasing medical impact, carbapenemase-producing Enterobacteriaceae (CPE) might be disseminated from colonized humans to non-human reservoirs (i.e., environment, animals, food). In Germany, the occurrence of CPE in livestock and food has been systematically monitored since 2016. In the 2019 monitoring, an OXA-48-producing E. coli (19-AB01443) was recovered from a fecal sample of a fattening pig. Phenotypic resistance was confirmed by broth microdilution and further characterized by PFGE, conjugation, and combined short-/long-read whole genome sequencing. This is the first detection of this resistance determinant in samples from German meat production. Molecular characterization and whole-genome sequencing revealed that the blaOXA-48 gene was located on a common pOXA-48 plasmid-prototype. This plasmid-type seems to be globally distributed among various bacterial species, but it was frequently associated with clinical Klebsiella spp. isolates. Currently, the route of introduction of this plasmid/isolate combination into the German pig production is unknown. We speculate that due to its strong correlation with human isolates a transmission from humans to livestock has occurred.


Blood ◽  
2009 ◽  
Vol 114 (22) ◽  
pp. 3965-3965
Author(s):  
Lukas D. Wartman ◽  
Li Ding ◽  
David E. Larson ◽  
Michael D. McLellan ◽  
Heather Schmidt ◽  
...  

Abstract Abstract 3965 Poster Board III-901 We have recently established that whole genome sequencing is a valid, unbiased approach that can identify novel candidate mutations that may be important for AML pathogenesis (Ley et al Nature 2008, Mardis et al NEJM 2009). Acute promyelocytic leukemia (APL, FAB M3 AML) is a subtype of AML characterized by the t(15;17)(q22;q11.2) translocation that creates an oncogenic fusion gene, PML-RARA. Our laboratory has previously modeled APL in a mouse in an effort to understand the genetic events that lead to the disease. In our knockin mouse model, a human PML-RARA cDNA was targeted to the 5' untranslated region of the mouse cathepsin G gene on chromosome 14 (mCG-PR). The targeting vector was transfected into the RW-4 embryonic stem cell line, derived from a 129/SvJ mouse. The transfected RW-4 cells were injected into C57Bl/6 blastocysts, and chimeric offspring were bred to C57Bl/6 mice. F1 129/SvJ x C57Bl/6 mice were subsequently backcrossed onto the B6/Taconic background for 10 generations before establishing a tumor watch. About 60% of the mCG-PR mice in the Bl/6 background develop a disease that closely resembles APL only after a latent period of 7-18 months, suggesting that additional progression mutations are required for APL development. Array-based genomic techniques (expression array studies and high resolution CGH) have revealed some recurring genetic alterations that may be relevant for progression (i.e. an interstitial deletion of chromosome 2, trisomy 15, etc.), but gene-specific progression mutations have not yet been identified. To begin to identify these mutations in an unbiased fashion, we sequenced a cytogenetically normal, diploid mouse APL genome using massively parallel DNA sequencing via the Illumina platform. Since the tumor arose in a highly inbred mouse strain, we predicted that 15x coverage of the genome (approximately 40 billion base pairs of sequence) would be necessary to identify >90% of the heterozygous somatic mutations. We generated 2 Illumina paired-end libraries (insert sizes of 300-350 bp and 550-600 bp) and generated 59.64 billion base pairs of sequence with 3 full sequencing runs; the reads that successfully mapped generated 15.6x coverage. The sequence data predicted 87,778 heterozygous Single Nucleotide Variants (SNVs) compared to the mouse C57Bl6/J reference sequence, and 23,439 homozygous SNVs. Of the predicted heterozygous SNVs, 695 were non-synonymous (missense or nonsense, or altering a canonical splice site). Thus far, 80 of these putative non-synonymous SNVs have been further analyzed using Sanger sequencing of the original tumor DNA vs. pooled B6/Taconic spleen DNA and pooled129/SvJ spleen DNA as controls. 37/80 were shown to be false positive calls, and 37 were inherited SNPs from residual regions of the129/SvJ genome. 6/80 were present only in the tumor genome, and were candidate somatic mutations. These 6 were screened in 89 additional murine APL tumor samples derived from the same mouse model. Mutations in the Jarid2 (L915I) and Capns2 (N149S) genes occurred only in the proband, and are therefore of uncertain significance. 4/6 mutations were found in additional samples; 3 of these mutations were derived from a common ancestor of the proband and the other affected mice, and were therefore not relevant for pathogenesis. The other recurring mutation was in the pseudokinase domain of JAK1 (V657F), and was identified in one other mouse that was not closely related to the proband. This mutation is orthologous to the known activating mutation V617F in human JAK2, and is identical to a recently described JAK1 pseudokinase domain mutation (V658F) found in human APL and T-ALL samples (EG Jeong et al, Clin Can Res 14: 3716, 2008). We are currently testing the functional significance of this mutation by expressing it in bone marrow cells derived from young WT vs. mCG-PR mice. In summary, unbiased whole genome sequencing of a mouse APL genome has identified a recurring mutation of JAK1 found in both human and mouse APL samples. This approach may allow us to rapidly identify progression mutations that are common to human and murine AML, and provides an important proof-of-concept that this mouse model of AML is functionally related to its human counterpart. Disclosures: No relevant conflicts of interest to declare.


Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 404-404 ◽  
Author(s):  
John S. Welch ◽  
David Larson ◽  
Li Ding ◽  
Michael D. McLellan ◽  
Tamara Lamprecht ◽  
...  

Abstract Abstract 404 To characterize the genomic events associated with distinct subtypes of AML, we used whole genome sequencing to compare 24 tumor/normal sample pairs from patients with normal karyotype (NK) M1-AML (12 cases) and t(15;17)-positive M3-AML (12 cases). All single nucleotide variants (SNVs), small insertions and deletions (indels), and cryptic structural variants (SVs) identified by whole genome sequencing (average coverage 28x) were validated using sample-specific custom Nimblegen capture arrays, followed by Illumina sequencing; an average coverage of 972 reads per somatic variant yielded 10,597 validated somatic variants (average 421/genome). Of these somatic mutations, 308 occurred in 286 unique genes; on average, 9.4 somatic mutations per genome had translational consequences. Several important themes emerged: 1) AML genomes contain a diverse range of recurrent mutations. We assessed the 286 mutated genes for recurrency in an additional 34 NK M1-AML cases and 9 M3-AML cases. We identified 51 recurrently mutated genes, including 37 that had not previously been described in AML; on average, each genome had 3 recurrently mutated genes (M1 = 3.2; M3 = 2.8, p = 0.32). 2) Many recurring mutations cluster in mutually exclusive pathways, suggesting pathophysiologic importance. The most commonly mutated genes were: FLT3 (36%), NPM1 (25%), DNMT3A (21%), IDH1 (18%), IDH2 (10%), TET2 (10%), ASXL1 (6%), NRAS (6%), TTN (6%), and WT1 (6%). In total, 3 genes (excluding PML-RARA) were mutated exclusively in M3 cases. 22 genes were found only in M1 cases (suggestive of alternative initiating mutations which occurred in methylation, signal transduction, and cohesin complex genes). 25 genes were mutated in both M1 and M3 genomes (suggestive of common progression mutations relevant for both subtypes). A single mutation in a cell growth/signaling gene occurred in 38 of 67 cases (FLT3, NRAS, RUNX1, KIT, CACNA1E, CADM2, CSMD1); these mutations were mutually exclusive of one another, and many of them occurred in genomes with PML-RARA, suggesting that they are progression mutations. We also identified a new leukemic pathway: mutations were observed in all four genes that encode members of the cohesin complex (STAG2, SMC1A, SMC3, RAD21), which is involved in mitotic checkpoints and chromatid separation. The cohesin mutations were mutually exclusive of each other, and collectively occur in 10% of non-M3 AML patients. 3) AML genomes also contain hundreds of benign “passenger” mutations. On average 412 somatic mutations per genome were translationally silent or occurred outside of annotated genes. Both M1 and M3 cases had similar total numbers of mutations per genome, similar mutation types (which favored C>T/G>A transitions), and a similar random distribution of variants throughout the genome (which was affected neither by coding regions nor expression levels). This is consistent with our recent observations of random “passenger” mutations in hematopoietic stem cell (HSC) clones derived from normal patients (Ley et al manuscript in preparation), and suggests that most AML-associated mutations are not pathologic, but pre-existed in the HSC at the time of initial transformation. In both studies, the total number of SNVs per genome correlated positively with the age of the patient (R2 = 0.48, p = 0.001), providing a possible explanation for the increasing incidence of AML in elderly patients. 4) NK M1 and M3 AML samples are mono- or oligo-clonal. By comparing the frequency of all somatic mutations within each sample, we could identify clusters of mutations with similar frequencies (leukemic clones) and determined that the average number of clones per genome was 1.8 (M1 = 1.5; M3 = 2.2; p = 0.04). 5) t(15;17) is resolved by a non-homologous end-joining repair pathway, since nucleotide resolution of all 12 t(15;17) breakpoints revealed inconsistent micro-homologies (0 – 7 bp). Summary: These data provide a genome-wide overview of NK and t(15;17) AML and provide important new insights into AML pathogenesis. AML genomes typically contain hundreds of random, non-genic mutations, but only a handful of recurring mutated genes that are likely to be pathogenic because they cluster in mutually exclusive pathways; specific combinations of recurring mutations, as well as rare and private mutations, shape the leukemia phenotype in an individual patient, and help to explain the clinical heterogeneity of this disease. Disclosures: Westervelt: Novartis: Speakers Bureau.


Sign in / Sign up

Export Citation Format

Share Document