scholarly journals SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Manish Goel ◽  
Hequan Sun ◽  
Wen-Biao Jiao ◽  
Korbinian Schneeberger

AbstractGenomic differences range from single nucleotide differences to complex structural variations. Current methods typically annotate sequence differences ranging from SNPs to large indels accurately but do not unravel the full complexity of structural rearrangements, including inversions, translocations, and duplications, where highly similar sequence changes in location, orientation, or copy number. Here, we present SyRI, a pairwise whole-genome comparison tool for chromosome-level assemblies. SyRI starts by finding rearranged regions and then searches for differences in the sequences, which are distinguished for residing in syntenic or rearranged regions. This distinction is important as rearranged regions are inherited differently compared to syntenic regions.

Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 103-103
Author(s):  
Yasuhito Nannya ◽  
Kenichi Yoshida ◽  
Lanying Zhao ◽  
June Takeda ◽  
Hiroo Ueno ◽  
...  

Abstract Background Intensive efforts of genome sequencing studies during the past decade identified >100 driver genes recurrently mutated in one or more subtypes of myeloid neoplasms, which collectively account for the pathogenesis of >90% of the cases. However, approximately 10% of the cases have no alterations in known drivers and their pathogenesis is still unclear. A possible explanation might be the presence of alterations in non-coding regions that are not detected by conventional exome/panel sequencing; mutations and complex structural variations (SVs) affecting these regions have been shown to deregulate expression of relevant genes in a variety of solid cancers. Unfortunately, however, no large studies have ever been performed, in which a large cohort of myeloid malignancies were analyzed using whole genome sequencing (WGS) in an attempt to identify a full spectrum of non-coding alterations, even though its efficacy have been demonstrated in many solid cancers. In this study, we performed WGS in a large cohort of pan-myeloid cancers, in which both coding and non-coding lesions were comprehensively analyzed. Patients and methods A total of 338 cases of myeloid malignancies, including 212 with MDS, 70 with AML, 17 with MDS/MPN, 23 with t-AML/MDS, and 16 with MPN were analyzed with WGS, of which 173 were also analyzed by transcriptome sequencing. Tumor samples were obtained from patients' bone marrow (N=269) or peripheral blood (N=69), while normal controls were derived from buccal smear (N=263) or peripheral T cells (N=75). Sequencing of target panel of 86 genes were performed for all samples. Sequencing data were processed using in-house pipelines, which were optimized for detection of complex structural variations (SVs) and abnormalities in non-coding sequences. Results WGS identified a median of 586,612 single nucleotide variants (SNVs) and 124,863 short indels per genome. NMF-based decomposition of the variants disclosed three major mutational signatures, which were characterized by age-related C>T transitions at CpG sites (Sig. A), C>T transitions at CpT sites (Sig. B), and T>C transitions at ApTpN context (Sig. C). Among these, Sig. C showed a prominent strand bias and corresponds to COSMIC signature 16, which has recently been implicated in alcohol drinking. Significant clustering of SNVs and short indels were interrogated across the genome divided into different window sizes (1Kbp, 10Kbp, 100Kbp) or confining the targets to coding exons and known regulatory regions, such as promoters, enhancers/super enhances, and DNase I hypersensitive sites. Recapitulating previous findings, SNVs in the coding exons were significantly enriched in known drivers, including TP53, TET2, ASXL1, DNMT3A, SF3B1, RUNX1, EZH2, and STAG2. We detected significant enrichment of SNVs in CpG islands, and promoters/enhancers. We also detected a total of 8,242 SVs with a median of 15 SVs/sample, which is more prevalent than expected from conventional karyotype analysis. Focal clusters of complex rearrangements compatible with chromothripsis were found in 8 cases, of which 7 carried biallelic TP53 alterations. NMF-based signature analysis of SVs revealed that large (>1Mb) deletions, inversions, and tandem duplications and translocations are clustered together and were strongly associated with TP53 mutations, while smaller deletions and tandem duplications, but not inversions, constitute another cluster. As expected, FLT3-ITD (N=15) and MLL-PTD (N=12) were among the most frequent SVs. Unexpectedly, in addition to known SVs associated with t(8;21) (RUNX1-RUNX1T1) (N=6) and t(3;21) (RUNX1-MECOM) (n=1) as well as non-synonymous SNVs within the coding exons (N=30), we detected frequent non-coding alterations affecting RUNX1, including SVs (N=15) and SNVs around splicing acceptor sites (N=5), suggesting that RUNX1 was affected by multiple mechanism, where as many as 38% of RUNX1 lesions were explained by non-coding alterations. Other recurrent targets of non-coding lesions included ASXL1, NF1, and ETV6. Conclusions WGS was successfully used to reveal a comprehensive registry of genetic alterations in pan-myeloid cancers. Non-coding alterations affecting known driver genes were more common than expected, suggesting the importance of detecting non-coding abnormalities in diagnostic sequencing. Disclosures Nakagawa: Sumitomo Dainippon Pharma Co., Ltd.: Research Funding. Usuki:Mochida Pharmaceutical: Speakers Bureau; Astellas Pharma Inc.: Research Funding; Sanofi K.K.: Research Funding; GlaxoSmithKline K.K.: Research Funding; Otsuka Pharmaceutical Co., Ltd.: Research Funding; Kyowa Hakko Kirin Co., Ltd.: Research Funding; Daiichi Sankyo: Research Funding; Celgene Corporation: Research Funding, Speakers Bureau; SymBio Pharmaceuticals Limited.: Research Funding; Shire Japan: Research Funding; Janssen Pharmaceutical K.K: Research Funding; Boehringer-Ingelheim Japan: Research Funding; Sumitomo Dainippon Pharma: Research Funding, Speakers Bureau; Pfizer Japan: Research Funding, Speakers Bureau; Novartis: Speakers Bureau; Nippon Shinyaku: Speakers Bureau; Chugai Pharmaceutical: Speakers Bureau; Takeda Pharmaceutical: Speakers Bureau; Ono Pharmaceutical: Speakers Bureau; MSD K.K.: Speakers Bureau. Chiba:Bristol Myers Squibb, Astellas Pharma, Kyowa Hakko Kirin: Research Funding. Miyawaki:Otsuka Pharmaceutical Co., Ltd.: Consultancy; Novartis Pharma KK: Consultancy; Astellas Pharma Inc.: Consultancy.


2002 ◽  
Vol 184 (19) ◽  
pp. 5479-5490 ◽  
Author(s):  
R. D. Fleischmann ◽  
D. Alland ◽  
J. A. Eisen ◽  
L. Carpenter ◽  
O. White ◽  
...  

ABSTRACT Virulence and immunity are poorly understood in Mycobacterium tuberculosis. We sequenced the complete genome of the M. tuberculosis clinical strain CDC1551 and performed a whole-genome comparison with the laboratory strain H37Rv in order to identify polymorphic sequences with potential relevance to disease pathogenesis, immunity, and evolution. We found large-sequence and single-nucleotide polymorphisms in numerous genes. Polymorphic loci included a phospholipase C, a membrane lipoprotein, members of an adenylate cyclase gene family, and members of the PE/PPE gene family, some of which have been implicated in virulence or the host immune response. Several gene families, including the PE/PPE gene family, also had significantly higher synonymous and nonsynonymous substitution frequencies compared to the genome as a whole. We tested a large sample of M. tuberculosis clinical isolates for a subset of the large-sequence and single-nucleotide polymorphisms and found widespread genetic variability at many of these loci. We performed phylogenetic and epidemiological analysis to investigate the evolutionary relationships among isolates and the origins of specific polymorphic loci. A number of these polymorphisms appear to have occurred multiple times as independent events, suggesting that these changes may be under selective pressure. Together, these results demonstrate that polymorphisms among M. tuberculosis strains are more extensive than initially anticipated, and genetic variation may have an important role in disease pathogenesis and immunity.


2021 ◽  
Author(s):  
Lihua Zou

Diffuse intrinsic pontine glioma (DIPG) is a deadly disease among young children. The evolution path and mutational processes giving rise to DIPG remain elusive. We analyzed 100 whole genome sequences (WGS) from 60 DIPG patients. This revealed 25% DIPGs acquired whole-genome duplications (WGD) early during tumor evolution. WGD samples are associated with loss of TP53 and poorer survival. In addition, almost all WGD samplers harbor complex structural variations (SVs) and show characteristic short microhomology at SV breakpoints. Mutation analysis revealed that H3K27M driver mutation is acquired early during tumor clonal evolution. Mutation signature analysis identified a unique mutational process at a late stage of tumor evolution. This study revealed that tumor evolution of DIPG is characterized by chromosomal instability shaped by DNA repair defects and dynamic mutational processes. Our work shed new insights on the disease pathogenesis of DIPG and provided rationale for designing novel therapy for this deadly disease.


2019 ◽  
Author(s):  
Can Wang ◽  
Lingbo Zhou ◽  
Xu Gao ◽  
Yanqing Ding ◽  
Bin Cheng ◽  
...  

AbstractsHongyingzi is a special waxy sorghum (Sorghum bicolor L. Moench) cultivar for brewing Moutai liquor. For an overall understanding of the whole genome of Hongyingzi, we performed whole-genome resequencing technology with 56.10 X depth to reveal its comprehensive variations. Compared with the BTx623 reference genome, 2.48% of genome sequences were altered in the Hongyingzi genome. Among these alterations, there were 1885774 single nucleotide polymorphisms (SNPs), 309381 small fragments insertions and deletions (Indels), 31966 structural variations (SVs), and 217273 copy number variations (CNVs). These alterations conferred 29614 genes variations. It was also predicted that 35 genes variations were related to the multidrug and toxic efflux (MATE) transporter, chalcone synthase (CHS), ATPase isoform 10 (AHA10) transporter, dihydroflavonol-4-reductase (DFR), the laccase 15 (LAC15), flavonol 3′-hydroxylase (F3′H), flavanone 3-hydroxylase (F3H), O-methyltransferase (OMT), flavonoid 3′5′ hydroxylase (F3′5′H), UDP-glucose:sterol-glucosyltransferase (SGT), flavonol synthase (FLS), and chalcone isomerase (CHI) involved in the tannin synthesis. These results would provide theoretical supports for the molecular markers developments and gene function studies related to the liquor-making traits, and the genetic improvement of waxy sorghum based on the genome editing technology.


2020 ◽  
Author(s):  
Ya-Qi Tan ◽  
Yue-Qiu Tan ◽  
De-Hua Cheng

Abstract BACKGROUND: Apparently balanced chromosome rearrangements (ABCRs) in non-affected individuals are well-known to possess high reproductive risks such as infertility, abnormal offspring, and pregnancy loss. However, caution should be exercised in genetic counseling and reproductive intervention because cryptic unbalanced defects and genome structural variations beyond the resolution of routine cytogenetics may not be detected. CASE PRESENTATION: Here, we studied two familial cases of ABCRs were recruited in this study. In family 1, the couple suffered two abortions pregnancies and underwent labor induction. Single nucleotide polymorphism (SNP) array analysis of the aborted sample from the second pregnancy revealed a 10.8 Mb heterozygous deletion at 10q26.13q26.3 and a 5.5 Mb duplication at 19q13.41-q13.43. The non-affected father was identified as a carrier of three-way complex chromosomal rearrangement [t(6;10;19)(p22;q26;q13)] by karyotyping. Whole-genome mate-pair sequencing revealed a cryptic breakpoint on the derivative chromosome 19 (der19), indicating that the karyotype was a more complex structural rearrangement comprising four breakpoints. Three genes, FAM24B, CACNG8, and KIAA0556, were disrupted without causing any abnormal phenotype in the carrier. In family 2, the couple suffered from a spontaneous miscarriage. This family had an affected child with multiple congenital deformities and an unbalanced karyotype, 46,XY,der(11)t(6;11)(q13;p11.2). The female partner was identified as a balanced translocation carrier with the karyotype 46,XX,t(6;11)(q13;p11.2)dn. Further SNP array and fluorescent in situ hybridization (FISH) indicated a cryptic insertion between chromosome 6 and chromosome 11. Finally, whole-genome mate-pair sequencing revealed an extremely complex genomic structural variation, including a cryptic deletion and 12 breakpoints on chromosome 11, and 1 breakpoint on chromosome 6 .CONCLUSIONS: Our study investigated two rare cases of ABCRs and demonstrated the efficacy of whole-genome mate-pair sequencing in analyzing the genome complex structural variation. In case of ABCRs detected by conventional cytogenetic techniques, whole genome sequencing (WGS) based approaches should be considered for accurate diagnosis, effective genetic counseling, and correct reproductive intervention to avoid recurrence risks.


2020 ◽  
Author(s):  
Yaqi Tan ◽  
Dehua Cheng ◽  
Yue-Qiu Tan

Abstract BACKGROUND: Apparently balanced chromosome rearrangements (ABCRs) in non-affected individuals are well-known to have high reproductive risks such as infertility, abnormal offspring, and pregnancy loss. However, caution should be exercised in genetic counseling and reproductive intervention because cryptic unbalanced defects and genome structural variations beyond the resolution of routine cytogenetics may get omitted.CASE PRESENTATION: Two cases of ABCRs were recruited in this study. In family 1, the couple suffered two terminated pregnancies and underwent labor induction. Single nucleotide polymorphism (SNP) array analysis of the aborted sample from the second pregnancy showed a 10.8 Mb heterozygous deletion at 10q26.13q26.3 and a 5.5 Mb duplication at 19q13.41-q13.43. The non-affected father was diagnosed as a carrier of three-way complex chromosomal rearrangement [t(6;10;19)(p22;q26;q13)] by karyotyping. Whole-genome mate-pair sequencing revealed a cryptic breakpoint on derivative (der) chromosome 19 indicating that the karyotype was a more complex structural rearrangement including four breakpoints. Three genes, FAM24B, CACNG8, and KIAA0556, were disrupted without causing any abnormal phenotype in the carrier. In family 2, the couple suffered from a spontaneous miscarriage. They had an affected child with multiple congenital malformations and an unbalanced karyotype, 46,XY,der(11)t(6;11)(q13;p11.2). The female partner was considered a balanced translocation carrier with the karyotype 46,XX,t(6;11)(q13;p11.2)dn. Further SNP array and fluorescent in situ hybridization (FISH) indicated a cryptic insertion between chromosome 6 and chromosome 11. Finally, whole-genome mate-pair sequencing revealed an extremely complex genomic structural variation, including a cryptic deletion and 12 breakpoints on derivative chromosome 11[der(11)], and 1 breakpoint on derivative chromosome 6 [der(6)].CONCLUSIONS: Our study investigated two rare cases of ABCRs and demonstrated that whole-genome mate-pair sequencing is a powerful approach to analyze the genome complex structural variation. In case of ABCRs detected by conventional cytogenetic techniques, whole genome sequencing (WGS) based approaches should be considered for accurate diagnosis, effective genetic counseling, and correct reproductive intervention to avoid recurrence risks.


Sign in / Sign up

Export Citation Format

Share Document