scholarly journals Genomic characterization of a pathogenic isolate of Saccharomyces cerevisiae reveals an extensive and dynamic landscape of structural variation

2021 ◽  
Author(s):  
Lydia R. Heasley ◽  
Juan Lucas Argueso

The budding yeast Saccharomyces cerevisiae has been extensively characterized for many decades and is a critical resource for the study of numerous facets of eukaryotic biology. Recently, the analysis of whole genome sequencing data from over 1000 natural isolates of S. cerevisiae has provided critical insights into the evolutionary landscape of this species by revealing a population structure comprised of numerous genomically diverse lineages. These survey-level analyses have been largely devoid of structural genomic information, mainly because short read sequencing is not suitable for detailed characterization of genomic architecture. Consequently, we still lack a complete perspective of the genomic variation the exists within the species. Single molecule long read sequencing technologies, such as Oxford Nanopore and PacBio, provide sequencing-based approaches with which to rigorously define the structure of a genome, and have empowered yeast geneticists to explore this poorly described realm of eukaryotic genomics. Here, we present the comprehensive genomic structural analysis of a pathogenic isolate of S. cerevisiae, YJM311. We used long read sequence analysis to construct a haplotype-phased, telomere-to-telomere length assembly of the YJM311 diploid genome and characterized the structural variations (SVs) therein. We discovered that the genome of YJM311 contains significant intragenomic structural variation, some of which imparts notable consequences to the genomic stability and developmental biology of the strain. Collectively, we outline a new methodology for creating accurate haplotype-phased genome assemblies and highlight how such genomic analyses can define the structural architectures of S. cerevisiae isolates. It is our hope that through continued structural characterization of S. cerevisiae genomes, such as we have reported here for YJM311, we will comprehensively advance our understanding of eukaryotic genome structure-function relationships, structural diversity, and evolution.

Genetics ◽  
2021 ◽  
Author(s):  
Lydia R Heasley ◽  
Juan Lucas Argueso

Abstract The budding yeast Saccharomyces cerevisiae has been extensively characterized for many decades and is a critical resource for the study of numerous facets of eukaryotic biology. Recently, whole genome sequence analysis of over 1000 natural isolates of S. cerevisiae has provided critical insights into the evolutionary landscape of this species by revealing a population structure comprised of numerous genomically diverse lineages. These survey-level analyses have been largely devoid of structural genomic information, mainly because short read sequencing is not suitable for detailed characterization of genomic architecture. Consequently, we still lack a complete perspective of the genomic variation the exists within the species. Single molecule long read sequencing technologies, such as Oxford Nanopore and PacBio, provide sequencing-based approaches with which to rigorously define the structure of a genome, and have empowered yeast geneticists to explore this poorly described realm of eukaryotic genomics. Here, we present the comprehensive genomic structural analysis of a wild diploid isolate of S. cerevisiae, YJM311. We used long read sequence analysis to construct a haplotype-phased, telomere-to-telomere length assembly of the YJM311 genome and characterized the structural variations (SVs) therein. We discovered that the genome of YJM311 contains significant intragenomic structural variation, some of which imparts notable consequences to the genomic stability and developmental biology of the strain. Collectively, we outline a new methodology for creating accurate haplotype-phased genome assemblies and highlight how such genomic analyses can define the structural architectures of S. cerevisiae isolates. It is our hope that continued structural characterization of S. cerevisiae genomes, such as we have reported here for YJM311, will comprehensively advance our understanding of eukaryotic genome structure-function relationships, structural genomic diversity, and evolution.


2020 ◽  
Author(s):  
Anna Tigano ◽  
Arne Jacobs ◽  
Aryn P. Wilder ◽  
Ankita Nand ◽  
Ye Zhan ◽  
...  

AbstractThe levels and distribution of standing genetic variation in a genome can provide a wealth of insights about the adaptive potential, demographic history, and genome structure of a population or species. As structural variants are increasingly associated with traits important for adaptation and speciation, investigating both sequence and structural variation is essential for wholly tapping this potential. Using a combination of shotgun sequencing, 10X Genomics linked reads and proximity-ligation data (Chicago and Hi-C), we produced and annotated a chromosome-level genome assembly for the Atlantic silverside (Menidia menidia) - an established ecological model for studying the phenotypic effects of natural and artificial selection - and examined patterns of genomic variation across two individuals sampled from different populations with divergent local adaptations. Levels of diversity varied substantially across each chromosome, consistently being highly elevated near the ends (presumably near telomeric regions) and dipping to near zero around putative centromeres. Overall, our estimate of the genome-wide average heterozygosity in the Atlantic silverside is the highest reported for a fish, or any vertebrate, to date (1.32-1.76% depending on inference method and sample). Furthermore, we also found extreme levels of structural variation, affecting ~23% of the total genome sequence, including multiple large inversions (> 1 Mb and up to 12.6 Mb) associated with previously identified haploblocks showing strong differentiation between locally adapted populations. These extreme levels of standing genetic variation are likely associated with large effective population sizes and may help explain the remarkable adaptive divergence among populations of the Atlantic silverside.


2021 ◽  
Author(s):  
Julie M Behr ◽  
Xiaotong Yao ◽  
Kevin Hadi ◽  
Huasong Tian ◽  
Aditya Deshpande ◽  
...  

Recent pan-cancer studies have delineated patterns of structural genomic variation across thousands of tumor whole genome sequences. It is not known to what extent the shortcomings of short read (≤ 150 bp) whole genome sequencing (WGS) used for structural variant analysis has limited our understanding of cancer genome structure. To formally address this, we introduce the concept of "loose ends" - copy number alterations that cannot be mapped to a rearrangement by WGS but can be indirectly detected through the analysis of junction-balanced genome graphs. Analyzing 2,319 pan-cancer WGS cases across 31 tumor types, we found loose ends were enriched in reference repeats and fusions of the mappable genome to repetitive or foreign sequences. Among these we found genomic footprints of neotelomeres, which were surprisingly enriched in cancers with low telomerase expression and alternate lengthening of telomeres phenotype. Our results also provide a rigorous upper bound on the role of non-allelic homologous recombination (NAHR) in large-scale cancer structural variation, while nominating INO80, FANCA, and ARID1A as positive modulators of somatic NAHR. Taken together, we estimate that short read WGS maps >97% of all large-scale (>10 kbp) cancer structural variation; the rest represent loose ends that require long molecule profiling to unambiguously resolve. Our results have broad relevance for future research and clinical applications of short read WGS and delineate precise directions where long molecule studies might provide transformative insight into cancer genome structure.


2021 ◽  
Vol 12 ◽  
Author(s):  
Gwyneth Halstead-Nussloch ◽  
Tsuyoshi Tanaka ◽  
Dario Copetti ◽  
Timothy Paape ◽  
Fuminori Kobayashi ◽  
...  

The seed protein α-gliadin is a major component of wheat flour and causes gluten-related diseases. However, due to the complexity of this multigene family with a genome structure composed of dozens of copies derived from tandem and genome duplications, little was known about the variation between accessions, and thus little effort has been made to explicitly target α-gliadin for bread wheat breeding. Here, we analyzed genomic variation in α-gliadins across 11 recently published chromosome-scale assemblies of hexaploid wheat, with validation using long-read data. We unexpectedly found that the Gli-B2 locus is not a single contiguous locus but is composed of two subloci, suggesting the possibility of recombination between the two during breeding. We confirmed that the number of immunogenic epitopes among 11 accessions varied. The D subgenome of a European spelt line also contained epitopes, in agreement with its hybridization history. Evolutionary analysis identified amino acid sites under diversifying selection, suggesting their functional importance. The analysis opens the way for improved grain quality and safety through wheat breeding.


Plants ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 270 ◽  
Author(s):  
Yun Gyeong Lee ◽  
Sang Chul Choi ◽  
Yuna Kang ◽  
Kyeong Min Kim ◽  
Chon-Sik Kang ◽  
...  

The whole genome sequencing (WGS) has become a crucial tool in understanding genome structure and genetic variation. The MinION sequencing of Oxford Nanopore Technologies (ONT) is an excellent approach for performing WGS and it has advantages in comparison with other Next-Generation Sequencing (NGS): It is relatively inexpensive, portable, has simple library preparation, can be monitored in real-time, and has no theoretical limits on reading length. Sorghum bicolor (L.) Moench is diploid (2n = 2x = 20) with a genome size of about 730 Mb, and its genome sequence information is released in the Phytozome database. Therefore, sorghum can be used as a good reference. However, plant species have complex and large genomes when compared to animals or microorganisms. As a result, complete genome sequencing is difficult for plant species. MinION sequencing that produces long-reads can be an excellent tool for overcoming the weak assembly of short-reads generated from NGS by minimizing the generation of gaps or covering the repetitive sequence that appears on the plant genome. Here, we conducted the genome sequencing for S. bicolor cv. BTx623 while using the MinION platform and obtained 895,678 reads and 17.9 gigabytes (Gb) (ca. 25× coverage of reference) from long-read sequence data. A total of 6124 contigs (covering 45.9%) were generated from Canu, and a total of 2661 contigs (covering 50%) were generated from Minimap and Miniasm with a Racon through a de novo assembly using two different tools and mapped assembled contigs against the sorghum reference genome. Our results provide an optimal series of long-read sequencing analysis for plant species while using the MinION platform and a clue to determine the total sequencing scale for optimal coverage that is based on various genome sizes.


2015 ◽  
Author(s):  
Xuefang Zhao ◽  
Sarah B. Emery ◽  
Bridget Myers ◽  
Jeffrey M. Kidd ◽  
Ryan E. Mills

Complex chromosomal rearrangements consist of structural genomic alterations involving multiple instances of deletions, duplications, inversions, or translocations that co-occur either on the same chromosome or represent different overlapping events on homologous chromosomes. We present SVelter, an algorithm that first identifies regions of the genome suspected to harbor a complex event and then iteratively rearranges the local genome structure, in a randomized fashion, with each structure scored against characteristics of the observed sequencing data. We show that SVelter is able to accurately reconstruct these regions when compared to well-characterized genomes that have been deep sequenced with both short and long read technologies.


2021 ◽  
Author(s):  
Hanna Sigeman ◽  
Bella Sinclair ◽  
Bengt Hansson

Sex chromosomes have evolved numerous times, as revealed by recent genomic studies. However, large gaps in our knowledge of sex chromosome diversity across the tree of life remain. Filling these gaps, through the study of novel species, is crucial for improved understanding of why and how sex chromosomes evolve. Characterization of sex chromosomes in already well-studied organisms is also important to avoid misinterpretations of population genomic patterns caused by undetected sex chromosome variation. Here we present findZX, an automated Snakemake-based computational pipeline for detecting and visualizing sex chromosomes through differences in genome coverage and heterozygosity between males and females. FindZX is user-friendly and scalable to suit different computational platforms and works with any number of male and female samples. An option to perform a genome coordinate lift-over to a reference genome of another species allows users to inspect sex- linked regions over larger contiguous chromosome regions, while also providing important between- species synteny information. To demonstrate its effectiveness, we applied findZX to publicly available genomic data from species belonging to widely different taxonomic groups (mammals, birds, reptiles, fish, and insects), with sex chromosome systems of different ages, sizes, and levels of differentiation. We also demonstrate that the lift-over method is robust over large phylogenetic distances (>80 million years of evolution).


2020 ◽  
Vol 117 (45) ◽  
pp. 28221-28231 ◽  
Author(s):  
Nadia M. V. Sampaio ◽  
V. P. Ajith ◽  
Ruth A. Watson ◽  
Lydia R. Heasley ◽  
Parijat Chakraborty ◽  
...  

Conventional models of genome evolution are centered around the principle that mutations form independently of each other and build up slowly over time. We characterized the occurrence of bursts of genome-wide loss-of-heterozygosity (LOH) inSaccharomyces cerevisiae, providing support for an additional nonindependent and faster mode of mutation accumulation. We initially characterized a yeast clone isolated for carrying an LOH event at a specific chromosome site, and surprisingly found that it also carried multiple unselected rearrangements elsewhere in its genome. Whole-genome analysis of over 100 additional clones selected for carrying primary LOH tracts revealed that they too contained unselected structural alterations more often than control clones obtained without any selection. We also measured the rates of coincident LOH at two different chromosomes and found that double LOH formed at rates 14- to 150-fold higher than expected if the two underlying single LOH events occurred independently of each other. These results were consistent across different strain backgrounds and in mutants incapable of entering meiosis. Our results indicate that a subset of mitotic cells within a population can experience discrete episodes of systemic genomic instability, when the entire genome becomes vulnerable and multiple chromosomal alterations can form over a narrow time window. They are reminiscent of early reports from the classic yeast genetics literature, as well as recent studies in humans, both in cancer and genomic disorder contexts. The experimental model we describe provides a system to further dissect the fundamental biological processes responsible for punctuated bursts of structural genomic variation.


2020 ◽  
Vol 11 ◽  
Author(s):  
Mei-Wei Luan ◽  
Xiao-Ming Zhang ◽  
Zi-Bin Zhu ◽  
Ying Chen ◽  
Shang-Qian Xie

Sign in / Sign up

Export Citation Format

Share Document