scholarly journals Multiple Wheat Genomes Reveal Novel Gli-2 Sublocus Location and Variation of Celiac Disease Epitopes in Duplicated α-Gliadin Genes

2021 ◽  
Vol 12 ◽  
Author(s):  
Gwyneth Halstead-Nussloch ◽  
Tsuyoshi Tanaka ◽  
Dario Copetti ◽  
Timothy Paape ◽  
Fuminori Kobayashi ◽  
...  

The seed protein α-gliadin is a major component of wheat flour and causes gluten-related diseases. However, due to the complexity of this multigene family with a genome structure composed of dozens of copies derived from tandem and genome duplications, little was known about the variation between accessions, and thus little effort has been made to explicitly target α-gliadin for bread wheat breeding. Here, we analyzed genomic variation in α-gliadins across 11 recently published chromosome-scale assemblies of hexaploid wheat, with validation using long-read data. We unexpectedly found that the Gli-B2 locus is not a single contiguous locus but is composed of two subloci, suggesting the possibility of recombination between the two during breeding. We confirmed that the number of immunogenic epitopes among 11 accessions varied. The D subgenome of a European spelt line also contained epitopes, in agreement with its hybridization history. Evolutionary analysis identified amino acid sites under diversifying selection, suggesting their functional importance. The analysis opens the way for improved grain quality and safety through wheat breeding.

2021 ◽  
Author(s):  
Lydia R. Heasley ◽  
Juan Lucas Argueso

The budding yeast Saccharomyces cerevisiae has been extensively characterized for many decades and is a critical resource for the study of numerous facets of eukaryotic biology. Recently, the analysis of whole genome sequencing data from over 1000 natural isolates of S. cerevisiae has provided critical insights into the evolutionary landscape of this species by revealing a population structure comprised of numerous genomically diverse lineages. These survey-level analyses have been largely devoid of structural genomic information, mainly because short read sequencing is not suitable for detailed characterization of genomic architecture. Consequently, we still lack a complete perspective of the genomic variation the exists within the species. Single molecule long read sequencing technologies, such as Oxford Nanopore and PacBio, provide sequencing-based approaches with which to rigorously define the structure of a genome, and have empowered yeast geneticists to explore this poorly described realm of eukaryotic genomics. Here, we present the comprehensive genomic structural analysis of a pathogenic isolate of S. cerevisiae, YJM311. We used long read sequence analysis to construct a haplotype-phased, telomere-to-telomere length assembly of the YJM311 diploid genome and characterized the structural variations (SVs) therein. We discovered that the genome of YJM311 contains significant intragenomic structural variation, some of which imparts notable consequences to the genomic stability and developmental biology of the strain. Collectively, we outline a new methodology for creating accurate haplotype-phased genome assemblies and highlight how such genomic analyses can define the structural architectures of S. cerevisiae isolates. It is our hope that through continued structural characterization of S. cerevisiae genomes, such as we have reported here for YJM311, we will comprehensively advance our understanding of eukaryotic genome structure-function relationships, structural diversity, and evolution.


Genetics ◽  
2021 ◽  
Author(s):  
Lydia R Heasley ◽  
Juan Lucas Argueso

Abstract The budding yeast Saccharomyces cerevisiae has been extensively characterized for many decades and is a critical resource for the study of numerous facets of eukaryotic biology. Recently, whole genome sequence analysis of over 1000 natural isolates of S. cerevisiae has provided critical insights into the evolutionary landscape of this species by revealing a population structure comprised of numerous genomically diverse lineages. These survey-level analyses have been largely devoid of structural genomic information, mainly because short read sequencing is not suitable for detailed characterization of genomic architecture. Consequently, we still lack a complete perspective of the genomic variation the exists within the species. Single molecule long read sequencing technologies, such as Oxford Nanopore and PacBio, provide sequencing-based approaches with which to rigorously define the structure of a genome, and have empowered yeast geneticists to explore this poorly described realm of eukaryotic genomics. Here, we present the comprehensive genomic structural analysis of a wild diploid isolate of S. cerevisiae, YJM311. We used long read sequence analysis to construct a haplotype-phased, telomere-to-telomere length assembly of the YJM311 genome and characterized the structural variations (SVs) therein. We discovered that the genome of YJM311 contains significant intragenomic structural variation, some of which imparts notable consequences to the genomic stability and developmental biology of the strain. Collectively, we outline a new methodology for creating accurate haplotype-phased genome assemblies and highlight how such genomic analyses can define the structural architectures of S. cerevisiae isolates. It is our hope that continued structural characterization of S. cerevisiae genomes, such as we have reported here for YJM311, will comprehensively advance our understanding of eukaryotic genome structure-function relationships, structural genomic diversity, and evolution.


Plants ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 270 ◽  
Author(s):  
Yun Gyeong Lee ◽  
Sang Chul Choi ◽  
Yuna Kang ◽  
Kyeong Min Kim ◽  
Chon-Sik Kang ◽  
...  

The whole genome sequencing (WGS) has become a crucial tool in understanding genome structure and genetic variation. The MinION sequencing of Oxford Nanopore Technologies (ONT) is an excellent approach for performing WGS and it has advantages in comparison with other Next-Generation Sequencing (NGS): It is relatively inexpensive, portable, has simple library preparation, can be monitored in real-time, and has no theoretical limits on reading length. Sorghum bicolor (L.) Moench is diploid (2n = 2x = 20) with a genome size of about 730 Mb, and its genome sequence information is released in the Phytozome database. Therefore, sorghum can be used as a good reference. However, plant species have complex and large genomes when compared to animals or microorganisms. As a result, complete genome sequencing is difficult for plant species. MinION sequencing that produces long-reads can be an excellent tool for overcoming the weak assembly of short-reads generated from NGS by minimizing the generation of gaps or covering the repetitive sequence that appears on the plant genome. Here, we conducted the genome sequencing for S. bicolor cv. BTx623 while using the MinION platform and obtained 895,678 reads and 17.9 gigabytes (Gb) (ca. 25× coverage of reference) from long-read sequence data. A total of 6124 contigs (covering 45.9%) were generated from Canu, and a total of 2661 contigs (covering 50%) were generated from Minimap and Miniasm with a Racon through a de novo assembly using two different tools and mapped assembled contigs against the sorghum reference genome. Our results provide an optimal series of long-read sequencing analysis for plant species while using the MinION platform and a clue to determine the total sequencing scale for optimal coverage that is based on various genome sizes.


2020 ◽  
Author(s):  
Anna Tigano ◽  
Arne Jacobs ◽  
Aryn P. Wilder ◽  
Ankita Nand ◽  
Ye Zhan ◽  
...  

AbstractThe levels and distribution of standing genetic variation in a genome can provide a wealth of insights about the adaptive potential, demographic history, and genome structure of a population or species. As structural variants are increasingly associated with traits important for adaptation and speciation, investigating both sequence and structural variation is essential for wholly tapping this potential. Using a combination of shotgun sequencing, 10X Genomics linked reads and proximity-ligation data (Chicago and Hi-C), we produced and annotated a chromosome-level genome assembly for the Atlantic silverside (Menidia menidia) - an established ecological model for studying the phenotypic effects of natural and artificial selection - and examined patterns of genomic variation across two individuals sampled from different populations with divergent local adaptations. Levels of diversity varied substantially across each chromosome, consistently being highly elevated near the ends (presumably near telomeric regions) and dipping to near zero around putative centromeres. Overall, our estimate of the genome-wide average heterozygosity in the Atlantic silverside is the highest reported for a fish, or any vertebrate, to date (1.32-1.76% depending on inference method and sample). Furthermore, we also found extreme levels of structural variation, affecting ~23% of the total genome sequence, including multiple large inversions (> 1 Mb and up to 12.6 Mb) associated with previously identified haploblocks showing strong differentiation between locally adapted populations. These extreme levels of standing genetic variation are likely associated with large effective population sizes and may help explain the remarkable adaptive divergence among populations of the Atlantic silverside.


Agronomy ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 816
Author(s):  
Ana B. Huertas-García ◽  
Laura Castellano ◽  
Carlos Guzmán ◽  
Juan B. Alvarez

Wild einkorn (Triticum monococcum L. ssp. aegilopoides (Link) Thell.) is a diploid wheat species from the Near East that has been classified as an ancestor of the first cultivated wheat (einkorn; T. monococcum L. ssp. monococcum). Its genome (Am), although it is not the donor of the A genome in polyploid wheat, shows high similarity to the Au genome. An important characteristic for wheat improvement is grain quality, which is associated with three components of the wheat grain: endosperm storage proteins (gluten properties), starch synthases (starch characteristics) and puroindolines (grain hardness). In the current study, these grain quality traits were studied in one collection of wild einkorn with the objective of evaluating its variability with respect to these three traits. The combined use of protein and DNA analyses allows detecting numerous variants for each one of the following genes: six for Ax, seven for Ay, eight for Wx, four for Gsp-1, two for Pina and three for Pinb. The high variability presence in this species suggests its potential as a source of novel alleles that could be used in modern wheat breeding.


Author(s):  
Alexandrina Bodrug-Schepers ◽  
Nancy Stralis-Pavese ◽  
Hermann Buerstmayr ◽  
Juliane C. Dohm ◽  
Heinz Himmelbauer

Abstract Key message We propose to use the natural variation between individuals of a population for genome assembly scaffolding. In today’s genome projects, multiple accessions get sequenced, leading to variant catalogs. Using such information to improve genome assemblies is attractive both cost-wise as well as scientifically, because the value of an assembly increases with its contiguity. We conclude that haplotype information is a valuable resource to group and order contigs toward the generation of pseudomolecules. Abstract Quinoa (Chenopodium quinoa) has been under cultivation in Latin America for more than 7500 years. Recently, quinoa has gained increasing attention due to its stress resistance and its nutritional value. We generated a novel quinoa genome assembly for the Bolivian accession CHEN125 using PacBio long-read sequencing data (assembly size 1.32 Gbp, initial N50 size 608 kbp). Next, we re-sequenced 50 quinoa accessions from Peru and Bolivia. This set of accessions differed at 4.4 million single-nucleotide variant (SNV) positions compared to CHEN125 (1.4 million SNV positions on average per accession). We show how to exploit variation in accessions that are distantly related to establish a genome-wide ordered set of contigs for guided scaffolding of a reference assembly. The method is based on detecting shared haplotypes and their expected continuity throughout the genome (i.e., the effect of linkage disequilibrium), as an extension of what is expected in mapping populations where only a few haplotypes are present. We test the approach using Arabidopsis thaliana data from different populations. After applying the method on our CHEN125 quinoa assembly we validated the results with mate-pairs, genetic markers, and another quinoa assembly originating from a Chilean cultivar. We show consistency between these information sources and the haplotype-based relations as determined by us and obtain an improved assembly with an N50 size of 1079 kbp and ordered contig groups of up to 39.7 Mbp. We conclude that haplotype information in distantly related individuals of the same species is a valuable resource to group and order contigs according to their adjacency in the genome toward the generation of pseudomolecules.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Kelly B. Klingler ◽  
Joshua P. Jahner ◽  
Thomas L. Parchman ◽  
Chris Ray ◽  
Mary M. Peacock

Abstract Background Distributional responses by alpine taxa to repeated, glacial-interglacial cycles throughout the last two million years have significantly influenced the spatial genetic structure of populations. These effects have been exacerbated for the American pika (Ochotona princeps), a small alpine lagomorph constrained by thermal sensitivity and a limited dispersal capacity. As a species of conservation concern, long-term lack of gene flow has important consequences for landscape genetic structure and levels of diversity within populations. Here, we use reduced representation sequencing (ddRADseq) to provide a genome-wide perspective on patterns of genetic variation across pika populations representing distinct subspecies. To investigate how landscape and environmental features shape genetic variation, we collected genetic samples from distinct geographic regions as well as across finer spatial scales in two geographically proximate mountain ranges of eastern Nevada. Results Our genome-wide analyses corroborate range-wide, mitochondrial subspecific designations and reveal pronounced fine-scale population structure between the Ruby Mountains and East Humboldt Range of eastern Nevada. Populations in Nevada were characterized by low genetic diversity (π = 0.0006–0.0009; θW = 0.0005–0.0007) relative to populations in California (π = 0.0014–0.0019; θW = 0.0011–0.0017) and the Rocky Mountains (π = 0.0025–0.0027; θW = 0.0021–0.0024), indicating substantial genetic drift in these isolated populations. Tajima’s D was positive for all sites (D = 0.240–0.811), consistent with recent contraction in population sizes range-wide. Conclusions Substantial influences of geography, elevation and climate variables on genetic differentiation were also detected and may interact with the regional effects of anthropogenic climate change to force the loss of unique genetic lineages through continued population extirpations in the Great Basin and Sierra Nevada.


Author(s):  
Daniella F Lato ◽  
G Brian Golding

Abstract Increasing evidence supports the notion that different regions of a genome have unique rates of molecular change. This variation is particularly evident in bacterial genomes where previous studies have reported gene expression and essentiality tend to decrease, while substitution rates usually increases with increasing distance from the origin of replication. Genomic reorganization such as rearrangements occur frequently in bacteria and allow for the introduction and restructuring of genetic content, creating gradients of molecular traits along genomes. Here, we explore the interplay of these phenomena by mapping substitutions to the genomes of Escherichia coli, Bacillus subtilis, Streptomyces, and Sinorhizobium meliloti, quantifying how many substitutions have occurred at each position in the genome. Preceding work indicates that substitution rate significantly increases with distance from the origin. Using a larger sample size and accounting for genome rearrangements through ancestral reconstruction, our analysis demonstrates that the correlation between the number of substitutions and distance from the origin of replication is often significant but small and inconsistent in direction. Some replicons had a significantly decreasing trend (E. coli and the chromosome of S. meliloti), while others showed the opposite significant trend (B. subtilis, Streptomyces, pSymA and pSymB in S. meliloti). dN, dS and ω were examined across all genes and there was no significant correlation between those values and distance from the origin. This study highlights the impact that genomic rearrangements and location have on molecular trends in some bacteria, illustrating the importance of considering spatial trends in molecular evolutionary analysis. Assuming that molecular trends are exclusively in one direction can be problematic.


Insects ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 97
Author(s):  
Nace Kranjc ◽  
Andrea Crisanti ◽  
Tony Nolan ◽  
Federica Bernardini

The increase in molecular tools for the genetic engineering of insect pests and disease vectors, such as Anopheles mosquitoes that transmit malaria, has led to an unprecedented investigation of the genomic landscape of these organisms. The understanding of genome variability in wild mosquito populations is of primary importance for vector control strategies. This is particularly the case for gene drive systems, which look to introduce genetic traits into a population by targeting specific genomic regions. Gene drive targets with functional or structural constraints are highly desirable as they are less likely to tolerate mutations that prevent targeting by the gene drive and consequent failure of the technology. In this study we describe a bioinformatic pipeline that allows the analysis of whole genome data for the identification of highly conserved regions that can point at potential functional or structural constraints. The analysis was conducted across the genomes of 22 insect species separated by more than hundred million years of evolution and includes the observed genomic variation within field caught samples of Anopheles gambiae and Anopheles coluzzii, the two most dominant malaria vectors. This study offers insight into the level of conservation at a genome-wide scale as well as at per base-pair resolution. The results of this analysis are gathered in a data storage system that allows for flexible extraction and bioinformatic manipulation. Furthermore, it represents a valuable resource that could provide insight into population structure and dynamics of the species in the complex and benefit the development and implementation of genetic strategies to tackle malaria.


PLoS ONE ◽  
2011 ◽  
Vol 6 (7) ◽  
pp. e22527 ◽  
Author(s):  
Peter Norberg ◽  
Shaun Tyler ◽  
Alberto Severini ◽  
Rich Whitley ◽  
Jan-Åke Liljeqvist ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document