scholarly journals Genome assembly, structural variants, and genetic differentiation between Lake Whitefish young species pairs (Coregonus sp.) with long and short reads

2022 ◽  
Author(s):  
Claire M&eacuterot ◽  
Kristina S R Stenl&oslashkk ◽  
Clare Venney ◽  
Martin Laporte ◽  
Michel Moser ◽  
...  

The parallel evolution of nascent pairs of ecologically differentiated species offers an opportunity to get a better glimpse at the genetic architecture of speciation. Of particular interest is our recent ability to consider a wider range of genomic variants, not only single-nucleotide polymorphisms (SNPs), thanks to long-read sequencing technology. We can now identify structural variants (SVs) like insertions, deletions, and other structural rearrangements, allowing further insights into the genetic architecture of speciation and how different variants are involved in species differentiation. Here, we investigated genomic patterns of differentiation between sympatric species pairs (Dwarf and Normal) belonging to the Lake Whitefish (Coregonus clupeaformis) species complex. We assembled the first reference genomes for both Dwarf and Normal Lake Whitefish, annotated the transposable elements, and analysed the genome in the light of related coregonid species. Next, we used a combination of long-read and short-read sequencing to characterize SVs and genotype them at population-scale using genome-graph approaches, showing that SVs cover five times more of the genome than SNPs. We then integrated both SNPs and SVs to investigate the genetic architecture of species differentiation in two different lakes and highlighted an excess of shared outliers of differentiation. In particular, a large fraction of SVs differentiating the two species was driven by transposable elements (TEs), suggesting that TE accumulation during a period of allopatry predating secondary contact may have been a key process in the speciation of the Dwarf and Normal Whitefish. Altogether, our results suggest that SVs play an important role in speciation and that by combining second and third generation sequencing we now have the ability to integrate SVs into speciation genomics.

2016 ◽  
Author(s):  
Clément Rougeux ◽  
Louis Bernatchez ◽  
Pierre-Alexandre Gagnaire

AbstractParallel divergence patterns across replicated species pairs occurring in similar environmental contrasts may arise through distinct evolutionary scenarios. Deciphering whether such parallelism actually reflects repeated parallel divergence driven by divergent selection or a single divergence event with subsequent gene flow needs to be ascertained. Reconstructing historical gene flow is therefore of fundamental interest to understand how demography and selection jointly shaped genomic divergence during speciation. Here, we use an extended modeling framework to explore the multiple facets of speciation-with-gene-flow with demo-genetic divergence models that capture both temporal and genomic variation in effective population size and migration rate. We investigate the divergence history of five sympatric Lake Whitefish limnetic (dwarf) and benthic (normal) species pairs characterized by variable degrees of ecological divergence and reproductive isolation. Genome-wide SNPs were used to document the extent of genetic differentiation in each species pair, and 26 divergence models were fitted and compared to the unfolded joint allele frequency spectrum of each pair. We found evidence that a recent (circa 3000-4000 generations) asymmetrical secondary contact between expanding post-glacial populations has accompanied Whitefish diversification. Our results suggest that heterogeneous genomic differentiation patterns have emerged through the combined effects of linked selection generating variable rates of lineage sorting across the genome during geographical isolation, and heterogeneous introgression eroding divergence at different rates across the genome upon secondary contact. This study thus provides a new retrospective insight into the historical demographic and selective processes that shaped a continuum of divergence associated with ecological speciation.


Evolution ◽  
2013 ◽  
Vol 67 (9) ◽  
pp. 2483-2497 ◽  
Author(s):  
Pierre-Alexandre Gagnaire ◽  
Scott A. Pavey ◽  
Eric Normandeau ◽  
Louis Bernatchez

2015 ◽  
Vol 5 (7) ◽  
pp. 1481-1491 ◽  
Author(s):  
Martin Laporte ◽  
Sean M. Rogers ◽  
Anne-Marie Dion-Côté ◽  
Eric Normandeau ◽  
Pierre-Alexandre Gagnaire ◽  
...  

Author(s):  
Paul Vollrath ◽  
Harmeet S. Chawla ◽  
Sarah V. Schiessl ◽  
Iulian Gabur ◽  
HueyTyng Lee ◽  
...  

Abstract Key message A novel structural variant was discovered in the FLOWERING LOCUS T orthologue BnaFT.A02 by long-read sequencing. Nested association mapping in an elite winter oilseed rape population revealed that this 288 bp deletion associates with early flowering, putatively by modification of binding-sites for important flowering regulation genes. Abstract Perfect timing of flowering is crucial for optimal pollination and high seed yield. Extensive previous studies of flowering behavior in Brassica napus (canola, rapeseed) identified mutations in key flowering regulators which differentiate winter, semi-winter and spring ecotypes. However, because these are generally fixed in locally adapted genotypes, they have only limited relevance for fine adjustment of flowering time in elite cultivar gene pools. In crosses between ecotypes, the ecotype-specific major-effect mutations mask minor-effect loci of interest for breeding. Here, we investigated flowering time in a multiparental mapping population derived from seven elite winter oilseed rape cultivars which are fixed for major-effect mutations separating winter-type rapeseed from other ecotypes. Association mapping revealed eight genomic regions on chromosomes A02, C02 and C03 associating with fine modulation of flowering time. Long-read genomic resequencing of the seven parental lines identified seven structural variants coinciding with candidate genes for flowering time within chromosome regions associated with flowering time. Segregation patterns for these variants in the elite multiparental population and a diversity set of winter types using locus-specific assays revealed significant associations with flowering time for three deletions on chromosome A02. One of these was a previously undescribed 288 bp deletion within the second intron of FLOWERING LOCUS T on chromosome A02, emphasizing the advantage of long-read sequencing for detection of structural variants in this size range. Detailed analysis revealed the impact of this specific deletion on flowering-time modulation under extreme environments and varying day lengths in elite, winter-type oilseed rape.


2019 ◽  
Author(s):  
Glenn Hickey ◽  
David Heller ◽  
Jean Monlong ◽  
Jonas A. Sibbesen ◽  
Jouni Sirén ◽  
...  

AbstractStructural variants (SVs) remain challenging to represent and study relative to point mutations despite their demonstrated importance. We show that variation graphs, as implemented in the vg toolkit, provide an effective means for leveraging SV catalogs for short-read SV genotyping experiments. We benchmarked vg against state-of-the-art SV genotypers using three sequence-resolved SV catalogs generated by recent long-read sequencing studies. In addition, we use assemblies from 12 yeast strains to show that graphs constructed directly from aligned de novo assemblies improve genotyping compared to graphs built from intermediate SV catalogs in the VCF format.


2019 ◽  
Author(s):  
Jie Xu ◽  
Fan Song ◽  
Emily Schleicher ◽  
Christopher Pool ◽  
Darrin Bann ◽  
...  

AbstractWhile genomic analysis of tumors has stimulated major advances in cancer diagnosis, prognosis and treatment, current methods fail to identify a large fraction of somatic structural variants in tumors. We have applied a combination of whole genome sequencing and optical genome mapping to a number of adult and pediatric leukemia samples, which revealed in each of these samples a large number of structural variants not recognizable by current tools of genomic analyses. We developed computational methods to determine which of those variants likely arose as somatic mutations. The method identified 97% of the structural variants previously reported by karyotype analysis of these samples and revealed an additional fivefold more such somatic rearrangements. The method identified on average tens of previously unrecognizable inversions and duplications and hundreds of previously unrecognizable insertions and deletions. These structural variants recurrently affected a number of leukemia associated genes as well as cancer driver genes not previously associated with leukemia and genes not previously associated with cancer. A number of variants only affected intergenic regions but caused cis-acting alterations in expression of neighboring genes. Analysis of TCGA data indicates that the status of several of the recurrently mutated genes identified in this study significantly affect survival of AML patients. Our results suggest that current genomic analysis methods fail to identify a majority of structural variants in leukemia samples and this lacunae may hamper diagnostic and prognostic efforts.


2020 ◽  
Author(s):  
Wesley Delage ◽  
Julien Thevenon ◽  
Claire Lemaitre

AbstractSince 2009, numerous tools have been developed to detect structural variants (SVs) using short read technologies. Insertions >50 bp are one of the hardest type to discover and are drastically underrepresented in gold standard variant callsets. The advent of long read technologies has completely changed the situation. In 2019, two independent cross technologies studies have published the most complete variant callsets with sequence resolved insertions in human individuals. Among the reported insertions, only 17 to 37% could be discovered with short-read based tools. In this work, we performed an in-depth analysis of these unprecedented insertion callsets in order to investigate the causes of such failures. We have first established a precise classification of insertion variants according to four layers of characterization: the nature and size of the inserted sequence, the genomic context of the insertion site and the breakpoint junction complexity. Because these levels are intertwined, we then used simulations to characterize the impact of each complexity factor on the recall of several SV callers. Simulations showed that the most impacting factor was the insertion type rather than the genomic context, with various difficulties being handled differently among the tested SV callers, and they highlighted the lack of sequence resolution for most insertion calls. Our results explain the low recall by pointing out several difficulty factors among the observed insertion features and provide avenues for improving SV caller algorithms and their [email protected]


2021 ◽  
Vol 288 (1942) ◽  
pp. 20202804
Author(s):  
Richard K. Simpson ◽  
David R. Wilson ◽  
Allison F. Mistakidis ◽  
Daniel J. Mennill ◽  
Stéphanie M. Doucet

Closely related species often exhibit similarities in appearance and behaviour, yet when related species exist in sympatry, signals may diverge to enhance species recognition. Prior comparative studies provided mixed support for this hypothesis, but the relationship between sympatry and signal divergence is likely nonlinear. Constraints on signal diversity may limit signal divergence, especially when large numbers of species are sympatric. We tested the effect of sympatric overlap on plumage colour and song divergence in wood-warblers (Parulidae), a speciose group with diverse visual and vocal signals. We also tested how number of sympatric species influences signal divergence. Allopatric species pairs had overall greater plumage and song divergence compared to sympatric species pairs. However, among sympatric species pairs, plumage divergence positively related to the degree of sympatric overlap in males and females, while male song bandwidth and syllable rate divergence negatively related to sympatric overlap. In addition, as the number of species in sympatry increased, average signal divergence among sympatric species decreased, which is likely due to constraints on warbler perceptual space and signal diversity. Our findings reveal that sympatry influences signal evolution in warblers, though not always as predicted, and that number of sympatric species can limit sympatry's influence on signal evolution.


2015 ◽  
Author(s):  
Ivan Sovic ◽  
Mile Sikic ◽  
Andreas Wilm ◽  
Shannon Nicole Fenlon ◽  
Swaine Chen ◽  
...  

Exploiting the power of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics. We present the first nanopore read mapper (GraphMap) that uses a read-funneling paradigm to robustly handle variable error rates and fast graph traversal to align long reads with speed and very high precision (>95%). Evaluation on MinION sequencing datasets against short and long-read mappers indicates that GraphMap increases mapping sensitivity by at least 15-80%. GraphMap alignments are the first to demonstrate consensus calling with <1 error in 100,000 bases, variant calling on the human genome with 76% improvement in sensitivity over the next best mapper (BWA-MEM), precise detection of structural variants from 100bp to 4kbp in length and species and strain-specific identification of pathogens using MinION reads. GraphMap is available open source under the MIT license at https://github.com/isovic/graphmap.


Sign in / Sign up

Export Citation Format

Share Document