scholarly journals Facilitating population genomics of non-model organisms through optimized experimental design for reduced representation sequencing

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Henrik Christiansen ◽  
Franz M. Heindler ◽  
Bart Hellemans ◽  
Quentin Jossart ◽  
Francesca Pasotti ◽  
...  

Abstract Background Genome-wide data are invaluable to characterize differentiation and adaptation of natural populations. Reduced representation sequencing (RRS) subsamples a genome repeatedly across many individuals. However, RRS requires careful optimization and fine-tuning to deliver high marker density while being cost-efficient. The number of genomic fragments created through restriction enzyme digestion and the sequencing library setup must match to achieve sufficient sequencing coverage per locus. Here, we present a workflow based on published information and computational and experimental procedures to investigate and streamline the applicability of RRS. Results In an iterative process genome size estimates, restriction enzymes and size selection windows were tested and scaled in six classes of Antarctic animals (Ostracoda, Malacostraca, Bivalvia, Asteroidea, Actinopterygii, Aves). Achieving high marker density would be expensive in amphipods, the malacostracan target taxon, due to the large genome size. We propose alternative approaches such as mitogenome or target capture sequencing for this group. Pilot libraries were sequenced for all other target taxa. Ostracods, bivalves, sea stars, and fish showed overall good coverage and marker numbers for downstream population genomic analyses. In contrast, the bird test library produced low coverage and few polymorphic loci, likely due to degraded DNA. Conclusions Prior testing and optimization are important to identify which groups are amenable for RRS and where alternative methods may currently offer better cost-benefit ratios. The steps outlined here are easy to follow for other non-model taxa with little genomic resources, thus stimulating efficient resource use for the many pressing research questions in molecular ecology.

2021 ◽  
Author(s):  
Henrik Christiansen ◽  
Franz M. Heindler ◽  
Bart Hellemans ◽  
Quentin Jossart ◽  
Francesca Pasotti ◽  
...  

Genome-wide data are invaluable to characterize differentiation and adaptation of natural populations. Reduced representation sequencing (RRS) subsamples a genome repeatedly across many individuals. However, RRS requires careful optimization and fine-tuning to deliver high marker density while being cost-efficient. The number of genomic fragments created through restriction enzyme digestion and the sequencing library setup must match to achieve sufficient sequencing coverage per locus. Here, we present a workflow based on published information and computational and experimental procedures to investigate and streamline the applicability of RRS. In an iterative process genome size estimates, restriction enzymes and size selection windows were tested and scaled in six classes of Antarctic animals (Ostracoda, Malacostraca, Bivalvia, Asteroidea, Actinopterygii, Aves). Achieving high marker density would be expensive in amphipods, the malacostracan target taxon, due to the large genome size. We propose alternative approaches such as mitogenome or target capture sequencing for this group. Pilot libraries were sequenced for all other target taxa. Ostracods, bivalves, sea stars, and fish showed overall good coverage and marker numbers for downstream population genomic analyses. In contrast, the bird test library produced low coverage and few polymorphic loci, likely due to degraded DNA. Prior testing and optimization are important to identify which groups are amenable for RRS and where alternative methods may currently offer better cost-benefit ratios. The steps outlined here are easy to follow for other non-model taxa with little genomic resources, thus stimulating efficient resource use for the many pressing research questions in molecular ecology.


2015 ◽  
Author(s):  
Thomas F Cooke ◽  
Muh-Ching Yee ◽  
Marina Muzzio ◽  
Alexandra Sockell ◽  
Ryan Bell ◽  
...  

Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.


2019 ◽  
Vol 35 (17) ◽  
pp. 3160-3162
Author(s):  
Davoud Torkamaneh ◽  
Jérôme Laroche ◽  
Istvan Rajcan ◽  
François Belzile

Abstract Motivation Reduced-representation sequencing is a genome-wide scanning method for simultaneous discovery and genotyping of thousands to millions of single nucleotide polymorphisms that is used across a wide range of species. However, in this method a reproducible but very small fraction of the genome is captured for sequencing, while the resulting reads are typically aligned against the entire reference genome. Results Here we present a skinny reference genome approach in which a simplified reference genome is used to decrease computing time for data processing and to increase single nucleotide polymorphism counts and accuracy. A skinny reference genome can be integrated into any reduced-representation sequencing analytical pipeline. Availability and implementation https://bitbucket.org/jerlar73/SRG-Extractor. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Elverson Soares de Melo ◽  
Gabriel da Luz Wallau

AbstractTransposable elements (TEs) are a set of mobile elements within a genome. Due to their complexity, an in-depth TE characterization is only available for a handful of model organisms. In the present study, we performed a de novo and homology-based characterization of TEs in the genomes of 24 mosquito species and investigated their mode of inheritance. More than 40% of the genome of Aedes aegypti, Aedes albopictus, and Culex quinquefasciatus is composed of TEs, varying substantially among Anopheles species (0.13%–19.55%). Class I TEs are the most abundant among mosquitoes and at least 24 TE superfamilies were found. Interestingly, TEs have been continuously exchanged by horizontal transfer (212 TE families of 18 different superfamilies) among mosquitoes since 30 million years ago, representing around 6% of the genome in Aedes genomes and a small fraction in Anopheles genomes. Most of these horizontally transferred TEs are from the three ubiquitous LTR superfamilies: Gypsy, Bel-Pao and Copia. Searching more 32,000 genomes, we also uncover transfers between mosquitoes and two different Phyla—Cnidaria and Nematoda—and two subphyla—Chelicerata and Crustacea, identifying a vector, the worm Wuchereria bancrofti, that enabled the horizontal spread of a Tc1-mariner element of irritans subfamily among various Anopheles species. These data also allowed us to reconstruct the horizontal transfer network of this TE involving more than 40 species. In summary, our results suggest that TEs are constantly exchanged by common phenomena of horizontal transfers among mosquitoes, influencing genome variation and contributing to genome size expansion.Author SummaryMost eukaryotes have DNA fragments inside their genome that can multiply by inserting themselves in other regions of the genome, generating variability. These fragments are called Transposable Elements (TEs). Since they are a constituent part of the eukaryote genomes, these pieces of DNA are usually inherited vertically by the offspring. To avoid damage to the genome caused by the replication and insertion of TEs, organisms usually control them, leading to their inactivation. However, TEs sometimes get out of control and invade other species through a horizontal transfer mechanism. This dynamic is not known in mosquitoes, a group of organisms that acts as vectors of many human diseases. We collected mosquito genomes available in public databases and characterized the whole content of TEs. Using a statistic supported method, we investigate TE relations among mosquitoes and discover that horizontal transfers of transposons are common and occurred in the last 30 million years among these species. Although not as common as transfers among closely related species, transposon transfer to distant species also occur. We also identify a parasite, a filarial worm, that may have facilitated the transfer of TE to many mosquitoes. Together, horizontally transferred TEs contribute to increasing mosquito genome size and variation.


PLoS ONE ◽  
2013 ◽  
Vol 8 (6) ◽  
pp. e65066 ◽  
Author(s):  
Aylwyn Scally ◽  
Bryndis Yngvadottir ◽  
Yali Xue ◽  
Qasim Ayub ◽  
Richard Durbin ◽  
...  

2019 ◽  
Author(s):  
Luisa Bresadola ◽  
Vivian Link ◽  
C. Alex Buerkle ◽  
Christian Lexer ◽  
Daniel Wegmann

AbstractIn non-model organisms, evolutionary questions are frequently addressed using reduced representation sequencing techniques due to their low cost, ease of use, and because they do not require genomic resources such as a reference genome. However, evidence is accumulating that such techniques may be affected by specific biases, questioning the accuracy of obtained genotypes, and as a consequence, their usefulness in evolutionary studies. Here we introduce three strategies to estimate genotyping error rates from such data: through the comparison to high quality genotypes obtained with a different technique, from individual replicates, or from a population sample when assuming Hardy-Weinberg equilibrium. Applying these strategies to data obtained with Restriction site Associated DNA sequencing (RAD-seq), arguably the most popular reduced representation sequencing technique, revealed per-allele genotyping error rates that were much higher than sequencing error rates, particularly at heterozygous sites that were wrongly inferred as homozygous. As we exemplify through the inference of genome-wide and local ancestry of well characterized hybrids of two Eurasian poplar (Populus) species, such high error rates may lead to wrong biological conclusions. By properly accounting for these error rates in downstream analyses, either by incorporating genotyping errors directly or by recalibrating genotype likelihoods, we were nevertheless able to use the RAD-seq data to support biologically meaningful and robust inferences of ancestry among Populus hybrids. Based on these findings, we strongly recommend carefully assessing genotyping error rates in reduced representation sequencing experiments, and to properly account for these in downstream analyses, for instance using the tools presented here.


Genetics ◽  
2002 ◽  
Vol 162 (4) ◽  
pp. 1863-1873 ◽  
Author(s):  
J Slate ◽  
P M Visscher ◽  
S MacGregor ◽  
D Stevens ◽  
M L Tate ◽  
...  

Abstract Recent empirical evidence indicates that although fitness and fitness components tend to have low heritability in natural populations, they may nonetheless have relatively large components of additive genetic variance. The molecular basis of additive genetic variation has been investigated in model organisms but never in the wild. In this article we describe an attempt to map quantitative trait loci (QTL) for birth weight (a trait positively associated with overall fitness) in an unmanipulated, wild population of red deer (Cervus elaphus). Two approaches were used: interval mapping by linear regression within half-sib families and a variance components analysis of a six-generation pedigree of >350 animals. Evidence for segregating QTL was found on three linkage groups, one of which was significant at the genome-wide suggestive linkage threshold. To our knowledge this is the first time that a QTL for any trait has been mapped in a wild mammal population. It is hoped that this study will stimulate further investigations of the genetic architecture of fitness traits in the wild.


2008 ◽  
Vol 83 (4) ◽  
pp. 2025-2028 ◽  
Author(s):  
Adam C. Smith ◽  
Kathy L. Poulin ◽  
Robin J. Parks

ABSTRACT Replication-defective adenovirus (Ad) vectors can vary considerably in genome length, but whether this affects virion stability has not been investigated. Helper-dependent Ad vectors with a genome size of ∼30 kb were 100-fold more sensitive to heat inactivation than their parental helper virus (>36 kb), and increasing the genome size of the vector significantly improved heat stability. A similar relationship between genome size and stability existed for Ad with early region 1 deleted. Loss of infectivity was due to release of vertex proteins, followed by disintegration of the capsid. Thus, not only does the viral DNA encode all of the heritable information essential for virus replication, it also plays a critical role in maintaining capsid strength and integrity.


Author(s):  
Brandon T. Sinn ◽  
Sandra J. Simon ◽  
Mathilda V. Santee ◽  
Stephen P. DiFazio ◽  
Nicole M. Fama ◽  
...  

2018 ◽  
Vol 49 (6) ◽  
pp. 579-591 ◽  
Author(s):  
Zhe Zhang ◽  
Qianqian Zhang ◽  
Qian Xiao ◽  
Hao Sun ◽  
Hongding Gao ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document