scholarly journals dDocent: a RADseq, variant-calling pipeline designed for population genomics of non-model organisms

Author(s):  
Jonathan Puritz ◽  
Christopher M. Hollenbeck ◽  
John R. Gold

Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for organisms with large effective population sizes and high levels of genetic polymorphism but for which no genomic resources exist. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is most likely due to the fact that dDocent quality trims instead of filtering and incorporates both forward and reverse reads in assembly, mapping, and SNP calling, thus enabling use of reads with Indel polymorphisms. The pipeline and a comprehensive user guide can be found at (http://dDocent.wordpress.com).

2014 ◽  
Author(s):  
Jonathan Puritz ◽  
Christopher M. Hollenbeck ◽  
John R. Gold

Restriction-site associated DNA sequencing (RADseq) has become a powerful and useful approach for population genomics. Currently, no software exists that utilizes both paired-end reads from RADseq data to efficiently produce population-informative variant calls, especially for organisms with large effective population sizes and high levels of genetic polymorphism but for which no genomic resources exist. dDocent is an analysis pipeline with a user-friendly, command-line interface designed to process individually barcoded RADseq data (with double cut sites) into informative SNPs/Indels for population-level analyses. The pipeline, written in BASH, uses data reduction techniques and other stand-alone software packages to perform quality trimming and adapter removal, de novo assembly of RAD loci, read mapping, SNP and Indel calling, and baseline data filtering. Double-digest RAD data from population pairings of three different marine fishes were used to compare dDocent with Stacks, the first generally available, widely used pipeline for analysis of RADseq data. dDocent consistently identified more SNPs shared across greater numbers of individuals and with higher levels of coverage. This is most likely due to the fact that dDocent quality trims instead of filtering and incorporates both forward and reverse reads in assembly, mapping, and SNP calling, thus enabling use of reads with Indel polymorphisms. The pipeline and a comprehensive user guide can be found at (http://dDocent.wordpress.com).


2019 ◽  
Author(s):  
C. Schmidt ◽  
M. Domaratzki ◽  
R.P. Kinnunen ◽  
J. Bowman ◽  
C.J. Garroway

AbstractUrbanization and associated environmental changes are causing global declines in vertebrate populations. In general, population declines of the magnitudes now detected should lead to reduced effective population sizes for animals living in proximity to humans and disturbed lands. This is cause for concern because effective population sizes set the rate of genetic diversity loss due to genetic drift, the rate of increase in inbreeding, and the efficiency with which selection can act on beneficial alleles. We predicted that the effects of urbanization should decrease effective population size and genetic diversity, and increase population-level genetic differentiation. To test for such patterns, we repurposed and reanalyzed publicly archived genetic data sets for North American birds and mammals. After filtering, we had usable raw genotype data from 85 studies and 41,023 individuals, sampled from 1,008 locations spanning 41 mammal and 25 bird species. We used census-based urban-rural designations, human population density, and the Human Footprint Index as measures of urbanization and habitat disturbance. As predicted, mammals sampled in more disturbed environments had lower effective population sizes and genetic diversity, and were more genetically differentiated from those in more natural environments. There were no consistent relationships detectable for birds. This suggests that, in general, mammal populations living near humans may have less capacity to respond adaptively to further environmental changes, and be more likely to suffer from effects of inbreeding.


PLoS ONE ◽  
2021 ◽  
Vol 16 (10) ◽  
pp. e0259124
Author(s):  
Damian C. Lettoof ◽  
Vicki A. Thomson ◽  
Jari Cornelis ◽  
Philip W. Bateman ◽  
Fabien Aubret ◽  
...  

Urbanisation alters landscapes, introduces wildlife to novel stressors, and fragments habitats into remnant ‘islands’. Within these islands, isolated wildlife populations can experience genetic drift and subsequently suffer from inbreeding depression and reduced adaptive potential. The Western tiger snake (Notechis scutatus occidentalis) is a predator of wetlands in the Swan Coastal Plain, a unique bioregion that has suffered substantial degradation through the development of the city of Perth, Western Australia. Within the urban matrix, tiger snakes now only persist in a handful of wetlands where they are known to bioaccumulate a suite of contaminants, and have recently been suggested as a relevant bioindicator of ecosystem health. Here, we used genome-wide single nucleotide polymorphism (SNP) data to explore the contemporary population genomics of seven tiger snake populations across the urban matrix. Specifically, we used population genomic structure and diversity, effective population sizes (Ne), and heterozygosity-fitness correlations to assess fitness of each population with respect to urbanisation. We found that population genomic structure was strongest across the northern and southern sides of a major river system, with the northern cluster of populations exhibiting lower heterozygosities than the southern cluster, likely due to a lack of historical gene flow. We also observed an increasing signal of inbreeding and genetic drift with increasing geographic isolation due to urbanisation. Effective population sizes (Ne) at most sites were small (< 100), with Ne appearing to reflect the area of available habitat rather than the degree of adjacent urbanisation. This suggests that ecosystem management and restoration may be the best method to buffer the further loss of genetic diversity in urban wetlands. If tiger snake populations continue to decline in urban areas, our results provide a baseline measure of genomic diversity, as well as highlighting which ‘islands’ of habitat are most in need of management and protection.


2017 ◽  
Author(s):  
Lucas A. Freitas ◽  
Beatriz Mello ◽  
Carlos G. Schrago

AbstractWith the increase in the availability of genomic data, sequences from different loci are usually concatenated in a supermatrix for phylogenetic inference. However, as an alternative to the supermatrix approach, several implementations of the multispecies coalescent (MSC) have been increasingly used in phylogenomic analyses due to their advantages in accommodating gene tree topological heterogeneity by taking account population-level processes. Moreover, the development of faster algorithms under the MSC is enabling the analysis of thousands of loci/taxa. Here, we explored the MSC approach for a phylogenomic dataset of Insecta. Even with the challenges posed by insects, due to large effective population sizes coupled with short deep internal branches, our MSC analysis could recover several orders and evolutionary relationships in agreement with current insect systematics. However, some phylogenetic relationships were not recovered by MSC methods. Most noticeable, a remiped crustacean was positioned within the Insecta. Additionally, the interordinal relationships within Polyneoptera and Neuropteroidea contradicted recent works, by suggesting the non-monophyly of Neuroptera. We notice, however, that these phylogenetic arrangements were also poorly supported by previous analyses and that they were sensitive to gene sampling.


2021 ◽  
Author(s):  
Michael A. Martin ◽  
Katia Koelle

An early analysis of SARS-CoV-2 deep-sequencing data that combined epidemiological and genetic data to characterize the transmission dynamics of the virus in and beyond Austria concluded that the size of the virus’s transmission bottleneck was large – on the order of 1000 virions. We performed new computational analyses using these deep-sequenced samples from Austria. Our analyses included characterization of transmission bottleneck sizes across a range of variant calling thresholds and examination of patterns of shared low-frequency variants between transmission pairs in cases where de novo genetic variation was present in the recipient. From these analyses, among others, we found that SARS-CoV-2 transmission bottlenecks are instead likely to be very tight, on the order of 1-3 virions. These findings have important consequences for understanding how SARS-CoV-2 evolves between hosts and the processes shaping genetic variation observed at the population level.


2018 ◽  
Author(s):  
Emily S. Melzer ◽  
Caralyn E. Sein ◽  
James J. Chambers ◽  
M. Sloan Siegrist

AbstractIn many model organisms, diffuse patterning of cell wall peptidoglycan synthesis by the actin homolog MreB enables the bacteria to maintain their characteristic rod shape. InCaulobacter crescentusandEscherichia coli, MreB is also required to sculpt this morphologyde novo. Mycobacteria are rod-shaped but expand their cell wall from discrete polar or sub-polar zones. In this genus, the tropomyosin-like protein DivIVA is required for the maintenance of cell morphology. DivIVA has also been proposed to direct peptidoglycan synthesis to the tips of the mycobacterial cell. The precise nature of this regulation is unclear, as is its role in creating rod shape from scratch. We find that DivIVA localizes nascent cell wall and covalently associated mycomembrane but is dispensable for the assembly process itself.Mycobacterium smegmatisrendered spherical by peptidoglycan digestion or by DivIVA depletion are able to regain rod shape at the population level in the presence of DivIVA. At the single cell level, there is a close spatiotemporal correlation between DivIVA foci, rod extrusion and concentrated cell wall synthesis. Thus, although the precise mechanistic details differ from other organisms,M. smegmatisalso establish and propagate rod shape by cytoskeleton-controlled patterning of peptidoglycan. Our data further support the emerging notion that morphology is a hardwired trait of bacterial cells.


2018 ◽  
Author(s):  
Alba Rey-Iglesia ◽  
Shyam Gopalakrishan ◽  
Christian Carøe ◽  
David E. Alquezar-Planas ◽  
Anne Ahlmann Nielsen ◽  
...  

AbstractIn recent years, the availability of reduced representation library (RRL) methods has catalysed an expansion of genome-scale studies to characterize both model and non-model organisms. Most of these methods rely on the use of restriction enzymes to obtain DNA sequences at a genome-wide level. These approaches have been widely used to sequence thousands of markers across individuals for many organisms at a reasonable cost, revolutionizing the field of population genomics. However, there are still some limitations associated with these methods, in particular, the high molecular weight DNA required as starting material, the reduced number of common loci among investigated samples, and the short length of the sequenced site-associated DNA. Here, we present MobiSeq, a RRL protocol exploiting simple laboratory techniques, that generates genomic data based on PCR targeted-enrichment of transposable elements and the sequencing of the associated flanking region. We validate its performance across 103 DNA extracts derived from three mammalian species: grey wolf (Canis lupus), red deer complex (Cervus sp.), and brown rat (Rattus norvegicus). MobiSeq enables the sequencing of hundreds of thousands loci across the genome, and performs SNP discovery with relatively low rates of clonality. Given the ease and flexibility of MobiSeq protocol, the method has the potential to be implemented for marker discovery and population genomics across a wide range of organisms – enabling the exploration of diverse evolutionary and conservation questions.


PeerJ ◽  
2014 ◽  
Vol 2 ◽  
pp. e431 ◽  
Author(s):  
Jonathan B. Puritz ◽  
Christopher M. Hollenbeck ◽  
John R. Gold

Sign in / Sign up

Export Citation Format

Share Document