scholarly journals SambaR: An R package for fast, easy and reproducible population‐genetic analyses of biallelic SNP data sets

Author(s):  
Menno J. Jong ◽  
Joost F. Jong ◽  
A. Rus Hoelzel ◽  
Axel Janke
2020 ◽  
Author(s):  
Menno J. de Jong ◽  
Joost F. de Jong ◽  
A. Rus Hoelzel ◽  
Axel Janke

ABSTRACTBackgroundSNP datasets can be used to infer a wealth of information about natural populations, including information about their structure, genetic diversity, and the presence of loci under selection. However, SNP data analysis can be a time-consuming and challenging process, not in the least because at present many different software packages are needed to execute and depict the wide variety of mainstream population-genetic analyses. Here we present SambaR, an integrative and user-friendly R package which automates and simplifies quality control and population-genetic analyses of biallelic SNP datasets. SambaR allows users to perform mainstream population-genetic analyses and to generate a wide variety of ready to publish graphs with a minimum number of commands (less than ten). These wrapper commands call functions of existing packages (including adegenet, ape, LEA, poppr, pcadapt and StAMPP) as well as new tools uniquely implemented in SambaR.ResultsWe tested SambaR on online available SNP datasets and found that SambaR can process datasets of millions of SNPs and hundreds of individuals within hours, given sufficient computing power. Newly developed tools implemented in SambaR facilitate optimization of filter settings, objective interpretation of ordination analyses, enhance comparability of diversity estimates from reduced representation library SNP datasets, and generate reduced SNP panels and structure-like plots with Bayesian population assignment probabilities.ConclusionSambaR facilitates rapid population genetic analyses on biallelic SNP datasets by removing three major time sinks: file handling, software learning, and data plotting. In addition, SambaR provides a convenient platform for SNP data storage and management, as well as several new utilities, including guidance in setting appropriate data filters.Availability and implementationThe SambaR source script, manual and example datasets are distributed through GitHub: https://github.com/mennodejong1986/SambaR


2006 ◽  
Vol 88 (1) ◽  
pp. 13-26 ◽  
Author(s):  
DORIS BACHTROG

The Drosophila nasuta subgroup of the immigrans species group is widely distributed throughout the South-East Asian region, consisting of morphologically similar species with varying degrees of reproductive isolation. Here, I report nucleotide variability data for five X-linked and two mtDNA loci in eight taxa from the nasuta subgroup, with deeper sampling from D. albomicans and its sister species D. nasuta. Phylogenetic relationships among these species vary among different genomic regions, and levels of genetic differentiation suggest that this species group diversified only about one million years ago. D. albomicans and D. nasuta share nucleotide polymorphisms and are distinguished by relatively few fixed differences. Patterns of genetic differentiation between this species pair are compatible with a simple isolation model with no gene flow. Nucleotide variability levels of species in the nasuta group are comparable to those in members of the melanogaster and pseudoobscura species groups, indicating effective population sizes on the order of several million. Population genetic analyses reveal that summaries of the frequency distribution of neutral polymorphisms in both D. albomicans and D. nasuta generally fit the assumptions of the standard neutral model. D. albomicans is of particular interest for evolutionary studies because of its recently formed neo-sex chromosomes, and our phylogenetic and population genetic analyses suggest that it might be an ideal model to study the very early stages of Y chromosome evolution.


Sign in / Sign up

Export Citation Format

Share Document