ISSRseq: an extensible method for reduced representation sequencing

Author(s):  
Brandon T. Sinn ◽  
Sandra J. Simon ◽  
Mathilda V. Santee ◽  
Stephen P. DiFazio ◽  
Nicole M. Fama ◽  
...  
2018 ◽  
Vol 49 (6) ◽  
pp. 579-591 ◽  
Author(s):  
Zhe Zhang ◽  
Qianqian Zhang ◽  
Qian Xiao ◽  
Hao Sun ◽  
Hongding Gao ◽  
...  

2015 ◽  
Author(s):  
Thomas F Cooke ◽  
Muh-Ching Yee ◽  
Marina Muzzio ◽  
Alexandra Sockell ◽  
Ryan Bell ◽  
...  

Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.


2019 ◽  
Vol 36 (1) ◽  
pp. 26-32
Author(s):  
Davoud Torkamaneh ◽  
Jérôme Laroche ◽  
Brian Boyle ◽  
François Belzile

Abstract Motivation Identification of DNA sequence variations such as single nucleotide polymorphisms (SNPs) is a fundamental step toward genetic studies. Reduced-representation sequencing methods have been developed as alternatives to whole genome sequencing to reduce costs and enable the analysis of many more individual. Amongst these methods, restriction site associated sequencing (RSAS) methodologies have been widely used for rapid and cost-effective discovery of SNPs and for high-throughput genotyping in a wide range of species. Despite the extensive improvements of the RSAS methods in the last decade, the estimation of the number of reads (i.e. read depth) required per sample for an efficient and effective genotyping remains mostly based on trial and error. Results Herein we describe a bioinformatics tool, DepthFinder, designed to estimate the required read counts for RSAS methods. To illustrate its performance, we estimated required read counts in six different species (human, cattle, spruce budworm, salmon, barley and soybean) that cover a range of different biological (genome size, level of genome complexity, level of DNA methylation and ploidy) and technical (library preparation protocol and sequencing platform) factors. To assess the prediction accuracy of DepthFinder, we compared DepthFinder-derived results with independent datasets obtained from an RSAS experiment. This analysis yielded estimated accuracies of nearly 94%. Moreover, we present DepthFinder as a powerful tool to predict the most effective size selection interval in RSAS work. We conclude that DepthFinder constitutes an efficient, reliable and useful tool for a broad array of users in different research communities. Availability and implementation https://bitbucket.org/jerlar73/DepthFinder Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (17) ◽  
pp. 3160-3162
Author(s):  
Davoud Torkamaneh ◽  
Jérôme Laroche ◽  
Istvan Rajcan ◽  
François Belzile

Abstract Motivation Reduced-representation sequencing is a genome-wide scanning method for simultaneous discovery and genotyping of thousands to millions of single nucleotide polymorphisms that is used across a wide range of species. However, in this method a reproducible but very small fraction of the genome is captured for sequencing, while the resulting reads are typically aligned against the entire reference genome. Results Here we present a skinny reference genome approach in which a simplified reference genome is used to decrease computing time for data processing and to increase single nucleotide polymorphism counts and accuracy. A skinny reference genome can be integrated into any reduced-representation sequencing analytical pipeline. Availability and implementation https://bitbucket.org/jerlar73/SRG-Extractor. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Cheng Jin ◽  
Huixia Kao ◽  
Shubin Dong

Abstract BackgroundStudying population genetic structure and gene flow of plant populations and their influence factors is crucial in field of conservation biology, especially rare and endangered plants. Tetraena mongolica Maxim (TM), belong to Zygophyllaceae family, a rare and endangered plant with narrow distribution. Due to excessive logging, urban expansion, industrial development and development of the scenic spot in the last decades, has caused habitat fragments and decline.ResultsIn this study, the genetic diversity, the population genetic structure and gene flow of TM populations were evaluated by reduced representation sequencing technology, a total of more than 133.45 GB high-quality clean reads and 38,097 high-quality SNPs were generated. Analysis based on multiple methods, we found existing TM populations have moderate levels of genetic diversity, very low genetic differentiation and high levels of gene flow between populations. Population structure and principal coordinates analysis showed that 8 TM populations can be divided into two groups, Mantel test detected no significant correlation between geographical distances and genetic distance for the whole sampling. The migration model indicated that the gene flow is more of an north to south migration pattern in history.ConclusionsOur study demonstrate that the present genetic structure is mainly due to habitat fragmentation caused by urban sprawl, industrial development and coal mining. For recommendations of conservation management, all 8 populations should be protected as a whole population, rather than just those in the core area of TM nature reserve, especially the populations near the edge of TM distribution in cities and industrial areas deserve our special protection.


2012 ◽  
Vol 21 (S1) ◽  
pp. 119-127 ◽  
Author(s):  
Günter Kahl ◽  
Carlos Molina ◽  
Björn Rotter ◽  
Ruth Jüngling ◽  
Anja Frank ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document