snpGBS: A Simple and Flexible Bioinformatics Workflow to Identify SNPs from Genotyping-by-Sequencing Data

Author(s):  
Jie Kang ◽  
Ken Dodds ◽  
Stephen Byrne ◽  
Marty Faville ◽  
Michael Black ◽  
...  
Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 1042
Author(s):  
Zhuoying Weng ◽  
Yang Yang ◽  
Xi Wang ◽  
Lina Wu ◽  
Sijie Hua ◽  
...  

Pedigree information is necessary for the maintenance of diversity for wild and captive populations. Accurate pedigree is determined by molecular marker-based parentage analysis, which may be influenced by the polymorphism and number of markers, integrity of samples, relatedness of parents, or different analysis programs. Here, we described the first development of 208 single nucleotide polymorphisms (SNPs) and 11 microsatellites for giant grouper (Epinephelus lanceolatus) taking advantage of Genotyping-by-sequencing (GBS), and compared the power of SNPs and microsatellites for parentage and relatedness analysis, based on a mixed family composed of 4 candidate females, 4 candidate males and 289 offspring. CERVUS, PAPA and COLONY were used for mutually verification. We found that SNPs had a better potential for relatedness estimation, exclusion of non-parentage and individual identification than microsatellites, and > 98% accuracy of parentage assignment could be achieved by 100 polymorphic SNPs (MAF cut-off < 0.4) or 10 polymorphic microsatellites (mean Ho = 0.821, mean PIC = 0.651). This study provides a reference for the development of molecular markers for parentage analysis taking advantage of next-generation sequencing, and contributes to the molecular breeding, fishery management and population conservation.


2020 ◽  
Author(s):  
Kyle Fletcher ◽  
Lin Zhang ◽  
Juliana Gil ◽  
Rongkui Han ◽  
Keri Cavanaugh ◽  
...  

AbstractBackgroundGenetic maps are an important resource for validation of genome assemblies, trait discovery, and breeding. Next generation sequencing has enabled production of high-density genetic maps constructed with 10,000s of markers. Most current approaches require a genome assembly to identify markers. Our Assembly Free Linkage Analysis Pipeline (AFLAP) removes this requirement by using uniquely segregating k-mers as markers to rapidly construct a genotype table and perform subsequent linkage analysis. This avoids potential biases including preferential read alignment and variant calling.ResultsThe performance of AFLAP was determined in simulations and contrasted to a conventional workflow. We tested AFLAP using 100 F2 individuals of Arabidopsis thaliana, sequenced to low coverage. Genetic maps generated using k-mers contained over 130,000 markers that were concordant with the genomic assembly. The utility of AFLAP was then demonstrated by generating an accurate genetic map using genotyping-by-sequencing data of 235 recombinant inbred lines of Lactuca spp. AFLAP was then applied to 83 F1 individuals of the oomycete Bremia lactucae, sequenced to >5x coverage. The genetic map contained over 90,000 markers ordered in 19 large linkage groups. This genetic map was used to fragment, order, orient, and scaffold the genome, resulting in a much-improved reference assembly.ConclusionsAFLAP can be used to generate high density linkage maps and improve genome assemblies of any organism when a mapping population is available using whole genome sequencing or genotyping-by-sequencing data. Genetic maps produced for B. lactucae were accurately aligned to the genome and guided significant improvements of the reference assembly.


2021 ◽  
Author(s):  
Scott T O’Donnell ◽  
Sorel T Fitz-Gibbon ◽  
Victoria L Sork

Abstract Ancient introgression can be an important source of genetic variation that shapes the evolution and diversification of many taxa. Here, we estimate the timing, direction and extent of gene flow between two distantly related oak species in the same section (Quercus sect. Quercus). We estimated these demographic events using genotyping by sequencing data (GBS), which generated 25,702 single nucleotide polymorphisms (SNPs) for 24 individuals of California scrub oak (Quercus berberidifolia) and 23 individuals of Engelmann oak (Q. engelmannii). We tested several scenarios involving gene flow between these species using the diffusion approximation-based population genetic inference framework and model-testing approach of the Python package DaDi. We found that the most likely demographic scenario includes a bottleneck in Q. engelmannii that coincides with asymmetric gene flow from Q. berberidifolia into Q. engelmannii. Given that the timing of this gene flow coincides with the advent of a Mediterranean-type climate in the California Floristic Province, we propose that changing precipitation patterns and seasonality may have favored the introgression of climate-associated genes from the endemic into the non-endemic California oak.


2020 ◽  
Vol 47 (4) ◽  
pp. 993-1005 ◽  
Author(s):  
Irene Villa‐Machío ◽  
Alejandro G. Fernández de Castro ◽  
Javier Fuertes‐Aguilar ◽  
Gonzalo Nieto Feliner

2018 ◽  
Vol 18 (2) ◽  
pp. 179-190 ◽  
Author(s):  
William R. Stovall ◽  
Helen R. Taylor ◽  
Michael Black ◽  
Stefanie Grosser ◽  
Kim Rutherford ◽  
...  

Data in Brief ◽  
2019 ◽  
Vol 25 ◽  
pp. 104273
Author(s):  
Berenice Talamantes-Becerra ◽  
Jason Carling ◽  
Karina Kennedy ◽  
Michelle E. Gahan ◽  
Arthur Georges

Genome ◽  
2020 ◽  
Vol 63 (11) ◽  
pp. 577-581
Author(s):  
Davoud Torkamaneh ◽  
Jérôme Laroche ◽  
François Belzile

Genotyping-by-sequencing (GBS) is a rapid, flexible, low-cost, and robust genotyping method that simultaneously discovers variants and calls genotypes within a broad range of samples. These characteristics make GBS an excellent tool for many applications and research questions from conservation biology to functional genomics in both model and non-model species. Continued improvement of GBS relies on a more comprehensive understanding of data analysis, development of fast and efficient bioinformatics pipelines, accurate missing data imputation, and active post-release support. Here, we present the second generation of Fast-GBS (v2.0) that offers several new options (e.g., processing paired-end reads and imputation of missing data) and features (e.g., summary statistics of genotypes) to improve the GBS data analysis process. The performance assessment analysis showed that Fast-GBS v2.0 outperformed other available analytical pipelines, such as GBS-SNP-CROP and Gb-eaSy. Fast-GBS v2.0 provides an analysis platform that can be run with different types of sequencing data, modest computational resources, and allows for missing-data imputation for various species in different contexts.


Genetics ◽  
2014 ◽  
Vol 197 (1) ◽  
pp. 401-404 ◽  
Author(s):  
B. Emma Huang ◽  
Chitra Raghavan ◽  
Ramil Mauleon ◽  
Karl W. Broman ◽  
Hei Leung

Sign in / Sign up

Export Citation Format

Share Document