scholarly journals Agro-morphological, yield, and genotyping-by-sequencing data of selected wheat germplasm

2020 ◽  
Author(s):  
Madiha Islam ◽  
Abdullah ◽  
Bibi Zubaida ◽  
Nosheen Shafqat ◽  
Rabia Masood ◽  
...  

AbstractWheat (Triticum aestivum) is the most important staple food in Pakistan. Knowledge of its genetic diversity is critical for designing effective crop breeding programs. Here we report agro-morphological and yield data for 112 genotypes (including 7 duplicates) of wheat (Triticum aestivum) cultivars, advance lines, landraces and wild relatives, collected from several research institutes and breeders across Pakistan. We also report genotyping-by-sequencing (GBS) data for a selected sub-set of 52 genotypes. Sequencing was performed using Illumina HiSeq 2500 platform using the PE150 run. Data generated per sample ranged from 1.01 to 2.5 Gb; 90% of the short reads exhibited quality scores above 99.9%. TGACv1 wheat genome was used as a reference to map short reads from individual genotypes and to filter single nucleotide polymorphic loci (SNPs). On average, 364,074±54479 SNPs per genotype were recorded. The sequencing data has been submitted to the SRA database of NCBI (accession number SRP179096). The agro-morphological and yield data, along with the sequence data and SNPs will be invaluable resources for wheat breeding programs in future.

2019 ◽  
Vol 15 ◽  
pp. 117693431988994
Author(s):  
Shulin Zhang ◽  
Yaling Cai ◽  
Jinggong Guo ◽  
Kun Li ◽  
Renhai Peng ◽  
...  

Determining the genetic rearrangement and domestication footprints in Gossypium hirsutum cultivars and primitive race genotypes are essential for effective gene conservation efforts and the development of advanced breeding molecular markers for marker-assisted breeding. In this study, 94 accessions representing the 7 primitive races of G hirsutum, along with 9 G hirsutum and 12 Gossypium barbadense cultivated accessions were evaluated. The genotyping-by-sequencing (GBS) approach was employed and 146 558 single nucleotide polymorphisms (SNP) were generated. Distinct SNP signatures were identified through the combination of selection scans and association analyses. Phylogenetic analyses were also conducted, and we concluded that the Latifolium, Richmondi, and Marie-Galante race accessions were more genetically related to the G hirsutum cultivars and tend to cluster together. Fifty-four outlier SNP loci were identified by selection-scan analysis, and 3 SNPs were located in genes related to the processes of plant responding to stress conditions and confirmed through further genome-wide signals of marker-phenotype association analysis, which indicate a clear selection signature for such trait. These results identified useful candidate gene locus for cotton breeding programs.


2021 ◽  
Author(s):  
Scott T O’Donnell ◽  
Sorel T Fitz-Gibbon ◽  
Victoria L Sork

Abstract Ancient introgression can be an important source of genetic variation that shapes the evolution and diversification of many taxa. Here, we estimate the timing, direction and extent of gene flow between two distantly related oak species in the same section (Quercus sect. Quercus). We estimated these demographic events using genotyping by sequencing data (GBS), which generated 25,702 single nucleotide polymorphisms (SNPs) for 24 individuals of California scrub oak (Quercus berberidifolia) and 23 individuals of Engelmann oak (Q. engelmannii). We tested several scenarios involving gene flow between these species using the diffusion approximation-based population genetic inference framework and model-testing approach of the Python package DaDi. We found that the most likely demographic scenario includes a bottleneck in Q. engelmannii that coincides with asymmetric gene flow from Q. berberidifolia into Q. engelmannii. Given that the timing of this gene flow coincides with the advent of a Mediterranean-type climate in the California Floristic Province, we propose that changing precipitation patterns and seasonality may have favored the introgression of climate-associated genes from the endemic into the non-endemic California oak.


2017 ◽  
Author(s):  
Kelly J Vining ◽  
Natalia Salinas ◽  
Jacob A Tennessen ◽  
Jason D Zurn ◽  
Daniel James Sargent ◽  
...  

With the goal of evaluating genotyping-by-sequencing (GBS) in a species with a complex octoploid genome, GBS was used to survey genome-wide single-nucleotide polymorphisms (SNPs) in three biparental strawberry (Fragaria ×ananassa) populations. GBS sequence data were aligned to the F. vesca ‘Fvb’ reference genome in order to call SNPs. Numbers of polymorphic SNPs per population ranged from 1,163 to 3,190. Linkage maps consisting of 30-65 linkage groups were produced from the SNP sets derived from each parent. The linkage groups covered 99% of the Fvb reference genome, with three to seven linkage groups from a given parent aligned to any particular chromosome. A phylogenetic analysis performed using the POLiMAPS pipeline revealed linkage groups that were most similar to ancestral species F. vesca for each chromosome. Linkage groups that were most similar to a second ancestral species, F. iinumae, were only resolved for Fvb 4. The quantity of missing data and heterogeneity in genome coverage inherent in GBS complicated the analysis, but POLiMAPS resolved F. ×ananassa chromosomal regions derived from diploid ancestor F. vesca.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3731 ◽  
Author(s):  
Kelly J. Vining ◽  
Natalia Salinas ◽  
Jacob A. Tennessen ◽  
Jason D. Zurn ◽  
Daniel James Sargent ◽  
...  

Genotyping-by-sequencing (GBS) was used to survey genome-wide single-nucleotide polymorphisms (SNPs) in three biparental strawberry (Fragaria× ananassa) populations with the goal of evaluating this technique in a species with a complex octoploid genome. GBS sequence data were aligned to theF. vesca‘Fvb’ reference genome in order to call SNPs. Numbers of polymorphic SNPs per population ranged from 1,163 to 3,190. Linkage maps consisting of 30–65 linkage groups were produced from the SNP sets derived from each parent. The linkage groups covered 99% of theFvbreference genome, with three to seven linkage groups from a given parent aligned to any particular chromosome. A phylogenetic analysis performed using the POLiMAPS pipeline revealed linkage groups that were most similar to ancestral speciesF. vescafor each chromosome. Linkage groups that were most similar to a second ancestral species,F. iinumae, were only resolved forFvb4. The quantity of missing data and heterogeneity in genome coverage inherent in GBS complicated the analysis, but POLiMAPS resolvedF.× ananassachromosomal regions derived from diploid ancestorF. vesca.


Agriculture ◽  
2019 ◽  
Vol 9 (5) ◽  
pp. 97 ◽  
Author(s):  
Govintharaj Ponnaiah ◽  
Shashi Kumar Gupta ◽  
Michael Blümmel ◽  
Maheswaran Marappa ◽  
Sumathi Pichaikannu ◽  
...  

Genetic diversity of 130 forage-type hybrid parents of pearl millet was investigated based on multiple season data of morphological traits and two type of markers: SSRs (Simple sequence repeats) and GBS identified SNPs (Genotyping by sequencing-Single nucleotide polymorphism). Most of the seed and pollinator parents clustered into two clear-cut separate groups based on marker based genetic distance. Significant variations were found for forage related morphological traits at different cutting intervals (first and second cut) in hybrid parents. Across two cuts, crude protein (CP) varied from 11% to 15%, while in vitro organic matter digestibility (IVOMD) varied from 51% to 56%. Eighty hybrids evaluated in multi-location trial along with their parents for forage traits showed that significant heterosis can be realized for forage traits. A low but positive significant correlation found between SSR based genetic distance (GD between parents of hybrid) and heterosis for most of the forage traits indicated that SSR-based GD can be used for predicting heterosis for GFY, DFY and CP in pearl millet. An attempt was made to associate marker-based clusters with forage quality traits, to enable breeders select parents for crossing purposes in forage breeding programs.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Kamal Khadka ◽  
Davoud Torkamaneh ◽  
Mina Kaviani ◽  
Francois Belzile ◽  
Manish N. Raizada ◽  
...  

Abstract Background Appropriate information about genetic diversity and population structure of germplasm improves the efficiency of plant breeding. The low productivity of Nepali bread wheat (Triticum aestivum L.) is a major concern particularly since Nepal is ranked the 4th most vulnerable nation globally to climate change. The genetic diversity and population structure of Nepali spring wheat have not been reported. This study aims to improve the exploitation of more diverse and under-utilized genetic resources to contribute to current and future breeding efforts for global food security. Results We used genotyping-by-sequencing (GBS) to characterize a panel of 318 spring wheat accessions from Nepal including 166 landraces, 115 CIMMYT advanced lines, and 34 Nepali released varieties. We identified 95 K high-quality SNPs. The greatest genetic diversity was observed among the landraces, followed by CIMMYT lines, and released varieties. Though we expected only 3 groupings corresponding to these 3 seed origins, the population structure revealed two large, distinct subpopulations along with two smaller and scattered subpopulations in between, with significant admixture. This result was confirmed by principal component analysis (PCA) and UPGMA distance-based clustering. The pattern of LD decay differed between subpopulations, ranging from 60 to 150 Kb. We discuss the possibility that germplasm explorations during the 1970s–1990s may have mistakenly collected exotic germplasm instead of local landraces and/or collected materials that had already cross-hybridized since exotic germplasm was introduced starting in the 1950s. Conclusion We suggest that only a subset of wheat “landraces” in Nepal are authentic which this study has identified. Targeting these authentic landraces may accelerate local breeding programs to improve the food security of this climate-vulnerable nation. Overall, this study provides a novel understanding of the genetic diversity of wheat in Nepal and this may contribute to global wheat breeding initiatives.


Author(s):  
Bernd Degen ◽  
Celine Blanc-Jolivet ◽  
Svetlana Bakhtina ◽  
Ruslan Ianbaev ◽  
Yulai Yanbaev ◽  
...  

AbstractWe used Double Digest Restriction site associated DNA sequencing (ddRAD) and Miseq to develop new geographically informative nuclear and plastid SNP and indel loci in Quercus robur and Q. petraea. Genotypes derived from sequence data of 95 individuals and two pools of 20 individuals each of Q. robur and Q. mongolica covering the distribution range of the species, were analysed to select geographically informative and polymorphic loci within Germany and Russia. We successfully screened a selected set of 431 nuclear single nucleotide polymorphism (nSNP), six nuclear Indel, six mitochondrial single nucleotide polymorphism (mtSNP) and ten chloroplast single nucleotide polymorphism (cpSNP) loci with a SeqSNP genotyping platform on 100 individuals Quercus petraea from 10 locations in Germany, 100 individuals Quercus robur from ten locations in Germany and 100 individuals Quercus robur from ten locations in Russia. The newly developed loci are useful for species identification and genetic studies on the genetic diversity and genetic differentiation of Quercus robur and Quercus petraea in Europe.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
D. N. U. Naranpanawa ◽  
C. H. W. M. R. B. Chandrasekara ◽  
P. C. G. Bandaranayake ◽  
A. U. Bandaranayake

Abstract Recent advances in next-generation sequencing technologies have paved the path for a considerable amount of sequencing data at a relatively low cost. This has revolutionized the genomics and transcriptomics studies. However, different challenges are now created in handling such data with available bioinformatics platforms both in assembly and downstream analysis performed in order to infer correct biological meaning. Though there are a handful of commercial software and tools for some of the procedures, cost of such tools has made them prohibitive for most research laboratories. While individual open-source or free software tools are available for most of the bioinformatics applications, those components usually operate standalone and are not combined for a user-friendly workflow. Therefore, beginners in bioinformatics might find analysis procedures starting from raw sequence data too complicated and time-consuming with the associated learning-curve. Here, we outline a procedure for de novo transcriptome assembly and Simple Sequence Repeats (SSR) primer design solely based on tools that are available online for free use. For validation of the developed workflow, we used Illumina HiSeq reads of different tissue samples of Santalum album (sandalwood), generated from a previous transcriptomics project. A portion of the designed primers were tested in the lab with relevant samples and all of them successfully amplified the targeted regions. The presented bioinformatics workflow can accurately assemble quality transcriptomes and develop gene specific SSRs. Beginner biologists and researchers in bioinformatics can easily utilize this workflow for research purposes.


2021 ◽  
Author(s):  
Tim H. Heupink ◽  
Lennert Verboven ◽  
Robin M. Warren ◽  
Annelies Van Rie

AbstractImproved understanding of the genomic variants that allow Mycobacterium tuberculosis (Mtb) to acquire drug resistance, or tolerance, and increase its virulence are important factors in controlling the current tuberculosis epidemic. Current approaches to Mtb sequencing however cannot reveal Mtb’s full genomic diversity due to the strict requirements of low contamination levels, high Mtb sequence coverage, and elimination of complex regions.We developed the XBS (compleX Bacterial Samples) bioinformatics pipeline which implements joint calling and machine-learning-based variant filtering tools to specifically improve variant detection in the important Mtb samples that do not meet these criteria, such as those from unbiased sputum samples. Using novel simulated datasets, that permit exact accuracy verification, XBS was compared to the UVP and MTBseq pipelines. Accuracy statistics showed that all three pipelines performed equally well for sequence data that resemble those obtained from high depth coverage and low-level contamination culture isolates. In the complex genomic regions however, XBS accurately identified 9.0% more single nucleotide polymorphisms and 8.1% more single nucleotide insertions and deletions than the WHO-endorsed unified analysis variant pipeline. XBS also had superior accuracy for sequence data that resemble those obtained directly from sputum samples, where depth of coverage is typically very low and contamination levels are high. XBS was the only pipeline not affected by low depth of coverage (5-10×), type of contamination and excessive contamination levels (>50%). Simulation results were confirmed using WGS data from clinical samples, confirming the superior performance of XBS with a higher sensitivity (98.8%) when analysing culture isolates and identification of 13.9% more variable sites in WGS data from sputum samples as compared to MTBseq, without evidence for false positive variants when ribosomal RNA regions were excluded.The XBS pipeline facilitates sequencing of less-than-perfect Mtb samples. These advances will benefit future clinical applications of Mtb sequencing, especially whole genome sequencing directly from clinical specimens, thereby avoiding in vitro biases and making many more samples available for drug resistance and other genomic analyses. The additional genetic resolution and increased sample success rate will improve genome-wide association studies and sequence-based transmission studies.Impact statementMycobacterium tuberculosis (Mtb) DNA is usually extracted from culture isolates to obtain high quantities of non-contaminated DNA but this process can change the make-up of the bacterial population and is time-consuming. Furthermore, current analytic approaches exclude complex genomic regions where DNA sequences are repeated to avoid inference of false positive genetic variants, which may result in the loss of important genetic information.We designed the compleX Bacterial Sample (XBS) variant caller to overcome these limitations. XBS employs joint variant calling and machine-learning-based variant filtering to ensure that high quality variants can be inferred from low coverage and highly contaminated genomic sequence data obtained directly from sputum samples. Simulation and clinical data analyses showed that XBS performs better than other pipelines as it can identify more genetic variants and can handle complex (low depth, highly contaminated) Mtb samples. The XBS pipeline was designed to analyse Mtb samples but can easily be adapted to analyse other complex bacterial samples.Data summarySimulated sequencing data have been deposited in SRA BioProject PRJNA706121. All detailed findings are available in the Supplementary Material. Scripts for running the XBS variant calling core are available on https://github.com/TimHHH/XBS The authors confirm all supporting data, code and protocols have been provided within the article or through supplementary data files.


Genome ◽  
2021 ◽  
Author(s):  
Guoliang Li ◽  
Lixin Yue ◽  
Xu Cai ◽  
Fei Li ◽  
Hui Zhang ◽  
...  

This study evaluated genotyping by sequencing (GBS) protocol for fingerprinting Brassica rapa and the data derived were more reliable than the re-sequencing data of B. rapa. Of the 10 enzyme solutions used to analyze the numbers of genotypes and single nucleotide polymorphisms (SNPs) in B. rapa, five solutions showed better results, namely: A (HaeIII, 450–500 bp), E (RsaI+HaeIII, 500–550 bp), F (RsaI+HaeIII, 500–600 bp), G (RsaI+HaeIII, ‘All’ fragment), and J (RsaI+EcoRV-HF®, ‘All’ fragment). The five enzyme solutions showed less than 40% similarity in different individuals from various samples, and 90% similarity in between two individuals from one sample. The E enzyme solution was most suitable for fingerprinting B. rapa revealing well-distributed SNPs in the whole genome. Of the 82 highly inbred lines and 18 F1 lines of B. rapa sequenced by GBS in E enzyme solution, known parents of 10 F1 lines were verified and male parents were discovered for 8 F1 lines that had only known female parents. This study provided a valuable method for screening parents for F1 lines in B. rapa for applied breeding through efficient evaluation of GBS with varied library construction strategies.


Sign in / Sign up

Export Citation Format

Share Document