scholarly journals Accounting for Errors in Low Coverage High-Throughput Sequencing Data when Constructing Genetic Maps using Biparental Outcrossed Populations

2018 ◽  
Author(s):  
Timothy P. Bilton ◽  
Matthew R. Schofield ◽  
Michael A. Black ◽  
David Chagné ◽  
Phillip L. Wilcox ◽  
...  

ABSTRACTNext generation sequencing is an efficient method that allows for substantially more markers than previous technologies, providing opportunities for building high density genetic linkage maps, which facilitate the development of non-model species’ genomic assemblies and the investigation of their genes. However, constructing genetic maps using data generated via high-throughput sequencing technology (e.g., genotyping-by-sequencing) is complicated by the presence of sequencing errors and genotyping errors resulting from missing parental alleles due to low sequencing depth. If unaccounted for, these errors lead to inflated genetic maps. In addition, map construction in many species is performed using full-sib family populations derived from the outcrossing of two individuals, where unknown parental phase and varying segregation types further complicate construction. We present a new methodology for modeling low coverage sequencing data in the construction of genetic linkage maps using full-sib populations of diploid species, implemented in a package called GUSMap. Our model is based on an extension of the Lander-Green hidden Markov model that accounts for errors present in sequencing data. Results show that GUSMap was able to give accurate estimates of the recombination fractions and overall map distance, while most existing mapping packages produced inflated genetic maps in the presence of errors. Our results demonstrate the feasibility of using low coverage sequencing data to produce genetic maps without requiring extensive filtering of potentially erroneous genotypes, provided that the associated errors are correctly accounted for in the model.

Genetics ◽  
2018 ◽  
Vol 209 (1) ◽  
pp. 65-76 ◽  
Author(s):  
Timothy P. Bilton ◽  
Matthew R. Schofield ◽  
Michael A. Black ◽  
David Chagné ◽  
Phillip L. Wilcox ◽  
...  

AoB Plants ◽  
2020 ◽  
Vol 12 (6) ◽  
Author(s):  
Morad M Mokhtar ◽  
Ebtissam H A Hussein ◽  
Salah El-Din S El-Assal ◽  
Mohamed A M Atia

Abstract Faba bean (Vicia faba) is an essential food and fodder legume crop worldwide due to its high content of proteins and fibres. Molecular markers tools represent an invaluable tool for faba bean breeders towards rapid crop improvement. Although there have historically been few V. faba genome resources available, several transcriptomes and mitochondrial genome sequence data have been released. These data in addition to previously developed genetic linkage maps represent a great resource for developing functional markers and maps that can accelerate the faba bean breeding programmes. Here, we present the Vicia faba Omics database (VfODB) as a comprehensive database integrating germplasm information, expressed sequence tags (ESTs), expressed sequence tags-simple sequence repeats (EST-SSRs), and mitochondrial-simple sequence repeats (mtSSRs), microRNA-target markers and genetic maps in faba bean. In addition, KEGG pathway-based markers and functional maps are integrated as a novel class of annotation-based markers/maps. Collectively, we developed 31 536 EST markers, 9071 EST-SSR markers and 3023 microRNA-target markers based on V. faba RefTrans V2 mining. By mapping 7940 EST and 2282 EST-SSR markers against the KEGG pathways database we successfully developed 107 functional maps. Also, 40 mtSSR markers were developed based on mitochondrial genome mining. On the data curation level, we retrieved 3461 markers representing 12 types of markers (CAPS, EST, EST-SSR, Gene marker, INDEL, Isozyme, ISSR, RAPD, SCAR, RGA, SNP and SSR), which mapped across 18 V. faba genetic linkage maps. VfODB provides two user-friendly tools to identify, classify SSR motifs and in silico amplify their targets. VfODB can serve as a powerful database and helpful platform for faba bean research community as well as breeders interested in Genomics-Assisted Breeding.


2013 ◽  
Vol 40 (2) ◽  
pp. 95-106 ◽  
Author(s):  
Baozhu Guo ◽  
Manish K. Pandey ◽  
Guohao He ◽  
Xinyou Zhang ◽  
Boshou Liao ◽  
...  

ABSTRACT The competitiveness of peanuts in domestic and global markets has been threatened by losses in productivity and quality that are attributed to diseases, pests, environmental stresses and allergy or food safety issues. Narrow genetic diversity and a deficiency of polymorphic DNA markers severely hindered construction of dense genetic maps and quantitative trait loci (QTL) mapping in order to deploy linked markers in marker-assisted peanut improvement. The U.S. Peanut Genome Initiative (PGI) was launched in 2004, and expanded to a global effort in 2006 to address these issues through coordination of international efforts in genome research beginning with molecular marker development and improvement of map resolution and coverage. Ultimately, a peanut genome sequencing project was launched in 2012 by the Peanut Genome Consortium (PGC). We reviewed the progress for accelerated development of peanut genomic resources in peanut, such as generation of expressed sequenced tags (ESTs) (252,832 ESTs as December 2012 in the public NCBI EST database), development of molecular markers (over 15,518 SSRs), and construction of peanut genetic linkage maps, in particular for cultivated peanut. Several consensus genetic maps have been constructed, and there are examples of recent international efforts to develop high density maps. An international reference consensus genetic map was developed recently with 897 marker loci based on 11 published mapping populations. Furthermore, a high-density integrated consensus map of cultivated peanut and wild diploid relatives also has been developed, which was enriched further with 3693 marker loci on a single map by adding information from five new genetic mapping populations to the published reference consensus map.


2018 ◽  
Author(s):  
Quinn K. Langdon ◽  
David Peris ◽  
Brian Kyle ◽  
Chris Todd Hittinger

AbstractThe genomics era has expanded our knowledge about the diversity of the living world, yet harnessing high-throughput sequencing data to investigate alternative evolutionary trajectories, such as hybridization, is still challenging. Here we present sppIDer, a pipeline for the characterization of interspecies hybrids and pure species,that illuminates the complete composition of genomes. sppIDer maps short-read sequencing data to a combination genome built from reference genomes of several species of interest and assesses the genomic contribution and relative ploidy of each parental species, producing a series of colorful graphical outputs ready for publication. As a proof-of-concept, we use the genus Saccharomyces to detect and visualize both interspecies hybrids and pure strains, even with missing parental reference genomes. Through simulation, we show that sppIDer is robust to variable reference genome qualities and performs well with low-coverage data. We further demonstrate the power of this approach in plants, animals, and other fungi. sppIDer is robust to many different inputs and provides visually intuitive insight into genome composition that enables the rapid identification of species and their interspecies hybrids. sppIDer exists as a Docker image, which is a reusable, reproducible, transparent, and simple-to-run package that automates the pipeline and installation of the required dependencies (https://github.com/GLBRC/sppIDer).


2021 ◽  
Author(s):  
Yun-Joo Kang ◽  
Bo-Mi Lee ◽  
Jangmi Kim ◽  
Moon Nam ◽  
Myoung-Hee Lee ◽  
...  

Abstract High-quality molecular markers are essential for marker-assisted selection to accelerate breeding progress. Compared with diploid species, recently diverged polyploid crop species tend to have highly similar homeologous subgenomes, which is expected to limit the development of broadly applicable locus-specific single-nucleotide polymorphism (SNP) assays. Furthermore, it is particularly challenging to make genome-wide marker sets for species that lack a reference genome. Here, we report the development of a genome-wide set of kompetitive allele specific PCR (KASP) markers for marker-assisted recurrent selection (MARS) in the tetraploid minor crop perilla. To find locus-specific SNP markers across the perilla genome, we used genotyping-by-sequencing (GBS) to construct linkage maps of two F2 populations. The two resulting high-resolution linkage maps comprised 2,326 and 2,454 SNP markers that spanned a total genetic distance of 2,133 cM across 16 linkage groups and 2,169 cM across 21 linkage groups, respectively. We then obtained a final genetic map consisting of 22 linkage groups with 1,123 common markers from the two genetic maps. We selected 96 genome-wide markers for MARS and confirmed the accuracy of markers in the two F2 populations using a high-throughput Fluidigm system. We confirmed that 91.8% of the SNP genotyping results from the Fluidigm assay were the same as the results obtained through GBS. These results provide a foundation for marker-assisted backcrossing and the development of new varieties of perilla.


2022 ◽  
Vol 12 ◽  
Author(s):  
Xinxiu Yu ◽  
Rajesh Joshi ◽  
Hans Magnus Gjøen ◽  
Zhenming Lv ◽  
Matthew Kent

Consensus and sex-specific genetic linkage maps for large yellow croaker (Larimichthys crocea) were constructed using samples from an F1 family produced by crossing a Daiqu female and a Mindong male. A total of 20,147 single nucleotide polymorphisms (SNPs) by restriction site associated DNA sequencing were assigned to 24 linkage groups (LGs). The total length of the consensus map was 1757.4 centimorgan (cM) with an average marker interval of 0.09 cM. The total length of female and male linkage map was 1533.1 cM and 1279.2 cM, respectively. The average female-to-male map length ratio was 1.2 ± 0.23. Collapsed markers in the genetic maps were re-ordered according to their relative positions in the ASM435267v1 genome assembly to produce integrated genetic linkage maps with 9885 SNPs distributed across the 24 LGs. The recombination pattern of most LGs showed sigmoidal patterns of recombination, with higher recombination in the middle and suppressed recombination at both ends, which corresponds with the presence of sub-telocentric and acrocentric chromosomes in the species. The average recombination rate in the integrated female and male maps was respectively 3.55 cM/Mb and 3.05 cM/Mb. In most LGs, higher recombination rates were found in the integrated female map, compared to the male map, except in LG12, LG16, LG21, LG22, and LG24. Recombination rate profiles within each LG differed between the male and the female, with distinct regions indicating potential recombination hotspots. Separate quantitative trait loci (QTL) and association analyses for growth related traits in 6 months fish were performed, however, no significant QTL was detected. The study indicates that there may be genetic differences between the two strains, which may have implications for the application of DNA-information in the further breeding schemes.


Plants ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1048
Author(s):  
Daniela Torello Marinoni ◽  
Sogo Nishio ◽  
Nadia Valentini ◽  
Kenta Shirasawa ◽  
Alberto Acquadro ◽  
...  

Castanea sativa is an important multipurpose species in Europe for nut and timber production as well as for its role in the landscape and in the forest ecosystem. This species has low tolerance to chestnut gall wasp (Dryocosmus kuriphilus Yasumatsu), which is a pest that was accidentally introduced into Europe in early 2000 and devastated forest and orchard trees. Resistance to the gall wasp was found in the hybrid cultivar ‘Bouche de Bétizac’ (C. sativa × C. crenata) and studied by developing genetic linkage maps using a population derived from a cross between ‘Bouche de Bétizac’ and the susceptible cultivar ‘Madonna’ (C. sativa). The high-density genetic maps were constructed using double-digest restriction site-associated DNA-seq and simple sequence repeat markers. The map of ‘Bouche de Bétizac’ consisted of 1459 loci and spanned 809.6 cM; the map of ‘Madonna’ consisted of 1089 loci and spanned 753.3 cM. In both maps, 12 linkage groups were identified. A single major QTL was recognized on the ‘Bouche de Bétizac’ map, explaining up to 67–69% of the phenotypic variance of the resistance trait (Rdk1). The Rdk1 quantitative trait loci (QTL) region included 11 scaffolds and two candidate genes putatively involved in the resistance response were identified. This study will contribute to C. sativa breeding programs and to the study of Rdk1 genes.


2019 ◽  
Author(s):  
Rasyidah Razar ◽  
Katrien Devos ◽  
Ali Missaoui

Abstract Background: Switchgrass is an emerging bioenergy crop due to its perennial nature, high biomass yield, and ability to grow in marginal land. The high genetic diversity in switchgrass germplasm can be exploited to capture favorable traits that increase the range of adaptation and biomass yield. Genetic diversity can be explored using single nucleotide polymorphisms (SNPs) that next-generation sequencing has made possible for high-throughput genotyping. We used genotyping-by-sequencing (GBS) of genomic fragments resulting from two methylation sensitive restriction enzymes: PstI and MspI . Two bi-parental F1 populations were developed from crosses between lowland B6 and lowland AP13 (AB population), and lowland B6 with upland VS16 genotypes (BV population), with a target number of 298 progenies in each population. Pseudo-testcross strategy was adopted to perform linkage analysis in these populations that are segregating for winter dormancy using single dose markers (SDA): heterozygous in one parent and homozygous in the other parent. We compared the amount of polymorphisms between the two crosses and examined the pattern of segregation distortion based on the SNPs data generated. Results: Two genetic maps were generated for each population, with 2772 markers in AB and 3766 markers in BV. The higher number of markers in the BV population was expected for since the parents originated from different ecotypes and verified to have the highest genetic distance. More segregation distortion was observed in markers located in the telomeric regions where more genes reside. More markers from the AB population exhibited segregation distortion compared to the BV, and the proportion of heterozygous alleles were significantly higher than homozygous alleles in AB population. The linkage maps showed strong collinearity with P. virgatum V5.1 reference genome with a very minimal number of markers originating from different chromosomes. Conclusion: Understanding the extent of segregation distortion in switchgrass crosses is important for the correct inclusion of markers based on their segregation ratio when constructing a linkage map. Switchgrass linkage maps should be a useful resource to dissect beneficial biomass traits linked to SNP markers.


Genetics ◽  
2018 ◽  
Vol 209 (2) ◽  
pp. 389-400 ◽  
Author(s):  
Timothy P. Bilton ◽  
John C. McEwan ◽  
Shannon M. Clarke ◽  
Rudiger Brauning ◽  
Tracey C. van Stijn ◽  
...  

2013 ◽  
Vol 6 (1) ◽  
pp. 307 ◽  
Author(s):  
Chun Au ◽  
Man Cheung ◽  
Man Wong ◽  
Astley Kin Kan Chu ◽  
Patrick Tik Wan Law ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document