scholarly journals De Novo SNP Discovery and Genotyping of Iranian Pimpinella Species Using ddRAD Sequencing

Agronomy ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1342
Author(s):  
Shaghayegh Mehravi ◽  
Gholam Ali Ranjbar ◽  
Ghader Mirzaghaderi ◽  
Anita Alice Severn-Ellis ◽  
Armin Scheben ◽  
...  

The species of Pimpinella, one of the largest genera of the family Apiaceae, are traditionally cultivated for medicinal purposes. In this study, high-throughput double digest restriction-site associated DNA sequencing technology (ddRAD-seq) was used to identify single nucleotide polymorphisms (SNPs) in eight Pimpinella species from Iran. After double-digestion with the enzymes HpyCH4IV and HinfI, a total of 334,702,966 paired-end reads were de novo assembled into 1,270,791 loci with an average of 28.8 reads per locus. After stringent filtering, 2440 high-quality SNPs were identified for downstream analysis. Analysis of genetic relationships and population structure, based on these retained SNPs, indicated the presence of three major groups. Gene ontology and pathway analysis were determined by using comparison SNP-associated flanking sequences with a public non-redundant database. Due to the lack of genomic resources in this genus, our present study is the first report to provide high-quality SNPs in Pimpinella based on a de novo analysis pipeline using ddRAD-seq. This data will enhance the molecular knowledge of the genus Pimpinella and will provide an important source of information for breeders and the research community to enhance breeding programs and support the management of Pimpinella genomic resources.

Author(s):  
Valentina Peona ◽  
Mozes P.K. Blom ◽  
Luohao Xu ◽  
Reto Burri ◽  
Shawn Sullivan ◽  
...  

AbstractGenome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies have opened up a whole new world of genomic biodiversity. Although these technologies generate high-quality genome assemblies, there are still genomic regions difficult to assemble, like repetitive elements and GC-rich regions (genomic “dark matter”). In this study, we compare the efficiency of currently used sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter starting from the same sample. By adopting different de-novo assembly strategies, we were able to compare each individual draft assembly to a curated multiplatform one and identify the nature of the previously missing dark matter with a particular focus on transposable elements, multi-copy MHC genes, and GC-rich regions. Thanks to this multiplatform approach, we demonstrate the feasibility of producing a high-quality chromosome-level assembly for a non-model organism (paradise crow) for which only suboptimal samples are available. Our approach was able to reconstruct complex chromosomes like the repeat-rich W sex chromosome and several GC-rich microchromosomes. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects around the completeness of both the coding and non-coding parts of the genomes.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 318
Author(s):  
Md. Bazlur Rahman Mollah ◽  
Md. Shamsul Alam Bhuiyan ◽  
M.A.M. Yahia Khandoker ◽  
Md. Abdul Jalil ◽  
Gautam Kumar Deb ◽  
...  

The Black Bengal goat (BBG) is a dwarf sized heritage goat (Capra hircus) breed from Bangladesh, and is well known for its high fertility, excellent meat and skin quality. Here we present the first whole genome sequence and genome-wide distributed single nucleotide polymorphisms (SNPs) of the BBG. A total of 833,469,900 raw reads consisting of 125,020,485,000 bases were obtained by sequencing one male BBG sample. The reads were aligned to the San Clemente and the Yunnan black goat genome which resulted in 98.65% (properly paired, 94.81%) and 98.50% (properly paired, 97.10%) of the reads aligning, respectively. Notably, the estimated sequencing coverages were 48.22X and 44.28X compared to published San Clemente and the Yunnan black goat genomes respectively. On the other hand, a total of 9,497,875 high quality SNPs (Q ≥ 20) along with 1,023,359 indels, and 8,746,849 high quality SNPs along with 842,706 indels were identified in BBG against the San Clemente and Yunnan black goat genomes respectively. The dataset is publicly available from NCBI BioSample (SAMN10391846), Sequence Read Archive (SRR8182317, SRR8549413 and SRR8549904), with BioProject ID PRJNA504436. These data might be useful genomic resources in conducting genome wide association studies, identification of quantitative trait loci (QTLs) and functional genomic analysis of the Black Bengal goat.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257974
Author(s):  
Mingliang Zhou ◽  
Gaofu Wang ◽  
Minghua Chen ◽  
Qian Pang ◽  
Shihai Jiang ◽  
...  

Sichuan, China, has abundant genetic resources of sheep (Ovis aries). However, their genetic diversity and population structure have been less studied, especially at the genome-wide level. In the present study, we employed the specific-locus amplified fragment sequencing for identifying genome-wide single nucleotide polymorphisms (SNPs) among five breeds of sheep distributed in Sichuan, including three local pure breeds, one composite breed, and one exotic breed of White Suffolk. From 494 million clean paired-end reads, we obtained a total of 327,845 high-quality SNPs that were evenly distributed among all 27 chromosomes, with a transition/transversion ratio of 2.56. Based on this SNP panel, we found that the overall nucleotide diversity was 0.2284 for all five breeds, with the highest and lowest diversity observed in Mage sheep (0.2125) and Butuo Black (0.1963) sheep, respectively. Both Wright’s fixation index and Identity-by-State distance revealed that all individuals of Liangshan Semifine-wool, White Suffolk, and Butuo Black sheep were respectively clustered together, and the breeds could be separated from each other, whereas Jialuo and Mage sheep had the closest genetic relationship and could not be distinguished from each other. In conclusion, we provide a reference panel of genome-wide and high-quality SNPs in five sheep breeds in Sichuan, by which their genetic diversity and population structures were investigated.


2019 ◽  
Author(s):  
Reddaiah Bodanapu ◽  
Sreehari V Vasudevan ◽  
Sivarama Prasad Lekkala ◽  
Navajeet Chakravartty ◽  
Krishna Lalam ◽  
...  

World-wide grown and consumed tomato (Solanum lycopersicum) crop used as model system for new cultivar and fruit development. Genetic and genomic research of Indian tomato cultivars will provide an insight to develop new breeding strategies and crop improvement. The present study aimed to identify the high quality common and unique single nucleotide polymorphisms (SNPs), present in 9 different Indian tomato cultivars using double digestion restriction associated DNA sequencing (ddRAD-seq). Total of 36,847,092 raw reads (3.68 GB) were generated for all samples and 3,329,625 of high-quality reads were aligned uniquely to the reference tomato genome. Using stringent filtering, a total of 1,165 SNPs and 69 INDELs were found in genic regions, along with the unique variants to each cultivar was observed. Similarly, 7 and 33 variants were identified in chloroplast and mitochondrial genome of tomato. In addition, the population structure and genetic relationship among these cultivars suggested 4 well-differentiated sub-populations. Functional annotation of SNP/INDLEs associated with flanking sequences along with gene ontology and pathway analysis was performed. Identified SNPs/INDELs could be useful as markers for variety identification for genetic purity analysis. Findings from this work will be useful to plant breeders and research community to deepen their understanding and enhance tomato breeding programs.


2019 ◽  
Author(s):  
Yinghui Dong ◽  
Qifan Zeng ◽  
Jianfeng Ren ◽  
Hanhan Yao ◽  
Wenbin Ruan ◽  
...  

AbstractBackgroundThe Chinese razor clam, Sinonovacula constricta, is one of the commercially important marine bivalves with deep-burrowing lifestyle and remarkable adaptability of broad-range salinity. Despite its economic impact and representative of the less-understood deep-burrowing bivalve lifestyle, there are few genomic resources for exploring its unique biology and adaptive evolution. Herein, we reported a high-quality chromosomal-level reference genome of S. constricta, the first genome of the family Solenidae, along with a large amount of short-read/full-length transcriptomic data of whole-ontogeny developmental stages, all major adult tissues, and gill tissues under salinity challenge.FindingsA total of 101.79 Gb and 129.73 Gb sequencing data were obtained with the PacBio and Illumina platforms, which represented approximately 186.63X genome coverage. In addition, a total of 160.90 Gb and 24.55 Gb clean data were also obtained with the Illumina and PacBio platforms for transcriptomic investigation. A de novo genome assembly of 1,340.13 Mb was generated, with a contig N50 of 689.18 kb. Hi-C scaffolding resulted in 19 chromosomes with a scaffold N50 of 57.99 Mb. The repeat sequences account for 50.71% of the assembled genome. A total of 26,273 protein-coding genes were predicted and 99.5% of them were annotated. Phylogenetic analysis revealed that S. constricta diverged from the lineage of Pteriomorphia at approximately 494 million years ago. Notably, cytoskeletal protein tubulin and motor protein dynein gene families are rapidly expanded in the S. constricta genome and are highly expressed in the mantle and gill, implicating potential genomic bases for the well-developed ciliary system in the S. constricta.ConclusionsThe high-quality genome assembly and comprehensive transcriptomes generated in this work not only provides highly valuable genomic resources for future studies of S. constricta, but also lays a solid foundation for further investigation into the adaptive mechanisms of benthic burrowing mollusks.


2021 ◽  
pp. 1-13
Author(s):  
Chuyan Wang ◽  
Jie Yu ◽  
Jun Wang ◽  
Jigang Zhang ◽  
Liuqing Yang ◽  
...  

BACKGROUND: Blueberry is among the fastest growing fruit crops in the world, which is beneficial to human health and attracts extensive interests. In contrast to its rapid development and utilization, availability of molecular and genetic resources for blueberries are still scarce. OBJECTIVE: In present report, transcriptomic profiling of four widely cultivated varieties of Rabbiteye and Southern Highbush blueberries were characterized to assist the breeding programs. METHODS: Both de novo and reference-based assembly were conducted to generate the genetic resources that can be used in the further functional and breeding studies. RESULTS: De novo and reference-based assembly found average 136,350 and 158,123 non-redundant transcripts, respectively. Average 57,668 de novo assembled transcripts can be functionally annotated by homology search with different databases. We further detected 6,268 polymorphic simple sequence repeats, 566,913 high-quality single nucleotide polymorphisms and 88,662 insertion and deletions among the four varieties with comparison to a recently released reference genome of blueberry. Differentially expressed genes analysis showed that varieties of same species show less differences within species but larger differences between species. CONCLUSIONS: These comprehensive and high-quality genetic resources will contribute to a wide range of genetics and molecular breeding studies in blueberries.


Genome ◽  
2020 ◽  
Vol 63 (12) ◽  
pp. 607-613
Author(s):  
Joanne A. Labate ◽  
Jeffrey C. Glaubitz ◽  
Michael J. Havey

Onion (Allium cepa) is not highly tractable for development of molecular markers due to its large (16 gigabases per 1C) nuclear genome. Single nucleotide polymorphisms (SNPs) are useful for genetic characterization and marker-aided selection of onion because of codominance and common occurrence in elite germplasm. We completed genotyping by sequencing (GBS) to identify SNPs in onion using 46 F2 plants, parents of the F2 plants (Ailsa Craig 43 and Brigham Yellow Globe 15-23), two doubled haploid (DH) lines (DH2107 and DH2110), and plants from 94 accessions in the USDA National Plant Germplasm System (NPGS). SNPs were called using the TASSEL 3.0 Universal Network Enabled Analysis (UNEAK) bioinformatics pipeline. Sequences from the F2 and DH plants were used to construct a pseudo-reference genome against which genotypes from all accessions were scored. Quality filters were used to identify a set of 284 high quality SNPs, which were placed onto an existing genetic map for the F2 family. Accessions showed a moderate level of diversity (mean He = 0.341) and evidence of inbreeding (mean F = 0.592). GBS is promising for SNP discovery in onion, although lack of a reference genome required extensive custom scripts for bioinformatics analyses to identify high quality markers.


2014 ◽  
Vol 2014 ◽  
pp. 1-4 ◽  
Author(s):  
Mingming Liu ◽  
Zach N. Adelman ◽  
Kevin M. Myles ◽  
Liqing Zhang

With the rapid development of high throughput sequencing technologies, new transcriptomes can be sequenced for little cost with high coverage. Sequence assembly approaches have been modified to meet the requirements for de novo transcriptomes, which have complications not found in traditional genome assemblies such as variation in coverage for each candidate mRNA and alternative splicing. As a consequence, de novo assembly strategies tend to generate a large number of redundant contigs due to sequence variations, which adversely affects downstream analysis and experiments. In this work we proposed TransPS, a transcriptome post-scaffolding method, to generate high quality, nonredundant de novo transcriptomes. TransPS shows promising results on the test transcriptome datasets, where redundancy is greatly reduced by more than 50% and, at the same time, coverage is improved considerably. The web server and source code are available.


2021 ◽  
Author(s):  
José Luis Spinoso-Castillo ◽  
Tarsicio Corona-Torres ◽  
Esteban Escamilla-Prado ◽  
Victorino Morales-Ramos ◽  
Víctor Heber Aguilar-Rincón ◽  
...  

Coffea arabica L. produces a high-quality beverage, with pleasant aroma and flavor, but diseases, pests and abiotic stresses often affect its yield. Therefore, improving important agronomic traits of this commercial specie remains a target for most coffee improvement programs. With advances in genomic and sequencing technology, it is feasible to understand the coffee genome and the molecular inheritance underlying coffee traits, thereby helping improve the efficiency of breeding programs. Thanks to the rapid development of genomic resources and the publication of the C. canephora reference genome, third-generation markers based on single-nucleotide polymorphisms (SNPs) have gradually been identified and assayed in Coffea, particularly in C. arabica. However, high-throughput genotyping assays are still needed in order to rapidly characterize the coffee genetic diversity and to evaluate the introgression of different cultivars in a cost-effective way. The DArTseq™ platform, developed by Diversity Arrays Technology, is one of these approaches that has experienced an increasing interest worldwide since it is able to generate thousands of high quality SNPs in a timely and cost-effective manner. These validated SNP markers will be useful to molecular genetics and for innovative approaches in coffee breeding.


2020 ◽  
Author(s):  
Yanyan Wu ◽  
Qinglan Tian ◽  
Weihua Huang ◽  
Jieyun Liu ◽  
Xiuzhong Xia ◽  
...  

AbstractInformation of the Passiflora genome is still very limited. Understand the evolutionary relationship between different species of Passiflora, and develop a large number of SSR markers to provide a basis for the genetic improvement of Passiflora. Applying restriction site associated DNA sequencing (RAD-Seq) technology, we studied the phylogeny, simple sequence repeat (SSR) and marker transferability of 10 accessions of 6 species of Passiflora. Taking the partial assembly sequence of accessions P4 as the reference genome, we constructed the phylogenetic tree using the detected 46,451 high-quality single nucleotide polymorphisms (SNPs), showing that P6, P7, P8 and P9 were a single one while P5 and P10 were clustered together, and P1, P2, P3 and P4 were closer in genetic relationship. Using P8 as the reference genome, a total of 12,452 high-quality SNPs were used to construct phylogenetic tree. P3, P4, P7, P8, P9 and P10 were all single branch while P1 and P2 were clustered together, and P5 and P6 were clustered into one branch. A principal component analysis (PCA) revealed a similar population structure, which four cultivated passion fruits forming a tight cluster. A total of 2,614 SSRs were identified in the genome of 10 Passiflora accessions. The core motifs were AT, GA, AAG etc., 2-6 bases, 4-16 repeats, and 2,515 pairs of SSR primer were successfully developed. Tthe SSR transferability in cultivated passion fruits is the best. These results will contribute to the study of genomics and molecular genetics in passion fruit.


Sign in / Sign up

Export Citation Format

Share Document