scholarly journals High-quality assembly of sweet basil genome

2018 ◽  
Author(s):  
Nativ Dudai ◽  
Marie-Jeanne Carp ◽  
Renana Milavski ◽  
David Chaimovitsh ◽  
Alona Shachter ◽  
...  

AbstractSweet basil, sometimes called the King of Herbs, is well known for its culinary uses, especially in the Italian sauce ‘Pesto’. It is also used in traditional medicine, as a source for essential oils and as an ornamental plant. So far, basil was bred by classical and traditional methods due to lack of a reference genome that will allow optimized application of the most up-to-date sequencing techniques. Here, we report on the first completion of the sweet basil genome of the cultivar ‘Perrie’, a fresh-cut Genovese-type basil, using several next generation sequencing platforms followed by genome assembly with NRGENE’s DeNovoMAGIC assembly tool. We determined that the genome size of sweet basil is 2.13 Gbp and assembled it into 12,212 scaffolds. The high-quality of the assembly is reflected in that more than 90% of the assembly size is composed of only 107 scaffolds. An independent analysis of single copy orthologues genes showed a 93% completeness which reveal also that 74% of them were duplicated, indicating that the sweet basil is a tetraploid organism. A reference genome of sweet basil will enable to develop precise molecular markers for various agricultural important traits such as disease resistance and tolerance to various environmental conditions. We will gain a better understanding of the underlying mechanisms of various metabolic processes such as aroma production and pigment accumulation. Finally, it will save time and money for basil breeders and scientists and ensure higher throughput and robustness in future studies.

Author(s):  
Yuanchao Liu ◽  
Longhua Huang ◽  
Huiping Hu ◽  
Manjun Cai ◽  
Xiaowei Liang ◽  
...  

Abstract Ganoderma leucocontextum, a newly discovered species of Ganodermataceae in China, has diverse pharmacological activities. G. leucocontextum was widely cultivated in southwest China, but the systematic genetic study has been impeded by the lack of a reference genome. Herein, we present the first whole-genome assembly of G. leucocontextum based on the Illumina and Nanopore platform from high-quality DNA extracted from a monokaryon strain (DH-8). The generated genome was 50.05 Mb in size with a N50 scaffold size of 3.06 Mb, 78,206 coding sequences and 13,390 putative genes. Genome completeness was assessed using the Benchmarking Universal Single-Copy Orthologs (BUSCO) tool, which identified 96.55% of the 280 Fungi BUSCO genes. Furthermore, differences in functional genes of secondary metabolites (terpenoids) were analyzed between G. leucocontextum and G. lucidum. G. leucocontextum has more genes related to terpenoids synthesis compared to G. lucidum, which may be one of the reasons why they exhibit different biological activities. This is the first genome assembly and annotation for G. leucocontextum, which would enrich the toolbox for biological and genetic studies in G. leucocontextum.


2021 ◽  
Author(s):  
Vipin K. Menon ◽  
Pablo C. Okhuysen ◽  
Cynthia Chappell ◽  
Medhat Mahmoud ◽  
Qingchang Meng ◽  
...  

Background Cryptosporidium parvum are apicomplexan parasites commonly found across many species with a global infection prevalence of 7.6%. As such it is important to understand the diversity and genomic makeup of this prevalent parasite to prohibit further spread and to fight an infection. The general basis of every genomic study is a high quality reference genome that has continuity and completeness, and is of high quality and thus enables comprehensive comparative studies. Findings Here we provide a highly accurate and complete reference genome of Cryptosporidium spp.. The assembly is based on Oxford Nanopore reads and was improved using Illumina reads for error correction. The assembly encompasses 8 chromosomes and includes 13 telomeres that were resolved. Overall the assembly shows a high completion rate with 98.4% single copy Busco genes. This is also shown by the identification of 13 telomeric regions across the 8 chromosomes. The consensus accuracy of the established reference genome was further validated by sequence alignment of established genetic markers for C.parvum. Conclusions This high quality reference genome provides the basis for subsequent studies and comparative genomic studies across the Cryptosporidium clade.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9114 ◽  
Author(s):  
Jiawei Wang ◽  
Weizhen Liu ◽  
Dongzi Zhu ◽  
Xiang Zhou ◽  
Po Hong ◽  
...  

The sweet cherry (Prunus avium) is one of the most economically important fruit species in the world. However, there is a limited amount of genetic information available for this species, which hinders breeding efforts at a molecular level. We were able to describe a high-quality reference genome assembly and annotation of the diploid sweet cherry (2n = 2x = 16) cv. Tieton using linked-read sequencing technology. We generated over 750 million clean reads, representing 112.63 GB of raw sequencing data. The Supernova assembler produced a more highly-ordered and continuous genome sequence than the current P. avium draft genome, with a contig N50 of 63.65 KB and a scaffold N50 of 2.48 MB. The final scaffold assembly was 280.33 MB in length, representing 82.12% of the estimated Tieton genome. Eight chromosome-scale pseudomolecules were constructed, completing a 214 MB sequence of the final scaffold assembly. De novo, homology-based, and RNA-seq methods were used together to predict 30,975 protein-coding loci. 98.39% of core eukaryotic genes and 97.43% of single copy orthologues were identified in the embryo plant, indicating the completeness of the assembly. Linked-read sequencing technology was effective in constructing a high-quality reference genome of the sweet cherry, which will benefit the molecular breeding and cultivar identification in this species.


2021 ◽  
Author(s):  
Anurag Priyam ◽  
Alicja Witwicka ◽  
Anindita Brahma ◽  
Eckart Stolle ◽  
Yannick Wurm

Long-molecule sequencing is now routinely applied to generate high-quality reference genome assemblies. However, datasets differ in repeat composition, heterozygosity, read lengths and error profiles. The assembly parameters that provide the best results could thus differ across datasets. By integrating four complementary and biologically meaningful metrics, we show that simple fine-tuning of assembly parameters can substantially improve the quality of long-read genome assemblies. In particular, modifying estimates of sequencing error rates improves some metrics more than two-fold. We provide a flexible software, CompareGenomeQualities, that automates comparisons of assembly qualities for researchers wanting a straightforward mechanism for choosing among multiple assemblies.


2020 ◽  
Author(s):  
C. Molitor ◽  
T.J. Kurowski ◽  
P.M. Fidalgo de Almeida ◽  
P. Eerolla ◽  
D.J. Spindlow ◽  
...  

AbstractSolanum sitiens is a self-incompatible wild relative of tomato, characterised by salt and drought resistance traits, with the potential to contribute to crop improvement in cultivated tomato. This species has a distinct morphology, classification and ecotype compared to other stress resistant wild tomato relatives such as S. pennellii and S. chilense. Therefore, the availability of a high-quality reference genome for S. sitiens will facilitate the genetic and molecular understanding of salt and drought resistance. Here, we present a de novo genome and transcriptome assembly for S. sitiens (Accession LA1974). A hybrid assembly strategy was followed using Illumina short reads (∼159X coverage) and PacBio long reads (∼44X coverage), generating a total of ∼262 Gbp of DNA sequence; in addition, ∼2,670 Gbp of BioNano data was obtained. A reference genome of 1,245 Mbp, arranged in 1,481 scaffolds with a N50 of 1,826 Mbp was generated. Genome completeness was estimated at 95% using the Benchmarking Universal Single-Copy Orthologs (BUSCO) and the K-mer Analysis Tool (KAT); this is within the range of current high-quality reference genomes for other tomato wild relatives. Additionally, we identified three large inversions compared to S. lycopersicum, containing several drought resistance related genes, such as beta-amylase 1 and YUCCA7.In addition, ∼63 Gbp of RNA-Seq were generated to support the prediction of 31,164 genes from the assembly, and perform a de novo transcriptome. Some of the protein clusters unique to S. sitiens were associated with genes involved in drought and salt resistance, including GLO1 and FQR1.This first reference genome for S. sitiens will provide a valuable resource to progress QTL studies to the gene level, and will assist molecular breeding to improve crop production in water-limited environments.


2020 ◽  
Author(s):  
Taisei Kikuchi ◽  
Mehmet Dayi ◽  
Vicky L. Hunt ◽  
Atsushi Toyoda ◽  
Yasunobu Maeda ◽  
...  

AbstractBackgroundThe cryptic parasite Sparganum proliferum proliferates in humans and invades tissues and organs. Only scattered cases have been reported, but S. proliferum infection is always fatal. However, the S. proliferum phylogeny and lifecycle are still an enigma.ResultsTo investigate the phylogenetic relationships between S. proliferum and other cestode species, and to examine the underlying mechanisms of pathogenicity, we sequenced the entire S. proliferum genome. Additionally, S. proliferum plerocercoid larvae transcriptome analyses were performed to identify genes involved in asexual reproduction in the host. The genome sequences confirmed that the S. proliferum genetic sequence is distinct from that of the closely related Spirometra erinaceieuropaei. Moreover, nonordinal extracellular matrix coordination allows for asexual reproduction in the host and loss of sexual maturity in S. proliferum is related to its fatal pathogenicity in humans.ConclusionsThe high-quality reference genome sequences generated should prove valuable for future studies of pseudophyllidean tapeworm biology and parasitism.


Author(s):  
Maximilian Driller ◽  
Sibelle Torres Vilaça ◽  
Larissa Souza Arantes ◽  
Tomás Carrasco-Valenzuela ◽  
Felix Heeger ◽  
...  

AbstractReduced representation libraries present an opportunity to perform large scale studies on non-model species without the need for a reference genome. Methods that use restriction enzymes and fragment size selection to help obtain the desired number of loci - such as ddRAD - are highly flexible and therefore suitable to different types of studies. However, a number of technical issues are not approachable without a reference genome, such as size selection reproducibility across samples and coverage across fragment lengths. Moreover, identity thresholds are usually chosen arbitrarily in order to maximize the number of SNPs considering arbitrary parameters. We have developed a strategy to identify de novo a set of reduced-representation single-copy orthologs (R2SCOs). Our approach is based on overlapping reads that recreate original fragments and add information about coverage per fragment size. A further in silico digestion step limits the data to well covered fragment sizes, increasing the chance of covering the majority of loci across different individuals. By using full sequences as putative alleles, we estimate optimal identity thresholds from pairwise comparisons. We have demonstrated our full workflow with data from five sea turtle species. Locus numbers were similar across all species, even at increasing phylogenetics distances. Our results indicated that sea turtles have in general very low levels of heterozygosity. Our approach produced a high-quality set of reference loci, eliminating a series of biological and experimental biases that can strongly affect downstream analysis, and allowed us to explore the genetic variability within and across sea turtle species.


Author(s):  
Timothy Camenzind ◽  
Asser Elsayed ◽  
Fahd Mohiyaddin ◽  
Ruoyu Li ◽  
Stefan Kubicek ◽  
...  

Abstract The quality of the semiconductor-barrier interface plays a pivotal role in the demonstration of high quality reproducible quantum dots for quantum information processing. In this work, we have measured SiMOSFET Hall bars on undoped Si substrates in order to investigate the device quality. For devices fabricated in a full CMOS process and of very thin oxide below a thickness of \unit[10]{nm}, we report a record mobility of \unit[$17.5\times 10^{3}$]{cm$^2$/Vs} indicating a high quality interface, suitable for future qubit applications. We also study the influence of gate materials on the mobilities and discuss the underlying mechanisms, giving insight into further material optimization for large scale quantum processors.


2021 ◽  
Author(s):  
Yikun Zhao ◽  
Yuancong Wang ◽  
De Ma ◽  
Guang Feng ◽  
Yongxue Huo ◽  
...  

AbstractThe maize cultivar Dan340 is an excellent backbone inbred line of Luda Red Cob Group with several desirable characters, such as disease resistance, lodging resistance, high combining ability, wide adaptability and so on. In this study, we constructed a high-quality chromosome-level reference genome for Dan340 by combining PacBio long HiFi sequencing, Illumina short reads and chromosomal conformational capture (Hi-C) sequencing reads. The final assembly of Dan340 genome was 2,348.72 Mb, including 2,738 contigs and 2,315 scaffolds with a N50 of 41.49 Mb and 215.35 Mb, respectively. The percent of high quality Illumina reads mapped to the reference genome was up to 97.48%. The assembly of this genome will not only facilitate our understanding about intraspecific genome diversity in maize, but also provides a novel resource for maize breeding.Key findingsThe final assembly of Dan340 genome was 2,348.72 Mb, including 2,738 contigs and 2,315 scaffolds with a N50 of 41.49 Mb and 215.35 Mb, respectively.The percent of reads mapped to the reference genome was up to 97.48%.The results showed that 96.84% of the plant single-copy orthologues were complete. Complete single-copy and multi copy genes accounted for 87.36% and 9.48% of the genes, respectively. Taken together, these results indicated that our Dan340 genome assembly presented high quality and coverage.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Paul Saary ◽  
Alex L. Mitchell ◽  
Robert D. Finn

Abstract Microbial eukaryotes constitute a significant fraction of biodiversity and have recently gained more attention, but the recovery of high-quality metagenomic assembled eukaryotic genomes is limited by the current availability of tools. To help address this, we have developed EukCC, a tool for estimating the quality of eukaryotic genomes based on the automated dynamic selection of single copy marker gene sets. We demonstrate that our method outperforms current genome quality estimators, particularly for estimating contamination, and have applied EukCC to datasets derived from two different environments to enable the identification of novel eukaryote genomes, including one from the human skin.


Sign in / Sign up

Export Citation Format

Share Document