scholarly journals A high-quality fungal genome assembly resolved from a sample accidentally contaminated by multiple taxa

BioTechniques ◽  
2021 ◽  
Author(s):  
Janneke Aylward ◽  
Michael J Wingfield ◽  
Francois Roets ◽  
Brenda D Wingfield

Contamination in sequenced genomes is a relatively common problem and several methods to remove non-target sequences have been devised. Typically, the target and contaminating organisms reside in different kingdoms, simplifying their separation. The authors present the case of a genome for the ascomycete fungus Teratosphaeria eucalypti, contaminated by another ascomycete fungus and a bacterium. Approaching the problem as a low-complexity metagenomics project, the authors used two available software programs, BlobToolKit and anvi'o, to filter the contaminated genome. Both the de novo and reference-assisted approaches yielded a high-quality draft genome assembly for the target fungus. Incorporating reference sequences increased assembly completeness and visualization elucidated previously unknown genome features. The authors suggest that visualization should be routine in any sequencing project, regardless of suspected contamination.

Author(s):  
Luis J Chueca ◽  
Tilman Schell ◽  
Markus Pfenninger

Abstract Among all molluscs, land snails are a scientifically and economically interesting group comprising edible species, alien species and agricultural pests. Yet, despite their high diversity, the number of genome drafts publicly available is still scarce. Here, we present the draft genome assembly of the land snail Candidula unifasciata, a widely distributed species along central Europe, belonging to the Geomitridae family, a highly diversified taxon in the Western-Palearctic region. We performed whole genome sequencing, assembly and annotation of an adult specimen based on PacBio and Oxford Nanopore long read sequences as well as Illumina data. A genome draft of about 1.29 Gb was generated with a N50 length of 246 kb. More than 60% of the assembled genome was identified as repetitive elements. 22,464 protein-coding genes were identified in the genome, of which 62.27% were functionally annotated. This is the first assembled and annotated genome for a geometrid snail and will serve as reference for further evolutionary, genomic and population genetic studies of this important and interesting group.


2021 ◽  
Author(s):  
Luis J. Chueca ◽  
Tilman Schell ◽  
Markus Pfenninger

AbstractAmong all molluscs, land snails are an economically and scientifically interesting group comprising edible species, alien species and agricultural pests. Yet, despite its high diversity, the number of whole genomes publicly available is still scarce. Here, we present the draft genome assembly of the land snail Candidula unifasciata, a widely distributed species along central Europe, which belongs to Geomitridae family, a group highly diversified in the Western-Palearctic region. We performed a whole genome sequencing, assembly and annotation of an adult specimen based on PacBio and Oxford Nanopore long read sequences as well as Illumina data. A genome of about 1.29 Gb was generated with a N50 length of 246 kb. More than 60% of the assembled genome was identified as repetitive elements, and 22,464 protein-coding genes were identified in the genome, where the 62.27% were functionally annotated. This is the first assembled and annotated genome for a geometrid snail and will serve as reference for further evolutionary, genomic and population genetic studies of this important and interesting group.


2021 ◽  
Author(s):  
Andre Machado ◽  
Andre Gomes-dos-Santos ◽  
Miguel Fonseca ◽  
Rute da Fonseca ◽  
Ana Verissimo ◽  
...  

The Atlantic chub mackerel, Scomber colias Gmelin, 1789, is a medium-size pelagic fish with substantial importance in the fisheries of the Atlantic Ocean and the Mediterranean Sea. Over the past decade, this species has gained special relevance being one of the main targets of pelagic fisheries in the NE Atlantic. Here, we sequenced and annotated the first high-quality draft genome assembly of S. colias, produced with Pacbio HiFi long reads and Illumina Paired-End short reads. The estimated genome size is 814 Mb distributed into 2,028 scaffolds and 2,093 contigs with an N50 length of 4,19 and 3,34 Mb, respectively. We annotated 27,675 protein-coding genes and the BUSCO analyses indicated high completeness, with 97.3 % of the single-copy orthologs in the Actinopterygii library profile. The present genome assembly represents a valuable resource to address the biology and management of this relevant fishery. Finally, this is the fourth high-quality genome assembly within the Order Scombriformes and the first in the genus Scomber.


2021 ◽  
Author(s):  
Xinxin Yi ◽  
Jing Liu ◽  
Shengcai Chen ◽  
Hao Wu ◽  
Min Liu ◽  
...  

Cultivated soybean (Glycine max) is an important source for protein and oil. Many elite cultivars with different traits have been developed for different conditions. Each soybean strain has its own genetic diversity, and the availability of more high-quality soybean genomes can enhance comparative genomic analysis for identifying genetic underpinnings for its unique traits. In this study, we constructed a high-quality de novo assembly of an elite soybean cultivar Jidou 17 (JD17) with chromsome contiguity and high accuracy. We annotated 52,840 gene models and reconstructed 74,054 high-quality full-length transcripts. We performed a genome-wide comparative analysis based on the reference genome of JD17 with three published soybeans (WM82, ZH13 and W05) , which identified five large inversions and two large translocations specific to JD17, 20,984 - 46,912 PAVs spanning 13.1 - 46.9 Mb in size, and 5 - 53 large PAV clusters larger than 500kb. 1,695,741 - 3,664,629 SNPs and 446,689 - 800,489 Indels were identified and annotated between JD17 and them. Symbiotic nitrogen fixation (SNF) genes were identified and the effects from these variants were further evaluated. It was found that the coding sequences of 9 nitrogen fixation-related genes were greatly affected. The high-quality genome assembly of JD17 can serve as a valuable reference for soybean functional genomics research.


2020 ◽  
Vol 12 (2) ◽  
pp. 3917-3925
Author(s):  
Greer A Dolby ◽  
Matheo Morales ◽  
Timothy H Webster ◽  
Dale F DeNardo ◽  
Melissa A Wilson ◽  
...  

Abstract Toll-like receptors (TLRs) are a complex family of innate immune genes that are well characterized in mammals and birds but less well understood in nonavian sauropsids (reptiles). The advent of highly contiguous draft genomes of nonmodel organisms enables study of such gene families through analysis of synteny and sequence identity. Here, we analyze TLR genes from the genomes of 22 tetrapod species. Findings reveal a TLR8 gene expansion in crocodilians and turtles (TLR8B), and a second duplication (TLR8C) specifically within turtles, followed by pseudogenization of that gene in the nonfreshwater species (desert tortoise and green sea turtle). Additionally, the Mojave desert tortoise (Gopherus agassizii) has a stop codon in TLR8B (TLR8-1) that is polymorphic among conspecifics. Revised orthology further reveals a new TLR homolog, TLR21-like, which is exclusive to lizards, snakes, turtles, and crocodilians. These analyses were made possible by a new draft genome assembly of the desert tortoise (gopAga2.0), which used chromatin-based assembly to yield draft chromosomal scaffolds (L50 = 26 scaffolds, N50 = 28.36 Mb, longest scaffold = 107 Mb) and an enhanced de novo genome annotation with 25,469 genes. Our three-step approach to orthology curation and comparative analysis of TLR genes shows what new insights are possible using genome assemblies with chromosome-scale scaffolds that permit integration of synteny conservation data.


2016 ◽  
Vol 4 (6) ◽  
Author(s):  
Mohammad H. A. Ibrahim ◽  
Brady F. Cress ◽  
Robert J. Linhardt ◽  
Mattheos A. G. Koffas ◽  
Richard A. Gross

We report here the 4.092-Mb high-quality draft genome assembly of a newly isolated poly-γ-glutamic acid–producing strain,Bacillus subtilisIa1a. The genome sequence is considered a critical tool to facilitate the engineering of improved production strains. Exopolysaccharides and many industrially important enzymes can be produced by this new strain utilizing different carbon sources.


2020 ◽  
Vol 12 (7) ◽  
pp. 1074-1079 ◽  
Author(s):  
Ruihao Shu ◽  
Jihong Zhang ◽  
Qian Meng ◽  
Huan Zhang ◽  
Guiling Zhou ◽  
...  

Abstract Ophiocordyceps sinensis (Berk.) is an entomopathogenic fungus endemic to the Qinghai-Tibet Plateau. It parasitizes and mummifies the underground ghost moth larvae, then produces a fruiting body. The fungus-insect complex, called Chinese cordyceps or “DongChongXiaCao,” is not only a valuable traditional Chinese medicine, but also a major source of income for numerous Himalayan residents. Here, taking advantage of rapid advances in single-molecule sequencing, we assembled a highly contiguous genome assembly of O. sinensis. The assembly of 23 contigs was ∼110.8 Mb with a N50 length of 18.2 Mb. We used RNA-seq and homologous protein sequences to identify 8,916 protein-coding genes in the IOZ07 assembly. Moreover, 63 secondary metabolite gene clusters were identified in the improved assembly. The improved assembly and genome features described in this study will further inform the evolutionary study and resource utilization of Chinese cordyceps.


Author(s):  
Xinhai Ye ◽  
Yi Yang ◽  
Zhaoyang Tian ◽  
Le Xu ◽  
Kaili Yu ◽  
...  

AbstractSequencing and assembling a genome with a single individual have several advantages, such as lower heterozygosity and easier sample preparation. However, the amount of genomic DNA of some small sized organisms might not meet the standard DNA input requirement for current sequencing pipelines. Although few studies sequenced a single small insect with about 100 ng DNA as input, it may still be challenging for many small organisms to obtain such amount of DNA from a single individual. Here, we use 20 ng DNA as input, and present a high-quality genome assembly for a single haploid male parasitoid wasp (Habrobracon hebetor) using Nanopore and Illumina. Because of the low input DNA, a whole genome amplification (WGA) method is used before sequencing. The assembled genome size is 131.6 Mb with a contig N50 of 1.63 Mb. A total of 99% Benchmarking Universal Single-Copy Orthologs are detected, suggesting the high level of completeness of the genome assembly. Genome comparison between H. hebetor and its relative Bracon brevicornis shows a high-level genome synteny, indicating the genome of H. hebetor is highly accurate and contiguous. Our study provides an example for de novo assembling a genome from ultra-low input DNA, and will be used for sequencing projects of small sized species and rare samples, haploid genomics as well as population genetics of small sized species.


2019 ◽  
Author(s):  
Haley Wight ◽  
Junhui Zhou ◽  
Muzi Li ◽  
Sridhar Hannenhalli ◽  
Stephen M. Mount ◽  
...  

AbstractThe red raspberry, Rubus idaeus, is widely distributed in all temperate regions of Europe, Asia, and North America and is a major commercial fruit valued for its taste, high antioxidant and vitamin content. However, Rubus breeding is a long and slow process hampered by limited genomic and molecular resources. Genomic resources such as a complete genome sequencing and transcriptome will be of exceptional value to improve research and breeding of this high value crop. Using a hybrid sequence assembly approach including data from both long and short sequence reads, we present the first assembly of the Rubus idaeus genome (Joan J. variety). The de novo assembled genome consists of 2,145 scaffolds with a genome completeness of 95.3% and an N50 score of 638 KB. Leveraging a linkage map, we anchored 80.1% of the genome onto seven chromosomes. Using over 1 billion paired-end RNAseq reads, we annotated 35,566 protein coding genes with a transcriptome completeness score of 97.2%. The Rubus idaeus genome provides an important new resource for researchers and breeders.


Gigabyte ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Julia Voelker ◽  
Mervyn Shepherd ◽  
Ramil Mauleon

The economically important Melaleuca alternifolia (tea tree) is the source of a terpene-rich essential oil with therapeutic and cosmetic uses around the world. Tea tree has been cultivated and bred in Australia since the 1990s. It has been extensively studied for the genetics and biochemistry of terpene biosynthesis. Here, we report a high quality de novo genome assembly using Pacific Biosciences and Illumina sequencing. The genome was assembled into 3128 scaffolds with a total length of 362 Mb (N50  = 1.9 Mb), with significantly higher contiguity than a previous assembly (N50  = 8.7 Kb). Using a homology-based, RNA-seq evidence-based and ab initio prediction approach, 37,226 protein-coding genes were predicted. Genome assembly and annotation exhibited high completeness scores of 98.1% and 89.4%, respectively. Sequence contiguity was sufficient to reveal extensive gene order conservation and chromosomal rearrangements in alignments with Eucalyptus grandis and Corymbia citriodora genomes. This new genome advances currently available resources to investigate the genome structure and gene family evolution of M. alternifolia. It will enable further comparative genomic studies in Myrtaceae to elucidate the genetic foundations of economically valuable traits in this crop.


Sign in / Sign up

Export Citation Format

Share Document