scholarly journals A high-quality genome assembly from short and long reads for the non-biting midge Chironomus riparius (Diptera)

2019 ◽  
Author(s):  
Hanno Schmidt ◽  
Ann-Marie Waldvogel ◽  
Sören Lukas Hellmann ◽  
Barbara Feldmeyer ◽  
Thomas Hankeln ◽  
...  

AbstractBackgroundChironomus riparius is of great importance as a study species in various fields like ecotoxicology, molecular genetics, developmental biology and ecology. However, only a fragmented draft genome exists to date, hindering the recent rush of population genomic studies in this species.FindingsMaking use of 50 NGS datasets, we present a hybrid genome assembly from short and long sequence reads that make C. riparius’ genome one of the most contiguous Dipteran genomes published, the first complete mitochondrial genome of the species and the respective recombination rate as one of the first insect recombination rates at all.ConclusionsThe genome and associated resources will be highly valuable to the broad community working with dipterans in general and chironomids in detail. The estimated recombination rate will help evolutionary biologist gain a better understanding of commonalities and differences of genomic patterns in insects.

2020 ◽  
Vol 10 (4) ◽  
pp. 1151-1157 ◽  
Author(s):  
Hanno Schmidt ◽  
Sören Lukas Hellmann ◽  
Ann-Marie Waldvogel ◽  
Barbara Feldmeyer ◽  
Thomas Hankeln ◽  
...  

Chironomus riparius is of great importance as a study species in various fields like ecotoxicology, molecular genetics, developmental biology and ecology. However, only a fragmented draft genome exists to date, hindering the recent rush of population genomic studies in this species. Making use of 50 NGS datasets, we present a hybrid genome assembly from short and long sequence reads that make C. riparius’ genome one of the most contiguous Dipteran genomes published, the first complete mitochondrial genome of the species, and the respective recombination rate among the first insect recombination rates at all. The genome assembly and associated resources will be highly valuable to the broad community working with dipterans in general and chironomids in particular. The estimated recombination rate will help evolutionary biologists gaining a better understanding of commonalities and differences of genomic patterns in insects.


2021 ◽  
Author(s):  
Chi yang ◽  
Lu Ma ◽  
Donglai Xiao ◽  
Xiaoyu Liu ◽  
Xiaoling Jiang ◽  
...  

Sparassis latifolia is a valuable edible mushroom cultivated in China. In 2018, our research group reported an incomplete and low quality genome of S. latifolia was obtained by Illumina HiSeq 2500 sequencing. These limitations in the available genome have constrained genetic and genomic studies in this mushroom resource. Herein, an updated draft genome sequence of S. latifolia was generated by Oxford Nanopore sequencing and the Hi-C technique. A total of 8.24 Gb of Oxford Nanopore long reads representing ~198.08X coverage of the S. latifolia genome were generated. Subsequently, a high-quality genome of 41.41 Mb, with scaffold and contig N50 sizes of 3.31 Mb and 1.51 Mb, respectively, was assembled. Hi-C scaffolding of the genome resulted in 12 pseudochromosomes containing 93.56% of the bases in the assembled genome. Genome annotation further revealed that 17.47% of the genome was composed of repetitive sequences. In addition, 13,103 protein-coding genes were predicted, among which 98.72% were functionally annotated. BUSCO assay results further revealed that there were 92.07% complete BUSCOs. The improved chromosome-scale assembly and genome features described here will aid further molecular elucidation of various traits, breeding of S. latifolia, and evolutionary studies with related taxa.


2021 ◽  
Author(s):  
Suresh Panthee ◽  
Hiroshi Hamamoto ◽  
Atmika Paudel ◽  
Chikara Kaito ◽  
Yutaka Suzuki ◽  
...  

Staphylococcus aureus RN4220 has been extensively used by staphylococcal researchers as an intermediate strain for genetic manipulation due to its ability to accept foreign DNA. Despite its wide use in laboratories, its complete genome is not available. In this study, we used the hybrid genome assembly approach using the minION long reads and Illumina short reads to sequence the complete genome of S. aureus RN4220. The comparative analysis of the annotated complete genome showed the presence of 39 genes fragmented in the previous assembly, many of which were located near the repeat regions. Using RNA-Seq reads, we showed that a higher number of reads could be mapped to the complete genome than the draft genome and the gene expression profile obtained using the complete genome also differs from that obtained from the draft genome. Furthermore, by comparative transcriptomic analysis, we showed the correlation between expression levels of staphyloxanthin biosynthetic genes and the production of yellow pigment. This study highlighted the importance of long reads in completing the microbial genomes, especially those possessing repetitive elements.


2015 ◽  
Author(s):  
Neeraja M Krishnan ◽  
Prachi Jain ◽  
Saurabh Gupta ◽  
Arun K Hariharan ◽  
Binay Panda

Neem (Azadirachta indica A. Juss.), an evergreen tree of the Meliaceae family, is known for its medicinal, cosmetic, pesticidal and insecticidal properties. We had previously sequenced and published the draft genome of the plant, using mainly short read sequencing data. In this report, we present an improved genome assembly generated using additional short reads from Illumina and long reads from Pacific Biosciences SMRT sequencer. We assembled short reads and error corrected long reads using Platanus, an assembler designed to perform well for heterozygous genomes. The updated genome assembly (v2.0) yielded 3- and 3.5-fold increase in N50 and N75, respectively; 2.6-fold decrease in the total number of scaffolds; 1.25-fold increase in the number of valid transcriptome alignments; 13.4-fold less mis-assembly and 1.85-fold increase in the percentage repeat, over the earlier assembly (v1.0). The current assembly also maps better to the genes known to be involved in the terpenoid biosynthesis pathway. Together, the data represents an improved assembly of the A. indica genome. The raw data described in this manuscript are submitted to the NCBI Short Read Archive under the accession numbers SRX1074131, SRX1074132, SRX1074133, and SRX1074134 (SRP013453).


2021 ◽  
Author(s):  
Lauren Coombe ◽  
Janet X Li ◽  
Theodora Lo ◽  
Johnathan Wong ◽  
Vladimir Nikolic ◽  
...  

Background Generating high-quality de novo genome assemblies is foundational to the genomics study of model and non-model organisms. In recent years, long-read sequencing has greatly benefited genome assembly and scaffolding, a process by which assembled sequences are ordered and oriented through the use of long-range information. Long reads are better able to span repetitive genomic regions compared to short reads, and thus have tremendous utility for resolving problematic regions and helping generate more complete draft assemblies. Here, we present LongStitch, a scalable pipeline that corrects and scaffolds draft genome assemblies exclusively using long reads. Results LongStitch incorporates multiple tools developed by our group and runs in up to three stages, which includes initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tigmint-long and ARKS-long are misassembly correction and scaffolding utilities, respectively, previously developed for linked reads, that we adapted for long reads. Here, we describe the LongStitch pipeline and introduce our new long-read scaffolder, ntLink, which utilizes lightweight minimizer mappings to join contigs. LongStitch was tested on short and long-read assemblies of three different human individuals using corresponding nanopore long-read data, and improves the contiguity of each assembly from 2.0-fold up to 304.6-fold (as measured by NGA50 length). Furthermore, LongStitch generates more contiguous and correct assemblies compared to state-of-the-art long-read scaffolder LRScaf in most tests, and consistently runs in under five hours using less than 23GB of RAM. Conclusions Due to its effectiveness and efficiency in improving draft assemblies using long reads, we expect LongStitch to benefit a wide variety of de novo genome assembly projects. The LongStitch pipeline is freely available at https://github.com/bcgsc/longstitch.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 401
Author(s):  
Jon Bråte ◽  
Janina Fuss ◽  
Kjetill S. Jakobsen ◽  
Dag Klaveness

Hydrurus foetidus is a freshwater alga belonging to the phylum Heterokonta. It thrives in cold rivers in polar and high alpine regions. It has several morphological traits reminiscent of single-celled eukaryotes, but can also form macroscopic thalli. Despite its ability to produce polyunsaturated fatty acids, its life under cold conditions and its variable morphology, very little is known about its genome and transcriptome. Here, we present an extensive set of next-generation sequencing data, including genomic short reads from Illumina sequencing and long reads from Nanopore sequencing, as well as full length cDNAs from PacBio IsoSeq sequencing and a small RNA dataset (smaller than 200 bp) sequenced with Illumina. We combined this data with, to our knowledge, the first draft genome assembly of a chrysophyte algae. The assembly consists of 5069 contigs to a total assembly size of 171 Mb and a 77% BUSCO completeness. The new data generated here may contribute to a better understanding of the evolution and ecological roles of chrysophyte algae, as well as to resolve the branching patterns within the Heterokonta.


2019 ◽  
Vol 11 (8) ◽  
pp. 2306-2311
Author(s):  
Juliane Hartke ◽  
Tilman Schell ◽  
Evelien Jongepier ◽  
Hanno Schmidt ◽  
Philipp P Sprenger ◽  
...  

Abstract The success of social insects is largely intertwined with their highly advanced chemical communication system that facilitates recognition and discrimination of species and nest-mates, recruitment, and division of labor. Hydrocarbons, which cover the cuticle of insects, not only serve as waterproofing agents but also constitute a major component of this communication system. Two cryptic Crematogaster species, which share their nest with Camponotus ants, show striking diversity in their cuticular hydrocarbon (CHC) profile. This mutualistic system therefore offers a great opportunity to study the genetic basis of CHC divergence between sister species. As a basis for further genome-wide studies high-quality genomes are needed. Here, we present the annotated draft genome for Crematogaster levior A. By combining the three most commonly used sequencing techniques—Illumina, PacBio, and Oxford Nanopore—we constructed a high-quality de novo ant genome. We show that even low coverage of long reads can add significantly to overall genome contiguity. Annotation of desaturase and elongase genes, which play a role in CHC biosynthesis revealed one of the largest repertoires in ants and a higher number of desaturases in general than in other Hymenoptera. This may provide a mechanistic explanation for the high diversity observed in C. levior CHC profiles.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Thomas Gatter ◽  
Sarah von Löhneysen ◽  
Jörg Fallmann ◽  
Polina Drozdova ◽  
Tom Hartmann ◽  
...  

Abstract Background Advances in genome sequencing over the last years have lead to a fundamental paradigm shift in the field. With steadily decreasing sequencing costs, genome projects are no longer limited by the cost of raw sequencing data, but rather by computational problems associated with genome assembly. There is an urgent demand for more efficient and and more accurate methods is particular with regard to the highly complex and often very large genomes of animals and plants. Most recently, “hybrid” methods that integrate short and long read data have been devised to address this need. Results is such a hybrid genome assembler. It has been designed specificially with an emphasis on utilizing low-coverage short and long reads. starts from a bipartite overlap graph between long reads and restrictively filtered short-read unitigs. This graph is translated into a long-read overlap graph G. Instead of the more conventional approach of removing tips, bubbles, and other local features, stepwisely extracts subgraphs whose global properties approach a disjoint union of paths. First, a consistently oriented subgraph is extracted, which in a second step is reduced to a directed acyclic graph. In the next step, properties of proper interval graphs are used to extract contigs as maximum weight paths. These path are translated into genomic sequences only in the final step. A prototype implementation of , entirely written in python, not only yields significantly more accurate assemblies of the yeast and fruit fly genomes compared to state-of-the-art pipelines but also requires much less computational effort. Conclusions is new low-cost genome assembler that copes well with large genomes and low coverage. It is based on a novel approach for reducing the overlap graph to a collection of paths, thus opening new avenues for future improvements. Availability The prototype is available at https://github.com/TGatter/LazyB.


2020 ◽  
Vol 33 (2) ◽  
pp. 145-148
Author(s):  
Lucia Landi ◽  
Stefania Pollastro ◽  
Caterina Rotolo ◽  
Gianfranco Romanazzi ◽  
Francesco Faretra ◽  
...  

Monilinia laxa is the causal agent of brown rot on stone fruit, and it can cause heavy yield losses during field production and postharvest storage. This article reports the draft genome assembly of the M. laxa Mlax316 strain, obtained using a hybrid genome assembly with both Illumina short-reads and PacBio long-reads sequencing technologies. The complete draft genome consists of 49 scaffolds with total size of 42.81 Mb, and scaffold N50 of 2,449.4 kb. Annotation of the M. laxa assembly identified 11,163 genes and 12,424 proteins which were functionally annotated. This new genome draft improves current genomic resources available for M. laxa and represents a useful tool for further research into its interactions with host plants and into evolution in the Monilinia genus.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Vindhya Mohindra ◽  
Tanushree Dangi ◽  
Ratnesh K. Tripathi ◽  
Rajesh Kumar ◽  
Rajeev K. Singh ◽  
...  

Abstract This study provides the first high-quality draft genome assembly (762.5 Mb) of Tenualosa ilisha that is highly contiguous and nearly complete. We observed a total of 2,864 contigs, with 96.4% completeness with N50 of 2.65 Mbp and the largest contig length of 17.4 Mbp, along with a complete mitochondrial genome of 16,745 bases. A total number of 33,042 protein coding genes were predicted, among these, 512 genes were classified under 61 Gene Ontology (GO) terms, associated with various homeostasis processes. Highest number of genes belongs to cellular calcium ion homeostasis, followed by tissue homeostasis. A total of 97 genes were identified, with 16 GO terms related to water homeostasis. Claudins, Aquaporins, Connexins/Gap junctions, Adenylate cyclase, Solute carriers and Voltage gated potassium channel genes were observed to be higher in number in T. ilisha, as compared to that in other teleost species. Seven novel gene variants, in addition to claudin gene (CLDZ), were found in T. ilisha. The present study also identified two putative novel genes, NKAIN3 and L4AM1, for the first time in fish, for which further studies are required for pinpointing their functions in fish. In addition, 1.6 million simple sequence repeats were mined from draft genome assembly. The study provides a valuable genomic resource for the anadromous Hilsa. It will form a basis for future studies, pertaining to its adaptation mechanisms to different salinity levels during migration, which in turn would facilitate in its domestication.


Sign in / Sign up

Export Citation Format

Share Document