scholarly journals Chromosome-scale assembly of the yellow mealworm genome

2021 ◽  
Vol 1 ◽  
pp. 94
Author(s):  
Evangelia Eleftheriou ◽  
Jean-Marc Aury ◽  
Benoît Vacherie ◽  
Benjamin Istace ◽  
Caroline Belser ◽  
...  

Background: The yellow mealworm beetle, Tenebrio molitor, is a promising alternative protein source for animal and human nutrition and its farming involves relatively low environmental costs. For these reasons, its industrial scale production started this century. However, to optimize and breed sustainable new T. molitor lines, the access to its genome remains essential. Methods: By combining Oxford Nanopore and Illumina Hi-C data, we constructed a high-quality chromosome-scale assembly of T. molitor. Then, we combined RNA-seq data and available coleoptera proteomes for gene prediction with GMOVE. Results: We produced a high-quality genome with a N50 = 21.9Mb with a completeness of 99.5% and predicted 21,435 genes with a median size of 1,780 bp. Gene orthology between T. molitor and Tribolium castaneaum showed a highly conserved synteny between the two coleoptera. Conclusions: The present genome will greatly help fundamental and applied research such as genetic breeding and will contribute to the sustainable production of the yellow mealworm.

2021 ◽  
Vol 7 (12) ◽  
Author(s):  
Sebastian Cristian Treitli ◽  
Priscila Peña-Diaz ◽  
Paweł Hałakuc ◽  
Anna Karnkowska ◽  
Vladimír Hampl

Monocercomonoides exilis is considered the first known eukaryote to completely lack mitochondria. This conclusion is based primarily on a genomic and transcriptomic study which failed to identify any mitochondrial hallmark proteins. However, the available genome assembly has limited contiguity and around 1.5 % of the genome sequence is represented by unknown bases. To improve the contiguity, we re-sequenced the genome and transcriptome of M. exilis using Oxford Nanopore Technology (ONT). The resulting draft genome is assembled in 101 contigs with an N50 value of 1.38 Mbp, almost 20 times higher than the previously published assembly. Using a newly generated ONT transcriptome, we further improve the gene prediction and add high quality untranslated region (UTR) annotations, in which we identify two putative polyadenylation signals present in the 3′UTR regions and characterise the Kozak sequence in the 5′UTR regions. All these improvements are reflected by higher BUSCO genome completeness values. Regardless of an overall more complete genome assembly without missing bases and a better gene prediction, we still failed to identify any mitochondrial hallmark genes, thus further supporting the hypothesis on the absence of mitochondrion.


2020 ◽  
Author(s):  
Bernard Y Kim ◽  
Jeremy Wang ◽  
Danny E. Miller ◽  
Olga Barmina ◽  
Emily K. Delaney ◽  
...  

Over 100 years of studies in Drosophila melanogaster and related species in the genus Drosophila have facilitated key discoveries in genetics, genomics, and evolution. While high-quality genome assemblies exist for several species in this group, they only encompass a small fraction of the genus. Recent advances in long read sequencing allow high quality genome assemblies for tens or even hundreds of species to be generated. Here, we utilize Oxford Nanopore sequencing to build an open community resource of high-quality assemblies for 101 lines of 95 drosophilid species encompassing 14 species groups and 35 sub-groups with an average contig N50 of 10.5 Mb and greater than 97% BUSCO completeness in 97/101 assemblies. These assemblies, along with detailed wet lab protocol and assembly pipelines, are released as a public resource and will serve as a starting point for addressing broad questions of genetics, ecology, and evolution within this key group.


Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 52
Author(s):  
Ashley G. Yow ◽  
Hamed Bostan ◽  
Raúl Castanera ◽  
Valentino Ruggieri ◽  
Molla F. Mengist ◽  
...  

Pineapple (Ananas comosus (L.) Merr.) is the second most important tropical fruit crop globally, and ‘MD2’ is the most important cultivated variety. A high-quality genome is important for molecular-based breeding, but available pineapple genomes still have some quality limitations. Here, PacBio and Hi-C data were used to develop a new high-quality MD2 assembly and gene prediction. Compared to the previous MD2 assembly, major improvements included a 26.6-fold increase in contig N50 length, phased chromosomes, and >6000 new genes. The new MD2 assembly also included 161.6 Mb additional sequences and >3000 extra genes compared to the F153 genome. Over 48% of the predicted genes harbored potential deleterious mutations, indicating that the high level of heterozygosity in this species contributes to maintaining functional alleles. The genome was used to characterize the FAR1-RELATED SEQUENCE (FRS) genes that were expanded in pineapple and rice. Transposed and dispersed duplications contributed to expanding the numbers of these genes in the pineapple lineage. Several AcFRS genes were differentially expressed among tissue-types and stages of flower development, suggesting that their expansion contributed to evolving specialized functions in reproductive tissues. The new MD2 assembly will serve as a new reference for genetic and genomic studies in pineapple.


Author(s):  
Ying-Feng Niu ◽  
Guo-Hua Li ◽  
Shu-Bang Ni ◽  
Xi-Yong He ◽  
Cheng Zheng ◽  
...  

AbstractMacadamia is a kind of evergreen nut trees which belong to the Proteaceae family. The two commercial macadamia species, Macadamia integrifolia and M. tetraphylla, are highly prized for their edible kernels. Catherine et al. reported M. integrifolia genome using NGS sequencing technology. However, the lack of a high-quality assembly for M. tetraphylla hinders the progress in biological research and breeding program. In this study, we report a high-quality genome sequence of M. tetraphylla using the Oxford Nanopore Technologies (ONT) technology. We generated an assembly of 750.54 Mb with a contig N50 length of 1.18 Mb, which is close to the size estimated by flow cytometry and k-mer analysis. Repetitive sequence represent 58.57% of the genome sequence, which is strikingly higher compared with M. integrifolia. A total of 31,571 protein-coding genes were annotated with an average length of 6,055 bp, of which 92.59% were functionally annotated. The genome sequence of M. tetraphylla will provide novel insights into the breeding of novel strains and genetic improvement of agronomic traits.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Bernard Y Kim ◽  
Jeremy Wang ◽  
Danny E Miller ◽  
Olga Barmina ◽  
Emily Kay Delaney ◽  
...  

Over 100 years of studies in Drosophila melanogaster and related species in the genus Drosophila have facilitated key discoveries in genetics, genomics, and evolution. While high-quality genome assemblies exist for several species in this group, they only encompass a small fraction of the genus. Recent advances in long-read sequencing allow high-quality genome assemblies for tens or even hundreds of species to be efficiently generated. Here, we utilize Oxford Nanopore sequencing to build an open community resource of genome assemblies for 101 lines of 93 drosophilid species encompassing 14 species groups and 35 sub-groups. The genomes are highly contiguous and complete, with an average contig N50 of 10.5 Mb and greater than 97% BUSCO completeness in 97/101 assemblies. We show that Nanopore-based assemblies are highly accurate in coding regions, particularly with respect to coding insertions and deletions. These assemblies, along with a detailed laboratory protocol and assembly pipelines, are released as a public resource and will serve as a starting point for addressing broad questions of genetics, ecology, and evolution at the scale of hundreds of species.


Plant Disease ◽  
2021 ◽  
Author(s):  
Haoming Wang ◽  
Yongrong Dong ◽  
Weixue Liao ◽  
Xin Zhang ◽  
Qinhu Wang ◽  
...  

Clonostachys rosea is a necrotrophic mycoparasitic fungus with excellent biological control ability against numerous fungal plant pathogens. Here, we performed genomic sequencing of C. rosea strain CanS41 using Oxford Nanopore sequencing technology. We generated a high-quality genome assembly (>99.99% accuracy), which comprised 26 contigs containing 60.68 Mb sequences with a GC content of 48.55% and a repeat content of 8.38%. The N50 contig length is 3.02 Mb. In total, 20,818 protein-coding genes were identified and functionally annotated. Genes encoding secreted proteins and carbohydrate-active enzymes as well as secondary metabolic gene clusters were also identified and analyzed. In summary, the high-quality genome assembly and gene annotation provided here will allow further exploration of biological functions and enhance biological control ability of C. rosea.


Author(s):  
Yixue Bao ◽  
Kaiyuan Pan ◽  
Khan Muhammad Tahir ◽  
Baoshan Chen ◽  
MUQING ZHANG

Sugarcane pokkah boeng disease (PBD) is emerging as a prevalent foliar disease in China. This airborne disease is caused by the Fusarium species complex. To investigate the diversity and evolution of Fusarium species, we performed the whole-genome sequencing of Fusarium andiyazi YN28 using a combination of the Oxford Nanopore and the Illumina technology. The F. andiyazi YN28 genome was sequenced, assembled, and annotated. A high-quality genome was assembled into 24 contigs with an N50 of 2.80 Mb. The genome assembly generated a total size of 44.1 Mb with a GC content of 47.64%. A total of 15,508 genes were predicted, including 794 genes related to the carbohydrate-active enzymes, 397 ncRNAs, 155 genes associated with transporter classification, 4,550 genes linked to pathogen-host interactions, and 269 genes involved in effector proteins. Collectively, our results will provide insight into the host-pathogen interaction and will facilitate the breeding of new varieties of sugarcane resistant to PBD.


2018 ◽  
Author(s):  
Danny E. Miller ◽  
Cynthia Staber ◽  
Julia Zeitlinger ◽  
R. Scott Hawley

ABSTRACTThe Drosophila genus is a unique group containing a wide range of species that occupy diverse ecosystems. In addition to the most widely studied species, Drosophila melanogaster, many other members in this genus also possess a well-developed set of genetic tools. Indeed, high-quality genomes exist for several species within the genus, facilitating studies of the function and evolution of cis-regulatory regions and proteins by allowing comparisons across at least 50 million years of evolution. Yet, the available genomes still fail to capture much of the substantial genetic diversity within the Drosophila genus. We have therefore tested protocols to rapidly and inexpensively sequence and assemble the genome from any Drosophila species using single-molecule sequencing technology from Oxford Nanopore. Here, we use this technology to present high-quality genome assemblies of 15 Drosophila species: 10 of the 12 originally sequenced Drosophila species (ananassae, erecta, mojavensis, persimilis, pseudoobscura, sechellia, simulans, virilis, willistoni, and yakuba), four additional species that had previously reported assemblies (biarmipes, bipectinata, eugracilis, and mauritiana), and one novel assembly (triauraria). Genomes were generated from an average of 29x depth-of-coverage data that after assembly resulted in an average contig N50 of 4.4 Mb. Subsequent alignment of contigs from the published reference genomes demonstrates that our assemblies could be used to close over 60% of the gaps present in the currently published reference genomes. Importantly, the materials and reagents cost for each genome was approximately $1,000 (USD). This study demonstrates the power and cost-effectiveness of long-read sequencing for genome assembly in Drosophila and provides a framework for the affordable sequencing and assembly of additional Drosophila genomes.


2021 ◽  
Vol 11 (2) ◽  
Author(s):  
James G Baldwin-Brown ◽  
Scott M Villa ◽  
Anna I Vickrey ◽  
Kevin P Johnson ◽  
Sarah E Bush ◽  
...  

Abstract The pigeon louse Columbicola columbae is a longstanding and important model for studies of ectoparasitism and host-parasite coevolution. However, a deeper understanding of its evolution and capacity for rapid adaptation is limited by a lack of genomic resources. Here, we present a high-quality draft assembly of the C. columbae genome, produced using a combination of Oxford Nanopore, Illumina, and Hi-C technologies. The final assembly is 208 Mb in length, with 12 chromosome-size scaffolds representing 98.1% of the assembly. For gene model prediction, we used a novel clustering method (wavy_choose) for Oxford Nanopore RNA-seq reads to feed into the MAKER annotation pipeline. High recovery of conserved single-copy orthologs (BUSCOs) suggests that our assembly and annotation are both highly complete and highly accurate. Consistent with the results of the only other assembled louse genome, Pediculus humanus, we find that C. columbae has a relatively low density of repetitive elements, the majority of which are DNA transposons. Also similar to P. humanus, we find a reduced number of genes encoding opsins, G protein-coupled receptors, odorant receptors, insulin signaling pathway components, and detoxification proteins in the C. columbae genome, relative to other insects. We propose that such losses might characterize the genomes of obligate, permanent ectoparasites with predictable habitats, limited foraging complexity, and simple dietary regimes. The sequencing and analysis for this genome were relatively low cost, and took advantage of a new clustering technique for Oxford Nanopore RNAseq reads that will be useful to future genome projects.


Sign in / Sign up

Export Citation Format

Share Document