Near-chromosome-level genome assembly of the dark septate endophyte Laburnicola rhizohalophila: a model for investigating root-fungus symbiosis

Genome Biology and Evolution ◽

10.1093/gbe/evab026 ◽

2021 ◽

Author(s):

Xinghua He ◽

Zhilin Yuan

Keyword(s):

Long Terminal Repeat Retrotransposons ◽

The Novel ◽

Carbohydrate Active Enzymes ◽

Dark Septate Endophyte ◽

Symbiotic Interactions ◽

Illumina Data ◽

Genome Features ◽

Long Read ◽

Root Fungi ◽

Chromosome Level

Abstract The novel DSE Laburnicola rhizohalophila (Pleosporales, Ascomycota) is frequently found in the halophytic seepweed (Suaeda salsa). In this paper, we report a near-chromosome-level hybrid assembly of this fungus using a combination of short-read Illumina data to polish assemblies generated from long-read Nanopore data. The reference genome for L. rhizohalophila was assembled into 26 scaffolds with a total length of 64.0 Mb and a N50 length of 3.15 Mb. Of them, 17 scaffolds approached the length of intact chromosomes, and 5 had telomeres at one end only. A total of 10,891 gene models were predicted. Intriguingly, 27.5 Mb of repeat sequences that accounted for 42.97% of the genome was identified, and long terminal repeat retrotransposons were the most frequent known transposable elements (TEs), indicating that TE proliferation contributes to its increased genome size. BUSCO analyses using the Fungi_odb10 dataset showed that 95.0% of genes were complete. In addition, 292 carbohydrate active enzymes, 33 secondary metabolite clusters, and 84 putative effectors were identified in silico. The resulting high-quality assembly and genome features are not only an important resource for further research on understanding the mechanism of root-fungi symbiotic interactions, but will also contribute to comparative analyses of genome biology and evolution within Pleosporalean species.

Chromosome-Scale Genome Assembly of Talaromyces rugulosus W13939, a Mycoparasitic Fungus and Promising Biocontrol Agent

Molecular Plant-Microbe Interactions ◽

10.1094/mpmi-06-20-0163-a ◽

2020 ◽

Vol 33 (12) ◽

pp. 1446-1450

Author(s):

Bo Wang ◽

Li Guo ◽

Kai Ye ◽

Long Wang

Keyword(s):

Genome Assembly ◽

Biocontrol Agent ◽

Biotechnological Applications ◽

Cell Wall Degrading Enzymes ◽

Carbohydrate Active Enzymes ◽

Fungal Cell ◽

Genes Encoding ◽

Long Read ◽

Mycoparasitic Fungus ◽

Chromosome Level

Here, we report a chromosome-level genome assembly of Talaromyces rugulosus (syn. Penicillium rugulosum) W13939 (six chromosomes; contig N50: 5.90 Mb), generated using a combination of PacBio long-read and Illumina paired-end data. T. rugulosus is not only a potent enzyme producer, but also a mycoparasite of Aspergillus flavus, which is a notorious plant pathogen and mycotoxin producer, making it a promising biocontrol agent. The T. rugulosus genome is rich in genes encoding proteases, carbohydrate-active enzymes, fungal cell wall–degrading enzymes, lectin, and secondary metabolite biosynthetic enzymes, reflecting its mycoparasitic life style and mycotoxigenic capability. This high-quality assembly of the T. rugulosus genome will be a valuable resource to assist us in the understanding of the molecular basis of mycoparasitism and facilitate the agricultural and biotechnological applications of Talaromyces spp.

Chromosome-level assembly of Drosophila bifasciata reveals important karyotypic transition of the X chromosome

10.1101/847558 ◽

2019 ◽

Author(s):

Ryan Bracewell ◽

Anita Tran ◽

Kamalakar Chatla ◽

Doris Bachtrog

Keyword(s):

X Chromosome ◽

Genome Assembly ◽

De Novo ◽

Pericentromeric Region ◽

Species Group ◽

Chromosome 15 ◽

Protein Coding ◽

Protein Coding Genes ◽

Long Read ◽

Chromosome Level

ABSTRACTThe Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromere, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.

Chromosome-scale assembly of the Sparassis latifolia genome obtained using long-read and Hi-C sequencing

10.1101/2021.01.08.426014 ◽

2021 ◽

Author(s):

Chi yang ◽

Lu Ma ◽

Donglai Xiao ◽

Xiaoyu Liu ◽

Xiaoling Jiang ◽

...

Keyword(s):

Repetitive Sequences ◽

Draft Genome ◽

Edible Mushroom ◽

Illumina Hiseq ◽

Protein Coding ◽

Long Reads ◽

Oxford Nanopore ◽

Genome Features ◽

Long Read ◽

Genomic Studies

Sparassis latifolia is a valuable edible mushroom cultivated in China. In 2018, our research group reported an incomplete and low quality genome of S. latifolia was obtained by Illumina HiSeq 2500 sequencing. These limitations in the available genome have constrained genetic and genomic studies in this mushroom resource. Herein, an updated draft genome sequence of S. latifolia was generated by Oxford Nanopore sequencing and the Hi-C technique. A total of 8.24 Gb of Oxford Nanopore long reads representing ~198.08X coverage of the S. latifolia genome were generated. Subsequently, a high-quality genome of 41.41 Mb, with scaffold and contig N50 sizes of 3.31 Mb and 1.51 Mb, respectively, was assembled. Hi-C scaffolding of the genome resulted in 12 pseudochromosomes containing 93.56% of the bases in the assembled genome. Genome annotation further revealed that 17.47% of the genome was composed of repetitive sequences. In addition, 13,103 protein-coding genes were predicted, among which 98.72% were functionally annotated. BUSCO assay results further revealed that there were 92.07% complete BUSCOs. The improved chromosome-scale assembly and genome features described here will aid further molecular elucidation of various traits, breeding of S. latifolia, and evolutionary studies with related taxa.

Chromosome-Level Assembly of Drosophila bifasciata Reveals Important Karyotypic Transition of the X Chromosome

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400922 ◽

2020 ◽

Vol 10 (3) ◽

pp. 891-897 ◽

Cited By ~ 3

Author(s):

Ryan Bracewell ◽

Anita Tran ◽

Kamalakar Chatla ◽

Doris Bachtrog

Keyword(s):

X Chromosome ◽

Genome Assembly ◽

De Novo ◽

Pericentromeric Region ◽

Species Group ◽

Chromosome 15 ◽

Protein Coding ◽

Protein Coding Genes ◽

Long Read ◽

Chromosome Level

The Drosophila obscura species group is one of the most studied clades of Drosophila and harbors multiple distinct karyotypes. Here we present a de novo genome assembly and annotation of D. bifasciata, a species which represents an important subgroup for which no high-quality chromosome-level genome assembly currently exists. We combined long-read sequencing (Nanopore) and Hi-C scaffolding to achieve a highly contiguous genome assembly approximately 193 Mb in size, with repetitive elements constituting 30.1% of the total length. Drosophila bifasciata harbors four large metacentric chromosomes and the small dot, and our assembly contains each chromosome in a single scaffold, including the highly repetitive pericentromeres, which were largely composed of Jockey and Gypsy transposable elements. We annotated a total of 12,821 protein-coding genes and comparisons of synteny with D. athabasca orthologs show that the large metacentric pericentromeric regions of multiple chromosomes are conserved between these species. Importantly, Muller A (X chromosome) was found to be metacentric in D. bifasciata and the pericentromeric region appears homologous to the pericentromeric region of the fused Muller A-AD (XL and XR) of pseudoobscura/affinis subgroup species. Our finding suggests a metacentric ancestral X fused to a telocentric Muller D and created the large neo-X (Muller A-AD) chromosome ∼15 MYA. We also confirm the fusion of Muller C and D in D. bifasciata and show that it likely involved a centromere-centromere fusion.

Improving the Chromosome-Level Genome Assembly of the Siamese Fighting Fish (Betta splendens) in a University Master’s Course

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401205 ◽

2020 ◽

Vol 10 (7) ◽

pp. 2179-2183 ◽

Cited By ~ 1

Author(s):

Stefan Prost ◽

Malte Petersen ◽

Martin Grethlein ◽

Sarah Joy Hahn ◽

Nina Kuschik-Maczollek ◽

...

Keyword(s):

Genome Assembly ◽

High Throughput Sequencing ◽

Siamese Fighting Fish ◽

Betta Splendens ◽

High Quality ◽

Sequencing Platform ◽

Sequencing Technologies ◽

Oxford Nanopore ◽

Long Read ◽

Chromosome Level

Ever decreasing costs along with advances in sequencing and library preparation technologies enable even small research groups to generate chromosome-level assemblies today. Here we report the generation of an improved chromosome-level assembly for the Siamese fighting fish (Betta splendens) that was carried out during a practical university master’s course. The Siamese fighting fish is a popular aquarium fish and an emerging model species for research on aggressive behavior. We updated the current genome assembly by generating a new long-read nanopore-based assembly with subsequent scaffolding to chromosome-level using previously published Hi-C data. The use of ∼35x nanopore-based long-read data sequenced on a MinION platform (Oxford Nanopore Technologies) allowed us to generate a baseline assembly of only 1,276 contigs with a contig N50 of 2.1 Mbp, and a total length of 441 Mbp. Scaffolding using the Hi-C data resulted in 109 scaffolds with a scaffold N50 of 20.7 Mbp. More than 99% of the assembly is comprised in 21 scaffolds. The assembly showed the presence of 96.1% complete BUSCO genes from the Actinopterygii dataset indicating a high quality of the assembly. We present an improved full chromosome-level assembly of the Siamese fighting fish generated during a university master’s course. The use of ∼35× long-read nanopore data drastically improved the baseline assembly in terms of continuity. We show that relatively in-expensive high-throughput sequencing technologies such as the long-read MinION sequencing platform can be used in educational settings allowing the students to gain practical skills in modern genomics and generate high quality results that benefit downstream research projects.

Whole-Genome Sequencing of a Human Clinical Isolate of the Novel Species Klebsiella quasivariicola sp. nov

Genome Announcements ◽

10.1128/genomea.01057-17 ◽

2017 ◽

Vol 5 (42) ◽

Cited By ~ 15

Author(s):

S. Wesley Long ◽

Sarah E. Linson ◽

Matthew Ojeda Saavedra ◽

Concepcion Cantu ◽

James J. Davis ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Clinical Isolate ◽

Genome Sequencing ◽

Novel Species ◽

Whole Genome ◽

The Novel ◽

Short Read ◽

Oxford Nanopore ◽

Long Read

ABSTRACT In a study of 1,777 Klebsiella strains, we discovered KPN1705, which was distinct from all recognized Klebsiella spp. We closed the genome of strain KPN1705 using a hybrid of Illumina short-read and Oxford Nanopore long-read technologies. For this novel species, we propose the name Klebsiella quasivariicola sp. nov.

Improved chromosome-level genome assembly and annotation of the seagrass, Zostera marina (eelgrass)

F1000Research ◽

10.12688/f1000research.38156.1 ◽

2021 ◽

Vol 10 ◽

pp. 289

Author(s):

Xiao Ma ◽

Jeanine L. Olsen ◽

Thorsten B.H. Reusch ◽

Gabriele Procaccini ◽

Dave Kudrna ◽

...

Keyword(s):

Genome Assembly ◽

Zostera Marina ◽

Draft Genome ◽

High Molecular Weight Dna ◽

Protein Coding ◽

New Findings ◽

Long Read ◽

Sanger Sequence ◽

Assembly Pipeline ◽

Chromosome Level

Background: Seagrasses (Alismatales) are the only fully marine angiosperms. Zostera marina (eelgrass) plays a crucial role in the functioning of coastal marine ecosystems and global carbon sequestration. It is the most widely studied seagrass and has become a marine model system for exploring adaptation under rapid climate change. The original draft genome (v.1.0) of the seagrass Z. marina (L.) was based on a combination of Illumina mate-pair libraries and fosmid-ends. A total of 25.55 Gb of Illumina and 0.14 Gb of Sanger sequence was obtained representing 47.7× genomic coverage. The assembly resulted in ~2000 unordered scaffolds (L50 of 486 Kb), a final genome assembly size of 203MB, 20,450 protein coding genes and 63% TE content. Here, we present an upgraded chromosome-scale genome assembly and compare v.1.0 and the new v.3.1, reconfirming previous results from Olsen et al. (2016), as well as pointing out new findings. Methods: The same high molecular weight DNA used in the original sequencing of the Finnish clone was used. A high-quality reference genome was assembled with the MECAT assembly pipeline combining PacBio long-read sequencing and Hi-C scaffolding. Results: In total, 75.97 Gb PacBio data was produced. The final assembly comprises six pseudo-chromosomes and 304 unanchored scaffolds with a total length of 260.5Mb and an N50 of 34.6 MB, showing high contiguity and few gaps (~0.5%). 21,483 protein-encoding genes are annotated in this assembly, of which 20,665 (96.2%) obtained at least one functional assignment based on similarity to known proteins. Conclusions: As an important marine angiosperm, the improved Z. marina genome assembly will further assist evolutionary, ecological, and comparative genomics at the chromosome level. The new genome assembly will further our understanding into the structural and physiological adaptations from land to marine life.

Long live the king: chromosome-level assembly of the lion (Panthera leo) using linked-read, Hi-C, and long read data

10.1101/705483 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ellie E. Armstrong ◽

Ryan W. Taylor ◽

Danny E. Miller ◽

Christopher Kaelin ◽

Gregory Barsh ◽

...

Keyword(s):

Small Population ◽

Domestic Cat ◽

Panthera Leo ◽

Rapid Decline ◽

Social Species ◽

Asiatic Lion ◽

Oxford Nanopore ◽

Long Read ◽

Population Sizes ◽

Chromosome Level

AbstractThe lion (Panthera leo) is one of the most popular and iconic feline species on the planet, yet in spite of its popularity, the last century has seen massive declines for lion populations worldwide. Genomic resources for endangered species represent an important way forward for the field of conservation, enabling high-resolution studies of demography, disease, and population dynamics. Here, we present a chromosome-level assembly for the captive African lion from the Exotic Feline Rescue Center as a resource for current and subsequent genetic work of the sole social species of the Panthera clade. Our assembly is composed of 10x Genomics Chromium data, Dovetail Hi-C, and Oxford Nanopore long-read data. Synteny is highly conserved between the lion, other Panthera genomes, and the domestic cat. We find variability in the length and levels of homozygosity across the genomes of the lion sequenced here and other previous published resequence data, indicating contrasting histories of recent and ancient small population sizes and/or inbreeding. Demographic analyses reveal similar histories across all individuals except the Asiatic lion, which shows a more rapid decline in population size. This high-quality genome will greatly aid in the continuing research and conservation efforts for the lion.

LeafGo: Leaf to Genome, a quick workflow to produce high-quality de novo plant genomes using long-read sequencing technology

Genome Biology ◽

10.1186/s13059-021-02475-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Patrick Driguez ◽

Salim Bougouffa ◽

Karen Carty ◽

Alexander Putra ◽

Kamel Jabbari ◽

...

Keyword(s):

De Novo ◽

Plant Genome ◽

Time Cost ◽

Sequencing Technology ◽

High Quality ◽

Plant Genomes ◽

Long Read ◽

Generate Plant ◽

Sequencing Platforms ◽

Chromosome Level

AbstractCurrently, different sequencing platforms are used to generate plant genomes and no workflow has been properly developed to optimize time, cost, and assembly quality. We present LeafGo, a complete de novo plant genome workflow, that starts from tissue and produces genomes with modest laboratory and bioinformatic resources in approximately 7 days and using one long-read sequencing technology. LeafGo is optimized with ten different plant species, three of which are used to generate high-quality chromosome-level assemblies without any scaffolding technologies. Finally, we report the diploid genomes of Eucalyptus rudis and E. camaldulensis and the allotetraploid genome of Arachis hypogaea.

Improved Apis mellifera reference genome based on the alternative long-read-based assemblies

10.1101/2021.04.30.442202 ◽

2021 ◽

Author(s):

Milyausha Kaskinova ◽

Bayazit Yunusbayev ◽

Radick Altinbaev ◽

Rika Raffiudin ◽

Madeline H. Carpenter ◽

...

Keyword(s):

Apis Mellifera ◽

Honey Bee ◽

Reference Genome ◽

De Novo ◽

Gene Annotation ◽

Model Organism ◽

Functional Genomic ◽

Long Read ◽

Chromosome Level

ABSTRACTApis mellifera L., the western honey bee is a major crop pollinator that plays a key role in beekeeping and serves as an important model organism in social behavior studies. Recent efforts have improved on the quality of the honey bee reference genome and developed a chromosome-level assembly of sixteen chromosomes, two of which are gapless. However, the rest suffer from 51 gaps, 160 unplaced/unlocalized scaffolds, and the lack of 2 distal telomeres. The gaps are located at the hard-to-assemble extended highly repetitive chromosomal regions that may contain functional genomic elements. Here, we use de-novo re-assemblies from the most recent reference genome Amel_HAv_3.1 raw reads and other long-read-based assemblies (INRA_AMelMel_1.0, ASM1384120v1, and ASM1384124v1) of the honey bee genome to resolve 13 gaps, five unplaced/unlocalized scaffolds and, the lacking telomeres of the Amel_HAv_3.1. The total length of the resolved gaps is 848,747 bp. The accuracy of the corrected assembly was validated by mapping PacBio reads and performing gene annotation assessment. Comparative analysis suggests that the PacBio-reads-based assemblies of the honey bee genomes failed in the same highly repetitive extended regions of the chromosomes, especially on chromosome 10. To fully resolve these extended repetitive regions, further work using ultra-long Nanopore sequencing would be needed. Our updated assembly facilitates more accurate reference-guided scaffolding and marker/sequence mapping in honey bee genomics studies.