Novel canine high-quality metagenome-assembled genomes, prophages, and host-associated plasmids by long-read metagenomics together with Hi-C proximity ligation

Long-read metagenomics facilitates the assembly of high-quality metagenome-assembled genomes (HQ MAGs) out of complex microbiomes. It provides highly contiguous assemblies by spanning repetitive regions, complete ribosomal genes, and mobile genetic elements. Hi-C proximity ligation data bins the long contigs and their associated extra-chromosomal elements to their bacterial host. Here, we characterized a canine fecal sample combining a long-read metagenomics assembly with Hi-C data, and further correcting frameshift errors. We retrieved 27 HQ MAGs and seven medium-quality (MQ) MAGs considering MIMAG criteria. All the long-read canine MAGs improved previous short-read MAGs from public datasets regarding contiguity of the assembly, presence, and completeness of the ribosomal operons, and presence of canonical tRNAs. This trend was also observed when comparing to representative genomes from a pure culture (short-read assemblies). Moreover, Hi-C data linked six potential plasmids to their bacterial hosts. Finally, we identified 51 bacteriophages integrated into their bacterial host, providing novel host information for eight viral clusters that included Gut Phage Database viral genomes. Even though three viral clusters were species-specific, most of them presented a broader host range. In conclusion, long-read metagenomics retrieved long contigs harboring complete assembled ribosomal operons, prophages, and other mobile genetic elements. Hi-C binned together the long contigs into HQ and MQ MAGs, some of them representing closely related species. Long-read metagenomics and Hi-C proximity ligation are likely to become a comprehensive approach to HQ MAGs discovery and assignment of extra-chromosomal elements to their bacterial host.

Download Full-text

Outcome of Different Sequencing and Assembly Approaches on the Detection of Plasmids and Localization of Antimicrobial Resistance Genes in Commensal Escherichia coli

Microorganisms ◽

10.3390/microorganisms9030598 ◽

2021 ◽

Vol 9 (3) ◽

pp. 598

Author(s):

Katharina Juraschek ◽

Maria Borowiak ◽

Simon H. Tausch ◽

Burkhard Malorny ◽

Annemarie Käsbohrer ◽

...

Keyword(s):

Escherichia Coli ◽

Antimicrobial Resistance ◽

Mobile Genetic Elements ◽

Hybrid Assembly ◽

Small Plasmid ◽

Short Read ◽

Short Read Sequencing ◽

Genetic Elements ◽

Antimicrobial Resistance Genes ◽

Long Read

Antimicrobial resistance (AMR) is a major threat to public health worldwide. Currently, AMR typing changes from phenotypic testing to whole-genome sequence (WGS)-based detection of resistance determinants for a better understanding of the isolate diversity and elements involved in gene transmission (e.g., plasmids, bacteriophages, transposons). However, the use of WGS data in monitoring purposes requires suitable techniques, standardized parameters and approved guidelines for reliable AMR gene detection and prediction of their association with mobile genetic elements (plasmids). In this study, different sequencing and assembly strategies were tested for their suitability in AMR monitoring in Escherichia coli in the routines of the German National Reference Laboratory for Antimicrobial Resistances. To assess the outcomes of the different approaches, results from in silico predictions were compared with conventional phenotypic- and genotypic-typing data. With the focus on (fluoro)quinolone-resistant E.coli, five qnrS-positive isolates with multiple extrachromosomal elements were subjected to WGS with NextSeq (Illumina), PacBio (Pacific BioSciences) and ONT (Oxford Nanopore) for in depth characterization of the qnrS1-carrying plasmids. Raw reads from short- and long-read sequencing were assembled individually by Unicycler or Flye or a combination of both (hybrid assembly). The generated contigs were subjected to bioinformatics analysis. Based on the generated data, assembly of long-read sequences are error prone and can yield in a loss of small plasmid genomes. In contrast, short-read sequencing was shown to be insufficient for the prediction of a linkage of AMR genes (e.g., qnrS1) to specific plasmid sequences. Furthermore, short-read sequencing failed to detect certain duplications and was unsuitable for genome finishing. Overall, the hybrid assembly led to the most comprehensive typing results, especially in predicting associations of AMR genes and mobile genetic elements. Thus, the use of different sequencing technologies and hybrid assemblies currently represents the best approach for reliable AMR typing and risk assessment.

Download Full-text

Evaluating the accuracy of Listeria monocytogenes assemblies from quasimetagenomic samples using long and short reads

BMC Genomics ◽

10.1186/s12864-021-07702-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Seth Commichaux ◽

Kiran Javkar ◽

Padmini Ramachandran ◽

Niranjan Nagarajan ◽

Denis Bertrand ◽

...

Keyword(s):

Public Health ◽

Public Health Response ◽

High Quality ◽

Short Read ◽

Short Reads ◽

The Core ◽

Long Reads ◽

Health Response ◽

Long Read ◽

Core Genes

Abstract Background Whole genome sequencing of cultured pathogens is the state of the art public health response for the bioinformatic source tracking of illness outbreaks. Quasimetagenomics can substantially reduce the amount of culturing needed before a high quality genome can be recovered. Highly accurate short read data is analyzed for single nucleotide polymorphisms and multi-locus sequence types to differentiate strains but cannot span many genomic repeats, resulting in highly fragmented assemblies. Long reads can span repeats, resulting in much more contiguous assemblies, but have lower accuracy than short reads. Results We evaluated the accuracy of Listeria monocytogenes assemblies from enrichments (quasimetagenomes) of naturally-contaminated ice cream using long read (Oxford Nanopore) and short read (Illumina) sequencing data. Accuracy of ten assembly approaches, over a range of sequencing depths, was evaluated by comparing sequence similarity of genes in assemblies to a complete reference genome. Long read assemblies reconstructed a circularized genome as well as a 71 kbp plasmid after 24 h of enrichment; however, high error rates prevented high fidelity gene assembly, even at 150X depth of coverage. Short read assemblies accurately reconstructed the core genes after 28 h of enrichment but produced highly fragmented genomes. Hybrid approaches demonstrated promising results but had biases based upon the initial assembly strategy. Short read assemblies scaffolded with long reads accurately assembled the core genes after just 24 h of enrichment, but were highly fragmented. Long read assemblies polished with short reads reconstructed a circularized genome and plasmid and assembled all the genes after 24 h enrichment but with less fidelity for the core genes than the short read assemblies. Conclusion The integration of long and short read sequencing of quasimetagenomes expedited the reconstruction of a high quality pathogen genome compared to either platform alone. A new and more complete level of information about genome structure, gene order and mobile elements can be added to the public health response by incorporating long read analyses with the standard short read WGS outbreak response.

Download Full-text

A mobile genetic element increases bacterial host fitness by manipulating development

eLife ◽

10.7554/elife.65924 ◽

2021 ◽

Vol 10 ◽

Author(s):

Joshua M Jones ◽

Ilana Grinberg ◽

Avigdor Eldar ◽

Alan D Grossman

Keyword(s):

Gene Transfer ◽

Horizontal Gene Transfer ◽

Mobile Genetic Elements ◽

Selective Advantage ◽

Host Cells ◽

Bacterial Host ◽

Host Fitness ◽

Genetic Elements ◽

Necessary And Sufficient ◽

Integrative And Conjugative Element

Horizontal gene transfer is a major force in bacterial evolution. Mobile genetic elements are responsible for much of horizontal gene transfer and also carry beneficial cargo genes. Uncovering strategies used by mobile genetic elements to benefit host cells is crucial for understanding their stability and spread in populations. We describe a benefit that ICEBs1, an integrative and conjugative element of Bacillus subtilis, provides to its host cells. Activation of ICEBs1 conferred a frequency-dependent selective advantage to host cells during two different developmental processes: biofilm formation and sporulation. These benefits were due to inhibition of biofilm-associated gene expression and delayed sporulation by ICEBs1-containing cells, enabling them to exploit their neighbors and grow more prior to development. A single ICEBs1 gene, devI (formerly ydcO), was both necessary and sufficient for inhibition of development. Manipulation of host developmental programs allows ICEBs1 to increase host fitness, thereby increasing propagation of the element.

Download Full-text

mobileOG-db: a manually curated database of protein families mediating the life cycle of bacterial mobile genetic elements

10.1101/2021.08.27.457951 ◽

2021 ◽

Author(s):

Connor L. Brown ◽

James Mullet ◽

Fadi Hindi ◽

James E. Stoll ◽

Suraj Gupta ◽

...

Keyword(s):

Life Cycle ◽

Mobile Genetic Elements ◽

Experimental Information ◽

Functional Modules ◽

Protein Families ◽

Class Label ◽

Annotation Scheme ◽

High Quality ◽

Genetic Elements ◽

Recombination Repair

ABSTRACTCurrently available databases of bacterial mobile genetic elements (MGEs) contain both “core” and accessory MGE functional modules, the latter of which are often only transiently associated with the element. The presence of these accessory genes, which are often close homologs to primarily immobile genes, limits the usability of these databases for MGE annotation. To overcome this limitation, we analysed 10,776,212 protein sequences derived from seven MGE databases to compile a comprehensive database of 6,140 manually curated protein families that are linked to the “life cycle” (integration, excision, replication/recombination/repair, transfer, and stability/defense) of all major classes of bacterial MGEs. We overlay experimental information where available to create a tiered annotation scheme of high-quality annotations and annotations inferred exclusively through bioinformatic evidence. We additionally provide an MGE-class label for each entry (e.g., plasmid, integrative element) derived from the source database, and assign a list of keywords to each entry to delineate different MGE functional modules and to facilitate annotation. The resulting database, mobileOG-db (for mobile orthologous groups), provides a simple and readily interpretable foundation for an array of MGE-centred analyses. mobileOG-db can be accessed at mobileogdb.flsi.cloud.vt.edu/, where users can browse and design, refine, and analyse custom subsets of the dynamic mobilome.

Download Full-text

Evidence of an epidemic spread of KPC-producing Enterobacterales in Czech hospitals

Scientific Reports ◽

10.1038/s41598-021-95285-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Lucie Kraftova ◽

Marc Finianos ◽

Vendula Studentova ◽

Katerina Chudejova ◽

Vladislav Jakubu ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Infection Control ◽

Mobile Genetic Elements ◽

Epidemic Spread ◽

Sequencing Data ◽

Genetic Elements ◽

Long Read ◽

Genetic Rearrangements

AbstractThe aim of the present study is to describe the ongoing spread of the KPC-producing strains, which is evolving to an epidemic in Czech hospitals. During the period of 2018–2019, a total of 108 KPC-producing Enterobacterales were recovered from 20 hospitals. Analysis of long-read sequencing data revealed the presence of several types of blaKPC-carrying plasmids; 19 out of 25 blaKPC-carrying plasmids could be assigned to R (n = 12), N (n = 5), C (n = 1) and P6 (n = 1) incompatibility (Inc) groups. Five of the remaining blaKPC-carrying plasmids were multireplicon, while one plasmid couldn’t be typed. Additionally, phylogenetic analysis confirmed the spread of blaKPC-carrying plasmids among different clones of diverse Enterobacterales species. Our findings demonstrated that the increased prevalence of KPC-producing isolates was due to plasmids spreading among different species. In some districts, the local dissemination of IncR and IncN plasmids was observed. Additionally, the ongoing evolution of blaKPC-carrying plasmids, through genetic rearrangements, favours the preservation and further dissemination of these mobile genetic elements. Therefore, the situation should be monitored, and immediate infection control should be implemented in hospitals reporting KPC-producing strains.

Download Full-text

Resolving the Complexity of Human Skin Metagenomes Using Single-Molecule Sequencing

mBio ◽

10.1128/mbio.01948-15 ◽

2016 ◽

Vol 7 (1) ◽

Cited By ~ 37

Author(s):

Yu-Chih Tsai ◽

Sean Conlan ◽

Clayton Deming ◽

Julia A. Segre ◽

Heidi H. Kong ◽

...

Keyword(s):

Microbial Community ◽

Human Skin ◽

Single Molecule ◽

Smrt Sequencing ◽

High Quality ◽

Single Nucleotide ◽

Single Molecule Sequencing ◽

Short Read ◽

Hybrid Approaches ◽

Long Read

ABSTRACT Deep metagenomic shotgun sequencing has emerged as a powerful tool to interrogate composition and function of complex microbial communities. Computational approaches to assemble genome fragments have been demonstrated to be an effective tool for de novo reconstruction of genomes from these communities. However, the resultant “genomes” are typically fragmented and incomplete due to the limited ability of short-read sequence data to assemble complex or low-coverage regions. Here, we use single-molecule, real-time (SMRT) sequencing to reconstruct a high-quality, closed genome of a previously uncharacterized Corynebacterium simulans and its companion bacteriophage from a skin metagenomic sample. Considerable improvement in assembly quality occurs in hybrid approaches incorporating short-read data, with even relatively small amounts of long-read data being sufficient to improve metagenome reconstruction. Using short-read data to evaluate strain variation of this C. simulans in its skin community at single-nucleotide resolution, we observed a dominant C. simulans strain with moderate allelic heterozygosity throughout the population. We demonstrate the utility of SMRT sequencing and hybrid approaches in metagenome quantitation, reconstruction, and annotation. IMPORTANCE The species comprising a microbial community are often difficult to deconvolute due to technical limitations inherent to most short-read sequencing technologies. Here, we leverage new advances in sequencing technology, single-molecule sequencing, to significantly improve reconstruction of a complex human skin microbial community. With this long-read technology, we were able to reconstruct and annotate a closed, high-quality genome of a previously uncharacterized skin species. We demonstrate that hybrid approaches with short-read technology are sufficiently powerful to reconstruct even single-nucleotide polymorphism level variation of species in this a community.

Download Full-text

The ESKAPE mobilome contributes to the spread of antimicrobial resistance and CRISPR-mediated conflict between mobile genetic elements

10.1101/2022.01.03.474784 ◽

2022 ◽

Author(s):

João Botelho ◽

Adrian Cazares ◽

Hinrich Schulenburg

Keyword(s):

Antibiotic Resistance Genes ◽

Distribution Patterns ◽

Mobile Genetic Elements ◽

Human Pathogens ◽

Systematic Analysis ◽

Genetic Elements ◽

General Importance ◽

Species Specific ◽

Eskape Pathogens ◽

Asymmetrically Distributed

Mobile genetic elements (MGEs) mediate the shuffling of genes among organisms. They contribute to the spread of virulence and antibiotic resistance genes in human pathogens, including the particularly problematic group of ESKAPE pathogens, such as Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter sp. Here, we performed the first systematic analysis of MGEs, including plasmids, prophages, and integrative and conjugative/mobilizable elements (ICEs/IMEs), in the ESKAPE pathogens. We characterized over 1700 complete ESKAPE genomes and found that different MGE types are asymmetrically distributed across these pathogens. While some MGEs are capable of exchanging DNA beyond the genus (and phylum) barrier, horizontal gene transfer (HGT) is mainly restricted by phylum or genus. We further observed that most genes on MGEs have unknown functions and show intricate distribution patterns. Moreover, AMR genes and anti-CRISPRs are overrepresented in the ESKAPE mobilome. Our results also underscored species-specific trends shaping the number of MGEs, AMR, and virulence genes across pairs of conspecific ESKAPE genomes with and without CRISPR-Cas systems. Finally, we found that CRISPR targets vary according to MGE type: while plasmid CRISPRs almost exclusively target other plasmids, ICEs/IME CRISPRs preferentially target ICEs/IMEs and prophages. Overall, our study highlights the general importance of the ESKAPE mobilome in contributing to the spread of AMR and mediating conflict among MGEs.

Download Full-text

Perspectives and benefits of high-throughput long-read sequencing in microbial ecology

Applied and Environmental Microbiology ◽

10.1128/aem.00626-21 ◽

2021 ◽

Author(s):

Leho Tedersoo ◽

Mads Albertsen ◽

Sten Anslan ◽

Benjamin Callahan

Keyword(s):

Microbial Ecology ◽

High Throughput ◽

Single Molecule ◽

High Throughput Sequencing ◽

Environmental Dna ◽

Nanopore Sequencing ◽

High Quality ◽

Short Read ◽

Sequencing Technologies ◽

Long Read

Short-read, high-throughput sequencing (HTS) methods have yielded numerous important insights into microbial ecology and function. Yet, in many instances short-read HTS techniques are suboptimal, for example by providing insufficient phylogenetic resolution or low integrity of assembled genomes. Single-molecule and synthetic long-read (SLR) HTS methods have successfully ameliorated these limitations. In addition, nanopore sequencing has generated a number of unique analysis opportunities such as rapid molecular diagnostics and direct RNA sequencing, and both PacBio and nanopore sequencing support detection of epigenetic modifications. Although initially suffering from relatively low sequence quality, recent advances have greatly improved the accuracy of long read sequencing technologies. In spite of great technological progress in recent years, the long-read HTS methods (PacBio and nanopore sequencing) are still relatively costly, require large amounts of high-quality starting material, and commonly need specific solutions in various analysis steps. Despite these challenges, long-read sequencing technologies offer high-quality, cutting-edge alternatives for testing hypotheses about microbiome structure and functioning as well as assembly of eukaryote genomes from complex environmental DNA samples.

Download Full-text

Characterization of Mobile Genetic Elements Using Long-Read Sequencing for Tracking Listeria monocytogenes from Food Processing Environments

Pathogens ◽

10.3390/pathogens9100822 ◽

2020 ◽

Vol 9 (10) ◽

pp. 822

Author(s):

Hee Jin Kwon ◽

Zhao Chen ◽

Peter Evans ◽

Jianghong Meng ◽

Yi Chen

Keyword(s):

Listeria Monocytogenes ◽

Food Processing ◽

Genetic Relatedness ◽

Mobile Genetic Elements ◽

Recent Common Ancestor ◽

Nucleotide Substitution Rate ◽

Genetic Elements ◽

Most Recent Common Ancestor ◽

Long Read ◽

The U.S

Recently developed nanopore sequencing technologies offer a unique opportunity to rapidly close the genome and to identify complete sequences of mobile genetic elements (MGEs). In this study, 17 isolates of Listeria monocytogenes (Lm) epidemic clone II (ECII) from seven ready-to-eat meat or poultry processing facilities, not known to be associated with outbreaks, were shotgun sequenced, and among them, five isolates were further subjected to long-read sequencing. Additionally, 26 genomes of Lm ECII isolates associated with three listeriosis outbreaks in the U.S. and South Africa were obtained from the National Center for Biotechnology Information (NCBI) database and analyzed to evaluate if MGEs may be used as a high-resolution genetic marker for identifying and sourcing the origin of Lm. The analyses identified four comK prophages in 11 non-outbreak isolates from four facilities and three comK prophages in 20 isolates associated with two outbreaks that occurred in the U.S. In addition, three different plasmids were identified among 10 non-outbreak isolates and 14 outbreak isolates. Each comK prophage and plasmid was conserved among the isolates sharing it. Different prophages from different facilities or outbreaks had significant genetic variations, possibly due to horizontal gene transfer. Phylogenetic analysis showed that isolates from the same facility or the same outbreak always closely clustered. The time of most recent common ancestor of the Lm ECII isolates was estimated to be in March 1816 with the average nucleotide substitution rate of 3.1 × 10−7 substitutions per site per year. This study showed that complete MGE sequences provide a good signal to determine the genetic relatedness of Lm isolates, to identify persistence or repeated contamination that occurred within food processing environment, and to study the evolutionary history among closely related isolates.

Download Full-text

Complete and high-quality genomes of novel microbial species from a meromictic lake using a workflow combining long- and short-read sequencing platforms

10.1101/2021.05.07.443067 ◽

2021 ◽

Author(s):

Yu-Hsiang Chen ◽

Pei-Wen Chiang ◽

Denis Yu Rogozin ◽

Andrey Georgievich Degermendzhy ◽

Hsiu-Hui Chiu ◽

...

Keyword(s):

Novel Species ◽

Bacterial Genome ◽

Meromictic Lake ◽

High Quality ◽

Good Opportunity ◽

Short Read ◽

Sequencing Technologies ◽

Complete Genomes ◽

Long Read ◽

Sequencing Platforms

Background: Most of Earth's bacteria have yet to be cultivated. The metabolic and functional potentials of these uncultivated microorganisms thus remain mysterious, and the metagenome-assembled genome (MAG) approach is the most robust method for uncovering these potentials. However, MAGs discovered by conventional metagenomic assembly and binning methods are usually highly fragmented genomes with heterogeneous sequence contamination, and this affects the accuracy and sensitivity of genomic analyses. Though the maturation of long-read sequencing technologies provides a good opportunity to fix the problem of highly fragmented MAGs as mentioned above, the method's error-prone nature causes severe problems of long-read-alone metagenomics. Hence, methods are urgently needed to retrieve MAGs by a combination of both long- and short-read technologies to advance genome-centric metagenomics. Results: In this study, we combined Illumina and Nanopore data to develop a new workflow to reconstruct 233 MAGs-six novel bacterial orders, 20 families, 66 genera, and 154 species-from Lake Shunet, a secluded meromictic lake in Siberia. Those new MAGs were underrepresented or undetectable in other MAGs studies using metagenomes from human or other common organisms or habitats. Using this newly developed workflow and strategy, the average N50 of reconstructed MAGs greatly increased 10-40-fold compared to when the conventional Illumina assembly and binning method were used. More importantly, six complete MAGs were recovered from our datasets, five of which belong to novel species. We used these as examples to demonstrate many novel and intriguing genomic characteristics discovered in these newly complete genomes and proved the importance of high-quality complete MAGs in microbial genomics and metagenomics studies. Conclusions: The results show that it is feasible to apply our workflow with a few additional long reads to recover numerous complete and high-quality MAGs from short-read metagenomes of high microbial diversity environment samples. The unique features we identified from five complete genomes highlight the robustness of this method in genome-centric metagenomic research. The recovery of 154 novel species MAGs from a rarely explored lake greatly expands the current bacterial genome encyclopedia and broadens our knowledge by adding new genomic characteristics of bacteria. It demonstrates a strong need to recover MAGs from diverse unexplored habitats in the search for microbial dark matter.

Download Full-text