scholarly journals SCAPP: an algorithm for improved plasmid assembly in metagenomes

Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
David Pellow ◽  
Alvah Zorea ◽  
Maraike Probst ◽  
Ori Furman ◽  
Arik Segal ◽  
...  

Abstract Background Metagenomic sequencing has led to the identification and assembly of many new bacterial genome sequences. These bacteria often contain plasmids: usually small, circular double-stranded DNA molecules that may transfer across bacterial species and confer antibiotic resistance. These plasmids are generally less studied and understood than their bacterial hosts. Part of the reason for this is insufficient computational tools enabling the analysis of plasmids in metagenomic samples. Results We developed SCAPP (Sequence Contents-Aware Plasmid Peeler)—an algorithm and tool to assemble plasmid sequences from metagenomic sequencing. SCAPP builds on some key ideas from the Recycler algorithm while improving plasmid assemblies by integrating biological knowledge about plasmids. We compared the performance of SCAPP to Recycler and metaplasmidSPAdes on simulated metagenomes, real human gut microbiome samples, and a human gut plasmidome dataset that we generated. We also created plasmidome and metagenome data from the same cow rumen sample and used the parallel sequencing data to create a novel assessment procedure. Overall, SCAPP outperformed Recycler and metaplasmidSPAdes across this wide range of datasets. Conclusions SCAPP is an easy to use Python package that enables the assembly of full plasmid sequences from metagenomic samples. It outperformed existing metagenomic plasmid assemblers in most cases and assembled novel and clinically relevant plasmids in samples we generated such as a human gut plasmidome. SCAPP is open-source software available from: https://github.com/Shamir-Lab/SCAPP.

Author(s):  
David Pellow ◽  
Alvah Zorea ◽  
Maraike Probst ◽  
Ori Furman ◽  
Arik Segal ◽  
...  

Background: Metagenomic sequencing has led to the identification and assembly of many new bacterial genome sequences. These bacteria often contain plasmids: usually small, circular double-stranded DNA molecules that may transfer across bacterial species and confer antibiotic resistance. These plasmids are generally less studied and understood than their bacterial hosts. Part of the reason for this is insufficient computational tools enabling the analysis of plasmids in metagenomic samples. Results: We developed SCAPP (Sequence Contents-Aware Plasmid Peeler) - an algorithm and tool to assemble plasmid sequences from metagenomic sequencing. SCAPP builds on some key ideas from the Recycler algorithm while improving plasmid assemblies by integrating biological knowledge about plasmids. We compared the performance of SCAPP to Recycler and metaplasmidSPAdes on simulated metagenomes, real human gut microbiome samples, and a human gut plasmidome dataset that we generated. We also created plasmidome and metagenome data from the same cow rumen sample and used the parallel sequencing data to create a novel assessment procedure. Overall, SCAPP outperformed Recycler and metaplasmidSPAdes across this wide range of datasets. Conclusions: SCAPP is an easy to use Python package that enables the assembly of full plasmid sequences from metagenomic samples. It outperformed existing metagenomic plasmid assemblers in most cases, and assembled novel and clinically relevant plasmids in samples we generated such as a human gut plasmidome. SCAPP is open-source software available from: https://github.com/Shamir-Lab/SCAPP.


2020 ◽  
Author(s):  
Sandeep Chakraborty

The metagenome of patients infected with SARS-Cov2 [1] has shown Prevotella to be a key player in immune response [2] in one Chinese study [3], just starting in another [4] and a host of other opportunistic pathogens in a study from San Diego county [5]. The metagenome can also be queried to find host response genes [5], as was done in monkey cells infected with SARS-Cov2 [6]Nanopore sequencing data from a familial cluster in ShenzhenThe patients were tested for 4 bacterial species - Bordetella pertussis, Bordetella parapertussis, Chlamydophila pneumoniae, and Mycoplasma pneumoniae. The sequencing data (Accid:SRR10948474, Nanopore) from five patients in a family cluster from Shenzhen who presented with unexplained pneumonia after returning from Wuhan (Table 1) shows a wide range of bacterial species - Lautropia, Cutibacterium, Haemophilus being most abundant. The presence of Campylobacter explains diarrhea seen in the patient [7,8]. Also, their tests should have detected Mycoplasma, since it is there in the data.Significant bacterial load with some bacterial species predominatingThe bacterial reads are about 20% (95K out of 500K reads). The viral load is also significant here (70K reads) [2]. They are in SI.familial/allsequences.fa. The number of bacterial species (with at least two reads) is 876 (SI.familial/list.allbacteria.txt). Thus, it is important to consider secondary infection, a possible reason why azithromycin (in addition to hydroxychloroquine) has given good initial results in a clinical trial [9].


mSphere ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Michelle Spoto ◽  
Changhui Guan ◽  
Elizabeth Fleming ◽  
Julia Oh

ABSTRACT The CRISPR/Cas system has significant potential to facilitate gene editing in a variety of bacterial species. CRISPR interference (CRISPRi) and CRISPR activation (CRISPRa) represent modifications of the CRISPR/Cas9 system utilizing a catalytically inactive Cas9 protein for transcription repression and activation, respectively. While CRISPRi and CRISPRa have tremendous potential to systematically investigate gene function in bacteria, few programs are specifically tailored to identify guides in draft bacterial genomes genomewide. Furthermore, few programs offer open-source code with flexible design parameters for bacterial targeting. To address these limitations, we created GuideFinder, a customizable, user-friendly program that can design guides for any annotated bacterial genome. GuideFinder designs guides from NGG protospacer-adjacent motif (PAM) sites for any number of genes by the use of an annotated genome and FASTA file input by the user. Guides are filtered according to user-defined design parameters and removed if they contain any off-target matches. Iteration with lowered parameter thresholds allows the program to design guides for genes that did not produce guides with the more stringent parameters, one of several features unique to GuideFinder. GuideFinder can also identify paired guides for targeting multiplicity, whose validity we tested experimentally. GuideFinder has been tested on a variety of diverse bacterial genomes, finding guides for 95% of genes on average. Moreover, guides designed by the program are functionally useful—focusing on CRISPRi as a potential application—as demonstrated by essential gene knockdown in two staphylococcal species. Through the large-scale generation of guides, this open-access software will improve accessibility to CRISPR/Cas studies of a variety of bacterial species. IMPORTANCE With the explosion in our understanding of human and environmental microbial diversity, corresponding efforts to understand gene function in these organisms are strongly needed. CRISPR/Cas9 technology has revolutionized interrogation of gene function in a wide variety of model organisms. Efficient CRISPR guide design is required for systematic gene targeting. However, existing tools are not adapted for the broad needs of microbial targeting, which include extraordinary species and subspecies genetic diversity, the overwhelming majority of which is characterized by draft genomes. In addition, flexibility in guide design parameters is important to consider the wide range of factors that can affect guide efficacy, many of which can be species and strain specific. We designed GuideFinder, a customizable, user-friendly program that addresses the limitations of existing software and that can design guides for any annotated bacterial genome with numerous features that facilitate guide design in a wide variety of microorganisms.


Viruses ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1143 ◽  
Author(s):  
Rasmus Riemer Jakobsen ◽  
Thor Haahr ◽  
Peter Humaidan ◽  
Jørgen Skov Jensen ◽  
Witold Piotr Kot ◽  
...  

Bacterial vaginosis (BV) is characterized by a reduction in Lactobacillus (L.) spp. abundance and increased abundance of facultative anaerobes, such as Gardnerella spp. BV aetiology is not fully understood; however, bacteriophages could play a pivotal role in the perturbation of the vaginal bacterial community. We investigated the vaginal viral community, including bacteriophages and the association to the bacterial community and BV-status. Vaginal samples from 48 patients undergoing IVF treatment for non-female factor infertility were subjected to metagenomic sequencing of purified virus-like particles. The vaginal viral community was characterized and correlated with the BV-status by Nugent score, bacterial community, structure, and the presence of key vaginal bacterial species. The majority of identified vaginal viruses belonged to the class of double-stranded DNA bacteriophages, with eukaryotic viruses constituting 4% of the total reads. Clear links between the viral community composition and BV (q = 0.006, R = 0.26) as well as the presence of L. crispatus (q = 0.001, R = 0.43), L. iners, Gardnerella spp., and Atopobium vaginae were found (q < 0.002, R > 0.15). The eukaryotic viral community also correlated with BV-status (q = 0.018, R = 0.20). In conclusion, the vaginal virome was clearly linked with bacterial community structure and BV-status.


2021 ◽  
Author(s):  
Fred J. Heller ◽  
Hasan Al Banna ◽  
M. Hasanul Kaisar ◽  
Denise Chac ◽  
Fahima Chowdhury ◽  
...  

Background: Oral cholera vaccines (OCVs) are an important tool for reduction of the worldwide cholera burden, but some individuals who receive an OCV do not develop protective immune responses. The gut microbiota is a potential explanation for these differences. Components of the gut microbiota associated with differences in OCV response have not been identified. Results: We used metagenomic sequencing to identify predicted protein-coding genes in the gut microbiota at the time of OCV administration, and then measured immune responses to vaccination. Vaccine recipients were classified as OCV 'responders' if they developed a post-vaccination increase in memory B cell populations that produce IgA or IgG specific for cholera toxin and the V. cholerae O-specific polysaccharide. We next analyzed microbial genes seen at similar abundances across individual samples and classified these into co-abundant gene groupings (CAGs), and correlated CAGs with OCV responses. Next, to identify specific bacterial strains associated with OCV responses, we mapped CAGs to bacterial genomes and generated a 'priority score' for each strain detected in the study population. This score reflects both the number of CAGs aligning to a specific bacterial genome and the strength of the association between the CAGs and the vaccine response. This strain-level analysis revealed relationships between the gut microbiota and immune response to OCV that were not detected at the genus or species level. Bacterial strains which produce short-chain fatty acids and those with sphingolipid-containing cell membranes were correlated with more robust immune responses to vaccination. Conclusion: Our study demonstrates a method for translating metagenomic sequencing data into strain-specific results associated with a biological outcome. Using this approach, we identified strains for the study of bacterial-derived molecules or metabolites associated with immune responses; such agents might have potential utility as vaccine adjuvants.


2021 ◽  
Vol 12 ◽  
Author(s):  
Li Song ◽  
Lu Zhang ◽  
Xiaodong Fang

The diversity and high genomic mutation rates of viral species hinder our understanding of viruses and their contributions to human health. Viral enterotypes as a description of the gut virome, its characteristics have not been thoroughly studied. Here we investigated the human gut virome composition using previously published sequencing data of 2,690 metagenomes from seven countries with various phenotypes. We found that the virome was dominated by double-stranded DNA viruses in our data, and young children and adults showed different stages in their fecal enterovirus composition. Beta diversity showed there were significantly less homogeneous in individuals with severe disorders of bile acid secretion, such as cirrhosis. In contrast, there were no significant differences in distances to centroids or viral components between patients with phenotypes unrelated to bile acid, such as hypertension. Enterotypes determined independently from various projects showed similar specific viruses and enrichment direction. Confounding factors, such as different sequencing platforms and library construction, did not confuse enterotyping. The gut virome composition pattern could be described by two viral enterotypes, which supported a discrete, rather than a gradient, distribution. Three main components, enterotype 1 and 2 specific viruses and the other, comprise the total viral variation in these sets. Compared with enterotype 2, enterotype 1 had a higher viral count, Shannon index, and similarity between samples. The relative abundance of enterotype-specific viruses is a crucial determinant of enterotype assignment. Samples not matching any of the defined enterotypes in the database did not necessarily correlate to sickness. Therefore, the background context must be carefully considered when using a viral enterotype as a feature for disease prediction. Our results highlight important insights into the human gut virome composition by exploring two-main viral enterotypes in population and providing an alternate covariate for early disease screening.


2018 ◽  
Author(s):  
Janko Tackmann ◽  
João Frederico Matias Rodrigues ◽  
Christian von Mering

AbstractThe recent explosion of metagenomic sequencing data opens the door towards the modeling of microbial ecosystems in unprecedented detail. In particular, co-occurrence based prediction of ecological interactions could strongly benefit from this development. However, current methods fall short on several fronts: univariate tools do not distinguish between direct and indirect interactions, resulting in excessive false positives, while approaches with better resolution are so far computationally highly limited. Furthermore, confounding variables typical for cross-study data sets are rarely addressed. We present FlashWeave, a new approach based on a flexible Probabilistic Graphical Models framework to infer highly resolved direct microbial interactions from massive heterogeneous microbial abundance data sets with seamless integration of metadata. On a variety of benchmarks, FlashWeave outperforms state-of-the-art methods by several orders of magnitude in terms of speed while generally providing increased accuracy. We apply FlashWeave to a cross-study data set of 69 818 publicly available human gut samples, resulting in one of the largest and most diverse models of microbial interactions in the human gut to date.


2016 ◽  
Vol 113 (9) ◽  
pp. 2502-2507 ◽  
Author(s):  
Ákos Nyerges ◽  
Bálint Csörgő ◽  
István Nagy ◽  
Balázs Bálint ◽  
Péter Bihari ◽  
...  

Currently available tools for multiplex bacterial genome engineering are optimized for a few laboratory model strains, demand extensive prior modification of the host strain, and lead to the accumulation of numerous off-target modifications. Building on prior development of multiplex automated genome engineering (MAGE), our work addresses these problems in a single framework. Using a dominant-negative mutant protein of the methyl-directed mismatch repair (MMR) system, we achieved a transient suppression of DNA repair inEscherichia coli, which is necessary for efficient oligonucleotide integration. By integrating all necessary components into a broad-host vector, we developed a new workflow we term pORTMAGE. It allows efficient modification of multiple loci, without any observable off-target mutagenesis and prior modification of the host genome. Because of the conserved nature of the bacterial MMR system, pORTMAGE simultaneously allows genome editing and mutant library generation in other biotechnologically and clinically relevant bacterial species. Finally, we applied pORTMAGE to study a set of antibiotic resistance-conferring mutations inSalmonella entericaandE. coli. Despite over 100 million y of divergence between the two species, mutational effects remained generally conserved. In sum, a single transformation of a pORTMAGE plasmid allows bacterial species of interest to become an efficient host for genome engineering. These advances pave the way toward biotechnological and therapeutic applications. Finally, pORTMAGE allows systematic comparison of mutational effects and epistasis across a wide range of bacterial species.


Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 331
Author(s):  
Nachon Raethong ◽  
Massalin Nakphaichit ◽  
Narissara Suratannon ◽  
Witida Sathitkowitchai ◽  
Wanlapa Weerapakorn ◽  
...  

The gut microbiome plays a major role in the maintenance of human health. Characterizing the taxonomy and metabolic functions of the human gut microbiome is necessary for enhancing health. Here, we analyzed the metagenomic sequencing, assembly and construction of a meta-gene catalogue of the human gut microbiome with the overall aim of investigating the taxonomy and metabolic functions of the gut microbiome in Thai adults. As a result, the integrative analysis of 16S rRNA gene and whole metagenome shotgun (WMGS) sequencing data revealed that the dominant gut bacterial families were Lachnospiraceae and Ruminococcaceae of the Firmicutes phylum. Consistently, across 3.8 million (M) genes annotated from 163.5 gigabases (Gb) of WMGS sequencing data, a significant number of genes associated with carbohydrate metabolism of the dominant bacterial families were identified. Further identification of bacterial community-wide metabolic functions promisingly highlighted the importance of Roseburia and Faecalibacterium involvement in central carbon metabolism, sugar utilization and metabolism towards butyrate biosynthesis. This work presents an initial study of shotgun metagenomics in a Thai population-based cohort in a developing Southeast Asian country.


2019 ◽  
Author(s):  
Samantha C. Waterworth ◽  
Eric W. Isemonger ◽  
Evan R. Rees ◽  
Rosemary A. Dorrington ◽  
Jason C. Kwan

SUMMARYStromatolites are complex microbial mats that form lithified layers and ancient forms are the oldest evidence of life on earth, dating back over 3.4 billion years. Modern stromatolites are relatively rare but may provide clues about the function and evolution of their ancient counterparts. In this study, we focus on peritidal stromatolites occurring at Cape Recife and Schoenmakerskop on the southeastern South African coastline. Using assembled shotgun metagenomic data we obtained 183 genomic bins, of which the most dominant taxa were from the Cyanobacteriia class (Cyanobacteria phylum), with lower but notable abundances of bacteria classified as Alphaproteobacteria, Gammaproteobacteria and Bacteroidia. We identified functional gene sets in bacterial species conserved across two geographically distinct stromatolite formations, which may promote carbonate precipitation through the reduction of nitrogenous compounds and possible production of calcium ions. We propose that an abundance of extracellular alkaline phosphatases may lead to the formation of phosphatic deposits within these stromatolites. We conclude that the cumulative effect of several conserved bacterial species drives accretion in these two stromatolite formations.ORIGINALITY-SIGNIFICANCEPeritidal stromatolites are unique among stromatolite formations as they grow at the dynamic interface of calcium carbonate-rich groundwater and coastal marine waters. The peritidal space forms a relatively unstable environment and the factors that influence the growth of these peritidal structures is not well understood. To our knowledge, this is the first comparative study that assesses species conservation within the microbial communities of two geographically distinct peritidal stromatolite formations. We assessed the potential functional roles of these communities using genomic bins clustered from metagenomic sequencing data. We identified several conserved bacterial species across the two sites and hypothesize that their genetic functional potential may be important in the formation of pertidal stromatolites. We contrasted these findings against a well-studied site in Shark Bay, Australia and show that, unlike these hypersaline formations, archaea do not play a major role in peritidal stromatolite formation. Furthermore, bacterial nitrogen and phosphate metabolisms of conserved species may be driving factors behind lithification in peritidal stromatolites.


Sign in / Sign up

Export Citation Format

Share Document