scholarly journals Improving contig binning of metagenomic data using $$ {d}_2^S $$ oligonucleotide frequency dissimilarity

2017 ◽  
Vol 18 (1) ◽  
Author(s):  
Ying Wang ◽  
Kun Wang ◽  
Yang Young Lu ◽  
Fengzhu Sun
2020 ◽  
Vol 94 (11) ◽  
Author(s):  
Shengzhong Xu ◽  
Liang Zhou ◽  
Xiaosha Liang ◽  
Yifan Zhou ◽  
Hao Chen ◽  
...  

ABSTRACT Virophages are small parasitic double-stranded DNA (dsDNA) viruses of giant dsDNA viruses infecting unicellular eukaryotes. Except for a few isolated virophages characterized by parasitization mechanisms, features of virophages discovered in metagenomic data sets remain largely unknown. Here, the complete genomes of seven virophages (26.6 to 31.5 kbp) and four large DNA viruses (190.4 to 392.5 kbp) that coexist in the freshwater lake Dishui Lake, Shanghai, China, have been identified based on environmental metagenomic investigation. Both genomic and phylogenetic analyses indicate that Dishui Lake virophages (DSLVs) are closely related to each other and to other lake virophages, and Dishui Lake large DNA viruses are affiliated with the micro-green alga-infecting Prasinovirus of the Phycodnaviridae (named Dishui Lake phycodnaviruses [DSLPVs]) and protist (protozoan and alga)-infecting Mimiviridae (named Dishui Lake large alga virus [DSLLAV]). The DSLVs possess more genes with closer homology to that of large alga viruses than to that of giant protozoan viruses. Furthermore, the DSLVs are strongly associated with large green alga viruses, including DSLPV4 and DSLLAV1, based on codon usage as well as oligonucleotide frequency and correlation analyses. Surprisingly, a nonhomologous CRISPR-Cas like system is found in DSLLAV1, which appears to protect DSLLAV1 from the parasitization of DSLV5 and DSLV8. These results suggest that novel cell-virus-virophage (CVv) tripartite infection systems of green algae, large green alga virus (Phycodnaviridae- and Mimiviridae-related), and virophage exist in Dishui Lake, which will contribute to further deep investigations of the evolutionary interaction of virophages and large alga viruses as well as of the essential roles that the CVv plays in the ecology of algae. IMPORTANCE Virophages are small parasitizing viruses of large/giant viruses. To our knowledge, the few isolated virophages all parasitize giant protozoan viruses (Mimiviridae) for propagation and form a tripartite infection system with hosts, here named the cell-virus-virophage (CVv) system. However, the CVv system remains largely unknown in environmental metagenomic data sets. In this study, we systematically investigated the metagenomic data set from the freshwater lake Dishui Lake, Shanghai, China. Consequently, four novel large alga viruses and seven virophages were discovered to coexist in Dishui Lake. Surprisingly, a novel CVv tripartite infection system comprising green algae, large green alga viruses (Phycodnaviridae- and Mimiviridae-related), and virophages was identified based on genetic link, genomic signature, and CRISPR system analyses. Meanwhile, a nonhomologous CRISPR-like system was found in Dishui Lake large alga viruses, which appears to protect the virus host from the infection of Dishui Lake virophages (DSLVs). These findings are critical to give insight into the potential significance of CVv in global evolution and ecology.


2020 ◽  
Vol 15 ◽  
Author(s):  
Akshatha Prasanna ◽  
Vidya Niranjan

Background: Since bacteria are the earliest known organisms, there has been significant interest in their variety and biology, most certainly concerning human health. Recent advances in Metagenomics sequencing (mNGS), a culture-independent sequencing technology have facilitated an accelerated development in clinical microbiology and our understanding of pathogens. Objective: For the implementation of mNGS in routine clinical practice to become feasible, a practical and scalable strategy for the study of mNGS data is essential. This study presents a robust automated pipeline to analyze clinical metagenomic data for pathogen identification and classification. Method: The proposed Clin-mNGS pipeline is an integrated, open-source, scalable, reproducible, and user-friendly framework scripted using the Snakemake workflow management software. The implementation avoids the hassle of manual installation and configuration of the multiple command-line tools and dependencies. The approach directly screens pathogens from clinical raw reads and generates consolidated reports for each sample. Results: The pipeline is demonstrated using publicly available data and is tested on a desktop Linux system and a High-performance cluster. The study compares variability in results from different tools and versions. The versions of the tools are made user modifiable. The pipeline results in quality check, filtered reads, host subtraction, assembled contigs, assembly metrics, relative abundances of bacterial species, antimicrobial resistance genes, plasmid finding, and virulence factors identification. The results obtained from the pipeline are evaluated based on sensitivity and positive predictive value. Conclusion: Clin-mNGS is an automated Snakemake pipeline validated for the analysis of microbial clinical metagenomics reads to perform taxonomic classification and antimicrobial resistance prediction.


Pathogens ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 86
Author(s):  
Erin M. Garcia ◽  
Myrna G. Serrano ◽  
Laahirie Edupuganti ◽  
David J. Edwards ◽  
Gregory A. Buck ◽  
...  

Gardnerella vaginalis has recently been split into 13 distinct species. In this study, we tested the hypotheses that species-specific variations in the vaginolysin (VLY) amino acid sequence could influence the interaction between the toxin and vaginal epithelial cells and that VLY variation may be one factor that distinguishes less virulent or commensal strains from more virulent strains. This was assessed by bioinformatic analyses of publicly available Gardnerella spp. sequences and quantification of cytotoxicity and cytokine production from purified, recombinantly produced versions of VLY. After identifying conserved differences that could distinguish distinct VLY types, we analyzed metagenomic data from a cohort of female subjects from the Vaginal Human Microbiome Project to investigate whether these different VLY types exhibited any significant associations with symptoms or Gardnerella spp.-relative abundance in vaginal swab samples. While Type 1 VLY was most prevalent among the subjects and may be associated with increased reports of symptoms, subjects with Type 2 VLY dominant profiles exhibited increased relative Gardnerella spp. abundance. Our findings suggest that amino acid differences alter the interaction of VLY with vaginal keratinocytes, which may potentiate differences in bacterial vaginosis (BV) immunopathology in vivo.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kazutoshi Yoshitake ◽  
Gaku Kimura ◽  
Tomoko Sakami ◽  
Tsuyoshi Watanabe ◽  
Yukiko Taniuchi ◽  
...  

AbstractAlthough numerous metagenome, amplicon sequencing-based studies have been conducted to date to characterize marine microbial communities, relatively few have employed full metagenome shotgun sequencing to obtain a broader picture of the functional features of these marine microbial communities. Moreover, most of these studies only performed sporadic sampling, which is insufficient to understand an ecosystem comprehensively. In this study, we regularly conducted seawater sampling along the northeastern Pacific coast of Japan between March 2012 and May 2016. We collected 213 seawater samples and prepared size-based fractions to generate 454 subsets of samples for shotgun metagenome sequencing and analysis. We also determined the sequences of 16S rRNA (n = 111) and 18S rRNA (n = 47) gene amplicons from smaller sample subsets. We thereafter developed the Ocean Monitoring Database for time-series metagenomic data (http://marine-meta.healthscience.sci.waseda.ac.jp/omd/), which provides a three-dimensional bird’s-eye view of the data. This database includes results of digital DNA chip analysis, a novel method for estimating ocean characteristics such as water temperature from metagenomic data. Furthermore, we developed a novel classification method that includes more information about viruses than that acquired using BLAST. We further report the discovery of a large number of previously overlooked (TAG)n repeat sequences in the genomes of marine microbes. We predict that the availability of this time-series database will lead to major discoveries in marine microbiome research.


Marine Drugs ◽  
2021 ◽  
Vol 19 (8) ◽  
pp. 424
Author(s):  
Osama G. Mohamed ◽  
Sadaf Dorandish ◽  
Rebecca Lindow ◽  
Megan Steltz ◽  
Ifrah Shoukat ◽  
...  

The antibiotic-resistant bacteria-associated infections are a major global healthcare threat. New classes of antimicrobial compounds are urgently needed as the frequency of infections caused by multidrug-resistant microbes continues to rise. Recent metagenomic data have demonstrated that there is still biosynthetic potential encoded in but transcriptionally silent in cultivatable bacterial genomes. However, the culture conditions required to identify and express silent biosynthetic gene clusters that yield natural products with antimicrobial activity are largely unknown. Here, we describe a new antibiotic discovery scheme, dubbed the modified crowded plate technique (mCPT), that utilizes complex microbial interactions to elicit antimicrobial production from otherwise silent biosynthetic gene clusters. Using the mCPT as part of the antibiotic crowdsourcing educational program Tiny Earth®, we isolated over 1400 antibiotic-producing microbes, including 62, showing activity against multidrug-resistant pathogens. The natural product extracts generated from six microbial isolates showed potent activity against vancomycin-intermediate resistant Staphylococcus aureus. We utilized a targeted approach that coupled mass spectrometry data with bioactivity, yielding a new macrolactone class of metabolite, desertomycin H. In this study, we successfully demonstrate a concept that significantly increased our ability to quickly and efficiently identify microbes capable of the silent antibiotic production.


Pathobiology ◽  
2021 ◽  
Vol 88 (2) ◽  
pp. 156-169
Author(s):  
Williams Fernandes Barra ◽  
Dionison Pereira Sarquis ◽  
André Salim Khayat ◽  
Bruna Cláudia Meireles Khayat ◽  
Samia Demachki ◽  
...  

Identifying a microbiome pattern in gastric cancer (GC) is hugely debatable due to the variation resulting from the diversity of the studied populations, clinical scenarios, and metagenomic approach. <i>H. pylori</i> remains the main microorganism impacting gastric carcinogenesis and seems necessary for the initial steps of the process. Nevertheless, an additional non-<i>H. pylori</i> microbiome pattern is also described, mainly at the final steps of the carcinogenesis. Unfortunately, most of the presented results are not reproducible, and there are no consensual candidates to share the <i>H. pylori</i> protagonists. Limitations to reach a consistent interpretation of metagenomic data include contamination along every step of the process, which might cause relevant misinterpretations. In addition, the functional consequences of an altered microbiome might be addressed. Aiming to minimize methodological bias and limitations due to small sample size and the lack of standardization of bioinformatics assessment and interpretation, we carried out a comprehensive analysis of the publicly available metagenomic data from various conditions relevant to gastric carcinogenesis. Mainly, instead of just analyzing the results of each available publication, a new approach was launched, allowing the comprehensive analysis of the total sample amount, aiming to produce a reliable interpretation due to using a significant number of samples, from different origins, in a standard protocol. Among the main results, <i>Helicobacter</i> and <i>Prevotella</i> figured in the “top 6” genera of every group. <i>Helicobacter</i> was the first one in chronic gastritis (CG), gastric cancer (GC), and adjacent (ADJ) groups, while <i>Prevotella</i> was the leader among healthy control (HC) samples. Groups of bacteria are differently abundant in each clinical situation, and bacterial metabolic pathways also diverge along the carcinogenesis cascade. This information may support future microbiome interventions aiming to face the carcinogenesis process and/or reduce GC risk.


GigaScience ◽  
2021 ◽  
Vol 10 (2) ◽  
Author(s):  
Guilhem Sempéré ◽  
Adrien Pétel ◽  
Magsen Abbé ◽  
Pierre Lefeuvre ◽  
Philippe Roumagnac ◽  
...  

Abstract Background Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both shotgun- and metabarcoding-based metagenomics, while online reference databases and user-friendly tools exist for running various types of analyses (e.g., Qiime, Mothur, Megan, IMG/VR, Anvi'o, Qiita, MetaVir), scientists lack comprehensive software for easily building scalable, searchable, online data repositories on which they can rely during their ongoing research. Results metaXplor is a scalable, distributable, fully web-interfaced application for managing, sharing, and exploring metagenomic data. Being based on a flexible NoSQL data model, it has few constraints regarding dataset contents and thus proves useful for handling outputs from both shotgun and metabarcoding techniques. By supporting incremental data feeding and providing means to combine filters on all imported fields, it allows for exhaustive content browsing, as well as rapid narrowing to find specific records. The application also features various interactive data visualization tools, ways to query contents by BLASTing external sequences, and an integrated pipeline to enrich assignments with phylogenetic placements. The project home page provides the URL of a live instance allowing users to test the system on public data. Conclusion metaXplor allows efficient management and exploration of metagenomic data. Its availability as a set of Docker containers, making it easy to deploy on academic servers, on the cloud, or even on personal computers, will facilitate its adoption.


2021 ◽  
Author(s):  
Jinglie Zhou ◽  
Susanna M. Theroux ◽  
Clifton P. Bueno de Mesquita ◽  
Wyatt H. Hartman ◽  
Ye Tian ◽  
...  

AbstractWetlands are important carbon (C) sinks, yet many have been destroyed and converted to other uses over the past few centuries, including industrial salt making. A renewed focus on wetland ecosystem services (e.g., flood control, and habitat) has resulted in numerous restoration efforts whose effect on microbial communities is largely unexplored. We investigated the impact of restoration on microbial community composition, metabolic functional potential, and methane flux by analyzing sediment cores from two unrestored former industrial salt ponds, a restored former industrial salt pond, and a reference wetland. We observed elevated methane emissions from unrestored salt ponds compared to the restored and reference wetlands, which was positively correlated with salinity and sulfate across all samples. 16S rRNA gene amplicon and shotgun metagenomic data revealed that the restored salt pond harbored communities more phylogenetically and functionally similar to the reference wetland than to unrestored ponds. Archaeal methanogenesis genes were positively correlated with methane flux, as were genes encoding enzymes for bacterial methylphosphonate degradation, suggesting methane is generated both from bacterial methylphosphonate degradation and archaeal methanogenesis in these sites. These observations demonstrate that restoration effectively converted industrial salt pond microbial communities back to compositions more similar to reference wetlands and lowered salinities, sulfate concentrations, and methane emissions.


Data in Brief ◽  
2017 ◽  
Vol 13 ◽  
pp. 738-741 ◽  
Author(s):  
Lukhanyo Mekuto ◽  
Seteno K.O. Ntwampe ◽  
John B.N. Mudumbi ◽  
Enoch A. Akinpelu ◽  
Maxwell Mewa-Ngongang

Sign in / Sign up

Export Citation Format

Share Document