taxonomic annotation
Recently Published Documents


TOTAL DOCUMENTS

38
(FIVE YEARS 18)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Carole Belliardo ◽  
Georgios Koutsovoulos ◽  
Corinne Rancurel ◽  
Mathilde Clement ◽  
Justine Lipuma ◽  
...  

Background | During the last decades, shotgun metagenomics and metabarcoding have highlighted the diversity of microorganisms from environmental or host-associated samples. Most assembled metagenome public repositories use annotation pipelines tailored for prokaryotes regardless of the taxonomic origin of contigs and metagenome-assembled genomes (MAGs). Consequently, eukaryotic contigs and MAGs, with intrinsically different gene features, are not optimally annotated, resulting in an incorrect representation of the eukaryotic component of biodiversity, despite their biological relevance. Results | Using an automated analysis pipeline, we have filtered eukaryotic contigs from 6,873 soil metagenomes from the IMG/M database of the Joint Genome Institute. We have re-annotated genes using eukaryote-tailored methods, yielding 5,6 million eukaryotic proteins. Our pipeline improves eukaryotic proteins completeness, contiguity and quality. Moreover, the better quality of eukaryotic proteins combined with a more comprehensive assignment method improves the taxonomic annotation as well. Conclusions | Using public soil metagenomic data, we provide a dataset of eukaryotic soil proteins with improved completeness and quality as well as a more reliable taxonomic annotation. This unique resource is of interest for any scientist aiming at studying the composition, biological functions and gene flux in soil communities involving eukaryotes.


2021 ◽  
Vol 9 (6) ◽  
pp. 1297
Author(s):  
Valentín Pérez-Hernández ◽  
Mario Hernández-Guzmán ◽  
Marco Luna-Guido ◽  
Yendi E. Navarro-Noya ◽  
Elda M. Romero-Tepal ◽  
...  

We studied three soils of the former lake Texcoco with different electrolytic conductivity (1.9 dS m−1, 17.3 dS m−1, and 33.4 dS m−1) and pH (9.3, 10.4, and 10.3) amended with young maize plants and their neutral detergent fibre (NDF) fraction and aerobically incubated in the laboratory for 14 days while the soil bacterial community structure was monitored by means of 454-pyrosequencing of their 16S rRNA marker gene. We identified specific bacterial groups that showed adaptability to soil salinity, i.e., Prauseria in soil amended with young maize plants and Marinobacter in soil amended with NDF. An increase in soil salinity (17.3 dS m−1, 33.4 dS m−1) showed more bacterial genera enriched than soil with low salinity (1.9 dS m−1). Functional prediction showed that members of Alfa-, Gamma-, and Deltaproteobacteria, which are known to adapt to extreme conditions, such as salinity and low nutrient soil content, were involved in the lignocellulose degradation, e.g., Marinimicrobium and Pseudomonas as cellulose degraders, and Halomonas and Methylobacterium as lignin degraders. This research showed that the taxonomic annotation and their functional prediction both highlighted keystone bacterial groups with the ability to degrade complex C-compounds, such as lignin and (hemi)cellulose, in the extreme saline-alkaline soil of the former Lake of Texcoco.


2021 ◽  
Vol 5 (1) ◽  
pp. 006-012
Author(s):  
Kim Gihyeon ◽  
Yoon Kyoung Wan ◽  
Park Changho ◽  
Kang Kyu Hyuck ◽  
Kim Sujeong ◽  
...  

Advances in metagenomics have facilitated population studies of associations between microbial compositions and host properties, but strategies to minimize biases in these population analyses are needed. However, the effects of storage conditions, including freezing and preservation buffer, on microbial populations in fecal samples have not been studied sufficiently. In this study, we investigated metagenomic differences between fecal samples stored in different conditions. We collected 46 fecal samples from patients with lung cancer. DNA quality and microbial composition within different storage Methods were compared throughout 16S rRNA sequencing and post analysis. DNA quality and sequencing results for two storage conditions (freezing and preservation in buffer) did not differ significantly, whereas microbial information was better preserved in buffer than by freezing. In a metagenomic analysis, we observed that the microbial compositional distance was small within the same storage condition. Taxonomic annotation revealed that many microbes differed in abundance between frozen and buffer-preserved feces. In particular, the abundances of Firmicutes and Bacteroidetes varied depending on storage conditions. Microbes belonging to these phyla differed, resulting in biases in population metagenomic analysis. We suggest that a unified storage Methods is requisite for accurate population metagenomic studies.


2021 ◽  
Author(s):  
Julian Liber ◽  
Gregory Bonito ◽  
Gian Maria Niccolò Benucci

SummaryCONSTAX - the CONSensus TAXonomy classifier - was developed for accurate and reproducible taxonomic annotation of fungal rDNA amplicons and is based upon a consensus approach of RDP, SINTAX and UTAX algorithms. CONSTAX2 can be used to classify prokaryotes and incorporates BLAST-based classifiers to reduce classification errors. Additionally, CONSTAX2 implements a conda-installable, command line tool with improved classification metrics, faster training, multithreading support, capacity to incorporate external taxonomic databases, new isolate matching and high-level taxonomy tools, replete with documentation and example tutorials.Availability and ImplementationCONSTAX2 is available at https://github.com/liberjul/CONSTAXv2, and is packaged for Linux and MacOS from Bioconda. A tutorial and documentation are available at https://constax.readthedocs.io/en/latest/.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Weibing Xun ◽  
Yunpeng Liu ◽  
Wei Li ◽  
Yi Ren ◽  
Wu Xiong ◽  
...  

Abstract Background The relationship between biodiversity and soil microbiome stability remains poorly understood. Here, we investigated the impacts of bacterial phylogenetic diversity on the functional traits and the stability of the soil microbiome. Communities differing in phylogenetic diversity were generated by inoculating serially diluted soil suspensions into sterilized soil, and the stability of the microbiome was assessed by detecting community variations under various pH levels. The taxonomic features and potential functional traits were detected by DNA sequencing. Results We found that bacterial communities with higher phylogenetic diversity tended to be more stable, implying that microbiomes with higher biodiversity are more resistant to perturbation. Functional gene co-occurrence network and machine learning classification analyses identified specialized metabolic functions, especially “nitrogen metabolism” and “phosphonate and phosphinate metabolism,” as keystone functions. Further taxonomic annotation found that keystone functions are carried out by specific bacterial taxa, including Nitrospira and Gemmatimonas, among others. Conclusions This study provides new insights into our understanding of the relationships between soil microbiome biodiversity and ecosystem stability and highlights specialized metabolic functions embedded in keystone taxa that may be essential for soil microbiome stability.


2021 ◽  
Vol 6 (57) ◽  
pp. 2817
Author(s):  
Arianna Krinos ◽  
Sarah Hu ◽  
Natalie Cohen ◽  
Harriet Alexander

2020 ◽  
Author(s):  
M. Mirdita ◽  
M. Steinegger ◽  
F. Breitwieser ◽  
J. Söding ◽  
E. Levy Karin

SummaryMMseqs2 taxonomy is a new tool to assign taxonomic labels to metagenomic contigs. It extracts all possible protein fragments from each contig, quickly retains those that can contribute to taxonomic annotation, assigns them with robust labels and determines the contig’s taxonomic identity by weighted voting. Its fragment extraction step is suitable for the analysis of all domains of life. MMseqs2 taxonomy is 2-18x faster than state-of-the-art tools and also contains new modules for creating and manipulating taxonomic reference databases as well as reporting and visualizing taxonomic assignments.AvailabilityMMseqs2 taxonomy is part of the MMseqs2 free open-source software package available for Linux, macOS and Windows at https://mmseqs.com.


2020 ◽  
Author(s):  
Alicia Clum ◽  
Marcel Huntemann ◽  
Brian Bushnell ◽  
Brian Foster ◽  
Bryce Foster ◽  
...  

ABSTRACTThe DOE JGI Metagenome Workflow performs metagenome data processing, including assembly, structural, functional, and taxonomic annotation, and binning of metagenomic datasets that are subsequently included into the Integrated Microbial Genomes and Microbiomes (IMG/M) comparative analysis system (I. Chen, K. Chu, K. Palaniappan, M. Pillay, A. Ratner, J. Huang, M. Huntemann, N. Varghese, J. White, R. Seshadri, et al, Nucleic Acids Rsearch, 2019) and provided for download via the Joint Genome Institute (JGI) Data Portal (https://genome.jgi.doe.gov/portal/). This workflow scales to run on thousands of metagenome samples per year, which can vary by the complexity of microbial communities and sequencing depth. Here we describe the different tools, databases, and parameters used at different steps of the workflow, to help with interpretation of metagenome data available in IMG and to enable researchers to apply this workflow to their own data. We use 20 publicly available sediment metagenomes to illustrate the computing requirements for the different steps and highlight the typical results of data processing. The workflow modules for read filtering and metagenome assembly are available as a Workflow Description Language (WDL) file (https://code.jgi.doe.gov/BFoster/jgi_meta_wdl.git). The workflow modules for annotation and binning are provided as a service to the user community at https://img.jgi.doe.gov/submit and require filling out the project and associated metadata descriptions in Genomes OnLine Database (GOLD) (S. Mukherjee, D. Stamatis, J. Bertsch, G. Ovchinnikova, H. Katta, A. Mojica, I Chen, and N. Kyrpides, and T. Reddy, Nucleic Acids Research, 2018).IMPORTANCEThe DOE JGI Metagenome Workflow is designed for processing metagenomic datasets starting from Illumina fastq files. It performs data pre-processing, error correction, assembly, structural and functional annotation, and binning. The results of processing are provided in several standard formats, such as fasta and gff and can be used for subsequent integration into the Integrated Microbial Genome (IMG) system where they can be compared to a comprehensive set of publicly available metagenomes. As of 7/30/2020 7,155 JGI metagenomes have been processed by the JGI Metagenome Workflow.


2020 ◽  
Vol 7 (Supplement_1) ◽  
pp. S626-S627
Author(s):  
Rohita Sinha ◽  
Steve Kleiboeker ◽  
Michelle Altrich ◽  
Ellis Bixler

Abstract Background Cell-free DNA (cfDNA) has emerged as an important clinical specimen to probe for pathogenic microbes, especially in organ transplant patients where the same data can be used to predict allograft rejection. Recent reports described viral, bacterial or the complete microbial diversity in plasma following cfDNA sequencing. The prevalence of certain viral families (anelloviridae) is associated with immunosuppressant dosage and the risk of antibody mediated rejection. While being informative, the cfDNA reads are inherently shorter in length (~160bp or 2x75bp) and predominated by the host DNA (~97-99%), causing challenges in their taxonomic annotation and lower specificity. Here we present a computational protocol which minimizes these challenges by merging the concept of “Reference-assisted Assembly” with K-mer profiles of NGS data, for highly sensitive and specific microbial detection. Methods We developed a pipeline in which non-host NGS data (reads not mapped to the human genome) undergo a reference-assisted assembly operation and then taxonomic annotation using KrakenUneq (a K-mer based classifier). We trained the KrakenUneq on an in-house and curated database of ~12,000 viral genomes. We used three different K-mer values (16, 21, 31) to train KrakenUneq, and final predictions are made by applying a majority-wins rule. Currently the default KrakenUneq database is used for bacterial & fungal metagenome analysis. We tested our method on 30 simulated and 124 clinical samples obtained from a biorepository. Results Our protocol currently screens for a targeted list of pathogens (15 viral species, 16 bacterial and 10 fungal genera). On a simulated set of viral sample mixes, our protocol had 100% accuracy. For 124 clinical samples, predictions were evaluated for specificity and sensitivity using qPCR assays for the following viral species: EBV, BKV, JCV, HSV1/2, HHV7, and CMV. Total 33/38 computational predictions (87%) were confirmed by qPCR. The prediction sensitivity in terms of cps/ml ranged from 6 - 106 copies/mL. Conclusion Our efforts to perform ‘Reference-assisted assembly’ followed by K-mer based taxonomic annotation of cfDNA data, led to development of a novel and accurate pathogen detection protocol. Disclosures Rohita Sinha, PhD, Viracor-Eurofins (Employee) Steve Kleiboeker, DVM, PhD, Viracor-Eurofins (Employee) Michelle Altrich, PhD, Viracor-Eurofins (Employee) Ellis Bixler, MS, Viracor-Eurofins (Employee)


Sign in / Sign up

Export Citation Format

Share Document