Functional Characterization Reveals Novel Putative Coding Sequences in Prevotella ruminicola Genome Extracted from Rumen Metagenomic Studies

2015 ◽  
Vol 25 (4) ◽  
pp. 292-299 ◽  
Author(s):  
Neelam M. Nathani ◽  
Ramesh K. Kothari ◽  
Amrutlal K. Patel ◽  
Chaitanya G. Joshi

<b><i>Aim:</i></b> To reassemble <i>Prevotella ruminicola</i> genome from rumen metagenomic data of cattle and buffalo and compare with the published reference genome. <b><i>Method:</i></b> Rumen microbial communities from Mehsani buffaloes (n = 8) and Kankrej cattle (n = 8), each adapted to different proportions of a dry or green roughage diet, were subjected to metagenomic sequencing by Ion Torrent PGM, and subsequent reads were analyzed by MG-RAST. Using reference-guided assembly of the sequences against the published <i>P. ruminicola</i> strain 23, draft genomes of 2.56 and 2.46 Mb were reconstructed from Mehsani buffalo and Kankrej cows, respectively. The genomes were annotated using the RAST Server and carbohydrate active enzyme (CAZyme) analysis. <b><i>Results:</i></b> Taxonomic analysis by MG-RAST revealed <i>P. ruminicola </i>to be the most abundant species present among the rumen microflora. Functional annotation of reconstructed genomes using the RAST Server depicted the maximum assignment of coding sequences involved in the subsystems amino acid and derivatives and carbohydrate metabolism. CAZyme profiling revealed the glycoside hydrolases (GH) family to be the most abundant. GH family subclassification revealed that the extracted genomes had more sequence hits for GH2, GH3, GH92 and GH97 as compared to the reference. <b><i>Conclusion:</i></b> The results reflect the metabolic significance of rumen-adapted <i>P. ruminicola</i> in utilizing a coarse diet for animals based on acquisition of novel genetic elements.

Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Laura-Jayne Gardiner ◽  
Niina Haiminen ◽  
Filippo Utro ◽  
Laxmi Parida ◽  
Ed Seabolt ◽  
...  

Abstract Background Widespread bioinformatic resource development generates a constantly evolving and abundant landscape of workflows and software. For analysis of the microbiome, workflows typically begin with taxonomic classification of the microorganisms that are present in a given environment. Additional investigation is then required to uncover the functionality of the microbial community, in order to characterize its currently or potentially active biological processes. Such functional analysis of metagenomic data can be computationally demanding for high-throughput sequencing experiments. Instead, we can directly compare sequencing reads to a functionally annotated database. However, since reads frequently match multiple sequences equally well, analyses benefit from a hierarchical annotation tree, e.g. for taxonomic classification where reads are assigned to the lowest taxonomic unit. Results To facilitate functional microbiome analysis, we re-purpose well-known taxonomic classification tools to allow us to perform direct functional sequencing read classification with the added benefit of a functional hierarchy. To enable this, we develop and present a tree-shaped functional hierarchy representing the molecular function subset of the Gene Ontology annotation structure. We use this functional hierarchy to replace the standard phylogenetic taxonomy used by the classification tools and assign query sequences accurately to the lowest possible molecular function in the tree. We demonstrate this with simulated and experimental datasets, where we reveal new biological insights. Conclusions We demonstrate that improved functional classification of metagenomic sequencing reads is possible by re-purposing a range of taxonomic classification tools that are already well-established, in conjunction with either protein or nucleotide reference databases. We leverage the advances in speed, accuracy and efficiency that have been made for taxonomic classification and translate these benefits for the rapid functional classification of microbiomes. While we focus on a specific set of commonly used methods, the functional annotation approach has broad applicability across other sequence classification tools. We hope that re-purposing becomes a routine consideration during bioinformatic resource development.


Animals ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 1247
Author(s):  
Xin Wu ◽  
Shuai Huang ◽  
Jinfeng Huang ◽  
Peng Peng ◽  
Yanan Liu ◽  
...  

The rumen contains abundant microorganisms that aid in the digestion of lignocellulosic feed and are associated with host phenotype traits. Cows with extremely high milk protein and fat percentages (HPF; n = 3) and low milk protein and fat percentages (LPF; n = 3) were selected from 4000 lactating Holstein cows under the same nutritional and management conditions. We found that the total concentration of volatile fatty acids, acetate, butyrate, and propionate in the rumen fluid was significantly higher in the HPF group than in the LPF group. Moreover, we identified 38 most abundant species displaying differential richness between the two groups, in which Prevotella accounted for 68.8% of the species, with the highest abundance in the HPF group. Functional annotation based on the Kyoto Encyclopedia of Gene and Genome (KEGG), evolutionary genealogy of genes: Non-supervised Orthologous Groups (eggNOG), and Carbohydrate-Active enzymes (CAZy) databases showed that the significantly more abundant species in the HPF group are enriched in carbohydrate, amino acid, pyruvate, insulin, and lipid metabolism and transportation. Furthermore, Spearman’s rank correlation analysis revealed that specific microbial taxa (mainly the Prevotella species and Neocallimastix californiae) are positively correlated with total volatile fatty acids (VFA). Collectively, we found that the HPF group was enriched with several Prevotella species related to the total VFA, acetate, and amino acid synthesis. Thereby, these fulfilled the host’s needs for energy, fat, and rumen microbial protein, which can be used for increased biosynthesis of milk fat and milk protein. Our findings provide novel information for elucidation of the regulatory mechanism of the rumen in the formation of milk composition.


2018 ◽  
Vol 57 (2) ◽  
Author(s):  
Qun Yan ◽  
Yu Mi Wi ◽  
Matthew J. Thoendel ◽  
Yash S. Raval ◽  
Kerryl E. Greenwood-Quaintance ◽  
...  

ABSTRACT We previously demonstrated that shotgun metagenomic sequencing can detect bacteria in sonicate fluid, providing a diagnosis of prosthetic joint infection (PJI). A limitation of the approach that we used is that data analysis was time-consuming and specialized bioinformatics expertise was required, both of which are barriers to routine clinical use. Fortunately, automated commercial analytic platforms that can interpret shotgun metagenomic data are emerging. In this study, we evaluated the CosmosID bioinformatics platform using shotgun metagenomic sequencing data derived from 408 sonicate fluid samples from our prior study with the goal of evaluating the platform vis-à-vis bacterial detection and antibiotic resistance gene detection for predicting staphylococcal antibacterial susceptibility. Samples were divided into a derivation set and a validation set, each consisting of 204 samples; results from the derivation set were used to establish cutoffs, which were then tested in the validation set for identifying pathogens and predicting staphylococcal antibacterial resistance. Metagenomic analysis detected bacteria in 94.8% (109/115) of sonicate fluid culture-positive PJIs and 37.8% (37/98) of sonicate fluid culture-negative PJIs. Metagenomic analysis showed sensitivities ranging from 65.7 to 85.0% for predicting staphylococcal antibacterial resistance. In conclusion, the CosmosID platform has the potential to provide fast, reliable bacterial detection and identification from metagenomic shotgun sequencing data derived from sonicate fluid for the diagnosis of PJI. Strategies for metagenomic detection of antibiotic resistance genes for predicting staphylococcal antibacterial resistance need further development.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 726
Author(s):  
Mike W.C. Thang ◽  
Xin-Yi Chua ◽  
Gareth Price ◽  
Dominique Gorse ◽  
Matt A. Field

Metagenomic sequencing is an increasingly common tool in environmental and biomedical sciences.  While software for detailing the composition of microbial communities using 16S rRNA marker genes is relatively mature, increasingly researchers are interested in identifying changes exhibited within microbial communities under differing environmental conditions. In order to gain maximum value from metagenomic sequence data we must improve the existing analysis environment by providing accessible and scalable computational workflows able to generate reproducible results. Here we describe a complete end-to-end open-source metagenomics workflow running within Galaxy for 16S differential abundance analysis. The workflow accepts 454 or Illumina sequence data (either overlapping or non-overlapping paired end reads) and outputs lists of the operational taxonomic unit (OTUs) exhibiting the greatest change under differing conditions. A range of analysis steps and graphing options are available giving users a high-level of control over their data and analyses. Additionally, users are able to input complex sample-specific metadata information which can be incorporated into differential analysis and used for grouping / colouring within graphs.  Detailed tutorials containing sample data and existing workflows are available for three different input types: overlapping and non-overlapping read pairs as well as for pre-generated Biological Observation Matrix (BIOM) files. Using the Galaxy platform we developed MetaDEGalaxy, a complete metagenomics differential abundance analysis workflow. MetaDEGalaxy is designed for bench scientists working with 16S data who are interested in comparative metagenomics.  MetaDEGalaxy builds on momentum within the wider Galaxy metagenomics community with the hope that more tools will be added as existing methods mature.


2020 ◽  
Author(s):  
Maxence Queyrel ◽  
Edi Prifti ◽  
Jean-Daniel Zucker

AbstractAnalysis of the human microbiome using metagenomic sequencing data has demonstrated high ability in discriminating various human diseases. Raw metagenomic sequencing data require multiple complex and computationally heavy bioinformatics steps prior to data analysis. Such data contain millions of short sequences read from the fragmented DNA sequences and are stored as fastq files. Conventional processing pipelines consist multiple steps including quality control, filtering, alignment of sequences against genomic catalogs (genes, species, taxonomic levels, functional pathways, etc.). These pipelines are complex to use, time consuming and rely on a large number of parameters that often provide variability and impact the estimation of the microbiome elements. Recent studies have demonstrated that training Deep Neural Networks directly from raw sequencing data is a promising approach to bypass some of the challenges associated with mainstream bioinformatics pipelines. Most of these methods use the concept of word and sentence embeddings that create a meaningful and numerical representation of DNA sequences, while extracting features and reducing the dimentionality of the data. In this paper we present an end-to-end approach that classifies patients into disease groups directly from raw metagenomic reads: metagenome2vec. This approach is composed of four steps (i) generating a vocabulary of k-mers and learning their numerical embeddings; (ii) learning DNA sequence (read) embeddings; (iii) identifying the genome from which the sequence is most likely to come and (iv) training a multiple instance learning classifier which predicts the phenotype based on the vector representation of the raw data. An attention mechanism is applied in the network so that the model can be interpreted, assigning a weight to the influence of the prediction for each genome. Using two public real-life datasets as well a simulated one, we demonstrated that this original approach reached very high performances, comparable with the state-of-the-art methods applied directly on processed data though mainstream bioinformatics workflows. These results are encouraging for this proof of concept work. We believe that with further dedication, the DNN models have the potential to surpass mainstream bioinformatics workflows in disease classification tasks.


2016 ◽  
Vol 82 (9) ◽  
pp. 2854-2861 ◽  
Author(s):  
Omri M. Finkel ◽  
Tom O. Delmont ◽  
Anton F. Post ◽  
Shimshon Belkin

ABSTRACTThe leaves ofTamarix aphylla, a globally distributed, salt-secreting desert tree, are dotted with alkaline droplets of high salinity. To successfully inhabit these organic carbon-rich droplets, bacteria need to be adapted to multiple stress factors, including high salinity, high alkalinity, high UV radiation, and periodic desiccation. To identify genes that are important for survival in this harsh habitat, microbial community DNA was extracted from the leaf surfaces of 10Tamarix aphyllatrees along a 350-km longitudinal gradient. Shotgun metagenomic sequencing, contig assembly, and binning yielded 17 genome bins, six of which were >80% complete. These genomic bins, representing three phyla (Proteobacteria,Bacteroidetes, andFirmicutes), were closely related to halophilic and alkaliphilic taxa isolated from aquatic and soil environments. Comparison of these genomic bins to the genomes of their closest relatives revealed functional traits characteristic of bacterial populations inhabiting theTamarixphyllosphere, independent of their taxonomic affiliation. These functions, most notably light-sensing genes, are postulated to represent important adaptations toward colonization of this habitat.IMPORTANCEPlant leaves are an extensive and diverse microbial habitat, forming the main interface between solar energy and the terrestrial biosphere. There are hundreds of thousands of plant species in the world, exhibiting a wide range of morphologies, leaf surface chemistries, and ecological ranges. In order to understand the core adaptations of microorganisms to this habitat, it is important to diversify the type of leaves that are studied. This study provides an analysis of the genomic content of the most abundant bacterial inhabitants of the globally distributed, salt-secreting desert treeTamarix aphylla. Draft genomes of these bacteria were assembled, using the culture-independent technique of assembly and binning of metagenomic data. Analysis of the genomes reveals traits that are important for survival in this habitat, most notably, light-sensing and light utilization genes.


2019 ◽  
Vol 20 (19) ◽  
pp. 4902 ◽  
Author(s):  
Christian Roth ◽  
Olga V. Moroz ◽  
Johan P. Turkenburg ◽  
Elena Blagova ◽  
Jitka Waterman ◽  
...  

Amylases are probably the best studied glycoside hydrolases and have a huge biotechnological value for industrial processes on starch. Multiple amylases from fungi and microbes are currently in use. Whereas bacterial amylases are well suited for many industrial processes due to their high stability, fungal amylases are recognized as safe and are preferred in the food industry, although they lack the pH tolerance and stability of their bacterial counterparts. Here, we describe three amylases, two of which have a broad pH spectrum extending to pH 8 and higher stability well suited for a broad set of industrial applications. These enzymes have the characteristic GH13 α-amylase fold with a central (β/α)8-domain, an insertion domain with the canonical calcium binding site and a C-terminal β-sandwich domain. The active site was identified based on the binding of the inhibitor acarbose in form of a transglycosylation product, in the amylases from Thamnidium elegans and Cordyceps farinosa. The three amylases have shortened loops flanking the nonreducing end of the substrate binding cleft, creating a more open crevice. Moreover, a potential novel binding site in the C-terminal domain of the Cordyceps enzyme was identified, which might be part of a starch interaction site. In addition, Cordyceps farinosa amylase presented a successful example of using the microseed matrix screening technique to significantly speed-up crystallization.


2012 ◽  
Vol 12 (1) ◽  
pp. 38 ◽  
Author(s):  
Michael J Dougherty ◽  
Patrik D’haeseleer ◽  
Terry C Hazen ◽  
Blake A Simmons ◽  
Paul D Adams ◽  
...  

mSystems ◽  
2019 ◽  
Vol 4 (1) ◽  
Author(s):  
Robert H. Mills ◽  
Yoshiki Vázquez-Baeza ◽  
Qiyun Zhu ◽  
Lingjing Jiang ◽  
James Gaffney ◽  
...  

ABSTRACT Although genetic approaches are the standard in microbiome analysis, proteome-level information is largely absent. This discrepancy warrants a better understanding of the relationship between gene copy number and protein abundance, as this is crucial information for inferring protein-level changes from metagenomic data. As it remains unknown how metaproteomic systems evolve during dynamic disease states, we leveraged a 4.5-year fecal time series using samples from a single patient with colonic Crohn’s disease. Utilizing multiplexed quantitative proteomics and shotgun metagenomic sequencing of eight time points in technical triplicate, we quantified over 29,000 protein groups and 110,000 genes and compared them to five protein biomarkers of disease activity. Broad-scale observations were consistent between data types, including overall clustering by principal-coordinate analysis and fluctuations in Gene Ontology terms related to Crohn’s disease. Through linear regression, we determined genes and proteins fluctuating in conjunction with inflammatory metrics. We discovered conserved taxonomic differences relevant to Crohn’s disease, including a negative association of Faecalibacterium and a positive association of Escherichia with calprotectin. Despite concordant associations of genera, the specific genes correlated with these metrics were drastically different between metagenomic and metaproteomic data sets. This resulted in the generation of unique functional interpretations dependent on the data type, with metaproteome evidence for previously investigated mechanisms of dysbiosis. An example of one such mechanism was a connection between urease enzymes, amino acid metabolism, and the local inflammation state within the patient. This proof-of-concept approach prompts further investigation of the metaproteome and its relationship with the metagenome in biologically complex systems such as the microbiome. IMPORTANCE A majority of current microbiome research relies heavily on DNA analysis. However, as the field moves toward understanding the microbial functions related to healthy and disease states, it is critical to evaluate how changes in DNA relate to changes in proteins, which are functional units of the genome. This study tracked the abundance of genes and proteins as they fluctuated during various inflammatory states in a 4.5-year study of a patient with colonic Crohn’s disease. Our results indicate that despite a low level of correlation, taxonomic associations were consistent in the two data types. While there was overlap of the data types, several associations were uniquely discovered by analyzing the metaproteome component. This case study provides unique and important insights into the fundamental relationship between the genes and proteins of a single individual’s fecal microbiome associated with clinical consequences.


2017 ◽  
Vol 5 (48) ◽  
Author(s):  
Cynthia Maria Chibani ◽  
Anja Poehlein ◽  
Olivia Roth ◽  
Heiko Liesegang ◽  
Carolin Charlotte Wendling

ABSTRACT Here, we present the draft genome sequence of Vibrio splendidus type strain DSM 19640. V. splendidus is an abundant species among coastal vibrioplankton. The assembly resulted in a 5,729,362-bp draft genome with 5,032 protein-coding sequences, 6 rRNAs, and 117 tRNAs.


Sign in / Sign up

Export Citation Format

Share Document