scholarly journals animalcules: Interactive Microbiome Analytics and Visualization in R

2020 ◽  
Author(s):  
Yue Zhao ◽  
Anthony Federico ◽  
Solaiappan Manimaran ◽  
Daniel Segre ◽  
Stefano Monti ◽  
...  

Abstract Background: Microbial communities that live in and on the human body play a vital role in health and disease. Recent advances in sequencing technologies have enabled the study of microbial communities at unprecedented resolution. However, these advances in data generation have presented novel challenges to researchers attempting to analyze and visualize these data. Results: To address some of these challenges, we have developed animalcules, an easy-to-use interactive microbiome analysis toolkit for 16S rRNA sequencing data, shotgun DNA metagenomics data, and RNA-based metatranscriptomics profiling data. This toolkit combines novel and existing analytics, visualization methods, and machine learning models. For example, the toolkit features traditional microbiome analyses such as alpha/beta diversity and differential abundance analysis, combined with new methods for biomarker identification are. In addition, animalcules provides interactive and dynamic figures that enable users to understand their data and discover new insights. animalcules can be used as a standalone command-line R package or users can explore their data with the accompanying interactive R Shiny interface. Conclusions: We present animalcules, an R package for interactive microbiome analysis through either an interactive interface facilitated by R Shiny or various command-line functions. It is the first microbiome analysis toolkit that supports the analysis of all 16S rRNA, DNA-based shotgun metagenomics, and RNA-sequencing based metatranscriptomics datasets. animalcules can be freely downloaded from GitHub at https://github.com/compbiomed/animalcules or installed through Bioconductor at https://www.bioconductor.org/packages/release/bioc/html/animalcules.html.

2020 ◽  
Author(s):  
Yue Zhao ◽  
Anthony Federico ◽  
Tyler Faits ◽  
Solaiappan Manimaran ◽  
Stefano Monti ◽  
...  

Abstract Background: Microbial communities that live in and on the human body play a vital role in health and disease. Recent advances in sequencing technologies have enabled the study of microbial communities at unprecedented resolution. However, these advances in data generation have presented novel challenges to researchers attempting to analyze and visualize these data.Results: To address some of these challenges, we have developed Animalcules, an easy-to-use interactive microbiome analysis toolkit for 16S rRNA sequencing data, shotgun DNA metagenomics data, and RNA-based metatranscriptomics profiling data. This toolkit combines novel and existing analytics, visualization methods, and machine learning models. For example, traditional microbiome analyses such as alpha/beta diversity and differential abundance analysis are enhanced in the toolkit, while new methods such as biomarker identification are introduced. Powerful interactive and dynamic figures generated by Animalcules enable users to understand their data and discover new insights. Animalcules can be used as a standalone command-line R package or users can explore their data with the accompanying interactive R Shiny interface.Conclusions: We present Animalcules, an R package for interactive microbiome analysis through either an interactive interface facilitated by R Shiny or various command-line functions. It is the first microbiome analysis toolkit that supports the analysis of all 16S rRNA, DNA-based shotgun metagenomics, and RNA-sequencing based metatranscriptomics datasets. Animalcules can be freely downloaded from GitHub at https://github.com/compbiomed/animalcules or installed through Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/animalcules.html).


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yue Zhao ◽  
Anthony Federico ◽  
Tyler Faits ◽  
Solaiappan Manimaran ◽  
Daniel Segrè ◽  
...  

Abstract Background Microbial communities that live in and on the human body play a vital role in health and disease. Recent advances in sequencing technologies have enabled the study of microbial communities at unprecedented resolution. However, these advances in data generation have presented novel challenges to researchers attempting to analyze and visualize these data. Results To address some of these challenges, we have developed animalcules, an easy-to-use interactive microbiome analysis toolkit for 16S rRNA sequencing data, shotgun DNA metagenomics data, and RNA-based metatranscriptomics profiling data. This toolkit combines novel and existing analytics, visualization methods, and machine learning models. For example, the toolkit features traditional microbiome analyses such as alpha/beta diversity and differential abundance analysis, combined with new methods for biomarker identification are. In addition, animalcules provides interactive and dynamic figures that enable users to understand their data and discover new insights. animalcules can be used as a standalone command-line R package or users can explore their data with the accompanying interactive R Shiny interface. Conclusions We present animalcules, an R package for interactive microbiome analysis through either an interactive interface facilitated by R Shiny or various command-line functions. It is the first microbiome analysis toolkit that supports the analysis of all 16S rRNA, DNA-based shotgun metagenomics, and RNA-sequencing based metatranscriptomics datasets. animalcules can be freely downloaded from GitHub at https://github.com/compbiomed/animalcules or installed through Bioconductor at https://www.bioconductor.org/packages/release/bioc/html/animalcules.html.


2020 ◽  
Author(s):  
Yue Zhao ◽  
Anthony Federico ◽  
Tyler Faits ◽  
Solaiappan Manimaran ◽  
Stefano Monti ◽  
...  

AbstractBackgroundMicrobial communities that live in and on the human body play a vital role in health and disease. Recent advances in sequencing technologies have enabled the study of microbial communities at unprecedented resolution. However, these advances in data generation have presented novel challenges to researchers attempting to analyze and visualize these data.ResultsTo address some of these challenges, we have developed animalcules, an easy-to-use interactive microbiome analysis toolkit for 16S rRNA sequencing data, shotgun DNA metagenomics data, and RNA-based metatranscriptomics profiling data. This toolkit combines novel and existing analytics, visualization methods, and machine learning models. For example, traditional microbiome analyses such as alpha/beta diversity and differential abundance analysis are enhanced in the toolkit, while new methods such as biomarker identification are introduced. Powerful interactive and dynamic figures generated by animalcules enable users to understand their data and discover new insights. animalcules can be used as a standalone command-line R package or users can explore their data with the accompanying interactive R Shiny interface.ConclusionsWe present animalcules, an R package for interactive microbiome analysis through either an interactive interface facilitated by R Shiny or various command-line functions. It is the first microbiome analysis toolkit that supports the analysis of all 16S rRNA, DNA-based shotgun metagenomics, and RNA-sequencing based metatranscriptomics datasets. animalcules can be freely downloaded from GitHub at https://github.com/compbiomed/animalcules or installed through Bioconductor at https://www.bioconductor.org/packages/release/bioc/html/animalcules.html.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Charlie M. Carpenter ◽  
Daniel N. Frank ◽  
Kayla Williamson ◽  
Jaron Arbet ◽  
Brandie D. Wagner ◽  
...  

Abstract Background The drive to understand how microbial communities interact with their environments has inspired innovations across many fields. The data generated from sequence-based analyses of microbial communities typically are of high dimensionality and can involve multiple data tables consisting of taxonomic or functional gene/pathway counts. Merging multiple high dimensional tables with study-related metadata can be challenging. Existing microbiome pipelines available in R have created their own data structures to manage this problem. However, these data structures may be unfamiliar to analysts new to microbiome data or R and do not allow for deviations from internal workflows. Existing analysis tools also focus primarily on community-level analyses and exploratory visualizations, as opposed to analyses of individual taxa. Results We developed the R package “tidyMicro” to serve as a more complete microbiome analysis pipeline. This open source software provides all of the essential tools available in other popular packages (e.g., management of sequence count tables, standard exploratory visualizations, and diversity inference tools) supplemented with multiple options for regression modelling (e.g., negative binomial, beta binomial, and/or rank based testing) and novel visualizations to improve interpretability (e.g., Rocky Mountain plots, longitudinal ordination plots). This comprehensive pipeline for microbiome analysis also maintains data structures familiar to R users to improve analysts’ control over workflow. A complete vignette is provided to aid new users in analysis workflow. Conclusions tidyMicro provides a reliable alternative to popular microbiome analysis packages in R. We provide standard tools as well as novel extensions on standard analyses to improve interpretability results while maintaining object malleability to encourage open source collaboration. The simple examples and full workflow from the package are reproducible and applicable to external data sets.


2021 ◽  
Vol 12 ◽  
Author(s):  
Marc Crampon ◽  
Coralie Soulier ◽  
Pauline Sidoli ◽  
Jennifer Hellal ◽  
Catherine Joulian ◽  
...  

The demand for energy and chemicals is constantly growing, leading to an increase of the amounts of contaminants discharged to the environment. Among these, pharmaceutical molecules are frequently found in treated wastewater that is discharged into superficial waters. Indeed, wastewater treatment plants (WWTPs) are designed to remove organic pollution from urban effluents but are not specific, especially toward contaminants of emerging concern (CECs), which finally reach the natural environment. In this context, it is important to study the fate of micropollutants, especially in a soil aquifer treatment (SAT) context for water from WWTPs, and for the most persistent molecules such as benzodiazepines. In the present study, soils sampled in a reed bed frequently flooded by water from a WWTP were spiked with diazepam and oxazepam in microcosms, and their concentrations were monitored for 97 days. It appeared that the two molecules were completely degraded after 15 days of incubation. Samples were collected during the experiment in order to follow the dynamics of the microbial communities, based on 16S rRNA gene sequencing for Archaea and Bacteria, and ITS2 gene for Fungi. The evolution of diversity and of specific operating taxonomic units (OTUs) highlighted an impact of the addition of benzodiazepines, a rapid resilience of the fungal community and an evolution of the bacterial community. It appeared that OTUs from the Brevibacillus genus were more abundant at the beginning of the biodegradation process, for diazepam and oxazepam conditions. Additionally, Tax4Fun tool was applied to 16S rRNA gene sequencing data to infer on the evolution of specific metabolic functions during biodegradation. It finally appeared that the microbial community in soils frequently exposed to water from WWTP, potentially containing CECs such as diazepam and oxazepam, may be adapted to the degradation of persistent contaminants.


2018 ◽  
Author(s):  
Arghavan Bahadorinejad ◽  
Ivan Ivanov ◽  
Johanna W Lampe ◽  
Meredith AJ Hullar ◽  
Robert S Chapkin ◽  
...  

AbstractWe propose a Bayesian method for the classification of 16S rRNA metagenomic profiles of bacterial abundance, by introducing a Poisson-Dirichlet-Multinomial hierarchical model for the sequencing data, constructing a prior distribution from sample data, calculating the posterior distribution in closed form; and deriving an Optimal Bayesian Classifier (OBC). The proposed algorithm is compared to state-of-the-art classification methods for 16S rRNA metagenomic data, including Random Forests and the phylogeny-based Metaphyl algorithm, for varying sample size, classification difficulty, and dimensionality (number of OTUs), using both synthetic and real metagenomic data sets. The results demonstrate that the proposed OBC method, with either noninformative or constructed priors, is competitive or superior to the other methods. In particular, in the case where the ratio of sample size to dimensionality is small, it was observed that the proposed method can vastly outperform the others.Author summaryRecent studies have highlighted the interplay between host genetics, gut microbes, and colorectal tumor initiation/progression. The characterization of microbial communities using metagenomic profiling has therefore received renewed interest. In this paper, we propose a method for classification, i.e., prediction of different outcomes, based on 16S rRNA metagenomic data. The proposed method employs a Bayesian approach, which is suitable for data sets with small ration of number of available instances to the dimensionality. Results using both synthetic and real metagenomic data show that the proposed method can outperform other state-of-the-art metagenomic classification algorithms.


2019 ◽  
Author(s):  
Kuan-Hao Chao ◽  
Yi-Wen Hsiao ◽  
Yi-Fang Lee ◽  
Chien-Yueh Lee ◽  
Liang-Chuan Lai ◽  
...  

RNA-Seq analysis has revolutionized researchers' understanding of the transcriptome in biological research. Assessing the differences in transcriptomic profiles between tissue samples or patient groups enables researchers to explore the underlying biological impact of transcription. RNA-Seq analysis requires multiple processing steps and huge computational capabilities. There are many well-developed R packages for individual steps; however, there are few R/Bioconductor packages that integrate existing software tools into a comprehensive RNA-Seq analysis and provide fundamental end-to-end results in pure R environment so that researchers can quickly and easily get fundamental information in big sequencing data. To address this need, we have developed the open source R/Bioconductor package, RNASeqR. It allows users to run an automated RNA-Seq analysis with only six steps, producing essential tabular and graphical results for further biological interpretation. The features of RNASeqR include: six-step analysis, comprehensive visualization, background execution version, and the integration of both R and command-line software. RNASeqR provides fast, light-weight, and easy-to-run RNA-Seq analysis pipeline in pure R environment. It allows users to efficiently utilize popular software tools, including both R/Bioconductor and command-line tools, without predefining the resources or environments. RNASeqR is freely available for Linux and macOS operating systems from Bioconductor (https://bioconductor.org/packages/release/bioc/html/RNASeqR.html).


2021 ◽  
Author(s):  
Pengfan Zhang ◽  
Stjin Spaepen ◽  
Yang Bai ◽  
Stephane Hacquard ◽  
Ruben Garrido-Oter

AbstractMotivationSynthetic microbial communities (SynComs) constitute an emergent and powerful tool in biological, biomedical, and biotechnological research. Despite recent advances in algorithms for analysis of culture-independent amplicon sequencing data from microbial communities, there is a lack of tools specifically designed for analysing SynCom data, where reference sequences for each strain are available.ResultsHere we present Rbec, a tool designed for analysing SynCom data that outperforms current methods by accurately correcting errors in amplicon sequences and identifying intra-strain polymorphic variation. Extensive evaluation using mock bacterial and fungal communities show that our tool performs robustly for samples of varying complexity, diversity, and sequencing depth. Further, Rbec also allows accurate detection of contaminations in SynCom experiments.AvailabilityRbec is freely available as an open-source R package and can be downloaded at: https://github.com/PengfanZhang/Microbiome.


2018 ◽  
Author(s):  
Keith Mitchell ◽  
Christopher Dao ◽  
Amanda Freise ◽  
Serghei Mangul ◽  
Jordan Moberg Parker

AbstractMicrobial community profiling and functional inference via 16S rRNA analysis is quickly expanding across various areas of microbiology due to improvements to technology. There are numerous platforms for producing 16S rRNA taxonomic data which often vary in file and sequence formatting, creating a common barrier in microbiome studies. Additionally, many of the methods for analyzing and visualizing this sequencing data each require their own specific formatting. As a result, efficient and reproducible comparative analysis of taxonomic data and corresponding metadata in multiple programs remains a challenge in the investigation of microbial communities. PUMA, the Program for Unifying Microbiome Analysis, alleviates this problem in microbiome studies by allowing users to take advantage of numerous 16S rRNA taxonomic identification platforms and analysis tools in an efficient manner. PUMA accepts sequencing results from several taxonomic identification platforms and then automates configuration of data and file types for analysis and visualization via many popular tools. The protocol accomplishes this by producing a variety of properly configured, annotated, and altered files for both analysis and visualization of taxonomic community profiles and inferred functional profiles. PUMA provides an easy and flexible interface to accommodate for a variety of users to produce all files needed for all-inclusive analysis of targeted amplicon sequencing studies. PUMA is an unprecedented open-source solution for unifying multiple microbiome analysis softwares and uses an adaptable implementation with the potential to improve and consolidate the state of microbiome research.Body/Findings


2020 ◽  
Author(s):  
Megan Sarah Beaudry ◽  
Jincheng Wang ◽  
Troy Kieran ◽  
Jesse Thomas ◽  
Natalia Juliana Bayona-Vasquez ◽  
...  

Environmental microbial diversity is often investigated from a molecular perspective using 16S ribosomal RNA (rRNA) gene amplicons and shotgun metagenomics. While amplicon methods are fast, low-cost, and have curated reference databases, they can suffer from amplification bias and are limited in genomic scope. In contrast, shotgun metagenomic methods sample more genomic regions with fewer sequence acquisition biases. However, shotgun metagenomic sequencing is much more expensive (even with moderate sequencing depth) and computationally challenging. Here, we develop a set of 16S rRNA sequence capture baits that offer a potential middle ground with the advantages from both approaches for investigating microbial communities. These baits cover the diversity of all 16S rRNA sequences available in the Greengenes (v. 13.5) database, with no sequence having < 80% sequence similarity to at least one bait for all segments of 16S. The use of our baits provide comparable results to 16S amplicon libraries and shotgun metagenomic libraries when assigning taxonomic units from 16S sequences within the metagenomic reads. We demonstrate that 16S rRNA capture baits can be used on a range of microbial samples (i.e., mock communities and rodent fecal samples) to increase the proportion of 16S rRNA sequences (average >400-fold) and decrease analysis time to obtain consistent community assessments. Furthermore, our study reveals that bioinformatic methods used to analyze sequencing data may have a greater influence on estimates of community composition than library preparation method used, likely in part to the extent and curation of the reference databases considered.


Sign in / Sign up

Export Citation Format

Share Document