scholarly journals PUMA: A tool for processing 16S rRNA taxonomy data for analysis and visualization

2018 ◽  
Author(s):  
Keith Mitchell ◽  
Christopher Dao ◽  
Amanda Freise ◽  
Serghei Mangul ◽  
Jordan Moberg Parker

AbstractMicrobial community profiling and functional inference via 16S rRNA analysis is quickly expanding across various areas of microbiology due to improvements to technology. There are numerous platforms for producing 16S rRNA taxonomic data which often vary in file and sequence formatting, creating a common barrier in microbiome studies. Additionally, many of the methods for analyzing and visualizing this sequencing data each require their own specific formatting. As a result, efficient and reproducible comparative analysis of taxonomic data and corresponding metadata in multiple programs remains a challenge in the investigation of microbial communities. PUMA, the Program for Unifying Microbiome Analysis, alleviates this problem in microbiome studies by allowing users to take advantage of numerous 16S rRNA taxonomic identification platforms and analysis tools in an efficient manner. PUMA accepts sequencing results from several taxonomic identification platforms and then automates configuration of data and file types for analysis and visualization via many popular tools. The protocol accomplishes this by producing a variety of properly configured, annotated, and altered files for both analysis and visualization of taxonomic community profiles and inferred functional profiles. PUMA provides an easy and flexible interface to accommodate for a variety of users to produce all files needed for all-inclusive analysis of targeted amplicon sequencing studies. PUMA is an unprecedented open-source solution for unifying multiple microbiome analysis softwares and uses an adaptable implementation with the potential to improve and consolidate the state of microbiome research.Body/Findings


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Caitlin M. Singleton ◽  
Francesca Petriglieri ◽  
Jannie M. Kristensen ◽  
Rasmus H. Kirkegaard ◽  
Thomas Y. Michaelsen ◽  
...  

AbstractMicroorganisms play crucial roles in water recycling, pollution removal and resource recovery in the wastewater industry. The structure of these microbial communities is increasingly understood based on 16S rRNA amplicon sequencing data. However, such data cannot be linked to functional potential in the absence of high-quality metagenome-assembled genomes (MAGs) for nearly all species. Here, we use long-read and short-read sequencing to recover 1083 high-quality MAGs, including 57 closed circular genomes, from 23 Danish full-scale wastewater treatment plants. The MAGs account for ~30% of the community based on relative abundance, and meet the stringent MIMAG high-quality draft requirements including full-length rRNA genes. We use the information provided by these MAGs in combination with >13 years of 16S rRNA amplicon sequencing data, as well as Raman microspectroscopy and fluorescence in situ hybridisation, to uncover abundant undescribed lineages belonging to important functional groups.



2017 ◽  
Author(s):  
Jon G Sanders ◽  
Piotr Lukasik ◽  
Megan E Frederickson ◽  
Jacob A Russell ◽  
Ryuichi Koga ◽  
...  

AbstractAbundance is a key parameter in microbial ecology, and important to estimates of potential metabolite flux, impacts of dispersal, and sensitivity of samples to technical biases such as laboratory contamination. However, modern amplicon-based sequencing techniques by themselves typically provide no information about the absolute abundance of microbes. Here, we use fluorescence microscopy and quantitative PCR as independent estimates of microbial abundance to test the hypothesis that microbial symbionts have enabled ants to dominate tropical rainforest canopies by facilitating herbivorous diets, and compare these methods to microbial diversity profiles from 16S rRNA amplicon sequencing. Through a systematic survey of ants from a lowland tropical forest, we show that the density of gut microbiota varies across several orders of magnitude among ant lineages, with median individuals from many genera only marginally above detection limits. Supporting the hypothesis that microbial symbiosis is important to dominance in the canopy, we find that the abundance of gut bacteria is positively correlated with stable isotope proxies of herbivory among canopy-dwelling ants, but not among ground-dwelling ants. Notably, these broad findings are much more evident in the quantitative data than in the 16S rRNA sequencing data. Our results help to resolve a longstanding question in tropical rainforest ecology, and have broad implications for the interpretation of sequence-based surveys of microbial diversity.



2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Stephanie D. Jurburg ◽  
Maximilian Konzack ◽  
Nico Eisenhauer ◽  
Anna Heintz-Buschart

AbstractAs DNA sequencing has become more popular, the public genetic repositories where sequences are archived have experienced explosive growth. These repositories now hold invaluable collections of sequences, e.g., for microbial ecology, but whether these data are reusable has not been evaluated. We assessed the availability and state of 16S rRNA gene amplicon sequences archived in public genetic repositories (SRA, EBI, and DDJ). We screened 26,927 publications in 17 microbiology journals, identifying 2015 16S rRNA gene sequencing studies. Of these, 7.2% had not made their data public at the time of analysis. Among a subset of 635 studies sequencing the same gene region, 40.3% contained data which was not available or not reusable, and an additional 25.5% contained faults in data formatting or data labeling, creating obstacles for data reuse. Our study reveals gaps in data availability, identifies major contributors to data loss, and offers suggestions for improving data archiving practices.



Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yue Zhao ◽  
Anthony Federico ◽  
Tyler Faits ◽  
Solaiappan Manimaran ◽  
Daniel Segrè ◽  
...  

Abstract Background Microbial communities that live in and on the human body play a vital role in health and disease. Recent advances in sequencing technologies have enabled the study of microbial communities at unprecedented resolution. However, these advances in data generation have presented novel challenges to researchers attempting to analyze and visualize these data. Results To address some of these challenges, we have developed animalcules, an easy-to-use interactive microbiome analysis toolkit for 16S rRNA sequencing data, shotgun DNA metagenomics data, and RNA-based metatranscriptomics profiling data. This toolkit combines novel and existing analytics, visualization methods, and machine learning models. For example, the toolkit features traditional microbiome analyses such as alpha/beta diversity and differential abundance analysis, combined with new methods for biomarker identification are. In addition, animalcules provides interactive and dynamic figures that enable users to understand their data and discover new insights. animalcules can be used as a standalone command-line R package or users can explore their data with the accompanying interactive R Shiny interface. Conclusions We present animalcules, an R package for interactive microbiome analysis through either an interactive interface facilitated by R Shiny or various command-line functions. It is the first microbiome analysis toolkit that supports the analysis of all 16S rRNA, DNA-based shotgun metagenomics, and RNA-sequencing based metatranscriptomics datasets. animalcules can be freely downloaded from GitHub at https://github.com/compbiomed/animalcules or installed through Bioconductor at https://www.bioconductor.org/packages/release/bioc/html/animalcules.html.



2020 ◽  
Vol 9 (42) ◽  
Author(s):  
Saidu Abdullahi ◽  
Hazzeman Haris ◽  
Kamarul Z. Zarkasi ◽  
Hamzah G. Amir

ABSTRACT The 16S rRNA gene amplicon sequence data from tailing and nontailing rhizosphere soils of Mimosa pudica from a heavy metal-contaminated area are reported here. Diverse bacterial taxa were represented in the results, and the most dominant phyla were Proteobacteria (41.2%), Acidobacteria (17.1%), and Actinobacteria (14.4%).



2014 ◽  
Vol 80 (24) ◽  
pp. 7583-7591 ◽  
Author(s):  
Stephen J. Salipante ◽  
Toana Kawashima ◽  
Christopher Rosenthal ◽  
Daniel R. Hoogestraat ◽  
Lisa A. Cummings ◽  
...  

ABSTRACTHigh-throughput sequencing of the taxonomically informative 16S rRNA gene provides a powerful approach for exploring microbial diversity. Here we compare the performances of two common “benchtop” sequencing platforms, Illumina MiSeq and Ion Torrent Personal Genome Machine (PGM), for bacterial community profiling by 16S rRNA (V1-V2) amplicon sequencing. We benchmarked performance by using a 20-organism mock bacterial community and a collection of primary human specimens. We observed comparatively higher error rates with the Ion Torrent platform and report a pattern of premature sequence truncation specific to semiconductor sequencing. Read truncation was dependent on both the directionality of sequencing and the target species, resulting in organism-specific biases in community profiles. We found that these sequencing artifacts could be minimized by using bidirectional amplicon sequencing and an optimized flow order on the Ion Torrent platform. Results of bacterial community profiling performed on the mock community and a collection of 18 human-derived microbiological specimens were generally in good agreement for both platforms; however, in some cases, results differed significantly. Disparities could be attributed to the failure to generate full-length reads for particular organisms on the Ion Torrent platform, organism-dependent differences in sequence error rates affecting classification of certain species, or some combination of these factors. This study demonstrates the potential for differential bias in bacterial community profiles resulting from the choice of sequencing platform alone.





2019 ◽  
Author(s):  
Nicolas Tromas ◽  
Zofia E. Taranu ◽  
Mathieu Castelli ◽  
Juliana S. M. Pimentel ◽  
Daniel A. Pereira ◽  
...  

SummaryUnderstanding how ecological traits have changed over evolutionary time is a fundamental question in biology. Specifically, the extent to which more closely-related organisms share similar ecological preferences due to phylogenetic conservation – or if they are forced apart by competition – is still debated. Here we explored the co-occurrence patterns of freshwater cyanobacteria at the sub-genus level to investigate whether more closely-related taxa share more similar niches, and to what extent these niches were defined by abiotic or biotic variables. We used deep 16S rRNA gene amplicon sequencing and measured several abiotic environmental parameters (nutrients, temperature, etc.) in water samples collected over time and space in Furnas Reservoir, Brazil. We found that relatively more closely-related Synechococcus (in the continuous range of 93-100% nucleotide identity in 16S) had an increased tendency to co-occur with one another (i.e. had similar realized niches). This tendency could not be easily explained by shared preferences for measured abiotic niche dimensions. Thus, commonly measured abiotic parameters might not be sufficient to characterize, nor to predict community assembly or dynamics. Rather, co-occurrence between Synechococcus and the surrounding community (whether or not they represent true biological interactions) may be a more sensitive measure of realized niches. Overall, our results suggest that realized niches are phylogenetically conserved, at least at the sub-genus level and at the resolution of the 16S marker. Determining how these results generalize to other genera and at finer genetic resolution merits further investigation.Originality-Significance StatementWe address a fundamental question in ecology and evolution: how do niche preferences change over evolutionary time? Using time-series analysis of 16S rRNA gene amplicon sequencing data, we develop an approach to highlight the importance of biotic factors in defining realized niches, and show how niche preferences change proportionally with the 16S gene molecular clock within the genus Synechococcus. Ours is also one of few studies on the ecology of freshwater Synechococcus, adding significantly to our knowledge about this abundant and widespread lineage of Cyanobacteria.



2020 ◽  
Author(s):  
Stephanie D. Jurburg ◽  
Maximilian Konzack ◽  
Nico Eisenhauer ◽  
Anna Heintz-Buschart

AbstractThe sequencing revolution has resulted in the explosive growth of public genetic repositories. These repositories now hold invaluable collections of 16S rRNA gene amplicon sequences, but the extent to which the currently archived data is findable, accessible, and reusable has not been evaluated. We conducted a field-wide assessment of the availability and state of publicly archived 16S rRNA gene amplicon sequencing data. Using custom-built pattern-based text extraction algorithms, we searched 26,927 publications in 17 microbiology or microbial ecology journals, and identified 2,015 studies which performed 16S rRNA gene amplicon sequencing. We found, for example, that 7.2% of these had not been made public at the time of analysis, a trend which increased over time. Of the 635 studies targeting the V3-V4 region of the 16S rRNA gene, 40.3% contained data which was not available or not reusable, and for 25.5% of the studies, faults in data formatting or data labelling were likely to create obstacles in data reuse. Taken together, only 34% of these datasets had potentially reusable data. Our study reveals significant gaps in the availability of currently deposited community sequencing data, identifies major contributors to data loss, and offers suggestions for improving data archiving practices in the future.



Sign in / Sign up

Export Citation Format

Share Document