scholarly journals Differential abundance analysis for microbial marker-gene surveys

2013 ◽  
Vol 10 (12) ◽  
pp. 1200-1202 ◽  
Author(s):  
Joseph N Paulson ◽  
O Colin Stine ◽  
Héctor Corrada Bravo ◽  
Mihai Pop
2020 ◽  
Vol 36 (13) ◽  
pp. 3959-3965
Author(s):  
Yuanjing Ma ◽  
Yuan Luo ◽  
Hongmei Jiang

Abstract Motivation Microbial communities have been proved to have close relationship with many diseases. The identification of differentially abundant microbial species is clinically meaningful for finding disease-related pathogenic or probiotic bacteria. However, certain characteristics of microbiome data have hurdled the accuracy and effectiveness of differential abundance analysis. The abundances or counts of microbiome species are usually on different scales and exhibit zero-inflation and over-dispersion. Normalization is a crucial step before the differential abundance test. However, existing normalization methods typically try to adjust counts on different scales to a common scale by constructing size factors with the assumption that count distributions across samples are equivalent up to a certain percentile. These methods often yield undesirable results when differentially abundant species are of low to medium abundance level. For differential abundance analysis, existing methods often use a single distribution to model the dispersion of species which lacks flexibility to catch a single species’ distinctiveness. These methods tend to detect a lot of false positives and often lack of power when the effect size is small. Results We develop a novel framework for differential abundance analysis on sparse high-dimensional marker gene microbiome data. Our methodology relies on a novel network-based normalization technique and a two-stage zero-inflated mixture count regression model (RioNorm2). Our normalization method aims to find a group of relatively invariant microbiome species across samples and conditions in order to construct the size factor. Another contribution of the paper is that our testing approach can take under-sampling and over-dispersion into consideration by separating microbiome species into two groups and model them separately. Through comprehensive simulation studies, the performance of our method is consistently powerful and robust across different settings with different sample size, library size and effect size. We also demonstrate the effectiveness of our novel framework using a published dataset of metastatic melanoma and find biological insights from the results. Availability and implementation The R package ‘RioNorm2’ can be installed from Github athttps://github.com/yuanjing-ma/RioNorm2. Supplementary information Supplementary data are available at Bioinformatics online.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 726
Author(s):  
Mike W.C. Thang ◽  
Xin-Yi Chua ◽  
Gareth Price ◽  
Dominique Gorse ◽  
Matt A. Field

Metagenomic sequencing is an increasingly common tool in environmental and biomedical sciences.  While software for detailing the composition of microbial communities using 16S rRNA marker genes is relatively mature, increasingly researchers are interested in identifying changes exhibited within microbial communities under differing environmental conditions. In order to gain maximum value from metagenomic sequence data we must improve the existing analysis environment by providing accessible and scalable computational workflows able to generate reproducible results. Here we describe a complete end-to-end open-source metagenomics workflow running within Galaxy for 16S differential abundance analysis. The workflow accepts 454 or Illumina sequence data (either overlapping or non-overlapping paired end reads) and outputs lists of the operational taxonomic unit (OTUs) exhibiting the greatest change under differing conditions. A range of analysis steps and graphing options are available giving users a high-level of control over their data and analyses. Additionally, users are able to input complex sample-specific metadata information which can be incorporated into differential analysis and used for grouping / colouring within graphs.  Detailed tutorials containing sample data and existing workflows are available for three different input types: overlapping and non-overlapping read pairs as well as for pre-generated Biological Observation Matrix (BIOM) files. Using the Galaxy platform we developed MetaDEGalaxy, a complete metagenomics differential abundance analysis workflow. MetaDEGalaxy is designed for bench scientists working with 16S data who are interested in comparative metagenomics.  MetaDEGalaxy builds on momentum within the wider Galaxy metagenomics community with the hope that more tools will be added as existing methods mature.


2017 ◽  
Vol 95 (9) ◽  
pp. 855-857
Author(s):  
Henrique Reggiani ◽  
Jorge Meléndez

The differential abundance analysis method can improve the precision of stellar chemical abundances. The method compares the equivalent widths of a certain line in a star with the same line in a star considered to be a standard representative of its class, using high resolution and high signal to noise ratio spectra. The method has achieved great results by reducing the measurement errors to unprecedentedly low levels. However, to date, there has not been a consistent analysis on the actual improvements of this method when compared to a classical analysis in metal-poor stars. Here we present a comparison between the errors of a classical stellar analysis and a differential analysis among low-metallicity stars.


2018 ◽  
Author(s):  
LM Simon ◽  
G Tsitsiridis ◽  
P Angerer ◽  
FJ Theis

AbstractMotivationThe MetaMap resource contains metatranscriptomic expression data from screening >17,000 RNA-seq samples from >400 archived human disease-related studies for viral and microbial reads, so-called “metafeatures”. However, navigating this set of large and heterogeneous data is challenging, especially for researchers without bioinformatic expertise. Therefore, a user-friendly interface is needed that allows users to visualize and statistically analyse the data.ResultsWe developed an interactive frontend to facilitate the exploration of the MetaMap resource. The webtool allows users to query the resource by searching study abstracts for keywords or browsing expression patterns for specific metafeatures. Moreover, users can manually define sample groupings or use the existing annotation for downstream analysis. The web tool provides a large variety of analyses and visualizations including dimension reduction, differential abundance analysis and Krona visualizations. The MetaMap webtool represents a valuable resource for hypothesis generation regarding the impact of the microbiome in human disease.AvailabilityThe presented web tool can be accessed at https://github.com/theislab/MetaMap


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Huang Lin ◽  
Shyamal Das Peddada

AbstractIncreasingly, researchers are discovering associations between microbiome and a wide range of human diseases such as obesity, inflammatory bowel diseases, HIV, and so on. The first step towards microbiome wide association studies is the characterization of the composition of human microbiome under different conditions. Determination of differentially abundant microbes between two or more environments, known as differential abundance (DA) analysis, is a challenging and an important problem that has received considerable interest during the past decade. It is well documented in the literature that the observed microbiome data (OTU/SV table) are relative abundances with an excess of zeros. Since relative abundances sum to a constant, these data are necessarily compositional. In this article we review some recent methods for DA analysis and describe their strengths and weaknesses.


2016 ◽  
Vol 586 ◽  
pp. A67 ◽  
Author(s):  
Henrique Reggiani ◽  
Jorge Meléndez ◽  
David Yong ◽  
Ivan Ramírez ◽  
Martin Asplund

2003 ◽  
Vol 209 ◽  
pp. 562-562
Author(s):  
J. R. Walsh ◽  
A. A. Zijlstra ◽  
D. Péquignot

Two planetary nebulae previously classified as Galactic were discovered, on the basis of their radial velocities, to be located in the Sagittarius Dwarf galaxy (Zijlstra & Walsh, 1996, A&A, 312, L21). At the distance of Sagittarius of ~25kpc, they are the closest extra-galactic PN. A detailed analysis based on ground-based spectra and radio continuum data was published (Dudziak et al. 2000, A&A, 363, 717); it was shown that the two nebulae are on the same evolutionary track with initial mass of 1.2M⊙and almost identical light element abundances. One of the nebulae, Wray 16–423, underwent PN ejection about 1500yr before its twin He 2-436. On the basis of their similarity, a differential abundance analysis could be conducted. Third dredge-up carbon was found to be more abundant in He 2-436 and the first conclusive evidence for third-dredge-up oxygen enrichment was revealed (Péquignot et al. 2000, A&A, 361, L1).


2017 ◽  
Vol 838 (2) ◽  
pp. 90 ◽  
Author(s):  
Erin M. O’Malley ◽  
Andrew McWilliam ◽  
Brian Chaboyer ◽  
Ian Thompson

Author(s):  
Matthew L Davis ◽  
Yuan Huang ◽  
Kai Wang

Abstract A major task in the analysis of microbiome data is to identify microbes associated with differing biological conditions. Before conducting analysis, raw data must first be adjusted so that counts from different samples are comparable. A typical approach is to estimate normalization factors by which all counts in a sample are multiplied or divided. However, the inherent variation associated with estimation of normalization factors are often not accounted for in subsequent analysis, leading to a loss of precision. Rank normalization is a nonparametric alternative to the estimation of normalization factors in which each count for a microbial feature is replaced by its intrasample rank. Although rank normalization has been successfully applied to microarray analysis in the past, it has yet to be explored for microbiome data, which is characterized by high frequencies of 0s, strongly correlated features and compositionality. We propose to use rank normalization as an alternative to the estimation of normalization factors and examine its performance when paired with a two-sample t-test. On a rigorous 3rd-party benchmarking simulation, it is shown to offer strong control over the false discovery rate, and at sample sizes greater than 50 per treatment group, to offer an improvement in performance over commonly used normalization factors paired with t-tests, Wilcoxon rank-sum tests and methodologies implemented by R packages. On two real datasets, it yielded valid and reproducible results that were strongly in agreement with the original findings and the existing literature, further demonstrating its robustness and future potential. Availability: The data underlying this article are available online along with R code and supplementary materials at https://github.com/matthewlouisdavisBioStat/Rank-Normalization-Empowers-a-T-Test.


Sign in / Sign up

Export Citation Format

Share Document