scholarly journals Reduced Metagenome Sequencing for strain-resolution taxonomic proles

2020 ◽  
Author(s):  
Lars Snipen ◽  
Inga-Leena Angell ◽  
Torbjørn Rognes ◽  
Knut Rudi

Abstract Background: Studies of shifts in microbial community composition has many applications. For studies at species or subspecies levels, the 16S amplicon sequencing lacks resolution, and is often replaced by full shotgun sequencing. Due to higher costs, this restricts the number of samples sequenced. As an alternative to a full shotgun sequencing we have investigated the use of Reduced Metagenome Sequencing (RMS) to estimate the composition of a microbial community. This involves the use of double-digested restriction associated DNA sequencing, which means only a smaller fraction of the genomes are sequenced. The read sets obtained by this approach have properties different from both amplicon and shotgun data, and analysis pipelines for both can either not be used at all or do not explore the full potential of RMS data.Results: We suggest a procedure for analyzing such data, based on fragment clustering and the use of a constrained ordinary least square de-convolution for estimating the relative abundance of all community members. Mock-community data sets shows the potential to clearly separate between strains even when the 16S is 100% identical and genome-wide dierences is < 0:02, indicating RMS has a very high resolution. From a simulation study we compare RMS to shotgun sequencing and show that we get improved abundance estimates when the community has many very closely related genomes. From a real data set of infants guts we show that RMS is capable of detecting a strain-diversity gradient for Escherichia coli across time.Conclusion: We find that RMS is a good alternative to either metabarcoding or shotgun sequencing when it comes to resolving microbial communities at the strain-level. Like shotgun metagenomics, it requires a good database of reference genomes, and is well suited for studies of the human gut or other communities where many reference genomes exist. A data analysis pipeline is offered, as an R package at https://github.com/larssnip/microRMS.

2021 ◽  
Author(s):  
Lars Snipen ◽  
Inga-Leena Angell ◽  
Torbjørn Rognes ◽  
Knut Rudi

Abstract BackgroundStudies of shifts in microbial community composition has many applications. For studies at species or subspecies levels, the 16S amplicon sequencing lacks resolution, and is often replaced by full shotgun sequencing. Due to higher costs, this restricts the number of samples sequenced. As an alternative to a full shotgun sequencing we have investigated the use of Reduced Metagenome Sequencing (RMS) to estimate the composition of a microbial community. This involves the use of double-digested restriction associated DNA sequencing, which means only a smaller fraction of the genomes are sequenced. The read sets obtained by this approach have properties different from both amplicon and shotgun data, and analysis pipelines for both can either not be used at all or do not explore the full potential of RMS data.ResultsWe suggest a procedure for analyzing such data, based on fragment clustering and the use of a constrained ordinary least square de-convolution for estimating the relative abundance of all community members. Mock-community data sets shows the potential to clearly separate between strains even when the 16S is 100% identical and genome-wide differences is <0.02, indicating RMS has a very high resolution. From a simulation study we compare RMS to shotgun sequencing and show that we get improved abundance estimates when the community has many very closely related genomes. From a real data set of infants guts we show that RMS is capable of detecting a strain-diversity gradient for Escherichia coli across time.ConclusionWe find that RMS is a good alternative to either metabarcoding or shotgun sequencing when it comes to resolving microbial communities at the strain-level. Like shotgun metagenomics, it requires a good database of reference genomes, and is well suited for studies of the human gut or other communities where many reference genomes exist. A data analysis pipeline is offered, as an R package at https://github.com/larssnip/microRMS.


2020 ◽  
Author(s):  
Lars Snipen ◽  
Inga-Leena Angell ◽  
Torbjørn Rognes ◽  
Knut Rudi

Abstract Background: Studies of shifts in microbial community composition has many applications. For studies at species or subspecies levels, the 16S amplicon sequencing lacks resolution, and is often replaced by full shotgun sequencing. Due to higher costs, this restricts the number of samples sequenced. As an alternative to a full shotgun sequencing we have investigated the use of Reduced Metagenome Sequencing (RMS) to estimate the composition of a microbial community. This involves the use of double-digested restriction associated DNA sequencing, which means only a smaller fraction of the genomes are sequenced. The read sets obtained by this approach have properties different from both amplicon and shotgun data, and analysis pipelines for both can either not be used at all or do not explore the full potential of RMS data. Results: We suggest a procedure for analyzing such data, based on fragment clustering and the use of a constrained ordinary least square de-convolution for estimating the relative abundance of all community members. Mock-community data sets shows the potential to clearly separate between strains even when the 16S is 100% identical and genome-wide dierences is < 0:02, indicating RMS has a very high resolution. From a simulation study we compare RMS to shotgun sequencing and show that we get improved abundance estimates when the community has many very closely related genomes. From a real data set of infants guts we show that RMS is capable of detecting a strain-diversity gradient for Escherichia coli across time. Conclusion: We find that RMS is a good alternative to either metabarcoding or shotgun sequencing when it comes to resolving microbial communities at the strain-level. Like shotgun metagenomics, it requires a good database of reference genomes, and is well suited for studies of the human gut or other communities where many reference genomes exist. A data analysis pipeline is offered, as an R package at https://github.com/larssnip/microRMS. Keywords: metagenome; strains; ddRADseq


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Lars Snipen ◽  
Inga-Leena Angell ◽  
Torbjørn Rognes ◽  
Knut Rudi

Abstract Background Studies of shifts in microbial community composition has many applications. For studies at species or subspecies levels, the 16S amplicon sequencing lacks resolution and is often replaced by full shotgun sequencing. Due to higher costs, this restricts the number of samples sequenced. As an alternative to a full shotgun sequencing we have investigated the use of Reduced Metagenome Sequencing (RMS) to estimate the composition of a microbial community. This involves the use of double-digested restriction-associated DNA sequencing, which means only a smaller fraction of the genomes are sequenced. The read sets obtained by this approach have properties different from both amplicon and shotgun data, and analysis pipelines for both can either not be used at all or not explore the full potential of RMS data. Results We suggest a procedure for analyzing such data, based on fragment clustering and the use of a constrained ordinary least square de-convolution for estimating the relative abundance of all community members. Mock community datasets show the potential to clearly separate strains even when the 16S is 100% identical, and genome-wide differences is < 0.02, indicating RMS has a very high resolution. From a simulation study, we compare RMS to shotgun sequencing and show that we get improved abundance estimates when the community has many very closely related genomes. From a real dataset of infant guts, we show that RMS is capable of detecting a strain diversity gradient for Escherichia coli across time. Conclusion We find that RMS is a good alternative to either metabarcoding or shotgun sequencing when it comes to resolving microbial communities at the strain level. Like shotgun metagenomics, it requires a good database of reference genomes and is well suited for studies of the human gut or other communities where many reference genomes exist. A data analysis pipeline is offered, as an R package at https://github.com/larssnip/microRMS.


In this paper, we have defined a new two-parameter new Lindley half Cauchy (NLHC) distribution using Lindley-G family of distribution which accommodates increasing, decreasing and a variety of monotone failure rates. The statistical properties of the proposed distribution such as probability density function, cumulative distribution function, quantile, the measure of skewness and kurtosis are presented. We have briefly described the three well-known estimation methods namely maximum likelihood estimators (MLE), least-square (LSE) and Cramer-Von-Mises (CVM) methods. All the computations are performed in R software. By using the maximum likelihood method, we have constructed the asymptotic confidence interval for the model parameters. We verify empirically the potentiality of the new distribution in modeling a real data set.


Author(s):  
Arun Kumar Chaudhary ◽  
Vijay Kumar

In this study, we have introduced a three-parameter probabilistic model established from type I half logistic-Generating family called half logistic modified exponential distribution. The mathematical and statistical properties of this distribution are also explored. The behavior of probability density, hazard rate, and quantile functions are investigated. The model parameters are estimated using the three well known estimation methods namely maximum likelihood estimation (MLE), least-square estimation (LSE) and Cramer-Von-Mises estimation (CVME) methods. Further, we have taken a real data set and verified that the presented model is quite useful and more flexible for dealing with a real data set. KEYWORDS— Half-logistic distribution, Estimation, CVME ,LSE, , MLE


2021 ◽  
Vol 50 (5) ◽  
pp. 77-100
Author(s):  
Aidi khaoula ◽  
Sanku Dey ◽  
Devendra Kumar ◽  
Seddik-Ameur N

In this paper, we try to contribute to the distribution theory literature by incorporating a new bounded distribution, called the unit generalized inverse Weibull distribution (UGIWD) in the (0, 1) intervals by transformation method. The proposed distribution exhibits  increasing and bathtub shaped hazard rate function. We derive some basic statistical properties of the new distribution. Based on complete sample, the model parameters are obtained by the methods of maximum likelihood, least square, weighted least square, percentile, maximum product of spacing and Cram`er-von-Mises and compared them using Monte Carlo simulation study. In addition, bootstrap confidence intervals of the parameters of the model based on aforementioned methods of estimation are also obtained. We illustrate the performance of the proposed distribution by means of one real data set and the data set shows that the new distribution is more appropriate as compared to unit Birnbaum-Saunders, unit gamma, unit Weibull, Kumaraswamy and unit Burr III distributions. Further, we construct chi-squared goodness-of-fit tests for the UGIWD using right censored data based on Nikulin-Rao-Robson (NRR) statistic and its modification. The criterion test used is the modified chi-squared statistic Y^2, developedby Bagdonavi?ius and Nikulin, 2011 for some parametric models when data are censored. The performances of the proposed test are shown by an intensive simulation study and an application to real data set


mSphere ◽  
2018 ◽  
Vol 3 (4) ◽  
Author(s):  
Adit Chaudhary ◽  
Imrose Kauser ◽  
Anirban Ray ◽  
Rachel Poretsky

ABSTRACT Urban streams are susceptible to stormwater and sewage inputs that can impact their ecological health and water quality. Microbial communities in streams play important functional roles, and their composition and metabolic potential can help assess ecological state and water quality. Although these environments are highly heterogenous, little is known about the influence of isolated perturbations, such as those resulting from rain events on urban stream microbiota. Here, we examined the microbial community composition and diversity in an urban stream during dry and wet weather conditions with both 16S rRNA gene sequencing across multiple years and shotgun metagenomics to more deeply analyze a single storm flow event. Metagenomics was used to assess population-level dynamics as well as shifts in the microbial community taxonomic profile and functional potential before and after a substantial rainfall. The results demonstrated general trends present in the stream under storm flow versus base flow conditions and also highlighted the influence of increased effluent flow following rain in shifting the stream microbial community from abundant freshwater taxa to those more associated with urban/anthropogenic settings. Shifts in the taxonomic composition were also linked to changes in functional gene content, particularly for transmembrane transport and organic substance biosynthesis. We also observed an increase in relative abundance of genes encoding degradation of organic pollutants and antibiotic resistance after rain. Overall, this study highlighted some differences in the microbial community of an urban stream under storm flow conditions and showed the impact of a storm flow event on the microbiome from an environmental and public health perspective. IMPORTANCE Urban streams in various parts of the world are facing increased anthropogenic pressure on their water quality, and storm flow events represent one such source of complex physical, chemical, and biological perturbations. Microorganisms are important components of these streams from both ecological and public health perspectives. Analysis of the effect of perturbations on the stream microbial community can help improve current knowledge on the impact such chronic disturbances can have on these water resources. This study examines microbial community dynamics during rain-induced storm flow conditions in an urban stream of the Chicago Area Waterway System. Additionally, using shotgun metagenomics we identified significant shifts in the microbial community composition and functional gene content following a high-rainfall event, with potential environment and public health implications. Previous work in this area has focused on specific genes/organisms or has not assessed immediate storm flow impact.


2021 ◽  
Vol 8 (1) ◽  
pp. 01-09
Author(s):  
Sanku Dey ◽  
Mahendra Saha ◽  
Sankar Goswami

This paper addresses the different methods of estimation of the unknown parameter of one parameter A(α) distribution from the frequentist point of view. We briefly describe different approaches, namely, maximum likelihood estimator, least square and weighted least square estimators, maximum product spacing estimators, Cram´er-von Mises estimator and compare those using extensive numerical simulations. Next, we obtain parametric bootstrap confidence interval of the parameter using frequentist approaches. Finally, one real data set has been analysed for illustrative purposes.


2021 ◽  
Author(s):  
Kristen D. Curry ◽  
Qi Wang ◽  
Michael G. Nute ◽  
Alona Tyshaieva ◽  
Elizabeth Reeves ◽  
...  

16S rRNA based analysis is the established standard for elucidating microbial community composition. While short read 16S analyses are largely confined to genus-level resolution at best since only a portion of the gene is sequenced, full-length 16S sequences have the potential to provide species-level accuracy. However, existing taxonomic identification algorithms are not optimized for the increased read length and error rate of long-read data. Here we present Emu, a novel approach that employs an expectation-maximization (EM) algorithm to generate taxonomic abundance profiles from full-length 16S rRNA reads. Results produced from one simulated data set and two mock communities prove Emu capable of accurate microbial community profiling while obtaining fewer false positives and false negatives than alternative methods. Additionally, we illustrate a real-world application of our new software by comparing clinical sample composition estimates generated by an established whole-genome shotgun sequencing workflow to those returned by full-length 16S sequences processed with Emu.


2009 ◽  
Vol 75 (9) ◽  
pp. 2889-2898 ◽  
Author(s):  
John Schellenberg ◽  
Matthew G. Links ◽  
Janet E. Hill ◽  
Tim J. Dumonceaux ◽  
Geoffrey A. Peters ◽  
...  

ABSTRACT We compared dideoxy sequencing of cloned chaperonin-60 universal target (cpn60 UT) amplicons to pyrosequencing of amplicons derived from vaginal microbial communities. In samples pooled from a number of individuals, the pyrosequencing method produced a data set that included virtually all of the sequences that were found within the clone library and revealed an additional level of taxonomic richness. However, the relative abundances of the sequences were different in the two datasets. These observations were expanded and confirmed by the analysis of paired clone library and pyrosequencing datasets from vaginal swabs taken from four individuals. Both for individuals with a normal vaginal microbiota and for those with bacterial vaginosis, the pyrosequencing method revealed a large number of low-abundance taxa that were missed by the clone library approach. In addition, we showed that the pyrosequencing method generates a reproducible profile of microbial community structure in replicate amplifications from the same community. We also compared the taxonomic composition of a vaginal microbial community determined by pyrosequencing of 16S rRNA amplicons to that obtained using cpn60 universal primers. We found that the profiles generated by the two molecular targets were highly similar, with slight differences in the proportional representation of the taxa detected. However, the number of operational taxonomic units was significantly higher in the cpn60 data set, suggesting that the protein-encoding gene provides improved species resolution over the 16S rRNA target. These observations demonstrate that pyrosequencing of cpn60 UT amplicons provides a robust, reliable method for deep sequencing of microbial communities.


Sign in / Sign up

Export Citation Format

Share Document