scholarly journals Equivolumetric Protocol Generates Library Sizes Proportional to Total Microbial Load in 16S Amplicon Sequencing

2021 ◽  
Vol 12 ◽  
Author(s):  
Giuliano Netto Flores Cruz ◽  
Ana Paula Christoff ◽  
Luiz Felipe Valter de Oliveira

High-throughput sequencing of 16S rRNA amplicon has been extensively employed to perform microbiome characterization worldwide. As a culture-independent methodology, it has allowed high-level profiling of sample bacterial composition directly from samples. However, most studies are limited to information regarding relative bacterial abundances (sample proportions), ignoring scenarios in which sample microbe biomass can vary widely. Here, we use an equivolumetric protocol for 16S rRNA amplicon library preparation capable of generating Illumina sequencing data responsive to input DNA, recovering proportionality between observed read counts and absolute bacterial abundances within each sample. Under specified conditions, we show that the estimation of colony-forming units (CFU), the most common unit of bacterial abundance in classical microbiology, is challenged mostly by resolution and taxon-to-taxon variation. We propose Bayesian cumulative probability models to address such issues. Our results indicate that predictive errors vary consistently below one order of magnitude for total microbial load and abundance of observed bacteria. We also demonstrate our approach has the potential to generalize to previously unseen bacteria, but predictive performance is hampered by specific taxa of uncommon profile. Finally, it remains clear that high-throughput sequencing data are not inherently restricted to sample proportions only, and such technologies bear the potential to meet the working scales of traditional microbiology.

2020 ◽  
Author(s):  
Giuliano Netto Flores Cruz ◽  
Ana Paula Christoff ◽  
Luiz Felipe Valter de Oliveira

Abstract Background Next-generation sequencing (NGS) has been extensively employed to perform microbiome characterization worldwide. As a culture-independent methodology, it has allowed high-level profiling of sample microbial composition. However, most studies are limited to sample information regarding relative bacterial abundances (sample proportions), ignoring scenarios in which sample microbe biomass can vary widely. Here, we develop an equivolumetric protocol for amplicon library preparation capable of generating NGS data responsive to input DNA, recovering proportionality between observed read counts and absolute bacterial abundances within each sample. Within a determined range, we show that the estimation of sample colony-forming units (CFU), the most common unit of bacterial abundance in classical microbiology, is challenged mostly by resolution and taxon-to-taxon variation. We propose Bayesian cumulative probability models to address such issues.Results Observed read counts were consistently proportional to input DNA, total microbial load, and bacterium-specific sample abundances, although a saturation tendency was observed as abundances increased. Using Bayesian cumulative probability models, predictive errors in sample CFU estimation varied constantly below one order of magnitude - as measured by the mean absolute log10-ratio (MALR). For total microbial load, observed MALR was no greater than 0.2 during both cross-validation and validation on a test dataset. For observed bacteria, estimation of taxon-specific CFU showed MALR values of at most 0.5. We also performed leave-one-group-out cross-validation to assess predictive performance for previously unseen bacteria. While most bacteria showed MALR no greater than 1, such a threshold was exceeded only by Bacillus cereus.Conclusions Being able to estimate sample CFU in a high-throughput fashion has a wide range of applications, from the study of built environments to public health surveillance. This study shows that equivolumetric protocols along with cumulative probability models allow sample CFU estimation from microbiome datasets. Further, our approach has the potential to generalize to previously unmodeled bacteria, an important feature in high-throughput settings. Lastly, it remains clear that NGS data are not inherently restricted to sample proportions only, and microbiome science can finally meet the working scales of classical microbiology.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yasemin Guenay-Greunke ◽  
David A. Bohan ◽  
Michael Traugott ◽  
Corinna Wallinger

AbstractHigh-throughput sequencing platforms are increasingly being used for targeted amplicon sequencing because they enable cost-effective sequencing of large sample sets. For meaningful interpretation of targeted amplicon sequencing data and comparison between studies, it is critical that bioinformatic analyses do not introduce artefacts and rely on detailed protocols to ensure that all methods are properly performed and documented. The analysis of large sample sets and the use of predefined indexes create challenges, such as adjusting the sequencing depth across samples and taking sequencing errors or index hopping into account. However, the potential biases these factors introduce to high-throughput amplicon sequencing data sets and how they may be overcome have rarely been addressed. On the example of a nested metabarcoding analysis of 1920 carabid beetle regurgitates to assess plant feeding, we investigated: (i) the variation in sequencing depth of individually tagged samples and the effect of library preparation on the data output; (ii) the influence of sequencing errors within index regions and its consequences for demultiplexing; and (iii) the effect of index hopping. Our results demonstrate that despite library quantification, large variation in read counts and sequencing depth occurred among samples and that the sequencing error rate in bioinformatic software is essential for accurate adapter/primer trimming and demultiplexing. Moreover, setting an index hopping threshold to avoid incorrect assignment of samples is highly recommended.


2021 ◽  
Vol 9 ◽  
Author(s):  
Olivia N. Choi ◽  
Ammon Corl ◽  
Andrew Wolfenden ◽  
Avishai Lublin ◽  
Suzanne L. Ishaq ◽  
...  

Studies in both humans and model organisms suggest that the microbiome may play a significant role in host health, including digestion and immune function. Microbiota can offer protection from exogenous pathogens through colonization resistance, but microbial dysbiosis in the gastrointestinal tract can decrease resistance and is associated with pathogenesis. Little is known about the effects of potential pathogens, such as Salmonella, on the microbiome in wildlife, which are known to play an important role in disease transmission to humans. Culturing techniques have traditionally been used to detect pathogens, but recent studies have utilized high throughput sequencing of the 16S rRNA gene to characterize host-associated microbial communities (i.e., the microbiome) and to detect specific bacteria. Building upon this work, we evaluated the utility of high throughput 16S rRNA gene sequencing for potential bacterial pathogen detection in barn swallows (Hirundo rustica) and used these data to explore relationships between potential pathogens and microbiota. To accomplish this, we first compared the detection of Salmonella spp. in swallows using 16S rRNA data with standard culture techniques. Second, we examined the prevalence of Salmonella using 16S rRNA data and examined the relationship between Salmonella-presence or -absence and individual host factors. Lastly, we evaluated host-associated bacterial diversity and community composition in Salmonella-present vs. -absent birds. Out of 108 samples, we detected Salmonella in six (5.6%) samples based on culture, 25 (23.1%) samples with unrarefied 16S rRNA gene sequencing data, and three (2.8%) samples with both techniques. We found that sex, migratory status, and weight were correlated with Salmonella presence in swallows. In addition, bacterial community composition and diversity differed between birds based on Salmonella status. This study highlights the value of 16S rRNA gene sequencing data for monitoring pathogens in wild birds and investigating the ecology of host microbe-pathogen relationships, data which are important for prediction and mitigation of disease spillover into domestic animals and humans.


Author(s):  
Giuliano Netto Flores Cruz ◽  
Ana Paula Christoff ◽  
Luiz Felipe Valter de Oliveira

AbstractNext-generation sequencing (NGS) has been extensively employed to perform microbiome characterization worldwide. As a culture-independent methodology, it has allowed high-level profiling of sample microbial composition. However, most studies are limited to information regarding relative bacterial abundances, ignoring scenarios in which sample microbe biomass can vary widely. Here, we develop an equivolumetric protocol for amplicon library preparation capable of generating NGS data responsive to input DNA, recovering proportionality between observed read counts and absolute bacterial abundances. Under specified conditions, we argue that the estimation of colony-forming units (CFU), the most common unit of bacterial abundance in classical microbiology, is challenged mostly by resolution and taxon-to-taxon variation. We propose Bayesian cumulative probability models to address such issues. Our results indicate that predictive errors vary consistently below one order of magnitude for observed bacteria. We also demonstrate our approach has the potential to generalize to previously unseen bacteria, but predictive performance is hampered by specific taxa of uncommon profile. Finally, it remains clear that NGS data are not inherently restricted to relative information only, and microbiome science can indeed meet the working scales of traditional microbiology.


2020 ◽  
Author(s):  
Andrés Vásquez-Domínguez ◽  
Luis Jaramillo-Valverde ◽  
Kelly S. Levano ◽  
Pedro Novoa-Bellota ◽  
Marco Machaguay-Romero ◽  
...  

ABSTRACTGenetic and microbiome studies of ancient Caral-Supe civilization have not yet been published. For this reason, the objective of this work is to identify the microorganisms and possible diseases that existed in this ancient civilization using coprolites samples. To do this, two coprolites samples were analyzing through high-throughput sequencing data of 16S rRNA gene and an intergenic region (ITS).


mSystems ◽  
2016 ◽  
Vol 1 (4) ◽  
Author(s):  
Maxime Galan ◽  
Maria Razzauti ◽  
Emilie Bard ◽  
Maria Bernard ◽  
Carine Brouat ◽  
...  

ABSTRACT Several recent public health crises have shown that the surveillance of zoonotic agents in wildlife is important to prevent pandemic risks. High-throughput sequencing (HTS) technologies are potentially useful for this surveillance, but rigorous experimental processes are required for the use of these effective tools in such epidemiological contexts. In particular, HTS introduces biases into the raw data set that might lead to incorrect interpretations. We describe here a procedure for cleaning data before estimating reliable biological parameters, such as positivity, prevalence, and coinfection, using 16S rRNA amplicon sequencing on an Illumina MiSeq platform. This procedure, applied to 711 rodents collected in West Africa, detected several zoonotic bacterial species, including some at high prevalence, despite their never before having been reported for West Africa. In the future, this approach could be adapted for the monitoring of other microbes such as protists, fungi, and even viruses. The human impact on natural habitats is increasing the complexity of human-wildlife interactions and leading to the emergence of infectious diseases worldwide. Highly successful synanthropic wildlife species, such as rodents, will undoubtedly play an increasingly important role in transmitting zoonotic diseases. We investigated the potential for recent developments in 16S rRNA amplicon sequencing to facilitate the multiplexing of the large numbers of samples needed to improve our understanding of the risk of zoonotic disease transmission posed by urban rodents in West Africa. In addition to listing pathogenic bacteria in wild populations, as in other high-throughput sequencing (HTS) studies, our approach can estimate essential parameters for studies of zoonotic risk, such as prevalence and patterns of coinfection within individual hosts. However, the estimation of these parameters requires cleaning of the raw data to mitigate the biases generated by HTS methods. We present here an extensive review of these biases and of their consequences, and we propose a comprehensive trimming strategy for managing these biases. We demonstrated the application of this strategy using 711 commensal rodents, including 208 Mus musculus domesticus, 189 Rattus rattus, 93 Mastomys natalensis, and 221 Mastomys erythroleucus, collected from 24 villages in Senegal. Seven major genera of pathogenic bacteria were detected in their spleens: Borrelia, Bartonella, Mycoplasma, Ehrlichia, Rickettsia, Streptobacillus, and Orientia. Mycoplasma, Ehrlichia, Rickettsia, Streptobacillus, and Orientia have never before been detected in West African rodents. Bacterial prevalence ranged from 0% to 90% of individuals per site, depending on the bacterial taxon, rodent species, and site considered, and 26% of rodents displayed coinfection. The 16S rRNA amplicon sequencing strategy presented here has the advantage over other molecular surveillance tools of dealing with a large spectrum of bacterial pathogens without requiring assumptions about their presence in the samples. This approach is therefore particularly suitable to continuous pathogen surveillance in the context of disease-monitoring programs. IMPORTANCE Several recent public health crises have shown that the surveillance of zoonotic agents in wildlife is important to prevent pandemic risks. High-throughput sequencing (HTS) technologies are potentially useful for this surveillance, but rigorous experimental processes are required for the use of these effective tools in such epidemiological contexts. In particular, HTS introduces biases into the raw data set that might lead to incorrect interpretations. We describe here a procedure for cleaning data before estimating reliable biological parameters, such as positivity, prevalence, and coinfection, using 16S rRNA amplicon sequencing on an Illumina MiSeq platform. This procedure, applied to 711 rodents collected in West Africa, detected several zoonotic bacterial species, including some at high prevalence, despite their never before having been reported for West Africa. In the future, this approach could be adapted for the monitoring of other microbes such as protists, fungi, and even viruses.


2016 ◽  
Vol 2016 ◽  
pp. 1-11 ◽  
Author(s):  
Axel Poulet ◽  
Maud Privat ◽  
Flora Ponelle ◽  
Sandrine Viala ◽  
Stephanie Decousus ◽  
...  

Screening forBRCAmutations in women with familial risk of breast or ovarian cancer is an ideal situation for high-throughput sequencing, providing large amounts of low cost data. However, 454, Roche, and Ion Torrent, Thermo Fisher, technologies produce homopolymer-associated indel errors, complicating their use in routine diagnostics. We developed software, named AGSA, which helps to detect false positive mutations in homopolymeric sequences. Seventy-two familial breast cancer cases were analysed in parallel by amplicon 454 pyrosequencing and Sanger dideoxy sequencing for genetic variations of theBRCAgenes. All 565 variants detected by dideoxy sequencing were also detected by pyrosequencing. Furthermore, pyrosequencing detected 42 variants that were missed with Sanger technique. Six amplicons contained homopolymer tracts in the coding sequence that were systematically misread by the software supplied by Roche. Read data plotted as histograms by AGSA software aided the analysis considerably and allowed validation of the majority of homopolymers. As an optimisation, additional 250 patients were analysed using microfluidic amplification of regions of interest (Access Array Fluidigm) of the BRCA genes, followed by 454 sequencing and AGSA analysis. AGSA complements a complete line of high-throughput diagnostic sequence analysis, reducing time and costs while increasing reliability, notably for homopolymer tracts.


Sign in / Sign up

Export Citation Format

Share Document