Machine Learning-assisted Identification of Bioindicators Predicts Medium-chain Carboxylate Production Performance of an Anaerobic Mixed Culture

Abstract Background: The ability to quantitatively predict ecophysiological functions of microbial communities provides an important step to engineer microbiota for desired functions related to specific biochemical conversions. Here, we present the quantitative prediction of medium-chain carboxylate production in two continuous anaerobic bioreactors from 16S rRNA gene dynamics in enrichment cultures. Results: By progressively shortening the hydraulic retention time from 8 days to 2 days with different temporal schemes in both bioreactors operated for 211 days, we achieved higher productivities and yields of the target products n-caproate and n-caprylate. The datasets generated from each bioreactor were applied independently for training and testing in machine learning. A predictive model was generated by employing the random forest algorithm using 16S rRNA amplicon sequencing data. More than 90% accuracy in the prediction of n-caproate and n-caprylate productivities was achieved. Four inferred bioindicators belonging to the genera Olsenella, Lactobacillus, Syntrophococcus and Clostridium IV suggest their relevance to the higher carboxylate productivity at shorter hydraulic retention time. The recovery of metagenome-assembled genomes of these bioindicators confirmed their genetic potential to perform key steps of medium-chain carboxylate production.Conclusions: Shortening the hydraulic retention time of the continuous bioreactor systems allows to shape the communities with desired chain elongation functions. Using machine-learning, we demonstrated that 16S rRNA amplicon sequencing data can be used to predict bioreactor process performance quantitatively and accurately. Characterising and harnessing bioindicators holds promise to manage reactor microbiota towards selection of the target processes. Our mathematical framework is transferrable to other ecosystem processes and 3 microbial systems where community dynamics is linked to key functions. The general methodology can be adapted to data types of other functional categories such as genes, transcripts, proteins or metabolites.

Download Full-text

Machine Learning-Assisted Identification of Bioindicators Predicts Medium-Chain Carboxylate Production Performance of An Anaerobic Mixed Culture

10.21203/rs.3.rs-78714/v2 ◽

2021 ◽

Author(s):

Bin Liu ◽

Heike Sträuber ◽

Joao Pedro Saraiva ◽

Hauke Harms ◽

Sandra Godinho Silva ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

16S Rrna ◽

Retention Time ◽

Hydraulic Retention Time ◽

16S Rrna Genes ◽

Rrna Genes ◽

Support Vector ◽

Random Forest Regression ◽

Medium Chain

Abstract Background: The ability to quantitatively predict ecophysiological functions of microbial communities provides an important step to engineer microbiota for desired functions related to specific biochemical conversions. Here, we present the quantitative prediction of medium-chain carboxylate production in two continuous anaerobic bioreactors from 16S rRNA gene dynamics in enriched communities. Results: By progressively shortening the hydraulic retention time (HRT) from 8 days to 2 days with different temporal schemes in two bioreactors operated for 211 days, we achieved higher productivities and yields of the target products n -caproate and n ‑caprylate. The datasets generated from each bioreactor were applied independently for training and testing machine learning algorithms using 16S rRNA genes to predict n -caproate and n ‑caprylate productivities. Our dataset consisted of 14 and 40 samples from HRT of 8 and 2 days, respectively. Because of the size and balance of our dataset, we compared linear regression, support vector machine and random forest regression algorithms using the original and balanced datasets generated using synthetic minority oversampling. Further, we performed cross validation to estimate model stability. The random forest regression was the best algorithm producing more consistent results with median of error rates below 8%. More than 90% accuracy in the prediction of n -caproate and n -caprylate productivities was achieved. Four inferred bioindicators belonging to the genera Olsenella, Lactobacillus, Syntrophococcus and Clostridium IV suggest their relevance to the higher carboxylate productivity at shorter HRT. The recovery of metagenome-assembled genomes of these bioindicators confirmed their genetic potential to perform key steps of medium-chain carboxylate production. Conclusions: Shortening the hydraulic retention time of the continuous bioreactor systems allows to shape the communities with desired chain elongation functions. Using machine learning, we demonstrated that 16S rRNA amplicon sequencing data can be used to predict bioreactor process performance quantitatively and accurately. Characterising and harnessing bioindicators holds promise to manage reactor microbiota towards selection of the target processes. Our mathematical framework is transferrable to other ecosystem processes and microbial systems where community dynamics is linked to key functions. The general methodology used here can be adapted to data types of other functional categories such as genes, transcripts, proteins or metabolites.

Download Full-text

Connecting structure to function with the recovery of over 1000 high-quality metagenome-assembled genomes from activated sludge using long-read sequencing

Nature Communications ◽

10.1038/s41467-021-22203-2 ◽

2021 ◽

Vol 12 (1) ◽

Cited By ~ 2

Author(s):

Caitlin M. Singleton ◽

Francesca Petriglieri ◽

Jannie M. Kristensen ◽

Rasmus H. Kirkegaard ◽

Thomas Y. Michaelsen ◽

...

Keyword(s):

16S Rrna ◽

Wastewater Treatment Plants ◽

In Situ Hybridisation ◽

Amplicon Sequencing ◽

Rrna Genes ◽

Fluorescence In Situ Hybridisation ◽

Sequencing Data ◽

High Quality ◽

16S Rrna Amplicon Sequencing ◽

Long Read

AbstractMicroorganisms play crucial roles in water recycling, pollution removal and resource recovery in the wastewater industry. The structure of these microbial communities is increasingly understood based on 16S rRNA amplicon sequencing data. However, such data cannot be linked to functional potential in the absence of high-quality metagenome-assembled genomes (MAGs) for nearly all species. Here, we use long-read and short-read sequencing to recover 1083 high-quality MAGs, including 57 closed circular genomes, from 23 Danish full-scale wastewater treatment plants. The MAGs account for ~30% of the community based on relative abundance, and meet the stringent MIMAG high-quality draft requirements including full-length rRNA genes. We use the information provided by these MAGs in combination with >13 years of 16S rRNA amplicon sequencing data, as well as Raman microspectroscopy and fluorescence in situ hybridisation, to uncover abundant undescribed lineages belonging to important functional groups.

Download Full-text

Dramatic differences in gut bacterial densities help to explain the relationship between diet and habitat in rainforest ants

10.1101/114512 ◽

2017 ◽

Cited By ~ 4

Author(s):

Jon G Sanders ◽

Piotr Lukasik ◽

Megan E Frederickson ◽

Jacob A Russell ◽

Ryuichi Koga ◽

...

Keyword(s):

16S Rrna ◽

Microbial Diversity ◽

Tropical Rainforest ◽

Amplicon Sequencing ◽

Sequencing Data ◽

Lowland Tropical Forest ◽

16S Rrna Amplicon Sequencing ◽

Microbial Symbionts ◽

Microbial Symbiosis ◽

Diversity Profiles

AbstractAbundance is a key parameter in microbial ecology, and important to estimates of potential metabolite flux, impacts of dispersal, and sensitivity of samples to technical biases such as laboratory contamination. However, modern amplicon-based sequencing techniques by themselves typically provide no information about the absolute abundance of microbes. Here, we use fluorescence microscopy and quantitative PCR as independent estimates of microbial abundance to test the hypothesis that microbial symbionts have enabled ants to dominate tropical rainforest canopies by facilitating herbivorous diets, and compare these methods to microbial diversity profiles from 16S rRNA amplicon sequencing. Through a systematic survey of ants from a lowland tropical forest, we show that the density of gut microbiota varies across several orders of magnitude among ant lineages, with median individuals from many genera only marginally above detection limits. Supporting the hypothesis that microbial symbiosis is important to dominance in the canopy, we find that the abundance of gut bacteria is positively correlated with stable isotope proxies of herbivory among canopy-dwelling ants, but not among ground-dwelling ants. Notably, these broad findings are much more evident in the quantitative data than in the 16S rRNA sequencing data. Our results help to resolve a longstanding question in tropical rainforest ecology, and have broad implications for the interpretation of sequence-based surveys of microbial diversity.

Download Full-text

A Bioinformatics Analysis workflow for 16S rRNA Amplicon Sequencing data v1 (protocols.io.bntpmemn)

protocols.io ◽

10.17504/protocols.io.bntpmemn ◽

2020 ◽

Cited By ~ 1

Author(s):

Lilan Hao

Keyword(s):

16S Rrna ◽

Bioinformatics Analysis ◽

Amplicon Sequencing ◽

Sequencing Data ◽

16S Rrna Amplicon Sequencing ◽

Analysis Workflow

Download Full-text

Machine learning-assisted identification of bioindicators predicts medium-chain carboxylate production performance of an anaerobic mixed culture

International Chain Elongation Conference 2020 ◽

10.18174/icec2020.18013 ◽

2020 ◽

Author(s):

Bin Liu

Keyword(s):

Machine Learning ◽

Mixed Culture ◽

Production Performance ◽

Chain Elongation ◽

Medium Chain ◽

The Right

Contribution to the International Chain Elongation Conference 2020 | ICEC 2020. An abstract can be found in the right column.

Download Full-text

Hydrogen as a Co-electron Donor for Chain Elongation With Complex Communities

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2021.650631 ◽

2021 ◽

Vol 9 ◽

Author(s):

Flávio C. F. Baleeiro ◽

Sabine Kleinsteuber ◽

Heike Sträuber

Keyword(s):

Microbial Communities ◽

Electron Donor ◽

Amplicon Sequencing ◽

Sensu Stricto ◽

Lessons Learned ◽

Electron Donors ◽

Chain Elongation ◽

Medium Chain ◽

16S Rrna Amplicon Sequencing ◽

Hydrogenotrophic Methanogenesis

Electron donor scarcity is seen as one of the major issues limiting economic production of medium-chain carboxylates from waste streams. Previous studies suggest that co-fermentation of hydrogen in microbial communities that realize chain elongation relieves this limitation. To better understand how hydrogen co-feeding can support chain elongation, we enriched three different microbial communities from anaerobic reactors (A, B, and C with ascending levels of diversity) for their ability to produce medium-chain carboxylates from conventional electron donors (lactate or ethanol) or from hydrogen. In the presence of abundant acetate and CO2, the effects of different abiotic parameters (pH values in acidic to neutral range, initial acetate concentration, and presence of chemical methanogenesis inhibitors) were tested along with the enrichment. The presence of hydrogen facilitated production of butyrate by all communities and improved production of i-butyrate and caproate by the two most diverse communities (B and C), accompanied by consumption of acetate, hydrogen, and lactate/ethanol (when available). Under optimal conditions, hydrogen increased the selectivity of conventional electron donors to caproate from 0.23 ± 0.01 mol e–/mol e– to 0.67 ± 0.15 mol e–/mol e– with a peak caproate concentration of 4.0 g L–1. As a trade-off, the best-performing communities also showed hydrogenotrophic methanogenesis activity by Methanobacterium even at high concentrations of undissociated acetic acid of 2.9 g L–1 and at low pH of 4.8. According to 16S rRNA amplicon sequencing, the suspected caproate producers were assigned to the family Anaerovoracaceae (Peptostreptococcales) and the genera Megasphaera (99.8% similarity to M. elsdenii), Caproiciproducens, and Clostridium sensu stricto 12 (97–100% similarity to C. luticellarii). Non-methanogenic hydrogen consumption correlated to the abundance of Clostridium sensu stricto 12 taxa (p < 0.01). If a robust methanogenesis inhibition strategy can be found, hydrogen co-feeding along with conventional electron donors can greatly improve selectivity to caproate in complex communities. The lessons learned can help design continuous hydrogen-aided chain elongation bioprocesses.

Download Full-text

Improving medium chain fatty acid productivity using chain elongation by reducing the hydraulic retention time in an upflow anaerobic filter

Bioresource Technology ◽

10.1016/j.biortech.2013.02.114 ◽

2013 ◽

Vol 136 ◽

pp. 735-738 ◽

Cited By ~ 74

Author(s):

T.I.M. Grootscholten ◽

K.J.J. Steinbusch ◽

H.V.M. Hamelers ◽

C.J.N. Buisman

Keyword(s):

Fatty Acid ◽

Retention Time ◽

Hydraulic Retention Time ◽

Chain Fatty Acid ◽

Medium Chain Fatty Acid ◽

Chain Elongation ◽

Medium Chain ◽

Anaerobic Filter ◽

Acid Productivity ◽

Fatty Acid Productivity

Download Full-text

NanoRTax, a real-time pipeline for taxonomic and diversity analysis of nanopore 16S rRNA amplicon sequencing data

10.21203/rs.3.rs-938802/v1 ◽

2021 ◽

Author(s):

Héctor Rodriguez-Perez ◽

Laura Ciuffreda ◽

Carlos Flores

Keyword(s):

16S Rrna ◽

Real Time ◽

Amplicon Sequencing ◽

Sequencing Data ◽

Real Time Analysis ◽

16S Rrna Amplicon Sequencing ◽

Oxford Nanopore ◽

Long Read ◽

Cost Efficient ◽

User Friendly

Abstract The study of microbial communities and their applications have been leveraged by the advances in sequencing techniques and bioinformatics tools. The Oxford Nanopore Technologies long read sequencing by nanopores provides a portable and cost-efficient platform for sequencing assays opening the possibility of its application outside specialized environments and real-time analysis of data. To complement the existing efficient library preparation protocol with a streamlined analytic workflow, here we present NanoRTax, a nextflow pipeline for nanopore 16S rRNA amplicon data that features state-of-art taxonomic classification tools and real-time capability. The pipeline is paired with a web-based visual interface to enable user-friendly inspections of the experiment in progress.

Download Full-text

Bacterial Diversity of Breast Milk in Healthy Spanish Women: Evolution from Birth to Five Years Postpartum

Nutrients ◽

10.3390/nu13072414 ◽

2021 ◽

Vol 13 (7) ◽

pp. 2414

Author(s):

Laura Sanjulián ◽

Alexandre Lamas ◽

Rocío Barreiro ◽

Alberto Cepeda ◽

Cristina A. Fente ◽

...

Keyword(s):

Breast Milk ◽

16S Rrna ◽

Human Milk ◽

Alpha Diversity ◽

Amplicon Sequencing ◽

Maternal Body Mass Index ◽

16S Rrna Amplicon Sequencing ◽

Spanish Women ◽

Calcium Magnesium ◽

Abundant Genus

The objective of this work was to characterize the microbiota of breast milk in healthy Spanish mothers and to investigate the effects of lactation time on its diversity. A total of ninety-nine human milk samples were collected from healthy Spanish women and were assessed by means of next-generation sequencing of 16S rRNA amplicons and by qPCR. Firmicutes was the most abundant phylum, followed by Bacteroidetes, Actinobacteria, and Proteobacteria. Accordingly, Streptococcus was the most abundant genus. Lactation time showed a strong influence in milk microbiota, positively correlating with Actinobacteria and Bacteroidetes, while Firmicutes was relatively constant over lactation. 16S rRNA amplicon sequencing showed that the highest alpha-diversity was found in samples of prolonged lactation, along with wider differences between individuals. As for milk nutrients, calcium, magnesium, and selenium levels were potentially associated with Streptococcus and Staphylococcus abundance. Additionally, Proteobacteria was positively correlated with docosahexaenoic acid (DHA) levels in breast milk, and Staphylococcus with conjugated linoleic acid. Conversely, Streptococcus and trans-palmitoleic acid showed a negative association. Other factors such as maternal body mass index or diet also showed an influence on the structure of these microbial communities. Overall, human milk in Spanish mothers appeared to be a complex niche shaped by host factors and by its own nutrients, increasing in diversity over time.

Download Full-text

Advantage of 16S rRNA amplicon sequencing in Helicobacter pylori diagnosis

Helicobacter ◽

10.1111/hel.12790 ◽

2021 ◽

Author(s):

Boldbaatar Gantuya ◽

Hashem B. El Serag ◽

Batsaikhan Saruuljavkhlan ◽

Dashdorj Azzaya ◽

Takashi Matsumoto ◽

...

Keyword(s):

Helicobacter Pylori ◽

16S Rrna ◽

Amplicon Sequencing ◽

16S Rrna Amplicon Sequencing

Download Full-text