scholarly journals Connections between freshwater carbon and nutrient cycles revealed through reconstructed population genomes

2018 ◽  
Author(s):  
Alexandra M. Linz ◽  
Shaomei He ◽  
Sarah L. R. Stevens ◽  
Karthik Anantharaman ◽  
Robin R. Rohwer ◽  
...  

AbstractMetabolic processes at the microbial scale influence ecosystem functions because microbes are responsible for much of the carbon and nutrient cycling in freshwater. One approach to predict the metabolic capabilities of microbial communities is to search for functional marker genes in metagenomes. However, this approach does not provide context about co-occurrence with other metabolic traits within an organism or detailed taxonomy about those organisms. Here, we combine a functional marker gene analysis with metabolic pathway prediction of microbial population genomes (MAGs) assembled from metagenomic time series in eutrophic Lake Mendota and humic Trout Bog to identify how carbon and nutrient cycles are connected in freshwater. We found that phototrophy, carbon fixation, and nitrogen fixation pathways co-occurred in Cyanobacteria MAGs in Lake Mendota and in Chlorobiales MAGs in Trout Bog. Cyanobacteria MAGs also had strong temporal correlations to functional marker genes for nitrogen fixation in several years. Genes encoding steps in the nitrogen and sulfur cycles varied in abundance and taxonomy by lake, potentially reflecting the availability and composition of inorganic nutrients in these systems. We were also able to identify which populations contained the greatest density and diversity of genes encoding glycoside hydrolases. Populations with many glycoside hydrolases also encoded pathways for sugar degradation. By using both MAGs and marker genes, we were better able to link functions to specific taxonomic groups in our metagenomic time series, enabling a more detailed understanding of freshwater microbial carbon and nutrient cycling.

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e6075 ◽  
Author(s):  
Alexandra M. Linz ◽  
Shaomei He ◽  
Sarah L.R. Stevens ◽  
Karthik Anantharaman ◽  
Robin R. Rohwer ◽  
...  

Although microbes mediate much of the biogeochemical cycling in freshwater, the categories of carbon and nutrients currently used in models of freshwater biogeochemical cycling are too broad to be relevant on a microbial scale. One way to improve these models is to incorporate microbial data. Here, we analyze both genes and genomes from three metagenomic time series and propose specific roles for microbial taxa in freshwater biogeochemical cycles. Our metagenomic time series span multiple years and originate from a eutrophic lake (Lake Mendota) and a humic lake (Trout Bog Lake) with contrasting water chemistry. Our analysis highlights the role of polyamines in the nitrogen cycle, the diversity of diazotrophs between lake types, the balance of assimilatory vs. dissimilatory sulfate reduction in freshwater, the various associations between types of phototrophy and carbon fixation, and the density and diversity of glycoside hydrolases in freshwater microbes. We also investigated aspects of central metabolism such as hydrogen metabolism, oxidative phosphorylation, methylotrophy, and sugar degradation. Finally, by analyzing the dynamics over time in nitrogen fixation genes and Cyanobacteria genomes, we show that the potential for nitrogen fixation is linked to specific populations in Lake Mendota. This work represents an important step towards incorporating microbial data into ecosystem models and provides a better understanding of how microbes may participate in freshwater biogeochemical cycling.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3812 ◽  
Author(s):  
Michael W. Hall ◽  
Robin R. Rohwer ◽  
Jonathan Perrie ◽  
Katherine D. McMahon ◽  
Robert G. Beiko

Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that complements existing sequence-identity-based clustering approaches by clustering marker-gene data based on time-series profiles and provides interactive visualization of clusters, including highlighting of internal OTU inconsistencies. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.


Author(s):  
Michael W Hall ◽  
Robin R Rohwer ◽  
Jonathan Perrie ◽  
Katherine D McMahon ◽  
Robert G Beiko

Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that clusters marker-gene data based on time-series profiles and provides interactive visualization of clusters. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.


2017 ◽  
Author(s):  
Michael W Hall ◽  
Robin R Rohwer ◽  
Jonathan Perrie ◽  
Katherine D McMahon ◽  
Robert G Beiko

Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that clusters marker-gene data based on time-series profiles and provides interactive visualization of clusters. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.


2020 ◽  
Vol 21 (S18) ◽  
Author(s):  
Sudipta Acharya ◽  
Laizhong Cui ◽  
Yi Pan

Abstract Background In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene expression data sets. In that context, designing an efficient feature selection algorithm exploiting knowledge from multiple potential biological resources may be an effective way to understand the spectrum of cancer or other diseases with applications in specific epidemiology for a particular population. Results In the current article, we design the feature selection and marker gene detection as a multi-view multi-objective clustering problem. Regarding that, we propose an Unsupervised Multi-View Multi-Objective clustering-based gene selection approach called UMVMO-select. Three important resources of biological data (gene ontology, protein interaction data, protein sequence) along with gene expression values are collectively utilized to design two different views. UMVMO-select aims to reduce gene space without/minimally compromising the sample classification efficiency and determines relevant and non-redundant gene markers from three cancer gene expression benchmark data sets. Conclusion A thorough comparative analysis has been performed with five clustering and nine existing feature selection methods with respect to several internal and external validity metrics. Obtained results reveal the supremacy of the proposed method. Reported results are also validated through a proper biological significance test and heatmap plotting.


Author(s):  
Guohong Zeng ◽  
Jin Li ◽  
Yuxiu Ma ◽  
Qian Pu ◽  
Tian Xiao ◽  
...  

AbstractSaponins are kinds of antifungal compounds produced by Panax notoginseng to resist invasion by pathogens. Ilyonectria mors-panacis G3B was the dominant pathogen inducing root rot of P. notoginseng, and the abilities to detoxify saponins were the key to infect P. notoginseng successfully. To research the molecular mechanisms of detoxifying saponins in I. mors-panacis G3B, we used high-throughput RNA-Seq to identify 557 and 1519 differential expression genes (DEGs) in I. mors-panacis G3B with saponins treatments for 4H (Hours) and 12H (Hours) compared with no saponins treatments, respectively. Among these DEGs, we found 93 genes which were simultaneously highly expressed in I. mors-panacis G3B with saponins treatments for 4H and 12H, they mainly belong to genes encoding transporters, glycoside hydrolases, oxidation–reduction enzymes, transcription factors and so on. In addition, there were 21 putative PHI (Pathogen–Host Interaction) genes out of those 93 up-regulated genes. In this report, we analyzed virulence-associated genes in I. mors-panacis G3B which may be related to detoxifying saponins to infect P. notoginseng successfully. They provided an excellent starting point for in-depth study on pathogenicity of I. mors-panacis G3B and developed appropriate root rot disease management strategies in the future.


Author(s):  
Bennett J Kapili ◽  
Anne E Dekas

Abstract Motivation Linking microbial community members to their ecological functions is a central goal of environmental microbiology. When assigned taxonomy, amplicon sequences of metabolic marker genes can suggest such links, thereby offering an overview of the phylogenetic structure underpinning particular ecosystem functions. However, inferring microbial taxonomy from metabolic marker gene sequences remains a challenge, particularly for the frequently sequenced nitrogen fixation marker gene, nitrogenase reductase (nifH). Horizontal gene transfer in recent nifH evolutionary history can confound taxonomic inferences drawn from the pairwise identity methods used in existing software. Other methods for inferring taxonomy are not standardized and require manual inspection that is difficult to scale. Results We present Phylogenetic Placement for Inferring Taxonomy (PPIT), an R package that infers microbial taxonomy from nifH amplicons using both phylogenetic and sequence identity approaches. After users place query sequences on a reference nifH gene tree provided by PPIT (n = 6317 full-length nifH sequences), PPIT searches the phylogenetic neighborhood of each query sequence and attempts to infer microbial taxonomy. An inference is drawn only if references in the phylogenetic neighborhood are: (1) taxonomically consistent and (2) share sufficient pairwise identity with the query, thereby avoiding erroneous inferences due to known horizontal gene transfer events. We find that PPIT returns a higher proportion of correct taxonomic inferences than BLAST-based approaches at the cost of fewer total inferences. We demonstrate PPIT on deep-sea sediment and find that Deltaproteobacteria are the most abundant potential diazotrophs. Using this dataset we show that emending PPIT inferences based on visual inspection of query sequence placement can achieve taxonomic inferences for nearly all sequences in a query set. We additionally discuss how users can apply PPIT to the analysis of other marker genes. Availability PPIT is freely available to non-commercial users at https://github.com/bkapili/ppit. Installation includes a vignette that demonstrates package use and reproduces the nifH amplicon analysis discussed here. The raw nifH amplicon sequence data have been deposited in the GenBank, EMBL, and DDBJ databases under BioProject number PRJEB37167. Supplementary information Supplementary data are available at Bioinformatics online.


1985 ◽  
Vol 5 (9) ◽  
pp. 2265-2271
Author(s):  
S Chakrabarti ◽  
S Joffe ◽  
M M Seidman

Shuttle vector plasmids were constructed with directly repeated sequences flanking a marker gene. African green monkey kidney (AGMK) cells were infected with the constructions, and after a period of replication, the progeny plasmids were recovered and introduced into bacteria. Those colonies with plasmids that had lost the marker gene were identified, and the individual plasmids were purified and characterized by restriction enzyme digestion. Recombination between the repeated elements generated a plasmid with a precise deletion and a characteristic restriction pattern, which distinguished the recombined molecules from those with other defects in the marker gene. Recombination among the following different sequences was measured in this assay: (i) the simian virus 40 origin and enhancer region, (ii) the AGMK Alu sequence, and (iii) a sequence from plasmid pBR322. Similar frequencies of recombination among these sequences were found. Recombination occurred more frequently in Cos1 cells than in CV1 cells. In these experiments, the plasmid population with defective marker genes consisted of the recombined molecules and of the spontaneous deletion-insertion mutants described earlier. The frequency of the latter class was unaffected by the presence of the option for recombination represented by the direct repeats. Both recombination and deletion-insertion mutagenesis were stimulated by double-strand cleavage between the repeated sequences and adjacent to the marker, and the frequency of the deletion-insertion mutants in this experiment was again independent of the presence of the direct repeats. We concluded that although recombination and deletion-insertion mutagenesis were both stimulated by double-strand cleavage, the molecules which underwent the two types of change were drawn from separate pools.


2021 ◽  
Author(s):  
Song-Lin Ding ◽  
Joshua J. Royall ◽  
Phil Lesnar ◽  
Benjamin A.C. Facer ◽  
Kimberly A. Smith ◽  
...  

Increasing interest in studies of prenatal human brain development, particularly using new single-cell genomics and anatomical technologies to create cell atlases, creates a strong need for accurate and detailed anatomical reference atlases. In this study, we present two cellular-resolution digital anatomical atlases for prenatal human brain at post-conceptional weeks (PCW) 15 and 21. Both atlases were annotated on sequential Nissl-stained sections covering brain-wide structures on the basis of combined analysis of cytoarchitecture, acetylcholinesterase staining and an extensive marker gene expression dataset. This high information content dataset allowed reliable and accurate demarcation of developing cortical and subcortical structures and their subdivisions. Furthermore, using the anatomical atlases as a guide, spatial expression of 37 and 5 genes from the brains respectively at PCW 15 and 21 was annotated, illustrating reliable marker genes for many developing brain structures. Finally, the present study uncovered several novel developmental features, such as the lack of an outer subventricular zone in the hippocampal formation and entorhinal cortex, and the apparent extension of both cortical (excitatory) and subcortical (inhibitory) progenitors into the prenatal olfactory bulb. These comprehensive atlases provide useful tools for visualization, targeting, imaging and interpretation of brain structures of prenatal human brain, and for guiding and interpreting the next generation of cell census and connectome studies.


2020 ◽  
Author(s):  
Nikola Palevich ◽  
Paul H. Maclean ◽  
William J. Kelly ◽  
Sinead C. Leahy ◽  
Jasna Rakonjac ◽  
...  

AbstractRuminants are essential for maintaining the global population and managing greenhouse gas emissions. In the rumen, bacterial species belonging to the genera rumen Butyrivibrio and Pseudobutyrivibrio constitute the core bacterial rumen microbiome and are important degraders of plant-derived complex polysaccharides. Pseudobutyrivibrio xylanivorans MA3014 was selected for genome sequencing in order to examine its ability to breakdown and utilize plant polysaccharides. The complete genome sequence of MA3014 is 3.58 Mb, consists of three replicons (a chromosome, chromid and plasmid), has an overall G+C content of 39.6% and encodes 3,265 putative protein-coding genes (PCGs). Comparative pan-genomics of all cultivated and currently available P. xylanivorans genomes has revealed highly open genomes and a strong correlation of orthologous genes within this species of rumen bacteria. MA3014 is metabolically versatile and capable of utilizing a range of simple mono-or oligosaccharides to complex plant polysaccharides such as pectins, mannans, starch and hemicelluloses for growth, with lactate, butyrate and formate as the principal fermentation end-products. The genes encoding these metabolic pathways have been identified and MA3014 is predicted to encode an extensive repertoire of Carbohydrate-Active enZYmes (CAZymes) with 80 Glycoside Hydrolases (GHs), 28 Carbohydrate Esterases (CEs) and 51 Glycosyl Transferases (GTs), that suggest its role as an initiator of primary solubilization of plant matter in the rumen.


Sign in / Sign up

Export Citation Format

Share Document