scholarly journals Deep Sequencing of a Dimethylsulfoniopropionate-Degrading Gene (dmdA) by Using PCR Primer Pairs Designed on the Basis of Marine Metagenomic Data

2009 ◽  
Vol 76 (2) ◽  
pp. 609-617 ◽  
Author(s):  
Vanessa A. Varaljay ◽  
Erinn C. Howard ◽  
Shulei Sun ◽  
Mary Ann Moran

ABSTRACT In silico design and testing of environmental primer pairs with metagenomic data are beneficial for capturing a greater proportion of the natural sequence heterogeneity in microbial functional genes, as well as for understanding limitations of existing primer sets that were designed from more restricted sequence data. PCR primer pairs targeting 10 environmental clades and subclades of the dimethylsulfoniopropionate (DMSP) demethylase protein, DmdA, were designed using an iterative bioinformatic approach that took advantage of thousands of dmdA sequences captured in marine metagenomic data sets. Using the bioinformatically optimized primers, dmdA genes were amplified from composite free-living coastal bacterioplankton DNA (from 38 samples over 5 years and two locations) and sequenced using 454 technology. An average of 6,400 amplicons per primer pair represented more than 700 clusters of environmental dmdA sequences across all primers, with clusters defined conservatively at >90% nucleotide sequence identity (∼95% amino acid identity). Degenerate and inosine-based primers did not perform better than specific primer pairs in determining dmdA richness and sometimes captured a lower degree of richness of sequences from the same DNA sample. A comparison of dmdA sequences in free-living versus particle-associated bacteria in southeastern U.S. coastal waters showed that sequence richness in some dmdA subgroups differed significantly between size fractions, though most gene clusters were shared (52 to 91%) and most sequences were affiliated with the shared clusters (∼90%). The availability of metagenomic sequence data has significantly enhanced the design of quantitative PCR primer pairs for this key functional gene, providing robust access to the capabilities and activities of DMSP demethylating bacteria in situ.

F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 726
Author(s):  
Mike W.C. Thang ◽  
Xin-Yi Chua ◽  
Gareth Price ◽  
Dominique Gorse ◽  
Matt A. Field

Metagenomic sequencing is an increasingly common tool in environmental and biomedical sciences.  While software for detailing the composition of microbial communities using 16S rRNA marker genes is relatively mature, increasingly researchers are interested in identifying changes exhibited within microbial communities under differing environmental conditions. In order to gain maximum value from metagenomic sequence data we must improve the existing analysis environment by providing accessible and scalable computational workflows able to generate reproducible results. Here we describe a complete end-to-end open-source metagenomics workflow running within Galaxy for 16S differential abundance analysis. The workflow accepts 454 or Illumina sequence data (either overlapping or non-overlapping paired end reads) and outputs lists of the operational taxonomic unit (OTUs) exhibiting the greatest change under differing conditions. A range of analysis steps and graphing options are available giving users a high-level of control over their data and analyses. Additionally, users are able to input complex sample-specific metadata information which can be incorporated into differential analysis and used for grouping / colouring within graphs.  Detailed tutorials containing sample data and existing workflows are available for three different input types: overlapping and non-overlapping read pairs as well as for pre-generated Biological Observation Matrix (BIOM) files. Using the Galaxy platform we developed MetaDEGalaxy, a complete metagenomics differential abundance analysis workflow. MetaDEGalaxy is designed for bench scientists working with 16S data who are interested in comparative metagenomics.  MetaDEGalaxy builds on momentum within the wider Galaxy metagenomics community with the hope that more tools will be added as existing methods mature.


2020 ◽  
Vol 87 (1) ◽  
Author(s):  
Rebecca Co ◽  
Laura A. Hug

ABSTRACT Improved sequencing technologies and the maturation of metagenomic approaches allow the identification of gene variants with potential industrial applications, including cellulases. Cellulase identification from metagenomic environmental surveys is complicated by inconsistent nomenclature and multiple categorization systems. Here, we summarize the current classification and nomenclature systems, with recommendations for improvements to these systems. Addressing the issues described will strengthen the annotation of cellulose-active enzymes from environmental sequence data sets—a rapidly growing resource in environmental and applied microbiology.


2016 ◽  
Author(s):  
Shea N Gardner ◽  
Sasha K Ames ◽  
Maya B Gokhale ◽  
Tom R Slezak ◽  
Jonathan Allen

Software for rapid, accurate, and comprehensive microbial profiling of metagenomic sequence data on a desktop will play an important role in large scale clinical use of metagenomic data. Here we describe LMAT-ML (Livermore Metagenomics Analysis Toolkit-Marker Library) which can be run with 24 GB of DRAM memory, an amount available on many clusters, or with 16 GB DRAM plus a 24 GB low cost commodity flash drive (NVRAM), a cost effective alternative for desktop or laptop users. We compared results from LMAT with five other rapid, low-memory tools for metagenome analysis for 131 Human Microbiome Project samples, and assessed discordant calls with BLAST. All the tools except LMAT-ML reported overly specific or incorrect species and strain resolution of reads that were in fact much more widely conserved across species, genera, and even families. Several of the tools misclassified reads from synthetic or vector sequence as microbial or human reads as viral. We attribute the high numbers of false positive and false negative calls to a limited reference database with inadequate representation of known diversity. Our comparisons with real world samples show that LMAT-ML is the only tool tested that classifies the majority of reads, and does so with high accuracy.


2017 ◽  
Author(s):  
Stuart M. Brown ◽  
Yuhan Hao ◽  
Hao Chen ◽  
Bobby P. Laungani ◽  
Thahmina A. Ali ◽  
...  

AbstractBackgroundMetagenomic shotgun sequencing is becoming increasingly popular to study microbes associated with the human body and in environmental samples. A key goal of shotgun metagenomic sequencing is to identify gene functions and metabolic pathways that differ between samples or conditions. However, current methods to identify function in the large number of reads in a high-throughput sequence data file rely on the computationally intensive and low stringency approach of mapping each read to a generic database of proteins or reference microbial genomes.ResultsWe have developed an alternative analysis approach for shotgun metagenomic sequence data utilizing Bowtie2 DNA-DNA alignment of the reads to a database of well annotated genes compiled from human microbiome data. This method is rapid, and provides high stringency matches (>90% DNA sequence identity) of shotgun metagenomics reads to genes with annotated functions. We demonstrate the use of this method with synthetic data, Human Microbiome Project shotgun metagenomic data sets, and data from a study of liver disease. Differentially abundant KEGG gene functions can be detected in these experiments.ConclusionsFunctional annotation of metagenomic shotgun sequence reads can be accomplished by rapid DNA-DNA matching to a custom database of microbial sequences using the Bowtie2 sequence alignment tool. This method can be used for a variety of microbiome studies and allows functional analysis which is otherwise computationally demanding. This rapid annotation method is freely available as a Galaxy workflow within a Docker image.


2008 ◽  
Vol 72 (4) ◽  
pp. 557-578 ◽  
Author(s):  
Victor Kunin ◽  
Alex Copeland ◽  
Alla Lapidus ◽  
Konstantinos Mavromatis ◽  
Philip Hugenholtz

SUMMARY As random shotgun metagenomic projects proliferate and become the dominant source of publicly available sequence data, procedures for the best practices in their execution and analysis become increasingly important. Based on our experience at the Joint Genome Institute, we describe the chain of decisions accompanying a metagenomic project from the viewpoint of the bioinformatic analysis step by step. We guide the reader through a standard workflow for a metagenomic project beginning with presequencing considerations such as community composition and sequence data type that will greatly influence downstream analyses. We proceed with recommendations for sampling and data generation including sample and metadata collection, community profiling, construction of shotgun libraries, and sequencing strategies. We then discuss the application of generic sequence processing steps (read preprocessing, assembly, and gene prediction and annotation) to metagenomic data sets in contrast to genome projects. Different types of data analyses particular to metagenomes are then presented, including binning, dominant population analysis, and gene-centric analysis. Finally, data management issues are presented and discussed. We hope that this review will assist bioinformaticians and biologists in making better-informed decisions on their journey during a metagenomic project.


mBio ◽  
2015 ◽  
Vol 6 (1) ◽  
Author(s):  
Carolina Megumi Mizuno ◽  
Francisco Rodriguez-Valera ◽  
Rohit Ghai

ABSTRACTThe genomes of four novel marineActinobacteriahave been assembled from large metagenomic data sets derived from the Mediterranean deep chlorophyll maximum (DCM). These are the first marine representatives belonging to the orderAcidimicrobialesand only the second group of planktonic marineActinobacteriato be described. Their streamlined genomes and photoheterotrophic lifestyle suggest that they are planktonic, free-living microbes. A novel rhodopsin clade, acidirhodopsins, related to freshwater actinorhodopsins, was found in these organisms. Their genomes suggest a capacity to assimilate C2 compounds, some using the glyoxylate bypass and others with the ethylmalonyl-coenzyme A (CoA) pathway. They are also able to derive energy from dimethylsulfopropionate (DMSP), sulfonate, and carbon monoxide oxidation, all commonly available in the marine habitat. These organisms appear to be prevalent in the deep photic zone at or around the DCM. The presence of sister clades to the marineAcidimicrobialesin freshwater aquatic habitats provides a new example of marine-freshwater transitions with potential evolutionary insights.IMPORTANCEDespite several studies showing the importance and abundance of planktonicActinobacteriain the marine habitat, a representative genome was only recently described. In order to expand the genomic repertoire of marineActinobacteria, we describe here the firstAcidimicrobidaegenomes of marine origin and provide insights about their ecology. They display metabolic versatility in the acquisition of carbon and appear capable of utilizing diverse sources of energy. One of the genomes harbors a new kind of rhodopsin related to the actinorhodopsin clade of freshwater origin that is widespread in the oceans. Our data also support their preference to inhabit the deep chlorophyll maximum and the deep photic zone. This work contributes to the perception of marine actinobacterial groups as important players in the marine environment with distinct and important contributions to nutrient cycling in the oceans.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 726 ◽  
Author(s):  
Mike W.C. Thang ◽  
Xin-Yi Chua ◽  
Gareth Price ◽  
Dominique Gorse ◽  
Matt A. Field

Metagenomic sequencing is an increasingly common tool in environmental and biomedical sciences yet analysis workflows remain immature relative to other field such as DNASeq and RNASeq analysis pipelines.  While software for detailing the composition of microbial communities using 16S rRNA marker genes is constantly improving, increasingly researchers are interested in identifying changes exhibited within microbial communities under differing environmental conditions. In order to gain maximum value from metagenomic sequence data we must improve the existing analysis environment by providing accessible and scalable computational workflows able to generate reproducible results. Here we describe a complete end-to-end open-source metagenomics workflow running within Galaxy for 16S differential abundance analysis. The workflow accepts 454 or Illumina sequence data (either overlapping or non-overlapping paired end reads) and outputs lists of the operational taxonomic unit (OTUs) exhibiting the greatest change under differing conditions. A range of analysis steps and graphing options are available giving users a high-level of control over their data and analyses. Additionally, users are able to input complex sample-specific metadata information which can be incorporated into differential analysis and used for grouping / colouring within graphs.  Detailed tutorials containing sample data and existing workflows are available for three different input types: overlapping and non-overlapping read pairs as well as for pre-generated Biological Observation Matrix (BIOM) files. Using the Galaxy platform we developed MetaDEGalaxy, a complete metagenomics differential abundance analysis workflow. MetaDEGalaxy is designed for bench scientists working with 16S data who are interested in comparative metagenomics.  MetaDEGalaxy builds on momentum within the wider Galaxy metagenomics community with the hope that more tools will be added as existing methods mature.


BMC Genomics ◽  
2011 ◽  
Vol 12 (Suppl 3) ◽  
pp. S12 ◽  
Author(s):  
Monzoorul Mohammed ◽  
Tarini Ghosh ◽  
Sudha Chadaram ◽  
Sharmila S Mande

2017 ◽  
Author(s):  
Adam L. Bazinet ◽  
Brian D. Ondov ◽  
Daniel D. Sommer ◽  
Shashikala Ratnayake

AbstractWhen performing bioforensic casework, it is important to be able to reliably detect the presence of a particular organism in a metagenomic sample, even if the organism is only present in a trace amount. For this task, it is common to use a sequence classification program that determines the taxonomic affiliation of individual sequence reads by comparing them to reference database sequences. As metagenomic data sets often consist of millions or billions of reads that need to be compared to reference databases containing millions of sequences, such sequence classification programs typically use search heuristics and databases with reduced sequence diversity to speed up the analysis, which can lead to incorrect assignments. Thus, in a bioforensic setting where correct assignments are paramount, assignments of interest made by “first-pass” classifiers should be confirmed using the most precise methods and comprehensive databases available. In this study we present ablast-based method for validating the assignments made by less precise sequence classification programs, with optimal parameters for filtering ofblastresults determined via simulation of sequence reads from genomes of interest, and we apply the method to the detection of four pathogenic organisms. The software implementing the method is open source and freely available.


Plant Disease ◽  
2012 ◽  
Vol 96 (8) ◽  
pp. 1222-1222 ◽  
Author(s):  
I. S. Myung ◽  
J. K. Choi ◽  
J. M. Wu ◽  
J. Y. Lee ◽  
H. L. Yoo ◽  
...  

In July 2011, bacterial stripe was observed on a commercial field of hog millet (Panicum miliaceum L.) in Chuncheon, Korea, with a disease incidence of 37% in the field. Symptoms on leaves included reddish-brown, long, narrow stripes that varied in length and were sharply delineated by uninfected adjacent vascular bundles. Eleven bacterial isolates (BC3107, BC3214 to BC3223) were recovered on trypticase soy agar from lesions surface sterilized in 70% ethanol for 1 min. The isolates, all obtained from different plants, were gram negative, oxidase positive, aerobic rods with two to four flagella. The isolates produced circular, cream-colored, nonfluorescent, butyrous colonies with entire margins on King's B medium. Using the Biolog Microbial Identification System, Version 4.2 (Biolog Inc., Hayward, CA), the isolates were identified as Acidovorax avenae subsp. avenae with Biolog similarity indices ranging from 0.52 to 0.72 after 24 hr. Characters for differentiating between Acidovorax spp. were tested according to Schaad et al. (2). The isolates were positive for gelatin liquefaction, nitrate reduction, lipase production, utilization of D-mannitol, sodium citrate, and alkaline in litmus milk. The isolates were negative for utilization of D-arabitol and did not amplify with PCR primer sets Aaaf5, Aaaf3/Aaar2, and Aacf2/Aacr2. Colonies were V–, V+, and V+ for utilization of D-fucose, maltose, and ethanol, respectively. Regions of the 16S rRNA (rrs) and the IGS were sequenced to aid in the identification of the isolates using reported PCR primer sets (1,4). A 1,426 bp fragment of the rrs region shared 100% similarity with all strains of A. avenae available in GenBank. Pathogenicity tests were separately performed for the 11 isolates in different greenhouses located in Suwon (National Academy of Agricultural Science), and Chuncheon (Gangwondo Agricultural Research and Extension Services) in Korea. Pathogenicity was confirmed by clip inoculation with sterilized scissors dipped into cell suspensions containing 105 CFU/ml on three 8-day-old leaves of hog millet (two plants per isolate), rice (Oryza sativa L. cv. Hopyeong), and sweet corn (Zea mays L. cv. Daehak) in a greenhouse maintained at 28 to 32°C and 90% relative humidity. The isolates induced similar symptoms as those originally observed on hog millet 5 days after inoculation. No symptoms were observed on the control plants (hog millet, rice, and sweet corn), which were clipped with scissors dipped in sterilized distilled water. The identity of bacteria reisolated from the stripes on inoculated leaves was confirmed by analyzing sequences of the 16S-23S rRNA intergenic spacer region (IGS) (1). On the basis of physiological, pathological, and sequence data, the isolates were identified as A. avenae subsp. avenae. To our knowledge, this is the first report of bacterial stripe of hog millet caused by A. avenae subsp. avenae in Korea. The spread of the bacterial disease is expected to have a significant economic impact on hog millet culture in the fields of Gangwon Province in Korea. Nucleotide sequence data reported are available under accession numbers JQ743877 to JQ743887 for rrs of BC 3207 and BC3214 to BC3223, and JQ743877 to JQ743887 for IGS of BC3207 and BC3214 to BC3223. References: (1) T. Barry et al. The PCR Methods Appl. 1:51, 1991. (2) N. W. Schaad et al. Syst, Appl. Microbiol. 31: 434, 2008. (3) K. Tamura et al. Mol. Biol. Evol. 28:2731, 2011. (4) W. G. Weisburg et al. J. Bacteriol. 173: 697, 1991.


Sign in / Sign up

Export Citation Format

Share Document