scholarly journals cpn60 barcode sequences accurately identify newly defined genera within the Lactobacillaceae

2021 ◽  
Author(s):  
Ishika Shukla ◽  
Janet E. Hill

AbstractThe cpn60 barcode sequence is established as an informative target for microbial species identification. Applications of cpn60 barcode sequencing are supported by the availability of “universal” PCR primers for its amplification and a curated reference database of cpn60 sequences, cpnDB. A recent reclassification of lactobacilli involving the definition of 23 new genera provided an opportunity to update cpnDB and to determine if the cpn60 barcode could be used for accurate identification of species consistent with the new framework. Analysis of 275 cpn60 sequences representing 258/269 of the validly named species in Lactobacillus, Paralactobacillus and the 23 newer genera showed that cpn60-based sequence relationships were consistent with the whole-genome-based phylogeny. Aligning or mapping full length barcode sequences or a 150 bp subsequence resulted in accurate and unambiguous species identification in almost all cases. Taken together, our results show that the combination of available reference sequence data, “universal” barcode amplification primers, and the inherent sequence diversity within the cpn60 barcode make it a useful target for the detection and identification of lactobacilli as defined by the latest taxonomic framework.Significance and Impact of the StudyThe genus Lactobacillus recently underwent a major reorganization resulting in the definition of 23 new genera. Lactobacilli are widespread in environmental and host-associated microbiomes and are exploited in food and biotechnology applications, making methods for their accurate identification desirable. Here we show that the combination of a reference sequence database, “universal” barcode amplification primers, and the inherent sequence diversity within the cpn60 barcode make it a useful target for the detection and identification of lactobacilli as defined by the latest taxonomic framework.

2018 ◽  
Author(s):  
Joe Parker ◽  
Andrew Helmstetter ◽  
James Crowe ◽  
John Iacona ◽  
Dion Devey ◽  
...  

AbstractThe versatility of the current DNA sequencing platforms and the development of portable, nanopore sequencers means that it has never been easier to collect genetic data for unknown sample ID. DNA barcoding and meta-barcoding have become increasingly popular and barcode databases continue to grow at an impressive rate. However, the number of canonical genome assemblies (reference or draft) that are publically available is relatively tiny, hindering the more widespread use of genome scale DNA sequencing technology for accurate species identification and discovery. Here, we show that rapid raw-read reference datasets, or R4IDs for short, generated in a matter of hours on the Oxford Nanopore MinION, can bridge this gap and accelerate the generation of useable reference sequence data. By exploiting the long read length of this technology, shotgun genomic sequencing of a small portion of an organism’s genome can act as a suitable reference database despite the low sequencing coverage. These R4IDs can then be used for accurate species identification with minimal amounts of re-sequencing effort (1000s of reads). We demonstrated the capabilities of this approach with six vascular plant species for which we created R4IDs in the laboratory and then re-sequenced, live at the Kew Science Festival 2016. We further validated our method using simulations to determine the broader applicability of the approach. Our data analysis pipeline has been made available as a Dockerised workflow for simple, scalable deployment for a range of uses.


2019 ◽  
Author(s):  
Vanessa R. Marcelino ◽  
Philip T.L.C. Clausen ◽  
Jan P. Buchmann ◽  
Michelle Wille ◽  
Jonathan R. Iredell ◽  
...  

AbstractHigh-throughput sequencing of DNA and RNA from environmental and host-associated samples (metagenomics and metatranscriptomics) is a powerful tool to assess which organisms are present in a sample. Taxonomic identification software usually align individual short sequence reads to a reference database, sometimes containing taxa with complete genomes only. This is a challenging task given that different species can share identical sequence regions and complete genome sequences are only available for a fraction of organisms. A recently developed approach to map sequence reads to reference databases involves weighing all high scoring read-mappings to the data base as a whole to produce better-informed alignments. We used this novel concept in read mapping to develop a highly accurate metagenomic classification pipeline named CCMetagen. Using simulated fungal and bacterial metagenomes, we demonstrate that CCMetagen substantially outperforms other commonly used metagenome classifiers, attaining a 3 – 1580 fold increase in precision and a 2 – 922 fold increase in F1 scores for species-level classifications when compared to Kraken2, Centrifuge and KrakenUniq. CCMetagen is sufficiently fast and memory efficient to use the entire NCBI nucleotide collection (nt) as reference, enabling the assessment of species with incomplete genome sequence data from all biological kingdoms. Our pipeline efficiently produced a comprehensive overview of the microbiome of two biological data sets, including both eukaryotes and prokaryotes. CCMetagen is user-friendly and the results can be easily integrated into microbial community analysis software for streamlined and automated microbiome studies.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2279 ◽  
Author(s):  
Javier F. Tabima ◽  
Sydney E. Everhart ◽  
Meredith M. Larsen ◽  
Alexandra J. Weisberg ◽  
Zhian N. Kamvar ◽  
...  

Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID (microbe-id.org) and provided a working implementation for the genusPhytophthora(phytophthora-id.org). InPhytophthora-ID, the Sequence-ID application allows identification based on ITS orcoxspacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen speciesP. infestansandP. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided athttps://github.com/grunwaldlab/Microbe-ID.


Insects ◽  
2018 ◽  
Vol 9 (4) ◽  
pp. 159 ◽  
Author(s):  
Narin Sontigun ◽  
Kabkaew Sukontason ◽  
Jens Amendt ◽  
Barbara Zajac ◽  
Richard Zehner ◽  
...  

Blow flies are the first insect group to colonize on a dead body and thus correct species identification is a crucial step in forensic investigations for estimating the minimum postmortem interval, as developmental times are species-specific. Due to the difficulty of traditional morphology-based identification such as the morphological similarity of closely related species and uncovered taxonomic keys for all developmental stages, DNA-based identification has been increasing in interest, especially in high biodiversity areas such as Thailand. In this study, the effectiveness of long mitochondrial cytochrome c oxidase subunit I and II (COI and COII) sequences (1247 and 635 bp, respectively) in identifying 16 species of forensically relevant blow flies in Thailand (Chrysomya bezziana, Chrysomya chani, Chrysomya megacephala, Chrysomya nigripes, Chrysomya pinguis, Chrysomya rufifacies, Chrysomya thanomthini, Chrysomya villeneuvi, Lucilia cuprina, Lucilia papuensis, Lucilia porphyrina, Lucilia sinensis, Hemipyrellia ligurriens, Hemipyrellia pulchra, Hypopygiopsis infumata, and Hypopygiopsis tumrasvini) was assessed using distance-based (Kimura two-parameter distances based on Best Match, Best Close Match, and All Species Barcodes criteria) and tree-based (grouping taxa by sequence similarity in the neighbor-joining tree) methods. Analyses of the obtained sequence data demonstrated that COI and COII genes were effective markers for accurate species identification of the Thai blow flies. This study has not only demonstrated the genetic diversity of Thai blow flies, but also provided a reliable DNA reference database for further use in forensic entomology within the country and other regions where these species exist.


Parasitology ◽  
2009 ◽  
Vol 136 (12) ◽  
pp. 1501-1507 ◽  
Author(s):  
W. GIBSON

SUMMARYThe first step in studying the epidemiology of a disease is the accurate identification of the pathogen. Traditional reliance on morphological identification has given way to the use of molecular methods for the detection and identification of pathogens, greatly improving our understanding of epidemiology. For the African tsetse-transmitted trypanosomes, the growth of PCR methods for identification of trypanosomes has led to increased appreciation of trypanosome genetic diversity and discovery of hitherto unknown trypanosome species, as well as greater knowledge about the number and type of trypanosome infections circulating in mammalian hosts and vectors. Sequence data and phylogenetic analysis have provided quantitative information on the relatedness of different trypanosome species and allowed the new trypanosome genotypes discovered through the use of species identification methods in the field to be accurately placed in the phylogenetic tree.


2021 ◽  
Vol 7 ◽  
Author(s):  
Gillian Mitchell ◽  
Ruth N. Zadoks ◽  
Philip J. Skuce

Rumen fluke are parasitic trematodes that affect domestic and wild ruminants across a wide range of countries and habitats. There are 6 major genera of rumen fluke and over 70 recognized species. Accurate species identification is important to investigate the epidemiology, pathophysiology and economic impact of rumen fluke species but paramphistomes are morphologically plastic, which has resulted in numerous instances of misclassification. Here, we present a universal approach to molecular identification of rumen fluke species, including different life-cycle stages (eggs, juvenile and mature fluke) and sample preservation methods (fresh, ethanol- or formalin-fixed, and paraffin wax-embedded). Among 387 specimens from 173 animals belonging to 10 host species and originating from 14 countries on 5 continents, 10 rumen fluke species were identified based on ITS-2 intergenic spacer sequencing, including members of the genera Calicophoron, Cotylophoron, Fischeroedius, Gastrothylax, Orthocoelium, and Paramphistomum. Pairwise comparison of ITS-2 sequences from this study and GenBank showed >98.5% homology for 80% of intra-species comparisons and <98.5% homology for 97% of inter-species comparisons, suggesting that some sequence data may have been entered into public repositories with incorrect species attribution based on morphological analysis. We propose that ITS-2 sequencing could be used as a universal tool for rumen fluke identification across host and parasite species from diverse technical and geographical origins and form the basis of an international reference database for accurate species identification.


2021 ◽  
Author(s):  
Kelly L Sovacool ◽  
Sarah L Westcott ◽  
M Brodie Mumphrey ◽  
Gabrielle A Dotson ◽  
Patrick D. Schloss

Assigning amplicon sequences to operational taxonomic units (OTUs) is often an important step in characterizing the composition of microbial communities across large datasets. OptiClust, a de novo OTU clustering method, has been shown to produce higher quality OTU assignments than other methods and at comparable or faster speeds. A notable difference between de novo clustering and database-dependent reference clustering methods is that OTU assignments from de novo methods may change when new sequences are added to a dataset. However, in some cases one may wish to incorporate new samples into a previously clustered dataset without performing clustering again on all sequences, such as when comparing across datasets or deploying machine learning models where OTUs are features. Existing reference-based clustering methods produce consistent OTUs, but they only consider the similarity of each query sequence to a single reference sequence in an OTU, thus resulting in OTU assignments that are significantly worse than those generated by de novo methods. To provide an efficient and robust method to fit amplicon sequence data to existing OTUs, we developed the OptiFit algorithm. Inspired by OptiClust, OptiFit considers the similarity of all pairs of reference and query sequences in an OTU to produce OTUs of the best possible quality. We tested OptiFit using four microbiome datasets with two different strategies: by clustering to an external reference database or by splitting the dataset into a reference and query set and clustering the query sequences to the reference set after clustering it using OptiClust. The result is an improved implementation of closed and open-reference clustering. OptiFit produces OTUs of similar quality as OptiClust and at faster speeds when using the split dataset strategy, although the OTU quality and processing speed depends on the database chosen when using the external database strategy. OptiFit provides a suitable option for users who require consistent OTU assignments at the same quality afforded by de novo clustering methods.


2016 ◽  
Author(s):  
Javier F Tabima ◽  
Sydney E Everhart ◽  
Meredith M Larsen ◽  
Alexandra J Weisberg ◽  
Zhian N Kamvar ◽  
...  

Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID ( www.microbe-id.org ) and provided a working implementation for the genus Phytophthora ( www.phytophthora-id.org ). In Phytophthora-ID, the Sequence-ID application allows identification based on ITS or cox spacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen species P. infestans and P. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided at https://github.com/grunwaldlab/Microbe-ID.


2016 ◽  
Author(s):  
Javier F Tabima ◽  
Sydney E Everhart ◽  
Meredith M Larsen ◽  
Alexandra J Weisberg ◽  
Zhian N Kamvar ◽  
...  

Development of tools to identify species, genotypes, or novel strains of invasive organisms is critical for monitoring emergence and implementing rapid response measures. Molecular markers, although critical to identifying species or genotypes, require bioinformatic tools for analysis. However, user-friendly analytical tools for fast identification are not readily available. To address this need, we created a web-based set of applications called Microbe-ID that allow for customizing a toolbox for rapid species identification and strain genotyping using any genetic markers of choice. Two components of Microbe-ID, named Sequence-ID and Genotype-ID, implement species and genotype identification, respectively. Sequence-ID allows identification of species by using BLAST to query sequences for any locus of interest against a custom reference sequence database. Genotype-ID allows placement of an unknown multilocus marker in either a minimum spanning network or dendrogram with bootstrap support from a user-created reference database. Microbe-ID can be used for identification of any organism based on nucleotide sequences or any molecular marker type and several examples are provided. We created a public website for demonstration purposes called Microbe-ID ( www.microbe-id.org ) and provided a working implementation for the genus Phytophthora ( www.phytophthora-id.org ). In Phytophthora-ID, the Sequence-ID application allows identification based on ITS or cox spacer sequences. Genotype-ID groups individuals into clonal lineages based on simple sequence repeat (SSR) markers for the two invasive plant pathogen species P. infestans and P. ramorum. All code is open source and available on github and CRAN. Instructions for installation and use are provided at https://github.com/grunwaldlab/Microbe-ID.


Sign in / Sign up

Export Citation Format

Share Document