DISCo-microbe: design of an identifiable synthetic community of microbes

PeerJ ◽

10.7717/peerj.8534 ◽

2020 ◽

Vol 8 ◽

pp. e8534 ◽

Cited By ~ 1

Author(s):

Dana L. Carper ◽

Travis J. Lawrence ◽

Alyssa A. Carrell ◽

Dale A. Pelletier ◽

David J. Weston

Keyword(s):

Microbial Community ◽

16S Rrna ◽

Dna Sequence ◽

Sequence Alignment ◽

Amplicon Sequencing ◽

Ribosomal Database Project ◽

Community Members ◽

Dna Sequence Alignment ◽

Diverse Community

Background Microbiomes are extremely important for their host organisms, providing many vital functions and extending their hosts’ phenotypes. Natural studies of host-associated microbiomes can be difficult to interpret due to the high complexity of microbial communities, which hinders our ability to track and identify individual members along with the many factors that structure or perturb those communities. For this reason, researchers have turned to synthetic or constructed communities in which the identities of all members are known. However, due to the lack of tracking methods and the difficulty of creating a more diverse and identifiable community that can be distinguished through next-generation sequencing, most such in vivo studies have used only a few strains. Results To address this issue, we developed DISCo-microbe, a program for the design of an identifiable synthetic community of microbes for use in in vivo experimentation. The program is composed of two modules; (1) create, which allows the user to generate a highly diverse community list from an input DNA sequence alignment using a custom nucleotide distance algorithm, and (2) subsample, which subsamples the community list to either represent a number of grouping variables, including taxonomic proportions, or to reach a user-specified maximum number of community members. As an example, we demonstrate the generation of a synthetic microbial community that can be distinguished through amplicon sequencing. The synthetic microbial community in this example consisted of 2,122 members from a starting DNA sequence alignment of 10,000 16S rRNA sequences from the Ribosomal Database Project. We generated simulated Illumina sequencing data from the constructed community and demonstrate that DISCo-microbe is capable of designing diverse communities with members distinguishable by amplicon sequencing. Using the simulated data we were able to recover sequences from between 97–100% of community members using two different post-processing workflows. Furthermore, 97–99% of sequences were assigned to a community member with zero sequences being misidentified. We then subsampled the community list using taxonomic proportions to mimic a natural plant host–associated microbiome, ultimately yielding a diverse community of 784 members. Conclusions DISCo-microbe can create a highly diverse community list of microbes that can be distinguished through 16S rRNA gene sequencing, and has the ability to subsample (i.e., design) the community for the desired number of members and taxonomic proportions. Although developed for bacteria, the program allows for any alignment input from any taxonomic group, making it broadly applicable. The software and data are freely available from GitHub (https://github.com/dlcarper/DISCo-microbe) and Python Package Index (PYPI).

DISCo-microbe: Design of an identifiable synthetic community of microbes

10.7287/peerj.preprints.27898v1 ◽

2019 ◽

Author(s):

Dana L Carper ◽

Travis J Lawrence ◽

Alyssa A Carrell ◽

Dale A Pelletier ◽

David J Weston

Keyword(s):

Microbial Community ◽

16S Rrna ◽

Dna Sequence ◽

Sequence Alignment ◽

Rrna Gene ◽

Ribosomal Database Project ◽

Plant Host ◽

Dna Sequence Alignment ◽

Diverse Community

Background Microbiomes are extremely important for their host organisms, providing many vital functions and extending their hosts’ phenotypes. Natural studies of host-associated microbiomes can be difficult to interpret due to the high complexity of microbial communities, which hinders our ability to track and identify individual members along with the many factors that structure or perturb those communities. For this reason, researchers have turned to synthetic or constructed communities in which the identities of all members are known. However, due to the lack of tracking methods and the difficulty of creating a more diverse and identifiable community that can be distinguished through next-generation sequencing, most such in vivo studies have used only a few strains. Results To address this issue, we developed DISCo-microbe, a program for the design of an identifiable synthetic community of microbes for use in in vivo experimentation. The program is composed of two modules; (1) create, which allows the user to generate a highly diverse community list from an input DNA sequence alignment using a custom nucleotide distance algorithm, and (2) subsample, which subsamples the community list to either represent a number of grouping variables, including taxonomic proportions, or to reach a user-specified maximum number of community members. As an example, we demonstrate the generation of a synthetic microbial community that can be distinguished through amplicon sequencing. The synthetic microbial community in this example consisted of 2340 members from a starting DNA sequence alignment of 10,000 16S rRNA sequences from the Ribosomal Database Project. We then subsampled the community list using taxonomic proportions to mimic a natural plant host–associated microbiome, ultimately yielding a diverse community of 853 members. Conclusions DISCo-microbe can create a highly diverse community list of microbes that can be distinguished through 16S rRNA gene sequencing, and has the ability to subsample (i.e., design) the community for the desired number of members and taxonomic proportions. Although developed for bacteria, the program allows for any alignment input from any taxonomic group, making it broadly applicable. The software and data are freely available from GitHub (https://github.com/dlcarper/DISCo-microbe) and Python Package Index (PYPI).

DISCo-microbe: Design of an identifiable synthetic community of microbes

10.7287/peerj.preprints.27898 ◽

2019 ◽

Author(s):

Dana L Carper ◽

Travis J Lawrence ◽

Alyssa A Carrell ◽

Dale A Pelletier ◽

David J Weston

Keyword(s):

Microbial Community ◽

16S Rrna ◽

Dna Sequence ◽

Sequence Alignment ◽

Rrna Gene ◽

Ribosomal Database Project ◽

Plant Host ◽

Dna Sequence Alignment ◽

Diverse Community

Background Microbiomes are extremely important for their host organisms, providing many vital functions and extending their hosts’ phenotypes. Natural studies of host-associated microbiomes can be difficult to interpret due to the high complexity of microbial communities, which hinders our ability to track and identify individual members along with the many factors that structure or perturb those communities. For this reason, researchers have turned to synthetic or constructed communities in which the identities of all members are known. However, due to the lack of tracking methods and the difficulty of creating a more diverse and identifiable community that can be distinguished through next-generation sequencing, most such in vivo studies have used only a few strains. Results To address this issue, we developed DISCo-microbe, a program for the design of an identifiable synthetic community of microbes for use in in vivo experimentation. The program is composed of two modules; (1) create, which allows the user to generate a highly diverse community list from an input DNA sequence alignment using a custom nucleotide distance algorithm, and (2) subsample, which subsamples the community list to either represent a number of grouping variables, including taxonomic proportions, or to reach a user-specified maximum number of community members. As an example, we demonstrate the generation of a synthetic microbial community that can be distinguished through amplicon sequencing. The synthetic microbial community in this example consisted of 2340 members from a starting DNA sequence alignment of 10,000 16S rRNA sequences from the Ribosomal Database Project. We then subsampled the community list using taxonomic proportions to mimic a natural plant host–associated microbiome, ultimately yielding a diverse community of 853 members. Conclusions DISCo-microbe can create a highly diverse community list of microbes that can be distinguished through 16S rRNA gene sequencing, and has the ability to subsample (i.e., design) the community for the desired number of members and taxonomic proportions. Although developed for bacteria, the program allows for any alignment input from any taxonomic group, making it broadly applicable. The software and data are freely available from GitHub (https://github.com/dlcarper/DISCo-microbe) and Python Package Index (PYPI).

Molecular and serological detection of Ehrlichia spp. in cats on São Luís Island, Maranhão, Brazil

Revista Brasileira de Parasitologia Veterinária ◽

10.1590/s1984-29612012000100008 ◽

2012 ◽

Vol 21 (1) ◽

pp. 37-41 ◽

Cited By ~ 19

Author(s):

Maria do Socorro Costa de Oliveira Braga ◽

Marcos Rogério André ◽

Carla Roberta Freschi ◽

Márcia Cristina Alves Teixeira ◽

Rosangela Zacarias Machado

Keyword(s):

16S Rrna ◽

Dna Sequence ◽

Sequence Alignment ◽

Molecular Detection ◽

Parasite Species ◽

Serological Detection ◽

Domestic Cats ◽

Dna Sequence Alignment ◽

Gene 16S Rrna ◽

Tick Borne Disease

Ehrlichiosis is a tick-borne disease that affects both humans and animals. The few existing reports on ehrlichiosis in Brazilian cats have been based on observation of morulae in leukocytes and, more recently, on molecular detection of Ehrlichia sp. In this study, we assessed occurrences of Ehrlichia sp. in the blood of 200 domestic cats in São Luís, Maranhão. Of the 200 animals tested, 11 (5.5%) were seropositive for Ehrlichia sp. and two (1%) were positive for Ehrlichia sp. in PCR. We also performed DNA sequence alignment to establish the identity of the parasite species infecting these animals, using the gene 16S rRNA. One cat presented infection with Ehrlichia sp. with 98% identity with E. canis, and another cat infected with Ehrlichia sp. showed 97% identity with E. chaffeensis. This is the first study on molecular detection of Ehrlichia sp. among domestic cats in São Luís, Maranhão.

Fast DNA Sequence Alignment Algorithm Based on Quality Score Using Improved Dynamic Programming and Fuzzy Gap Cost Control

Current Bioinformatics ◽

10.2174/1574893609666140523000227 ◽

2014 ◽

Vol 9 (5) ◽

pp. 540-547

Author(s):

Kwang Kim ◽

Hyun Park ◽

Doo Song

Keyword(s):

Dynamic Programming ◽

Dna Sequence ◽

Sequence Alignment ◽

Cost Control ◽

Quality Score ◽

Alignment Algorithm ◽

Sequence Alignment Algorithm ◽

Dna Sequence Alignment ◽

Improved Dynamic Programming

Ultra-accurate microbial amplicon sequencing with synthetic long reads

Microbiome ◽

10.1186/s40168-021-01072-3 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Benjamin J. Callahan ◽

Dmitry Grinevich ◽

Siddhartha Thakur ◽

Michael A. Balamotis ◽

Tuval Ben Yehezkel

Keyword(s):

Microbial Community ◽

16S Rrna ◽

Amplicon Sequencing ◽

Species Level ◽

Full Length ◽

16S Rrna Genes ◽

Rrna Genes ◽

Strain Identification ◽

Long Reads ◽

Long Read

Abstract Background Out of the many pathogenic bacterial species that are known, only a fraction are readily identifiable directly from a complex microbial community using standard next generation DNA sequencing. Long-read sequencing offers the potential to identify a wider range of species and to differentiate between strains within a species, but attaining sufficient accuracy in complex metagenomes remains a challenge. Methods Here, we describe and analytically validate LoopSeq, a commercially available synthetic long-read (SLR) sequencing technology that generates highly accurate long reads from standard short reads. Results LoopSeq reads are sufficiently long and accurate to identify microbial genes and species directly from complex samples. LoopSeq perfectly recovered the full diversity of 16S rRNA genes from known strains in a synthetic microbial community. Full-length LoopSeq reads had a per-base error rate of 0.005%, which exceeds the accuracy reported for other long-read sequencing technologies. 18S-ITS and genomic sequencing of fungal and bacterial isolates confirmed that LoopSeq sequencing maintains that accuracy for reads up to 6 kb in length. LoopSeq full-length 16S rRNA reads could accurately classify organisms down to the species level in rinsate from retail meat samples, and could differentiate strains within species identified by the CDC as potential foodborne pathogens. Conclusions The order-of-magnitude improvement in length and accuracy over standard Illumina amplicon sequencing achieved with LoopSeq enables accurate species-level and strain identification from complex- to low-biomass microbiome samples. The ability to generate accurate and long microbiome sequencing reads using standard short read sequencers will accelerate the building of quality microbial sequence databases and removes a significant hurdle on the path to precision microbial genomics.

Design and Analysis of 8-bit Smith Waterman based DNA Sequence Alignment Accelerator's Core on ASIC Design Flow

2010 Fourth UKSim European Symposium on Computer Modeling and Simulation ◽

10.1109/ems.2010.31 ◽

2010 ◽

Cited By ~ 2

Author(s):

A.K. Halim ◽

Z.A. Majid ◽

M.A. Mansor ◽

S.A.M. Al Junid ◽

S. Mohamed ◽

...

Keyword(s):

Dna Sequence ◽

Sequence Alignment ◽

Design Flow ◽

Asic Design ◽

Dna Sequence Alignment

A Memory-Efficient Accelerator for DNA Sequence Alignment with Two-Piece Affine Gap Tracebacks

2021 IEEE International Symposium on Circuits and Systems (ISCAS) ◽

10.1109/iscas51556.2021.9401771 ◽

2021 ◽

Author(s):

Jing-Ping Wu ◽

Yi-Chien Lin ◽

Ying-Wei Wu ◽

Shih-Wei Hsieh ◽

Ching-Hsuan Tai ◽

...

Keyword(s):

Dna Sequence ◽

Sequence Alignment ◽

Dna Sequence Alignment ◽

Memory Efficient

Benchmark of algorithms for multiple DNA sequence alignment across livestock species

Translational Research in Veterinary Science ◽

10.12775/trvs.2020.009 ◽

2021 ◽

Vol 3 (2) ◽

pp. 41

Author(s):

Artur Bąk ◽

Grzegorz Migdałek ◽

Chandra Shekhar Pareek ◽

Kacper Żukowski

Keyword(s):

Dna Sequence ◽

Sequence Alignment ◽

Livestock Species ◽

Dna Sequence Alignment

Evaluation of the microbial community structure of potable water samples from occupied and unoccupied buildings using16S rRNA amplicon sequencing

10.1101/2020.07.17.209346 ◽

2020 ◽

Author(s):

Kimothy L Smith ◽

Howard A Shuman ◽

Douglas Findeisen

Keyword(s):

Microbial Community ◽

16S Rrna ◽

Water Samples ◽

Ad Hoc ◽

Microbial Community Composition ◽

Amplicon Sequencing ◽

The Other ◽

Water Usage ◽

16S Rrna Amplicon Sequencing ◽

Oxford Nanopore

AbstractWe conducted two studies of water samples from buildings with normal occupancy and water usage compared to water from buildings that were unoccupied with little or no water usage due to the COVID-19 shutdown. Study 1 had 52 water samples obtained ad hoc from buildings in four metropolitan locations in different states in the US and a range of building types. Study 2 had 36 water samples obtained from two buildings in one metropolitan location with matched water sample types. One of the buildings had been continuously occupied, and the other substantially vacant for approximately 3 months. All water samples were analyzed using 16S rRNA amplicon sequencing with a MinION from Oxford Nanopore Technologies. More than 127 genera of bacteria were identified, including genera with members that are known to include more than 50 putative frank and opportunistic pathogens. While specific results varied among sample locations, 16S rRNA amplicon abundance and the diversity of bacteria were higher in water samples from unoccupied buildings than normally occupied buildings as was the abundance of sequenced amplicons of genera known to include pathogenic bacterial members. In both studies Legionella amplicon abundance was relatively small compared to the abundance of the other bacteria in the samples. Indeed, when present, the relative abundance of Legionella amplicons was lower in samples from unoccupied buildings. Legionella did not predominate in any of the water samples and were found, on average, in 9.6% of samples in Study 1 and 8.3% of samples in Study 2.SynopsisComparison of microbial community composition in the plumbing of occupied and unoccupied buildings during the COVID-19 pandemic shutdown.

A Binary Integer Programming model for computing DNA Sequence Alignment

AL-Rafidain Journal of Computer Sciences and Mathematics ◽

10.33899/csmj.2010.163847 ◽

2010 ◽

Vol 7 (1) ◽

pp. 59-80

Author(s):

Nawar Qubat

Keyword(s):

Integer Programming ◽

Dna Sequence ◽

Sequence Alignment ◽

Programming Model ◽

Binary Integer Programming ◽

Integer Programming Model ◽

Dna Sequence Alignment