mock communities
Recently Published Documents


TOTAL DOCUMENTS

86
(FIVE YEARS 49)

H-INDEX

14
(FIVE YEARS 4)

2021 ◽  
Vol 12 ◽  
Author(s):  
Changwoo Park ◽  
Seung Bum Kim ◽  
Sang Ho Choi ◽  
Seil Kim

Microbial community analysis based on the 16S rRNA-gene is used to investigate both beneficial and harmful microorganisms in various fields and environments. Recently, the next-generation sequencing (NGS) technology has enabled rapid and accurate microbial community analysis. Despite these advantages of NGS based metagenomics study, sample transport, storage conditions, amplification, library preparation kits, sequencing, and bioinformatics procedures can bias microbial community analysis results. In this study, eight mock communities were pooled from genomic DNA of Lactobacillus acidophilus KCTC 3164T, Limosilactobacillus fermentum KCTC 3112T, Lactobacillus gasseri KCTC 3163T, Lacticaseibacillus paracasei subsp. paracasei KCTC 3510T, Limosilactobacillus reuteri KCTC 3594T, Lactococcus lactis subsp. lactis KCTC 3769T, Bifidobacterium animalis subsp. lactis KCTC 5854T, and Bifidobacterium breve KCTC 3220T. The genomic DNAs were quantified by droplet digital PCR (ddPCR) and were mixed as mock communities. The mock communities were amplified with various 16S rRNA gene universal primer pairs and sequenced by MiSeq, IonTorrent, MGIseq-2000, Sequel II, and MinION NGS platforms. In a comparison of primer-dependent bias, the microbial profiles of V1-V2 and V3 regions were similar to the original ratio of the mock communities, while the microbial profiles of the V1-V3 region were relatively biased. In a comparison of platform-dependent bias, the sequence read from short-read platforms (MiSeq, IonTorrent, and MGIseq-2000) showed lower bias than that of long-read platforms (Sequel II and MinION). Meanwhile, the sequences read from Sequel II and MinION platforms were relatively biased in some mock communities. In the data of all NGS platforms and regions, L. acidophilus was greatly underrepresented while Lactococcus lactis subsp. lactis was generally overrepresented. In all samples of this study, the bias index (BI) was calculated and PCA was performed for comparison. The samples with biased relative abundance showed high BI values and were separated in the PCA results. In particular, analysis of regions rich in AT and GC poses problems for genome assembly, which can lead to sequencing bias. According to this comparative analysis, the development of reference material (RM) material has been proposed to calibrate the bias in microbiome analysis.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Daniel P. Dacey ◽  
Frédéric J. J. Chain

Abstract Background Taxonomic classification of genetic markers for microbiome analysis is affected by the numerous choices made from sample preparation to bioinformatics analysis. Paired-end read merging is routinely used to capture the entire amplicon sequence when the read ends overlap. However, the exclusion of unmerged reads from further analysis can result in underestimating the diversity in the sequenced microbial community and is influenced by bioinformatic processes such as read trimming and the choice of reference database. A potential solution to overcome this is to concatenate (join) reads that do not overlap and keep them for taxonomic classification. The use of concatenated reads can outperform taxonomic recovery from single-end reads, but it remains unclear how their performance compares to merged reads. Using various sequenced mock communities with different amplicons, read length, read depth, taxonomic composition, and sequence quality, we tested how merging and concatenating reads performed for genus recall and precision in bioinformatic pipelines combining different parameters for read trimming and taxonomic classification using different reference databases. Results The addition of concatenated reads to merged reads always increased pipeline performance. The top two performing pipelines both included read concatenation, with variable strengths depending on the mock community. The pipeline that combined merged and concatenated reads that were quality-trimmed performed best for mock communities with larger amplicons and higher average quality sequences. The pipeline that used length-trimmed concatenated reads outperformed quality trimming in mock communities with lower quality sequences but lost a significant amount of input sequences for taxonomic classification during processing. Genus level classification was more accurate using the SILVA reference database compared to Greengenes. Conclusions Merged sequences with the addition of concatenated sequences that were unable to be merged increased performance of taxonomic classifications. This was especially beneficial in mock communities with larger amplicons. We have shown for the first time, using an in-depth comparison of pipelines containing merged vs concatenated reads combined with different trimming parameters and reference databases, the potential advantages of concatenating sequences in improving resolution in microbiome investigations.


2021 ◽  
Vol 5 ◽  
Author(s):  
Andreas Kolter ◽  
Birgit Gemeinholzer

The unprecedented ongoing biodiversity decline necessitates scalable means of monitoring in order to fully understand the underlying causes. DNA metabarcoding has the potential to provide a powerful tool for accurate and rapid biodiversity monitoring. Unfortunately, in many cases, a lack of universal standards undermines the widespread application of metabarcoding. One of the most important considerations in metabarcoding of plants, aside from selecting a potent barcode marker, is primer choice. Our study evaluates published ITS primers in silico and in vitro, through mock communities and presents newly designed primers. We were able to show that a large proportion of previously available ITS primers have unfavourable attributes. Our combined results support the recommendation of the introduced primers ITS-3p62plF1 and ITS-4unR1 as the best current universal plant specific ITS2 primer combination. We also found that PCR optimisation, such as the addition of 5% DMSO, is essential to obtain meaningful results in ITS2 metabarcoding. Finally, we conclude that continuous quality assurance is indispensable for reliable metabarcoding results.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Charlotte Marie Ahle ◽  
Kristian Stødkilde-Jørgensen ◽  
Anja Poehlein ◽  
Wolfgang R. Streit ◽  
Jennifer Hüpeden ◽  
...  

Abstract Background Staphylococci are important members of the human skin microbiome. Many staphylococcal species and strains are commensals of the healthy skin microbiota, while few play essential roles in skin diseases such as atopic dermatitis. To study the involvement of staphylococci in health and disease, it is essential to determine staphylococcal populations in skin samples beyond the genus and species level. Culture-independent approaches such as amplicon next-generation sequencing (NGS) are time- and cost-effective options. However, their suitability depends on the power of resolution. Results Here we compare three amplicon NGS schemes that rely on different targets within the genes tuf and rpsK, designated tuf1, tuf2 and rpsK schemes. The schemes were tested on mock communities and on human skin samples. To obtain skin samples and build mock communities, skin swab samples of healthy volunteers were taken. In total, 254 staphylococcal strains were isolated and identified to the species level by MALDI-TOF mass spectrometry. A subset of ten strains belonging to different staphylococcal species were genome-sequenced. Two mock communities with nine and eighteen strains, respectively, as well as eight randomly selected skin samples were analysed with the three amplicon NGS methods. Our results imply that all three methods are suitable for species-level determination of staphylococcal populations. However, the novel tuf2-NGS scheme was superior in resolution power. It unambiguously allowed identification of Staphylococcus saccharolyticus and distinguish phylogenetically distinct clusters of Staphylococcus epidermidis. Conclusions Powerful amplicon NGS approaches for the detection and relative quantification of staphylococci in human samples exist that can resolve populations to the species and, to some extent, to the subspecies level. Our study highlights strengths, weaknesses and pitfalls of three currently available amplicon NGS approaches to determine staphylococcal populations. Applied to the analysis of healthy and diseased skin, these approaches can be useful to attribute host-beneficial and -detrimental roles to skin-resident staphylococcal species and subspecies.


2021 ◽  
Vol 1 (1) ◽  
Author(s):  
Sandra Reitmeier ◽  
Thomas C. A. Hitch ◽  
Nicole Treichel ◽  
Nikolaos Fikas ◽  
Bela Hausmann ◽  
...  

Abstract16S rRNA gene amplicon sequencing is a popular approach for studying microbiomes. However, some basic concepts have still not been investigated comprehensively. We studied the occurrence of spurious sequences using defined microbial communities based on data either from the literature or generated in three sequencing facilities and analyzed via both operational taxonomic units (OTUs) and amplicon sequence variants (ASVs) approaches. OTU clustering and singleton removal, a commonly used approach, delivered approximately 50% (mock communities) to 80% (gnotobiotic mice) spurious taxa. The fraction of spurious taxa was generally lower based on ASV analysis, but varied depending on the gene region targeted and the barcoding system used. A relative abundance of 0.25% was found as an effective threshold below which the analysis of spurious taxa can be prevented to a large extent in both OTU- and ASV-based analysis approaches. Using this cutoff improved the reproducibility of analysis, i.e., variation in richness estimates was reduced by 38% compared with singleton filtering using six human fecal samples across seven sequencing runs. Beta-diversity analysis of human fecal communities was markedly affected by both the filtering strategy and the type of phylogenetic distances used for comparison, highlighting the importance of carefully analyzing data before drawing conclusions on microbiome changes. In summary, handling of artifact sequences during bioinformatic processing of 16S rRNA gene amplicon data requires careful attention to avoid the generation of misleading findings. We propose the concept of effective richness to facilitate the comparison of alpha-diversity across studies.


2021 ◽  
Author(s):  
Thomas H.A. Haverkamp ◽  
Bjørn Spilsberg ◽  
Gro Skøien H.A. Johannessen ◽  
Mona Torp ◽  
Camilla Sekse

Background: Foodborne pathogens such as Campylobacter jejuni are responsible for a large fraction of the gastrointestinal infections worldwide associated with poultry meat. Campylobacter spp. can be found in the chicken fecal microbiome and can contaminate poultry meat during the slaughter process. The current standard methods to detect these pathogens at poultry farms use fecal dropping or boot swaps in combination with cultivation / PCR. In this study, we have used air filters in combination with shotgun metagenomics for the detection of Campylobacter in poultry houses and MOCK communities to test the applicability of this approach for the detection of foodborne pathogens. Results: The spiked MOCK communities showed that we could detect as little as 200 CFU Campylobacter per sample using our protocols. Since we were interested in detecting Campylobacter, a DNA extraction protocol for Gram negative bacteria was chosen, and as expected, we found that the DNA extraction protocol created a substantial bias affecting the community composition of the MOCK communities. It can be expected that the same bias is present for poultry house samples analyzed. We observed significant amounts of Campylobacter on the air filters using both real-time PCR as well as shotgun metagenomics, irrespective of the amount of spiked in Campylobacter cells, suggesting that the flocks in both houses harboured Campylobacter spp.. Interestingly, in both houses we find diverse microbial communities present in the indoor air. In addition, have we tested the Campylobacter detection rate using shotgun metagenomics by spiking with different levels of C. jejuni cells in both the mock and the house samples. This showed that even with limited sequencing Campylobacter is detectable in samples with low abundance. Conclusions: These results show that air sampling of poultry houses in combination with shotgun metagenomics can detect and identify Campylobacter spp. present at low levels. This is important since early detection of Campylobacter in food production can help to decrease the number of food-borne infections.


Author(s):  
Yi‐Chun Yeh ◽  
Jesse C. McNichol ◽  
David M. Needham ◽  
Erin B. Fichot ◽  
Lyria Berdjeb ◽  
...  

2021 ◽  
Vol 22 (S10) ◽  
Author(s):  
Zhenmiao Zhang ◽  
Lu Zhang

Abstract Background Due to the complexity of microbial communities, de novo assembly on next generation sequencing data is commonly unable to produce complete microbial genomes. Metagenome assembly binning becomes an essential step that could group the fragmented contigs into clusters to represent microbial genomes based on contigs’ nucleotide compositions and read depths. These features work well on the long contigs, but are not stable for the short ones. Contigs can be linked by sequence overlap (assembly graph) or by the paired-end reads aligned to them (PE graph), where the linked contigs have high chance to be derived from the same clusters. Results We developed METAMVGL, a multi-view graph-based metagenomic contig binning algorithm by integrating both assembly and PE graphs. It could strikingly rescue the short contigs and correct the binning errors from dead ends. METAMVGL learns the two graphs’ weights automatically and predicts the contig labels in a uniform multi-view label propagation framework. In experiments, we observed METAMVGL made use of significantly more high-confidence edges from the combined graph and linked dead ends to the main graph. It also outperformed many state-of-the-art contig binning algorithms, including MaxBin2, MetaBAT2, MyCC, CONCOCT, SolidBin and GraphBin on the metagenomic sequencing data from simulation, two mock communities and Sharon infant fecal samples. Conclusions Our findings demonstrate METAMVGL outstandingly improves the short contig binning and outperforms the other existing contig binning tools on the metagenomic sequencing data from simulation, mock communities and infant fecal samples.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11120
Author(s):  
Gilda Varliero ◽  
Jared Wray ◽  
Cédric Malandain ◽  
Gary Barker

Many environmental and biomedical biomonitoring and detection studies aim to explore the presence of specific organisms or gene functionalities in microbiome samples. In such cases, when the study hypotheses can be answered with the exploration of a small number of genes, a targeted PCR-approach is appropriate. However, due to the complexity of environmental microbial communities, the design of specific primers is challenging and can lead to non-specific results. We designed PhyloPrimer, the first user-friendly platform to semi-automate the design of taxon-specific oligos (i.e., PCR primers) for a gene of interest. The main strength of PhyloPrimer is the ability to retrieve and align GenBank gene sequences matching the user’s input, and to explore their relationships through an online dynamic tree. PhyloPrimer then designs oligos specific to the gene sequences selected from the tree and uses the tree non-selected sequences to look for and maximize oligo differences between targeted and non-targeted sequences, therefore increasing oligo taxon-specificity (positive/negative consensus approach). Designed oligos are then checked for the presence of secondary structure with the nearest-neighbor (NN) calculation and the presence of off-target matches with in silico PCR tests, also processing oligos with degenerate bases. Whilst the main function of PhyloPrimer is the design of taxon-specific oligos (down to the species level), the software can also be used for designing oligos to target a gene without any taxonomic specificity, for designing oligos from preselected sequences and for checking predesigned oligos. We validated the pipeline on four commercially available microbial mock communities using PhyloPrimer to design genus- and species-specific primers for the detection of Streptococcus species in the mock communities. The software performed well on these mock microbial communities and can be found at https://www.cerealsdb.uk.net/cerealgenomics/phyloprimer.


2021 ◽  
Author(s):  
Natalia García-García ◽  
Javier Tamames ◽  
Fernando Puente-Sánchez

Motivation: Advances in sequencing technologies have triggered the development of many bioinformatic tools aimed to analyze these data. As these tools need to be tested, it is important to simulate datasets that resemble realistic conditions. Although there is a large amount of software dedicated to produce reads from in silico microbial communities, often the simulated data diverge widely from real situations. Results: Here, we introduce M&Ms, a user-friendly open-source bioinformatic tool to produce realistic amplicon datasets from reference sequences, based on pragmatic ecological parameters. This tool creates sequence libraries for in silico microbial communities with user-controlled richness, evenness, microdiversity, and source environment. M&Ms allows the user to generate simple to complex read datasets based on real parameters that can be used in developing bioinformatic software or in benchmarking current tools. M&Ms also provides additional figures and files with extensive details on how each synthetic community is composed, so that users can make informed choices when designing their benchmarking pipelines. Availability: The source code of M&Ms is freely available from https://github.com/ggnatalia/MMs


Sign in / Sign up

Export Citation Format

Share Document