COMPARISON OF WHOLE GENOME SEQUENCE-BASED METHODS AND PCR RIBOTYPING FOR SUBTYPING OF Clostridioides difficile

Clostridioides difficile is the most common cause of antibiotic-associated gastrointestinal infections. Capillary-electrophoresis (CE)-PCR ribotyping is currently the gold standard for C. difficile typing but lacks discriminatory power to study transmission and outbreaks in detail. New molecular methods have the capacity to differentiate better and provide standardized and interlaboratory exchangeable data. Using a well-characterized collection of diverse strains (N=630; 100 unique ribotypes (RTs)), we compared the discriminatory power of core genome multilocus sequence typing (cgMLST) (SeqSphere & EnteroBase), whole genome MLST (wgMLST) (EnteroBase) and single nucleotide polymorphism (SNP) analysis. A unique cgMLST profile (>6 allele differences) was observed in 82/100 RTs, indicating that cgMLST could distinguish most, but not all, RTs. Application of cgMLST in two outbreak settings with RT078 and RT181 (known with a low intra-RT allele difference) showed no distinction between outbreak- and non-outbreak strains, in contrast to wgMLST and SNP analysis. We conclude that cgMLST has the potential to be an alternative to CE-PCR ribotyping. The method is reproducible, easy to standardize and offers higher discrimination. However, adjusted cut-off thresholds and epidemiological data are necessary to recognize outbreaks of some specific RTs. We propose to use an allelic threshold 3 alleles to identify outbreaks.

Download Full-text

Performance of Core Genome Multilocus Sequence Typing Compared to Capillary-Electrophoresis PCR Ribotyping and SNP Analysis of Clostridioides difficile

10.1101/2021.08.10.455895 ◽

2021 ◽

Author(s):

Amoe Baktash ◽

Jeroen Corver ◽

Celine Harmanus ◽

Wiep Klaas Smits ◽

Warren N. Fawley ◽

...

Keyword(s):

Capillary Electrophoresis ◽

Multilocus Sequence Typing ◽

Core Genome ◽

Discriminatory Power ◽

Snp Analysis ◽

Gastrointestinal Infections ◽

Clostridioides Difficile ◽

Backward Compatibility ◽

Pcr Ribotyping ◽

Allele Difference

Clostridioides difficile is the most common cause of antibiotic-associated gastrointestinal infections. Capillary-electrophoresis (CE)-PCR ribotyping is currently the gold standard for C. difficile typing but lacks discriminatory power to study transmission and outbreaks in detail. New molecular methods have the capacity to differentiate better, but backward compatibility with CE-PCR ribotyping must be assessed. Using a well-characterized collection of diverse strains (N=630; 100 unique ribotypes [RTs]), we aimed to investigate PCR ribotyping prediction from core genome multilocus sequence typing (cgMLST). Additionally, we compared the discriminatory power of cgMLST (SeqSphere & EnteroBase) and whole genome MLST (wgMLST) (EnteroBase) with single nucleotide polymorphism (SNP) analysis). A unique cgMLST profile (>6 allele differences) was observed in 82/100 ribotypes, indicating sufficient backward compatibility. Intra-RT allele difference varied per ribotype and MLST clade. Application of cg/wgMLST and SNP analysis in two outbreak settings with ribotypes RT078 and RT181 (known with a low intra-ribotype allele difference) showed no distinction between outbreak- and non-outbreak strains, in contrast to wgMLST and SNP analysis. We conclude that cgMLST has the potential to be an alternative to CE-PCR ribotyping. The method is reproducible, easy to standardize and offers higher discrimination. However, in some ribotype complexes adjusted cut-off thresholds and epidemiological data are necessary to recognize outbreaks. We propose to decrease the current threshold of 6 to 3 alleles to better identify outbreaks.

Download Full-text

Genome typing and epidemiology of human listeriosis in New Zealand 1999-2018

Journal of Clinical Microbiology ◽

10.1128/jcm.00849-21 ◽

2021 ◽

Author(s):

Lucia Rivas ◽

Shevaun Paine ◽

Pierre-Yves Dupont ◽

Audrey Tiong ◽

Beverley Horn ◽

...

Keyword(s):

Genetic Diversity ◽

New Zealand ◽

Epidemiological Data ◽

Common Source ◽

Snp Analysis ◽

Whole Genome ◽

Single Nucleotide ◽

Low Genetic Diversity ◽

Geographical Regions ◽

Sequence Types

This study describes the epidemiology of listeriosis in New Zealand (NZ) between 1999 and 2018, as well as the retrospective whole genome sequencing (WGS) of 453 Listeria monocytogenes isolates corresponding to 95% of the human cases within this period. The average notified rate of listeriosis was 0.5 cases per 100,000 population and non-pregnancy associated cases were more prevalent than pregnancy-associated cases (average 19 and 5 cases per annum, respectively). Analysis of WGS data was assessed using multi-locus sequencing typing (MLST), including core-genome and whole-genome MLST (cgMLST and wgMLST) and single-nucleotide polymorphism (SNP) analysis. Thirty-nine sequence types (STs) were identified, with the most common being, ST1 (21.9%), ST4 (13.2%), ST2 (11.3%), ST120 (6.1%) and ST155 (6.4%). A total of 291 different cgMLST types were identified, with the majority (n = 243) of types observed as a single isolate, consistent with the observation that listeriosis is predominately sporadic. Amongst the 49 cgMLST types containing two or more isolates, 18 cgMLST types contained 2-4 isolates (50 isolates in total, including three outbreak-associated isolates) that shared low genetic diversity (0-2 whole-genome alleles), some of which were dispersed in time or geographical regions. SNP-analysis also produced comparable results to wgMLST. The low genetic diversity within these clusters suggests a potential common source but incomplete epidemiological data impaired retrospective epidemiological investigations. Prospective use of WGS analysis, together with thorough exposure information from cases will potentially identify future outbreaks more rapidly and possibly those that have been undetected for some time over different geographically regions.

Download Full-text

Epidemic Clostridioides difficile Ribotype 027 Lineages: Comparisons of Texas Versus Worldwide Strains

Open Forum Infectious Diseases ◽

10.1093/ofid/ofz013 ◽

2019 ◽

Vol 6 (2) ◽

Cited By ~ 7

Author(s):

Bradley T Endres ◽

Khurshida Begum ◽

Hua Sun ◽

Seth T Walk ◽

Ali Memariani ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Phylogenetic Trees ◽

Large Scale ◽

The United States ◽

Whole Genome Sequence ◽

Snp Analysis ◽

Whole Genome ◽

Ribotype 027 ◽

Clostridioides Difficile

Abstract Background The epidemic Clostridioides difficile ribotype 027 strain resulted from the dissemination of 2 separate fluoroquinolone-resistant lineages: FQR1 and FQR2. Both lineages were reported to originate in North America; however, confirmatory large-scale investigations of C difficile ribotype 027 epidemiology using whole genome sequencing has not been undertaken in the United States. Methods Whole genome sequencing and single-nucleotide polymorphism (SNP) analysis was performed on 76 clinical ribotype 027 isolates obtained from hospitalized patients in Texas with C difficile infection and compared with 32 previously sequenced worldwide strains. Maximum-likelihood phylogeny based on a set of core genome SNPs was used to construct phylogenetic trees investigating strain macro- and microevolution. Bayesian phylogenetic and phylogeographic analyses were used to incorporate temporal and geographic variables with the SNP strain analysis. Results Whole genome sequence analysis identified 2841 SNPs including 900 nonsynonymous mutations, 1404 synonymous substitutions, and 537 intergenic changes. Phylogenetic analysis separated the strains into 2 prominent groups, which grossly differed by 28 SNPs: the FQR1 and FQR2 lineages. Five isolates were identified as pre-epidemic strains. Phylogeny demonstrated unique clustering and resistance genes in Texas strains indicating that spatiotemporal bias has defined the microevolution of ribotype 027 genetics. Conclusions Clostridioides difficile ribotype 027 lineages emerged earlier than previously reported, coinciding with increased use of fluoroquinolones. Both FQR1 and FQR2 ribotype 027 epidemic lineages are present in Texas, but they have evolved geographically to represent region-specific public health threats.

Download Full-text

Whole Genome Sequence Analysis of Brucella abortus Isolates from Various Regions of South Africa

Microorganisms ◽

10.3390/microorganisms9030570 ◽

2021 ◽

Vol 9 (3) ◽

pp. 570

Author(s):

Maphuti Betty Ledwaba ◽

Barbara Akorfa Glover ◽

Itumeleng Matle ◽

Giuseppe Profiti ◽

Pier Luigi Martelli ◽

...

Keyword(s):

South Africa ◽

Single Nucleotide Polymorphisms ◽

South African ◽

Genome Sequence ◽

Brucella Abortus ◽

Whole Genome Sequence ◽

Snp Analysis ◽

Whole Genome ◽

Nucleotide Polymorphisms ◽

Single Nucleotide

The availability of whole genome sequences in public databases permits genome-wide comparative studies of various bacterial species. Whole genome sequence-single nucleotide polymorphisms (WGS-SNP) analysis has been used in recent studies and allows the discrimination of various Brucella species and strains. In the present study, 13 Brucella spp. strains from cattle of various locations in provinces of South Africa were typed and discriminated. WGS-SNP analysis indicated a maximum pairwise distance ranging from 4 to 77 single nucleotide polymorphisms (SNPs) between the South African Brucella abortus virulent field strains. Moreover, it was shown that the South African B. abortus strains grouped closely to B. abortus strains from Mozambique and Zimbabwe, as well as other Eurasian countries, such as Portugal and India. WGS-SNP analysis of South African B. abortus strains demonstrated that the same genotype circulated in one farm (Farm 1), whereas another farm (Farm 2) in the same province had two different genotypes. This indicated that brucellosis in South Africa spreads within the herd on some farms, whereas the introduction of infected animals is the mode of transmission on other farms. Three B. abortus vaccine S19 strains isolated from tissue and aborted material were identical, even though they originated from different herds and regions of South Africa. This might be due to the incorrect vaccination of animals older than the recommended age of 4–8 months or might be a problem associated with vaccine production.

Download Full-text

Whole genome sequencing of Clostridioides difficile PCR ribotype 046 suggests transmission between pigs and humans

PLoS ONE ◽

10.1371/journal.pone.0244227 ◽

2020 ◽

Vol 15 (12) ◽

pp. e0244227

Author(s):

Anders Werner ◽

Paula Mölling ◽

Anna Fagerström ◽

Fredrik Dyrkell ◽

Dimitrios Arnellos ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

General Pattern ◽

Snp Analysis ◽

Whole Genome ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Neonatal Pigs ◽

Hospital Outbreak ◽

Clostridioides Difficile

Background A zoonotic association has been suggested for several PCR ribotypes (RTs) of Clostridioides difficile. In central parts of Sweden, RT046 was found dominant in neonatal pigs at the same time as a RT046 hospital C. difficile infection (CDI) outbreak occurred in the southern parts of the country. Objective To detect possible transmission of RT046 between pig farms and human CDI cases in Sweden and investigate the diversity of RT046 in the pig population using whole genome sequencing (WGS). Methods WGS was performed on 47 C. difficile isolates from pigs (n = 22), the farm environment (n = 7) and human cases of CDI (n = 18). Two different core genome multilocus sequencing typing (cgMLST) schemes were used together with a single nucleotide polymorphisms (SNP) analysis and the results were related to time and location of isolation of the isolates. Results The pig isolates were closely related (≤6 cgMLST alleles differing in both cgMLST schemes) and conserved over time and were clearly separated from isolates from the human hospital outbreak (≥76 and ≥90 cgMLST alleles differing in the two cgMLST schemes). However, two human isolates were closely related to the pig isolates, suggesting possible transmission. The SNP analysis was not more discriminate than cgMLST. Conclusion No general pattern suggesting zoonotic transmission was apparent between pigs and humans, although contrasting results from two isolates still make transmission possible. Our results support the need for high resolution WGS typing when investigating hospital and environmental transmission of C. difficile.

Download Full-text

Whole-Genome Single-Nucleotide-Polymorphism Analysis for Discrimination of Clostridium botulinum Group I Strains

Applied and Environmental Microbiology ◽

10.1128/aem.03934-13 ◽

2014 ◽

Vol 80 (7) ◽

pp. 2125-2132 ◽

Cited By ~ 23

Author(s):

Narjol Gonzalez-Escalona ◽

Ruth Timme ◽

Brian H. Raphael ◽

Donald Zink ◽

Shashi K. Sharma

Keyword(s):

Single Nucleotide Polymorphism ◽

Clostridium Botulinum ◽

Toxin Gene ◽

Polymorphism Analysis ◽

Snp Analysis ◽

Whole Genome ◽

Nucleotide Polymorphism ◽

Single Nucleotide ◽

Content Type ◽

Group I

ABSTRACTClostridium botulinumis a genetically diverse Gram-positive bacterium producing extremely potent neurotoxins (botulinum neurotoxins A through G [BoNT/A-G]). The complete genome sequences of three strains harboring only the BoNT/A1 nucleotide sequence are publicly available. Although these strains contain a toxin cluster (HA+OrfX−) associated with hemagglutinin genes, little is known about the genomes of subtype A1 strains (termed HA−OrfX+) that lack hemagglutinin genes in the toxin gene cluster. We sequenced the genomes of three BoNT/A1-producingC. botulinumstrains: two strains with the HA+OrfX−cluster (69A and 32A) and one strain with the HA−OrfX+cluster (CDC297). Whole-genome phylogenic single-nucleotide-polymorphism (SNP) analysis of these strains along with other publicly availableC. botulinumgroup I strains revealed five distinct lineages. Strains 69A and 32A clustered with theC. botulinumtype A1 Hall group, and strain CDC297 clustered with theC. botulinumtype Ba4 strain 657. This study reports the use of whole-genome SNP sequence analysis for discrimination ofC. botulinumgroup I strains and demonstrates the utility of this analysis in quickly differentiatingC. botulinumstrains harboring identical toxin gene subtypes. This analysis further supports previous work showing that strains CDC297 and 657 likely evolved from a common ancestor and independently acquired separate BoNT/A1 toxin gene clusters at distinct genomic locations.

Download Full-text

Whole-Genome Sequencing Allows for Improved Identification of Persistent Listeria monocytogenes in Food-Associated Environments

Applied and Environmental Microbiology ◽

10.1128/aem.01049-15 ◽

2015 ◽

Vol 81 (17) ◽

pp. 6024-6037 ◽

Cited By ~ 76

Author(s):

Matthew J. Stasiewicz ◽

Haley F. Oliver ◽

Martin Wiedmann ◽

Henk C. den Bakker

Keyword(s):

Listeria Monocytogenes ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genetic Determinants ◽

Single Nucleotide ◽

Content Type ◽

Food Borne ◽

Clonal Spread

ABSTRACTWhile the food-borne pathogenListeria monocytogenescan persist in food associated environments, there are no whole-genome sequence (WGS) based methods to differentiate persistent from sporadic strains. Whole-genome sequencing of 188 isolates from a longitudinal study ofL. monocytogenesin retail delis was used to (i) apply single-nucleotide polymorphism (SNP)-based phylogenetics for subtyping ofL. monocytogenes, (ii) use SNP counts to differentiate persistent from repeatedly reintroduced strains, and (iii) identify genetic determinants ofL. monocytogenespersistence. WGS analysis revealed three prophage regions that explained differences between three pairs of phylogenetically similar populations with pulsed-field gel electrophoresis types that differed by ≤3 bands. WGS-SNP-based phylogenetics found that putatively persistentL. monocytogenesrepresent SNP patterns (i) unique to a single retail deli, supporting persistence within the deli (11 clades), (ii) unique to a single state, supporting clonal spread within a state (7 clades), or (iii) spanning multiple states (5 clades). Isolates that formed one of 11 deli-specific clades differed by a median of 10 SNPs or fewer. Isolates from 12 putative persistence events had significantly fewer SNPs (median, 2 to 22 SNPs) than between isolates of the same subtype from other delis (median up to 77 SNPs), supporting persistence of the strain. In 13 events, nearly indistinguishable isolates (0 to 1 SNP) were found across multiple delis. No individual genes were enriched among persistent isolates compared to sporadic isolates. Our data show that WGS analysis improves food-borne pathogen subtyping and identification of persistent bacterial pathogens in food associated environments.

Download Full-text

Whole genome sequencing of orofacial cleft trios from the Gabriella Miller Kids First Pediatric Research Consortium identifies a new locus on chromosome 21

Human Genetics ◽

10.1007/s00439-019-02099-1 ◽

2019 ◽

Vol 139 (2) ◽

pp. 215-226 ◽

Cited By ~ 3

Author(s):

Nandita Mukhopadhyay ◽

Madison Bishop ◽

Michael Mortillo ◽

Pankaj Chopra ◽

Jacqueline B. Hetmanski ◽

...

Keyword(s):

Latin American ◽

Birth Defects ◽

Pediatric Research ◽

Chromosome 21 ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genetic Etiology ◽

Single Nucleotide ◽

Public Health Burden ◽

Kids First

AbstractOrofacial clefts (OFCs) are among the most prevalent craniofacial birth defects worldwide and create a significant public health burden. The majority of OFCs are non-syndromic, and the genetic etiology of non-syndromic OFCs is only partially determined. Here, we analyze whole genome sequence (WGS) data for association with risk of OFCs in European and Colombian families selected from a multicenter family-based OFC study. This is the first large-scale WGS study of OFC in parent–offspring trios, and a part of the Gabriella Miller Kids First Pediatric Research Program created for the study of childhood cancers and structural birth defects. WGS provides deeper and more specific genetic data than using imputation on present-day single nucleotide polymorphic (SNP) marker panels. Genotypes of case–parent trios at single nucleotide variants (SNV) and short insertions and deletions (indels) spanning the entire genome were called from their sequences using human GRCh38 genome assembly, and analyzed for association using the transmission disequilibrium test. Among genome-wide significant associations, we identified a new locus on chromosome 21 in Colombian families, not previously observed in other larger OFC samples of Latin American ancestry. This locus is situated within a region known to be expressed during craniofacial development. Based on deeper investigation of this locus, we concluded that it contributed risk for OFCs exclusively in the Colombians. This study reinforces the ancestry differences seen in the genetic etiology of OFCs, and underscores the need for larger samples when studying for OFCs and other birth defects in populations with diverse ancestry.

Download Full-text

Whole Genome and Core Genome Multilocus Sequence Typing and Single Nucleotide Polymorphism Analyses of Listeria monocytogenes Isolates Associated with an Outbreak Linked to Cheese, United States, 2013

Applied and Environmental Microbiology ◽

10.1128/aem.00633-17 ◽

2017 ◽

Vol 83 (15) ◽

Cited By ~ 36

Author(s):

Yi Chen ◽

Yan Luo ◽

Heather Carleton ◽

Ruth Timme ◽

David Melka ◽

...

Keyword(s):

Single Nucleotide Polymorphism ◽

Clustering Analysis ◽

Multilocus Sequence Typing ◽

Core Genome ◽

Variant Calling ◽

Discriminatory Power ◽

Sufficient Evidence ◽

Whole Genome ◽

Nucleotide Polymorphism ◽

Single Nucleotide

ABSTRACT Epidemiological findings of a listeriosis outbreak in 2013 implicated Hispanic-style cheese produced by company A, and pulsed-field gel electrophoresis (PFGE) and whole genome sequencing (WGS) were performed on clinical isolates and representative isolates collected from company A cheese and environmental samples during the investigation. The results strengthened the evidence for cheese as the vehicle. Surveillance sampling and WGS 3 months later revealed that the equipment purchased by company B from company A yielded an environmental isolate highly similar to all outbreak isolates. The whole genome and core genome multilocus sequence typing and single nucleotide polymorphism (SNP) analyses results were compared to demonstrate the maximum discriminatory power obtained by using multiple analyses, which were needed to differentiate outbreak-associated isolates from a PFGE-indistinguishable isolate collected in a nonimplicated food source in 2012. This unrelated isolate differed from the outbreak isolates by only 7 to 14 SNPs, and as a result, the minimum spanning tree from the whole genome analyses and certain variant calling approach and phylogenetic algorithm for core genome-based analyses could not provide differentiation between unrelated isolates. Our data also suggest that SNP/allele counts should always be combined with WGS clustering analysis generated by phylogenetically meaningful algorithms on a sufficient number of isolates, and the SNP/allele threshold alone does not provide sufficient evidence to delineate an outbreak. The putative prophages were conserved across all the outbreak isolates. All outbreak isolates belonged to clonal complex 5 and serotype 1/2b and had an identical inlA sequence which did not have premature stop codons. IMPORTANCE In this outbreak, multiple analytical approaches were used for maximum discriminatory power. A PFGE-matched, epidemiologically unrelated isolate had high genetic similarity to the outbreak-associated isolates, with as few as 7 SNP differences. Therefore, the SNP/allele threshold should not be used as the only evidence to define the scope of an outbreak. It is critical that the SNP/allele counts be complemented by WGS clustering analysis generated by phylogenetically meaningful algorithms to distinguish outbreak-associated isolates from epidemiologically unrelated isolates. Careful selection of a variant calling approach and phylogenetic algorithm is critical for core-genome-based analyses. The whole-genome-based analyses were able to construct the highly resolved phylogeny needed to support the findings of the outbreak investigation. Ultimately, epidemiologic evidence and multiple WGS analyses should be combined to increase confidence levels during outbreak investigations.

Download Full-text

A multisite genomic epidemiology study of Clostridioides difficile infections in the U.S. supports differential roles of healthcare versus community spread for two common strains

10.1101/2020.11.28.20240192 ◽

2020 ◽

Author(s):

Arianna Miles-Jay ◽

Vincent B. Young ◽

Eric G. Pamer ◽

Tor C. Savidge ◽

Mini Kamboj ◽

...

Keyword(s):

Phylogenetic Analyses ◽

Clinical Care ◽

Whole Genome Sequence ◽

P Value ◽

Whole Genome ◽

Epidemiology Study ◽

Genomic Epidemiology ◽

Clostridioides Difficile ◽

Healthcare Associated ◽

The U.S

ABSTRACTClostridioides difficile is the leading cause of healthcare-associated infectious diarrhea. However, it is increasingly appreciated that healthcare-associated infections derive from both community and healthcare transmission, and that the primary sites of C. difficile transmission may be strain dependent. We conducted a multisite genomic epidemiology study to assess differential genomic evidence of healthcare vs. community spread for two of the most common C. difficile strains in the U.S.: sequence type (ST) 1 (associated with Ribotype 027) and ST2 (associated with Ribotype 014/020). Isolates recovered from stool specimens collected during standard clinical care at three geographically distinct U.S. medical centers between 2010 and 2018 underwent whole genome sequencing and phylogenetic analyses. ST1 and ST2 isolates both displayed some evidence of phylogenetic clustering by study site, but clustering was stronger and more apparent in ST1, consistent with our healthcare-based study more comprehensively sampling local transmission of ST1 compared to ST2 strains. Analyses of pairwise single nucleotide variant (SNV) distance distributions were also consistent with more evidence of healthcare transmission of ST1 compared to ST2, with 44% of ST1 isolates being within 2 SNVs of another isolate from the same geographic collection site compared to 5.5% of ST2 isolates (p-value = <0.001). Conversely, ST2 isolates were more likely to have close genetic neighbors across disparate geographic sites compared to ST1 isolates, further supporting non-healthcare routes of spread for ST2 and highlighting the potential for misattributing genomic similarity among ST2 isolates to recent healthcare transmission. Finally, we estimated a lower evolutionary rate for the ST2 lineage compared to the ST1 lineage using Bayesian timed phylogenomic analyses, and hypothesize that this may contribute to observed differences in geographic concordance among closely related isolates. Together, these findings suggest that ST1 and ST2, while both common causes of C. difficile infection in hospitals, show differential reliance on community and hospital spread. This conclusion supports the need for strain-specific criteria for interpreting genomic linkages and emphasizes the importance of considering differences in the epidemiology of circulating strains when devising interventions to reduce the burden of C. difficile infections.DATA SUMMARYAll whole genome sequence data was uploaded to the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under Bioproject accessions PRJNA595724, PRJNA561087, and PRJNA594943. Metadata that comply with patient privacy rules are included in the Supplementary Materials.

Download Full-text