scholarly journals Improved Genomic Identification, Clustering, and Serotyping of Shiga Toxin-Producing Escherichia coli Using Cluster/Serotype-Specific Gene Markers

Author(s):  
Xiaomei Zhang ◽  
Michael Payne ◽  
Sandeep Kaur ◽  
Ruiting Lan

Shiga toxin-producing Escherichia coli (STEC) have more than 470 serotypes. The well-known STEC O157:H7 serotype is a leading cause of STEC infections in humans. However, the incidence of non-O157:H7 STEC serotypes associated with foodborne outbreaks and human infections has increased in recent years. Current detection and serotyping assays are focusing on O157 and top six (“Big six”) non-O157 STEC serogroups. In this study, we performed phylogenetic analysis of nearly 41,000 publicly available STEC genomes representing 460 different STEC serotypes and identified 19 major and 229 minor STEC clusters. STEC cluster-specific gene markers were then identified through comparative genomic analysis. We further identified serotype-specific gene markers for the top 10 most frequent non-O157:H7 STEC serotypes. The cluster or serotype specific gene markers had 99.54% accuracy and more than 97.25% specificity when tested using 38,534 STEC and 14,216 non-STEC E. coli genomes, respectively. In addition, we developed a freely available in silico serotyping pipeline named STECFinder that combined these robust gene markers with established E. coli serotype specific O and H antigen genes and stx genes for accurate identification, cluster determination and serotyping of STEC. STECFinder can assign 99.85% and 99.83% of 38,534 STEC isolates to STEC clusters using assembled genomes and Illumina reads respectively and can simultaneously predict stx subtypes and STEC serotypes. Using shotgun metagenomic sequencing reads of STEC spiked food samples from a published study, we demonstrated that STECFinder can detect the spiked STEC serotypes, accurately. The cluster/serotype-specific gene markers could also be adapted for culture independent typing, facilitating rapid STEC typing. STECFinder is available as an installable package (https://github.com/LanLab/STECFinder) and will be useful for in silico STEC cluster identification and serotyping using genome data.

2021 ◽  
Author(s):  
Xiaomei Zhang ◽  
Michael Payne ◽  
Thanh Nguyen ◽  
Sandeep Kaur ◽  
Ruiting Lan

AbstractShigella and enteroinvasive Escherichia coli (EIEC) cause human bacillary dysentery with similar invasion mechanisms and share similar physiological, biochemical and genetic characteristics. The ability to differentiate Shigella and EIEC from each other is important for clinical diagnostic and epidemiologic investigations. The existing genetic signatures may not discriminate between Shigella and EIEC. However, phylogenetically, Shigella and EIEC strains are composed of multiple clusters and are different forms of E. coli. In this study, we identified 10 Shigella clusters, 7 EIEC clusters and 53 sporadic types of EIEC by examining over 17,000 publicly available Shigella/EIEC genomes. We compared Shigella and EIEC accessory genomes to identify the cluster-specific gene markers or marker sets for the 17 clusters and 53 sporadic types. The gene markers showed 99.63% accuracy and more than 97.02% specificity. In addition, we developed a freely available in silico serotyping pipeline named Shigella EIEC Cluster Enhanced Serotype Finder (ShigEiFinder) by incorporating the cluster-specific gene markers and established Shigella/EIEC serotype specific O antigen genes and modification genes into typing. ShigEiFinder can process either paired end Illumina sequencing reads or assembled genomes and almost perfectly differentiated Shigella from EIEC with 99.70% and 99.81% cluster assignment accuracy for the assembled genomes and mapped reads respectively. ShigEiFinder was able to serotype over 59 Shigella serotypes and 22 EIEC serotypes and provided a high specificity with 99.40% for assembled genomes and 99.38% for mapped reads for serotyping. The cluster markers and our new serotyping tool, ShigEiFinder (https://github.com/LanLab/ShigEiFinder), will be useful for epidemiologic and diagnostic investigations.Impact statementThe differentiation of Shigella strains from enteroinvasive E. coli (EIEC) is important for clinical diagnosis and public health epidemiologic investigations. The similarities between Shigella and EIEC strains make this differentiation very difficult as both share common ancestries within E. coli. However, Shigella and EIEC are phylogenetically separated into multiple clusters, making high resolution separation using cluster specific genomic markers possible. In this study, we identified 17 Shigella or EIEC clusters including five that were newly identified through examination of over 17,000 publicly available Shigella and EIEC genomes. We further identified an individual or a set of cluster-specific gene markers for each cluster using comparative genomic analysis. These markers can then be used to classify isolates into clusters and were used to develop an in silico pipeline, ShigEiFinder (https://github.com/LanLab/ShigEiFinder) for accurate differentiation, cluster typing and serotyping of Shigella and EIEC from Illumina sequencing reads or assembled genomes. This study will have broad application from understanding the evolution of Shigella/EIEC to diagnosis and epidemiology.Data summarySequencing data have been deposited at the National Center for Biotechnology Information under BioProject number PRJNA692536.RepositoriesRaw sequence data are available from NCBI under the BioProject number PRJNA692536.


2019 ◽  
Vol 87 (10) ◽  
Author(s):  
Tracy H. Hazen ◽  
David A. Rasko

ABSTRACT Enteropathogenic Escherichia coli (EPEC) is a leading cause of moderate to severe diarrhea among young children in developing countries, and EPEC isolates can be subdivided into two groups. Typical EPEC (tEPEC) bacteria are characterized by the presence of both the locus of enterocyte effacement (LEE) and the plasmid-encoded bundle-forming pilus (BFP), which are involved in adherence and translocation of type III effectors into the host cells. Atypical EPEC (aEPEC) bacteria also contain the LEE but lack the BFP. In the current report, we describe the complete genome of outbreak-associated aEPEC isolate E110019, which carries four plasmids. Comparative genomic analysis demonstrated that the type III secreted effector EspT gene, an autotransporter gene, a hemolysin gene, and putative fimbrial genes are all carried on plasmids. Further investigation of 65 espT-containing E. coli genomes demonstrated that different espT alleles are associated with multiple plasmids that differ in their overall gene content from the E110019 espT-containing plasmid. EspT has been previously described with respect to its role in the ability of E110019 to invade host cells. While other type III secreted effectors of E. coli have been identified on insertion elements and prophages of the chromosome, we demonstrated in the current study that the espT gene is located on multiple unique plasmids. These findings highlight a role of plasmids in dissemination of a unique E. coli type III secreted effector that is involved in host invasion and severe diarrheal illness.


2008 ◽  
Vol 190 (20) ◽  
pp. 6881-6893 ◽  
Author(s):  
David A. Rasko ◽  
M. J. Rosovitz ◽  
Garry S. A. Myers ◽  
Emmanuel F. Mongodin ◽  
W. Florian Fricke ◽  
...  

ABSTRACT Whole-genome sequencing has been skewed toward bacterial pathogens as a consequence of the prioritization of medical and veterinary diseases. However, it is becoming clear that in order to accurately measure genetic variation within and between pathogenic groups, multiple isolates, as well as commensal species, must be sequenced. This study examined the pangenomic content of Escherichia coli. Six distinct E. coli pathovars can be distinguished using molecular or phenotypic markers, but only two of the six pathovars have been subjected to any genome sequencing previously. Thus, this report provides a seminal description of the genomic contents and unique features of three unsequenced pathovars, enterotoxigenic E. coli, enteropathogenic E. coli, and enteroaggregative E. coli. We also determined the first genome sequence of a human commensal E. coli isolate, E. coli HS, which will undoubtedly provide a new baseline from which workers can examine the evolution of pathogenic E. coli. Comparison of 17 E. coli genomes, 8 of which are new, resulted in identification of ∼2,200 genes conserved in all isolates. We were also able to identify genes that were isolate and pathovar specific. Fewer pathovar-specific genes were identified than anticipated, suggesting that each isolate may have independently developed virulence capabilities. Pangenome calculations indicate that E. coli genomic diversity represents an open pangenome model containing a reservoir of more than 13,000 genes, many of which may be uncharacterized but important virulence factors. This comparative study of the species E. coli, while descriptive, should provide the basis for future functional work on this important group of pathogens.


2015 ◽  
Vol 78 (4) ◽  
pp. 675-684 ◽  
Author(s):  
KRISTIN W. LIVEZEY ◽  
BETTINA GROSCHEL ◽  
MICHAEL M. BECKER

Escherichia coli O157:H7 and six serovars (O26, O103, O121, O111, O145, and O45) are frequently implicated in severe clinical illness worldwide. Standard testing methods using stx, eae, and O serogroup–specific gene sequences for detecting the top six non-O157 STEC bear the disadvantage that these genes may reside, independently, in different nonpathogenic organisms, leading to false-positive results. The ecf operon has previously been identified in the large enterohemolysin-encoding plasmid of eae-positive Shiga toxin–producing E. coli (STEC). Here, we explored the utility of the ecf operon as a single marker to detect eae-positive STEC from pure broth and primary meat enrichments. Analysis of 501 E. coli isolates demonstrated a strong correlation (99.6%) between the presence of the ecf1 gene and the combined presence of stx, eae, and ehxA genes. Two large studies were carried out to determine the utility of an ecf1 detection assay to detect non-O157 STEC strains in enriched meat samples in comparison to the results using the U.S. Department of Agriculture Food Safety and Inspection Service (FSIS) method that detects stx and eae genes. In ground beef samples (n = 1,065), the top six non-O157 STEC were detected in 4.0% of samples by an ecf1 detection assay and in 5.0% of samples by the stx- and eae-based method. In contrast, in beef samples composed largely of trim (n = 1,097), the top six non-O157 STEC were detected at 1.1% by both methods. Estimation of false-positive rates among the top six non-O157 STEC revealed a lower rate using the ecf1 detection method (0.5%) than using the eae and stx screening method (1.1%). Additionally, the ecf1 detection assay detected STEC strains associated with severe illness that are not included in the FSIS regulatory definition of adulterant STEC.


2020 ◽  
pp. JCM.02624-20
Author(s):  
Nguyen Thi Thu Huong ◽  
Atsushi Iguchi ◽  
Ritsuko Ohata ◽  
Hisahiro Kawai ◽  
Tadasuke Ooka ◽  
...  

Shiga toxin-producing Escherichia coli (STEC) is an important foodborne pathogen. Although most cases of STEC infection in humans are due to O157 and non-O157 serogroups, there are also reports of infection with STEC strains that cannot be serologically classified into any O-serogroup (O-serogroup untypeable, OUT). Recently, it has become clear that even OUT strains can be subclassified based on the diversity of O-antigen biosynthesis gene cluster (O-AGC) sequences. Cattle are thought to be a major reservoir of STEC strains belonging to various serotypes; however, the internal composition of OUT STEC strains in cattle remains unknown. In this study, we screened 366 STEC strains isolated from healthy cattle by using multiplex PCR kits including primers that targeted novel O-AGC types (Og-types) found in OUT E. coli and Shigella strains in previous studies. Interestingly, 94 (25.7%) of these strains could be classified into 13 novel Og-types. Genomic analysis revealed that the results of the in silico serotyping of novel Og-type strains were perfectly consistent with those of the PCR experiment. In addition, it was revealed that a dual Og8+OgSB17-type strain carried two types of O-AGCs from E. coli O8 and Shigella boydii type 17 tandemly inserted at the locus, with both antigens expressed on the cell surface. The results of this comprehensive analysis of cattle-derived STEC strains may help improve our understanding of the strains circulating in the environment. Additionally, the DNA-based serotyping systems used in this study could be used in future epidemiological studies and risk assessments of other STEC strains.


2016 ◽  
Vol 84 (8) ◽  
pp. 2362-2371 ◽  
Author(s):  
Tracy H. Hazen ◽  
Susan R. Leonard ◽  
Keith A. Lampel ◽  
David W. Lacher ◽  
Anthony T. Maurelli ◽  
...  

EnteroinvasiveEscherichia coli(EIEC) is a unique pathovar that has a pathogenic mechanism nearly indistinguishable from that ofShigellaspecies. In contrast to isolates of the fourShigellaspecies, which are widespread and can be frequent causes of human illness, EIEC causes far fewer reported illnesses each year. In this study, we analyzed the genome sequences of 20 EIEC isolates, including 14 first described in this study. Phylogenomic analysis of the EIEC genomes demonstrated that 17 of the isolates are present in three distinct lineages that contained only EIEC genomes, compared to reference genomes from each of theE. colipathovars andShigellaspecies. Comparative genomic analysis identified genes that were unique to each of the three identified EIEC lineages. While many of the EIEC lineage-specific genes have unknown functions, those with predicted functions included a colicin and putative proteins involved in transcriptional regulation or carbohydrate metabolism.In silicodetection of theShigellavirulence plasmid (pINV), which is essential for the invasion of host cells, demonstrated that a form of pINV was present in nearly all EIEC genomes, but the Mxi-Spa-Ipa region of the plasmid that encodes the invasion-associated proteins was absent from several of the EIEC isolates. The comparative genomic findings in this study support the hypothesis that multiple EIEC lineages have evolved independently from multiple distinct lineages ofE. colivia the acquisition of theShigellavirulence plasmid and, in some cases, theShigellapathogenicity islands.


2011 ◽  
Vol 78 (1) ◽  
pp. 58-69 ◽  
Author(s):  
Minjung Park ◽  
Ju-Hoon Lee ◽  
Hakdong Shin ◽  
Minsik Kim ◽  
Jeongjoon Choi ◽  
...  

ABSTRACTSalmonella entericaandEscherichia coliO157:H7 are major food-borne pathogens causing serious illness. Phage SFP10, which revealed effective infection of bothS. entericaandE. coliO157:H7, was isolated and characterized. SFP10 contains a 158-kb double-stranded DNA genome belonging to the Vi01 phage-like familyMyoviridae.In vitroadsorption assays showed that the adsorption constant rates to bothSalmonella entericaserovar Typhimurium andE. coliO157:H7 were 2.50 × 10−8ml/min and 1.91 × 10−8ml/min, respectively. One-step growth analysis revealed that SFP10 has a shorter latent period (25 min) and a larger burst size (>200 PFU) than ordinaryMyoviridaephages, suggesting effective host infection and lytic activity. However, differential development of resistance to SFP10 inS.Typhimurium andE. coliO157:H7 was observed; bacteriophage-insensitive mutant (BIM) frequencies of 1.19 × 10−2CFU/ml forS.Typhimurium and 4.58 × 10−5CFU/ml forE. coliO157:H7 were found, indicating that SFP10 should be active and stable for control ofE. coliO157:H7 with minimal emergence of SFP10-resistant pathogens but may not be forS.Typhimurium. Specific mutation ofrfaLinS.Typhimurium andE. coliO157:H7 revealed the O antigen as an SFP10 receptor for both bacteria. Genome sequence analysis of SFP10 and its comparative analysis with homologousSalmonellaVi01 andShigellaphiSboM-AG3 phages revealed that their tail fiber and tail spike genes share low sequence identity, implying that the genes are major host specificity determinants. This is the first report identifying specific infection and inhibition ofSalmonellaTyphimurium andE. coliO157:H7 by a single bacteriophage.


2021 ◽  
Vol 7 (12) ◽  
Author(s):  
Xiaomei Zhang ◽  
Michael Payne ◽  
Thanh Nguyen ◽  
Sandeep Kaur ◽  
Ruiting Lan

Shigella and enteroinvasive Escherichia coli (EIEC) cause human bacillary dysentery with similar invasion mechanisms and share similar physiological, biochemical and genetic characteristics. Differentiation of Shigella from EIEC is important for clinical diagnostic and epidemiological investigations. However, phylogenetically, Shigella and EIEC strains are composed of multiple clusters and are different forms of E. coli , making it difficult to find genetic markers to discriminate between Shigella and EIEC. In this study, we identified 10 Shigella clusters, seven EIEC clusters and 53 sporadic types of EIEC by examining over 17000 publicly available Shigella and EIEC genomes. We compared Shigella and EIEC accessory genomes to identify cluster-specific gene markers for the 17 clusters and 53 sporadic types. The cluster-specific gene markers showed 99.64% accuracy and more than 97.02% specificity. In addition, we developed a freely available in silico serotyping pipeline named Shigella EIEC Cluster Enhanced Serotype Finder (ShigEiFinder) by incorporating the cluster-specific gene markers and established Shigella and EIEC serotype-specific O antigen genes and modification genes into typing. ShigEiFinder can process either paired-end Illumina sequencing reads or assembled genomes and almost perfectly differentiated Shigella from EIEC with 99.70 and 99.74% cluster assignment accuracy for the assembled genomes and read mapping respectively. ShigEiFinder was able to serotype over 59 Shigella serotypes and 22 EIEC serotypes and provided a high specificity of 99.40% for assembled genomes and 99.38% for read mapping for serotyping. The cluster-specific gene markers and our new serotyping tool, ShigEiFinder (installable package: https://github.com/LanLab/ShigEiFinder, online tool: https://mgtdb.unsw.edu.au/ShigEiFinder/), will be useful for epidemiological and diagnostic investigations.


2019 ◽  
Vol 8 (27) ◽  
Author(s):  
Amrita Salim ◽  
Pradeesh Babu ◽  
Keerthi Mohan ◽  
Manju Moorthy ◽  
Devika Raj ◽  
...  

ABSTRACT We report the draft genome sequence of Escherichia coli ASBT-1, a representative of E. coli sequence type 155 (ST155), obtained from India. Considering the known wide variety of pathogenic and antibiotic resistance potentials, this strain should be of great interest for detailed comparative genomic analysis.


Sign in / Sign up

Export Citation Format

Share Document