scholarly journals Improving hazard characterization in microbial risk assessment using next generation sequencing data and machine learning: Predicting clinical outcomes in shigatoxigenic Escherichia coli

2019 ◽  
Vol 292 ◽  
pp. 72-82 ◽  
Author(s):  
Patrick Murigu Kamau Njage ◽  
Pimlapas Leekitcharoenphon ◽  
Tine Hald
2019 ◽  
Author(s):  
Tom Hill ◽  
Robert L. Unckless

AbstractCopy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods or coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data.Available at: https://github.com/tomh1lll/dudeml


2016 ◽  
Vol 79 (4) ◽  
pp. 574-581 ◽  
Author(s):  
TRENNA BLAGDEN ◽  
WILLIAM SCHNEIDER ◽  
ULRICH MELCHER ◽  
JON DANIELS ◽  
JACQUELINE FLETCHER

ABSTRACT The Centers for Disease Control and Prevention recently emphasized the need for enhanced technologies to use in investigations of outbreaks of foodborne illnesses. To address this need, e-probe diagnostic nucleic acid analysis (EDNA) was adapted and validated as a tool for the rapid, effective identification and characterization of multiple pathogens in a food matrix. In EDNA, unassembled next generation sequencing data sets from food sample metagenomes are queried using pathogen-specific sequences known as electronic probes (e-probes). In this study, the query of mock sequence databases demonstrated the potential of EDNA for the detection of foodborne pathogens. The method was then validated using next generation sequencing data sets created by sequencing the metagenome of alfalfa sprouts inoculated with Escherichia coli O157:H7. Nonspecific hits in the negative control sample indicated the need for additional filtration of the e-probes to enhance specificity. There was no significant difference in the ability of an e-probe to detect the target pathogen based upon the length of the probe set oligonucleotides. The results from the queries of the sample database using E. coli e-probe sets were significantly different from those obtained using random decoy probe sets and exhibited 100% precision. The results support the use of EDNA as a rapid response methodology in foodborne outbreaks and investigations for establishing comprehensive microbial profiles of complex food samples.


Risk Analysis ◽  
2018 ◽  
Author(s):  
Patrick Murigu Kamau Njage ◽  
Clementine Henri ◽  
Pimlapas Leekitcharoenphon ◽  
Michel‐Yves Mistou ◽  
Rene S. Hendriksen ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document