scholarly journals Evaluation of parameters affecting performance and reliability of machine learning-based antibiotic susceptibility testing from whole genome sequencing data

2019 ◽  
Vol 15 (9) ◽  
pp. e1007349 ◽  
Author(s):  
Allison L. Hicks ◽  
Nicole Wheeler ◽  
Leonor Sánchez-Busó ◽  
Jennifer L. Rakeman ◽  
Simon R. Harris ◽  
...  
mSystems ◽  
2020 ◽  
Vol 5 (3) ◽  
Author(s):  
Nenad Macesic ◽  
Oliver J. Bear Don’t Walk ◽  
Itsik Pe’er ◽  
Nicholas P. Tatonetti ◽  
Anton Y. Peleg ◽  
...  

ABSTRACT Polymyxins are used as treatments of last resort for Gram-negative bacterial infections. Their increased use has led to concerns about emerging polymyxin resistance (PR). Phenotypic polymyxin susceptibility testing is resource intensive and difficult to perform accurately. The complex polygenic nature of PR and our incomplete understanding of its genetic basis make it difficult to predict PR using detection of resistance determinants. We therefore applied machine learning (ML) to whole-genome sequencing data from >600 Klebsiella pneumoniae clonal group 258 (CG258) genomes to predict phenotypic PR. Using a reference-based representation of genomic data with ML outperformed a rule-based approach that detected variants in known PR genes (area under receiver-operator curve [AUROC], 0.894 versus 0.791, P = 0.006). We noted modest increases in performance by using a bacterial genome-wide association study to filter relevant genomic features and by integrating clinical data in the form of prior polymyxin exposure. Conversely, reference-free representation of genomic data as k-mers was associated with decreased performance (AUROC, 0.692 versus 0.894, P = 0.015). When ML models were interpreted to extract genomic features, six of seven known PR genes were correctly identified by models without prior programming and several genes involved in stress responses and maintenance of the cell membrane were identified as potential novel determinants of PR. These findings are a proof of concept that whole-genome sequencing data can accurately predict PR in K. pneumoniae CG258 and may be applicable to other forms of complex antimicrobial resistance. IMPORTANCE Polymyxins are last-resort antibiotics used to treat highly resistant Gram-negative bacteria. There are increasing reports of polymyxin resistance emerging, raising concerns of a postantibiotic era. Polymyxin resistance is therefore a significant public health threat, but current phenotypic methods for detection are difficult and time-consuming to perform. There have been increasing efforts to use whole-genome sequencing for detection of antibiotic resistance, but this has been difficult to apply to polymyxin resistance because of its complex polygenic nature. The significance of our research is that we successfully applied machine learning methods to predict polymyxin resistance in Klebsiella pneumoniae clonal group 258, a common health care-associated and multidrug-resistant pathogen. Our findings highlight that machine learning can be successfully applied even in complex forms of antibiotic resistance and represent a significant contribution to the literature that could be used to predict resistance in other bacteria and to other antibiotics.


2019 ◽  
Vol 6 (Supplement_2) ◽  
pp. S42-S42
Author(s):  
David E Greenberg ◽  
Jiwoong Kim ◽  
Xiaowei Zhan ◽  
Samuel A Shelburne ◽  
Samuel A Shelburne ◽  
...  

Abstract Background Multi-drug-resistant (MDR) P. aeruginosa (PA) infections continue to cause significant morbidity and mortality in various patient groups including those with malignancies. Predicting antimicrobial resistance (AMR) from whole-genome sequencing data if done rapidly, could aid in providing optimal care to patients. Methods To better understand the connections between DNA variation and phenotypic AMR in PA, we developed a new algorithm, variant mapping and prediction of antibiotic resistance (VAMPr), to build association and machine learning prediction models of AMR based on publicly available whole-genome sequencing and antibiotic susceptibility testing (AST) data. A validation cohort of contemporary PA bloodstream isolates was sequenced and AST was performed. Accuracy of predicting AMR for various PA–drug combinations was calculated. Results VAMPr was built from 3,393 bacterial isolates (83 PA isolates included) from 9 species that contained AST data for 29 antibiotics. 14,615 variant genotypes were identified within the dataset and 93 association and prediction models were built. 120 PA bloodstream isolates from cancer patients were included for analysis in the validation cohort. ~15% of isolates were carbapenem resistant and ~20% were quinolone resistant. For drug-isolate combinations where >100 isolates were available, machine-learning prediction accuracies ranged from 75.6% (PA and ceftazidime; 90/119 correctly predicted) to 98.1% (PA and amikacin; 105/107 correctly predicted). Machine learning accurately identified known variants that strongly predicted resistance to various antibiotic classes. Examples included specific gyrA mutations (T83I; P < 0.00001) and quinolone resistance. Conclusion Machine learning predicted AMR in P. aeruginosa across a number of antibiotics with high accuracy. Given the genomic heterogeneity of PA, increased genomic data for this pathogen will aid in further improving prediction accuracy across all antibiotic classes. Disclosures Samuel L. Aitken, PharmD, Melinta Therapeutoics: Grant/Research Support, Research Grant; Merck, Sharpe, and Dohme: Advisory Board; Shionogi: Advisory Board


Heredity ◽  
2021 ◽  
Author(s):  
Axel Jensen ◽  
Mette Lillie ◽  
Kristofer Bergström ◽  
Per Larsson ◽  
Jacob Höglund

AbstractThe use of genetic markers in the context of conservation is largely being outcompeted by whole-genome data. Comparative studies between the two are sparse, and the knowledge about potential effects of this methodology shift is limited. Here, we used whole-genome sequencing data to assess the genetic status of peripheral populations of the wels catfish (Silurus glanis), and discuss the results in light of a recent microsatellite study of the same populations. The Swedish populations of the wels catfish have suffered from severe declines during the last centuries and persists in only a few isolated water systems. Fragmented populations generally are at greater risk of extinction, for example due to loss of genetic diversity, and may thus require conservation actions. We sequenced individuals from the three remaining native populations (Båven, Emån, and Möckeln) and one reintroduced population of admixed origin (Helge å), and found that genetic diversity was highest in Emån but low overall, with strong differentiation among the populations. No signature of recent inbreeding was found, but a considerable number of short runs of homozygosity were present in all populations, likely linked to historically small population sizes and bottleneck events. Genetic substructure within any of the native populations was at best weak. Individuals from the admixed population Helge å shared most genetic ancestry with the Båven population (72%). Our results are largely in agreement with the microsatellite study, and stresses the need to protect these isolated populations at the northern edge of the distribution of the species.


Sign in / Sign up

Export Citation Format

Share Document