scholarly journals Using Machine Learning To Predict Antimicrobial MICs and Associated Genomic Features for NontyphoidalSalmonella

2018 ◽  
Vol 57 (2) ◽  
Author(s):  
Marcus Nguyen ◽  
S. Wesley Long ◽  
Patrick F. McDermott ◽  
Randall J. Olsen ◽  
Robert Olson ◽  
...  

ABSTRACTNontyphoidalSalmonellaspecies are the leading bacterial cause of foodborne disease in the United States. Whole-genome sequences and paired antimicrobial susceptibility data are available forSalmonellastrains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidalSalmonellagenomes, collected over 15 years in the United States, was used to generate extreme gradient boosting (XGBoost)-based machine learning models for predicting MICs for 15 antibiotics. The MIC prediction models had an overall average accuracy of 95% within ±1 2-fold dilution step (confidence interval, 95% to 95%), an average very major error rate of 2.7% (confidence interval, 2.4% to 3.0%), and an average major error rate of 0.1% (confidence interval, 0.1% to 0.2%). The model predicted MICs with noa prioriinformation about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for the training sets, we show that highly accurate MIC prediction models can be generated with less than 500 genomes. We also show that our approach for predicting MICs is stable over time, despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Our strategy for developing whole-genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.

2018 ◽  
Author(s):  
Marcus Nguyen ◽  
S. Wesley Long ◽  
Patrick F. McDermott ◽  
Randall J. Olsen ◽  
Robert Olson ◽  
...  

NontyphoidalSalmonellaspecies are the leading bacterial cause of food-borne disease in the United States. Whole genome sequences and paired antimicrobial susceptibility data are available forSalmonellastrains because of surveillance efforts from public health agencies. In this study, a collection of 5,278 nontyphoidalSalmonellagenomes, collected over 15 years in the United States, were used to generate XGBoost-based machine learning models for predicting minimum inhibitory concentrations (MICs) for 15 antibiotics. The MIC prediction models have average accuracies between 95-96% within ± 1 two-fold dilution factor and can predict MICs with noa prioriinformation about the underlying gene content or resistance phenotypes of the strains. By selecting diverse genomes for training sets, we show that highly accurate MIC prediction models can be generated with fewer than 500 genomes. We also show that our approach for predicting MICs is stable over time despite annual fluctuations in antimicrobial resistance gene content in the sampled genomes. Finally, using feature selection, we explore the important genomic regions identified by the models for predicting MICs. To date, this is one of the largest MIC modeling studies to be published. Our strategy for developing whole genome sequence-based models for surveillance and clinical diagnostics can be readily applied to other important human pathogens.


2018 ◽  
Vol 57 (2) ◽  
Author(s):  
Jonathan M. Monk

ABSTRACT Thanks to the genomics revolution, thousands of strain-specific whole-genome sequences are now accessible for a wide range of pathogenic bacteria. This availability enables big data informatics approaches to be used to study the spread and acquisition of antimicrobial resistance (AMR). In this issue of the Journal of Clinical Microbiology, Nguyen et al. (M. Nguyen, S. W. Long, P. F. McDermott, R. J. Olsen, R. Olson, R. L. Stevens, G. H. Tyson, S. Zhao, and J. J. Davis, J Clin Microbiol 57:e01260-18, 2019, https://doi.org/10.1128/JCM.01260-18) report the results obtained with their machine learning models based on whole-genome sequencing data to predict the MICs of antibiotics for 5,728 nontyphoidal Salmonella genomes collected over 15 years in the United States. Their major finding demonstrates that MICs can be predicted with an average accuracy of 95% within ±1 2-fold dilution step (confidence interval, 95% to 95%), an average very major error rate of 2.7%, and an average major error rate of 0.1%. Importantly, these models predict MICs with no a priori information about the underlying gene content or resistance phenotypes of the strains, enabling the possibility to identify AMR determinants and rapidly diagnose and prioritize antibiotic use directly from the organism sequence. Employing such tools to diagnose and limit the spread of resistance-conferring mechanisms could help ameliorate the looming antibiotic resistance crisis.


Author(s):  
Momen R. Mousa ◽  
Saleh R. Mousa ◽  
Marwa Hassan ◽  
Paul Carlson ◽  
Ibrahim A. Elnaml

Waterborne paint is the most common marking material used throughout the United States. Because of budget constraints, most transportation agencies repaint their markings based on a fixed schedule, which is questionable in relation to efficiency and economy. To overcome this problem, state agencies could evaluate the marking performance by utilizing measured retroreflectivity of waterborne paints applied in the National Transportation Product Evaluation Program (NTPEP) or by using retroreflectivity degradation models developed in previous studies. Generally, both options lack accuracy because of the high dimensionality and multi-collinearity of retroreflectivity data. Therefore, the objective of this study was to employ an advanced machine learning algorithm to develop performance prediction models for waterborne paints considering the variables that are believed to affect their performance. To achieve this objective, a total of 17,952 skip and wheel retroreflectivity measurements were collected from 10 test decks included in the NTPEP. Based on these data, two CatBoost models were developed with an acceptable level of accuracy which can predict the skip and wheel retroreflectivity of waterborne paints for up to 3 years using only the initial measured retroreflectivity and the anticipated project conditions over the intended prediction horizon, such as line color, traffic, air temperature, and so forth. These models could be used by transportation agencies throughout the United States to 1) compare between different products and select the best product for a specific project, and 2) determine the expected service life of a specific product based on a specified threshold retroreflectivity to plan for future restriping activities.


2021 ◽  
Vol 14 (5) ◽  
pp. 472
Author(s):  
Tyler C. Beck ◽  
Kyle R. Beck ◽  
Jordan Morningstar ◽  
Menny M. Benjamin ◽  
Russell A. Norris

Roughly 2.8% of annual hospitalizations are a result of adverse drug interactions in the United States, representing more than 245,000 hospitalizations. Drug–drug interactions commonly arise from major cytochrome P450 (CYP) inhibition. Various approaches are routinely employed in order to reduce the incidence of adverse interactions, such as altering drug dosing schemes and/or minimizing the number of drugs prescribed; however, often, a reduction in the number of medications cannot be achieved without impacting therapeutic outcomes. Nearly 80% of drugs fail in development due to pharmacokinetic issues, outlining the importance of examining cytochrome interactions during preclinical drug design. In this review, we examined the physiochemical and structural properties of small molecule inhibitors of CYPs 3A4, 2D6, 2C19, 2C9, and 1A2. Although CYP inhibitors tend to have distinct physiochemical properties and structural features, these descriptors alone are insufficient to predict major cytochrome inhibition probability and affinity. Machine learning based in silico approaches may be employed as a more robust and accurate way of predicting CYP inhibition. These various approaches are highlighted in the review.


2021 ◽  
pp. 1-4
Author(s):  
Mathieu D'Aquin ◽  
Stefan Dietze

The 29th ACM International Conference on Information and Knowledge Management (CIKM) was held online from the 19 th to the 23 rd of October 2020. CIKM is an annual computer science conference, focused on research at the intersection of information retrieval, machine learning, databases as well as semantic and knowledge-based technologies. Since it was first held in the United States in 1992, 28 conferences have been hosted in 9 countries around the world.


2020 ◽  
pp. 073346482097760
Author(s):  
Manka Nkimbeng ◽  
Yvonne Commodore-Mensah ◽  
Jacqueline L. Angel ◽  
Karen Bandeen-Roche ◽  
Roland J. Thorpe ◽  
...  

Acculturation and racial discrimination have been independently associated with physical function limitations in immigrant and United States (U.S.)-born populations. This study examined the relationships among acculturation, racial discrimination, and physical function limitations in N = 165 African immigrant older adults using multiple linear regression. The mean age was 62 years ( SD = 8 years), and 61% were female. Older adults who resided in the United States for 10 years or more had more physical function limitations compared with those who resided here for less than 10 years ( b = −2.62, 95% confidence interval [CI] = [–5.01, –0.23]). Compared to lower discrimination, those with high discrimination had more physical function limitations ( b = −2.51, 95% CI = [–4.91, –0.17]), but this was no longer significant after controlling for length of residence and acculturation strategy. Residing in the United States for more than 10 years is associated with poorer physical function. Longitudinal studies with large, diverse samples of African immigrants are needed to confirm these associations.


2009 ◽  
Vol 99 (12) ◽  
pp. 1387-1393 ◽  
Author(s):  
M. Hodda ◽  
D. C. Cook

Potato cyst nematodes (PCN) (Globodera spp.) are quarantine pests with serious potential economic consequences. Recent new detections in Australia, Canada, and the United States have focussed attention on the consequences of spread and economic justifications for alternative responses. Here, a full assessment of the economic impact of PCN spread from a small initial incursion is presented. Models linking spread, population growth, and economic impact are combined to estimate costs of spread without restriction in Australia. Because the characteristics of the Australian PCN populations are currently unknown, the known ranges of parameters were used to obtain cost scenarios, an approach which makes the model predictions applicable generally. Our analysis indicates that mean annual costs associated with spread of PCN would increase rapidly initially, associated with increased testing. Costs would then increase more slowly to peak at over AUD$20 million per year ≈10 years into the future. Afterward, this annual cost would decrease slightly due to discounting factors. Mean annual costs over 20 years were $18.7 million, with a 90% confidence interval between AUD$11.9 million and AUD$27.0 million. Thus, cumulative losses to Australian agriculture over 20 years may exceed $370 million without action to prevent spread of PCN and entry to new areas.


2021 ◽  
Author(s):  
satya katragadda ◽  
ravi teja bhupatiraju ◽  
vijay raghavan ◽  
ziad ashkar ◽  
raju gottumukkala

Abstract Background: Travel patterns of humans play a major part in the spread of infectious diseases. This was evident in the geographical spread of COVID-19 in the United States. However, the impact of this mobility and the transmission of the virus due to local travel, compared to the population traveling across state boundaries, is unknown. This study evaluates the impact of local vs. visitor mobility in understanding the growth in the number of cases for infectious disease outbreaks. Methods: We use two different mobility metrics, namely the local risk and visitor risk extracted from trip data generated from anonymized mobile phone data across all 50 states in the United States. We analyzed the impact of just using local trips on infection spread and infection risk potential generated from visitors' trips from various other states. We used the Diebold-Mariano test to compare across three machine learning models. Finally, we compared the performance of models, including visitor mobility for all the three waves in the United States and across all 50 states. Results: We observe that visitor mobility impacts case growth and that including visitor mobility in forecasting the number of COVID-19 cases improves prediction accuracy by 34. We found the statistical significance with respect to the performance improvement resulting from including visitor mobility using the Diebold-Mariano test. We also observe that the significance was much higher during the first peak March to June 2020. Conclusion: With presence of cases everywhere (i.e. local and visitor), visitor mobility (even within the country) is shown to have significant impact on growth in number of cases. While it is not possible to account for other factors such as the impact of interventions, and differences in local mobility and visitor mobility, we find that these observations can be used to plan for both reopening and limiting visitors from regions where there are high number of cases.


2021 ◽  
Vol 11 (23) ◽  
pp. 11227
Author(s):  
Arnold Kamis ◽  
Yudan Ding ◽  
Zhenzhen Qu ◽  
Chenchen Zhang

The purpose of this paper is to model the cases of COVID-19 in the United States from 13 March 2020 to 31 May 2020. Our novel contribution is that we have obtained highly accurate models focused on two different regimes, lockdown and reopen, modeling each regime separately. The predictor variables include aggregated individual movement as well as state population density, health rank, climate temperature, and political color. We apply a variety of machine learning methods to each regime: Multiple Regression, Ridge Regression, Elastic Net Regression, Generalized Additive Model, Gradient Boosted Machine, Regression Tree, Neural Network, and Random Forest. We discover that Gradient Boosted Machines are the most accurate in both regimes. The best models achieve a variance explained of 95.2% in the lockdown regime and 99.2% in the reopen regime. We describe the influence of the predictor variables as they change from regime to regime. Notably, we identify individual person movement, as tracked by GPS data, to be an important predictor variable. We conclude that government lockdowns are an extremely important de-densification strategy. Implications and questions for future research are discussed.


Sign in / Sign up

Export Citation Format

Share Document