scholarly journals 2063. Using Twitter Data and Machine Learning to Identify Outpatient Antibiotic Misuse: A Proof-of-Concept Study

2019 ◽  
Vol 6 (Supplement_2) ◽  
pp. S695-S695
Author(s):  
Timothy Sullivan

Abstract Background Outpatient antibiotic misuse is common, yet it is difficult to identify and prevent. Novel methods are needed to better identify unnecessary antibiotic use in the outpatient setting. Methods The Twitter developer platform was accessed to identify Tweets describing outpatient antibiotic use in the United States between November 2018 and March 2019. Unique English-language Tweets reporting recent antibiotic use were aggregated, reviewed, and labeled as describing possible misuse or not describing misuse. Possible misuse was defined as antibiotic use for a diagnosis or symptoms for which antibiotics are not indicated based on national guidelines, or the use of antibiotics without evaluation by a healthcare provider (Figure 1). Tweets were randomly divided into training and testing sets consisting of 80% and 20% of the data, respectively. Training set Tweets were preprocessed via a natural language processing pipeline, converted into numerical vectors, and used to generate a logistic regression algorithm to predict misuse in the testing set. Analyses were performed in Python using the scikit-learn and nltk libraries. Results 4000 Tweets were included, of which 1028 were labeled as describing possible outpatient antibiotic misuse. The algorithm correctly identified Tweets describing possible antibiotic misuse in the testing set with specificity = 94%, sensitivity = 55%, PPV = 75%, NPV = 87%, and area under the ROC curve = 0.91 (Figure 2). Conclusion A machine learning algorithm using Twitter data identified episodes of self-reported antibiotic misuse with good test performance, as defined by the area under the ROC curve. Analysis of Twitter data captured some episodes of antibiotic misuses, such as the use of non-prescribed antibiotics, that are not easily identified by other methods. This approach could be used to generate novel insights into the causes and extent of antibiotic misuse in the United States, and to monitor antibiotic misuse in real time. Disclosures All authors: No reported disclosures.

Author(s):  
Ari Z. Klein ◽  
Arjun Magge ◽  
Karen O’Connor ◽  
Haitao Cai ◽  
Davy Weissenbacher ◽  
...  

ABSTRACTThe rapidly evolving outbreak of COVID-19 presents challenges for actively monitoring its spread. In this study, we assessed a social media mining approach for automatically analyzing the chronological and geographical distribution of users in the United States reporting personal information related to COVID-19 on Twitter. The results suggest that our natural language processing and machine learning framework could help provide an early indication of the spread of COVID-19.


Among the foremost challenges with big data is how to go about analyzing it. What new tools are needed to be able to properly investigate and model the large quantities of highly complex, often messy data? Chapter 4 addresses this question by introducing and briefly exploring the fields of Machine Learning, Natural Language Processing, and Social Network Analysis, focusing on how these methods and toolsets can be utilized to make sense of big data. The authors provide a broad overview of tools, ideas, and caveats for each of these fields. This chapter ends with a look at how one major public university in the United States, the University of Texas at Arlington, is beginning to address some of the questions surrounding big data in an institutional setting. A list of additional readings is provided.


2018 ◽  
Vol 21 ◽  
pp. 45-48
Author(s):  
Shilpa Balan ◽  
Sanchita Gawand ◽  
Priyanka Purushu

Cybersecurity plays a vital role in protecting the privacy and data of people. In the recent times, there have been several issues relating to cyber fraud, data breach and cyber theft. Many people in the United States have been a victim of identity theft. Thus, understanding of cybersecurity plays an important role in protecting their information and devices. As the adoption of smart devices and social networking are increasing, cybersecurity awareness needs to be spread. The research aims at building a classification machine learning algorithm to determine the awareness of cybersecurity by the common masses in the United States. We were able to attain a good F-measure score when evaluating the performance of the classification model built for this study.


2016 ◽  
Vol 3 (2) ◽  
pp. e21 ◽  
Author(s):  
Scott R Braithwaite ◽  
Christophe Giraud-Carrier ◽  
Josh West ◽  
Michael D Barnes ◽  
Carl Lee Hanson

Background One of the leading causes of death in the United States (US) is suicide and new methods of assessment are needed to track its risk in real time. Objective Our objective is to validate the use of machine learning algorithms for Twitter data against empirically validated measures of suicidality in the US population. Methods Using a machine learning algorithm, the Twitter feeds of 135 Mechanical Turk (MTurk) participants were compared with validated, self-report measures of suicide risk. Results Our findings show that people who are at high suicidal risk can be easily differentiated from those who are not by machine learning algorithms, which accurately identify the clinically significant suicidal rate in 92% of cases (sensitivity: 53%, specificity: 97%, positive predictive value: 75%, negative predictive value: 93%). Conclusions Machine learning algorithms are efficient in differentiating people who are at a suicidal risk from those who are not. Evidence for suicidality can be measured in nonclinical populations using social media data.


2020 ◽  
Vol 6 (30) ◽  
pp. eabb5824 ◽  
Author(s):  
Meysam Alizadeh ◽  
Jacob N. Shapiro ◽  
Cody Buntain ◽  
Joshua A. Tucker

We study how easy it is to distinguish influence operations from organic social media activity by assessing the performance of a platform-agnostic machine learning approach. Our method uses public activity to detect content that is part of coordinated influence operations based on human-interpretable features derived solely from content. We test this method on publicly available Twitter data on Chinese, Russian, and Venezuelan troll activity targeting the United States, as well as the Reddit dataset of Russian influence efforts. To assess how well content-based features distinguish these influence operations from random samples of general and political American users, we train and test classifiers on a monthly basis for each campaign across five prediction tasks. Content-based features perform well across period, country, platform, and prediction task. Industrialized production of influence campaign content leaves a distinctive signal in user-generated content that allows tracking of campaigns from month to month and across different accounts.


Author(s):  
Momen R. Mousa ◽  
Saleh R. Mousa ◽  
Marwa Hassan ◽  
Paul Carlson ◽  
Ibrahim A. Elnaml

Waterborne paint is the most common marking material used throughout the United States. Because of budget constraints, most transportation agencies repaint their markings based on a fixed schedule, which is questionable in relation to efficiency and economy. To overcome this problem, state agencies could evaluate the marking performance by utilizing measured retroreflectivity of waterborne paints applied in the National Transportation Product Evaluation Program (NTPEP) or by using retroreflectivity degradation models developed in previous studies. Generally, both options lack accuracy because of the high dimensionality and multi-collinearity of retroreflectivity data. Therefore, the objective of this study was to employ an advanced machine learning algorithm to develop performance prediction models for waterborne paints considering the variables that are believed to affect their performance. To achieve this objective, a total of 17,952 skip and wheel retroreflectivity measurements were collected from 10 test decks included in the NTPEP. Based on these data, two CatBoost models were developed with an acceptable level of accuracy which can predict the skip and wheel retroreflectivity of waterborne paints for up to 3 years using only the initial measured retroreflectivity and the anticipated project conditions over the intended prediction horizon, such as line color, traffic, air temperature, and so forth. These models could be used by transportation agencies throughout the United States to 1) compare between different products and select the best product for a specific project, and 2) determine the expected service life of a specific product based on a specified threshold retroreflectivity to plan for future restriping activities.


2020 ◽  
Author(s):  
Carson Lam ◽  
Jacob Calvert ◽  
Gina Barnes ◽  
Emily Pellegrini ◽  
Anna Lynn-Palevsky ◽  
...  

BACKGROUND In the wake of COVID-19, the United States has developed a three stage plan to outline the parameters to determine when states may reopen businesses and ease travel restrictions. The guidelines also identify subpopulations of Americans that should continue to stay at home due to being at high risk for severe disease should they contract COVID-19. These guidelines were based on population level demographics, rather than individual-level risk factors. As such, they may misidentify individuals at high risk for severe illness and who should therefore not return to work until vaccination or widespread serological testing is available. OBJECTIVE This study evaluated a machine learning algorithm for the prediction of serious illness due to COVID-19 using inpatient data collected from electronic health records. METHODS The algorithm was trained to identify patients for whom a diagnosis of COVID-19 was likely to result in hospitalization, and compared against four U.S policy-based criteria: age over 65, having a serious underlying health condition, age over 65 or having a serious underlying health condition, and age over 65 and having a serious underlying health condition. RESULTS This algorithm identified 80% of patients at risk for hospitalization due to COVID-19, versus at most 62% that are identified by government guidelines. The algorithm also achieved a high specificity of 95%, outperforming government guidelines. CONCLUSIONS This algorithm may help to enable a broad reopening of the American economy while ensuring that patients at high risk for serious disease remain home until vaccination and testing become available.


Author(s):  
Timnit Gebru

This chapter discusses the role of race and gender in artificial intelligence (AI). The rapid permeation of AI into society has not been accompanied by a thorough investigation of the sociopolitical issues that cause certain groups of people to be harmed rather than advantaged by it. For instance, recent studies have shown that commercial automated facial analysis systems have much higher error rates for dark-skinned women, while having minimal errors on light-skinned men. Moreover, a 2016 ProPublica investigation uncovered that machine learning–based tools that assess crime recidivism rates in the United States are biased against African Americans. Other studies show that natural language–processing tools trained on news articles exhibit societal biases. While many technical solutions have been proposed to alleviate bias in machine learning systems, a holistic and multifaceted approach must be taken. This includes standardization bodies determining what types of systems can be used in which scenarios, making sure that automated decision tools are created by people from diverse backgrounds, and understanding the historical and political factors that disadvantage certain groups who are subjected to these tools.


2021 ◽  
Vol 14 (5) ◽  
pp. 472
Author(s):  
Tyler C. Beck ◽  
Kyle R. Beck ◽  
Jordan Morningstar ◽  
Menny M. Benjamin ◽  
Russell A. Norris

Roughly 2.8% of annual hospitalizations are a result of adverse drug interactions in the United States, representing more than 245,000 hospitalizations. Drug–drug interactions commonly arise from major cytochrome P450 (CYP) inhibition. Various approaches are routinely employed in order to reduce the incidence of adverse interactions, such as altering drug dosing schemes and/or minimizing the number of drugs prescribed; however, often, a reduction in the number of medications cannot be achieved without impacting therapeutic outcomes. Nearly 80% of drugs fail in development due to pharmacokinetic issues, outlining the importance of examining cytochrome interactions during preclinical drug design. In this review, we examined the physiochemical and structural properties of small molecule inhibitors of CYPs 3A4, 2D6, 2C19, 2C9, and 1A2. Although CYP inhibitors tend to have distinct physiochemical properties and structural features, these descriptors alone are insufficient to predict major cytochrome inhibition probability and affinity. Machine learning based in silico approaches may be employed as a more robust and accurate way of predicting CYP inhibition. These various approaches are highlighted in the review.


Sign in / Sign up

Export Citation Format

Share Document