Analytics, Machine Learning & NLP – use in BioSurveillance and Public Health practice

This presentation summarizes ways in which Analytics, Machine Learning (ML) and Natural Language Processing (NLP) can improve accuracy and efficiency in bio surveillance and public health practices. Currently, there is an abundance of data coming from most of the surveillance environments and applications. Identification and filtering of responsive messages from this big data ocean and then processing these informative datasets to gain knowledge are the two real challenges in today's applications. Details of a Simulation environment consisting of Devices/Sensors, Web/Mobile, Clinical Records, Internet queries, Social/News media, in which this ML platform was evaluated is also discussed. Infrastructure needs for this operating environment is also covered.

Download Full-text

Artificial Intelligence in News Media: Current Perceptions and Future Outlook

10.20944/preprints202110.0020.v2 ◽

2021 ◽

Author(s):

Mathias-Felipe de-Lima-Santos ◽

Wilson Ceron

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Computer Vision ◽

Science Fiction ◽

Language Processing ◽

News Media ◽

Future Research ◽

Production And Distribution ◽

News Industry ◽

Journalistic Field

In recent years, news media has been greatly disrupted by the potential of technologically driven approaches in the creation, production, and distribution of news products and services. Artificial intelligence (AI) has emerged from the realm of science fiction and has become a very real tool that can aid society in addressing many issues, including the challenges faced by the news industry. The ubiquity of computing has become apparent and has demonstrated the different approaches that can be achieved using AI. We analyzed the news industry’s AI adoption based on the seven subfields of AI: (i) machine learning; (ii) computer vision (CV); (iii) speech recognition; (iv) natural language processing (NLP); (v) planning, scheduling, and optimization; (vi) expert systems; and (vii) robotics. Our findings suggest that three subfields are being developed more in the news media: machine learning, computer vision, as well as planning, scheduling, and optimization. Other areas have not been fully deployed in the journalistic field. Most AI news projects rely on funds from tech companies such as Google. This limits AI’s potential to a small number of players in the news industry. We make conclusions by providing examples of how these subfields are being developed in journalism and present an agenda for future research.

Download Full-text

Natural language processing and machine learning methods in public health surveillance: a narrative review (Preprint)

10.2196/preprints.26351 ◽

2020 ◽

Author(s):

Patrick James Ward ◽

April M Young

Keyword(s):

Public Health ◽

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Public Health Surveillance ◽

Health Surveillance ◽

Surveillance Data ◽

Online Media ◽

Traditional Surveillance

BACKGROUND Public health surveillance is critical to detecting emerging population health threats and improvements. Surveillance data has increased in size and complexity, posing challenges to data management and analysis. Natural language processing (NLP) and machine learning (ML) are valuable tools for analysis of unstructured data involving free-text and have been used in innovative ways to examine a variety of health outcomes. OBJECTIVE Given the cross-disciplinary applications of NLP and ML, research on their applications in surveillance have been disseminated in a variety of outlets. As such, the aim of this narrative review was to describe the current state of NLP and ML use in surveillance science and to identify directions in future research. METHODS Information was abstracted from articles describing the use of natural language processing and machine learning in public health surveillance identified through a PubMed search. RESULTS Twenty-two articles met review criteria, 12 involving traditional surveillance data sources and 10 involving online media sources for surveillance. Traditional surveillance sources analyzed with NLP and ML consisted primarily of death certificates (n=6), hospital data (n=5), and online media sources (e.g., Twitter) (n=8). CONCLUSIONS The reviewed articles demonstrate the potential of NLP and ML to enhance surveillance data through improving timeliness of surveillance, identifying cases in the absence of standardized case definitions, and enabling mining of social media for public health surveillance.

Download Full-text

Web Search Engine Misinformation Notifier Extension (SEMiNExt): A Machine Learning Based Approach during COVID-19 Pandemic

Healthcare ◽

10.3390/healthcare9020156 ◽

2021 ◽

Vol 9 (2) ◽

pp. 156

Author(s):

Abdullah Bin Shams ◽

Ehsanul Hoque Apu ◽

Ashiqur Rahman ◽

Md. Mohsin Sarker Raihan ◽

Nazeeba Siddika ◽

...

Keyword(s):

Public Health ◽

Machine Learning ◽

Real Time ◽

Search Engine ◽

Language Processing ◽

Web Search ◽

Training Data ◽

Small Data ◽

Web Search Engine ◽

User Query

Misinformation such as on coronavirus disease 2019 (COVID-19) drugs, vaccination or presentation of its treatment from untrusted sources have shown dramatic consequences on public health. Authorities have deployed several surveillance tools to detect and slow down the rapid misinformation spread online. Large quantities of unverified information are available online and at present there is no real-time tool available to alert a user about false information during online health inquiries over a web search engine. To bridge this gap, we propose a web search engine misinformation notifier extension (SEMiNExt). Natural language processing (NLP) and machine learning algorithm have been successfully integrated into the extension. This enables SEMiNExt to read the user query from the search bar, classify the veracity of the query and notify the authenticity of the query to the user, all in real-time to prevent the spread of misinformation. Our results show that SEMiNExt under artificial neural network (ANN) works best with an accuracy of 93%, F1-score of 92%, precision of 92% and a recall of 93% when 80% of the data is trained. Moreover, ANN is able to predict with a very high accuracy even for a small training data size. This is very important for an early detection of new misinformation from a small data sample available online that can significantly reduce the spread of misinformation and maximize public health safety. The SEMiNExt approach has introduced the possibility to improve online health management system by showing misinformation notifications in real-time, enabling safer web-based searching on health-related issues.

Download Full-text

Artificial Intelligence in News Media: Current Perceptions and Future Outlook

Journalism and Media ◽

10.3390/journalmedia3010002 ◽

2021 ◽

Vol 3 (1) ◽

pp. 13-26

Author(s):

Mathias-Felipe de-Lima-Santos ◽

Wilson Ceron

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Computer Vision ◽

Science Fiction ◽

Language Processing ◽

News Media ◽

Future Research ◽

Production And Distribution ◽

News Industry ◽

Journalistic Field

In recent years, news media has been greatly disrupted by the potential of technologically driven approaches in the creation, production, and distribution of news products and services. Artificial intelligence (AI) has emerged from the realm of science fiction and has become a very real tool that can aid society in addressing many issues, including the challenges faced by the news industry. The ubiquity of computing has become apparent and has demonstrated the different approaches that can be achieved using AI. We analyzed the news industry’s AI adoption based on the seven subfields of AI: (i) machine learning; (ii) computer vision (CV); (iii) speech recognition; (iv) natural language processing (NLP); (v) planning, scheduling, and optimization; (vi) expert systems; and (vii) robotics. Our findings suggest that three subfields are being developed more in the news media: machine learning, computer vision, and planning, scheduling, and optimization. Other areas have not been fully deployed in the journalistic field. Most AI news projects rely on funds from tech companies such as Google. This limits AI’s potential to a small number of players in the news industry. We made conclusions by providing examples of how these subfields are being developed in journalism and presented an agenda for future research.

Download Full-text

"Life is unrecognisable": A natural language processing study of COVID-19 impacts on Australian adults (Preprint)

10.2196/preprints.29213 ◽

2021 ◽

Author(s):

Jillian RYAN ◽

Hamza Sellak ◽

Emily Brindal

Keyword(s):

Public Health ◽

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Computer Algorithms ◽

Life Domains ◽

Multinomial Regression ◽

Positive Effects ◽

Negative Sentiment

BACKGROUND Natural language processing is a machine learning technique that uses intelligent computer algorithms to detect patterns and themes in unstructured datasets commonly containing text data. Machine learning can aid with understanding the impacts of novel and disruptive events, and therefore offers myriad public health applications. OBJECTIVE This study aims to explore community sentiment towards COVID-19 and the nature of the impacts that COVID-19 has had on people using natural language processing on a linked research dataset. METHODS Stanford CoreNLP was used to analyse and detect sentiment in qualitative COVID-19 impact stories from 3,483 Australian adults. Common themes were categorised according to the Theoretical Life Domains framework and a multinomial regression analysis was conducted to identify psychological and demographic predictors of sentiment. RESULTS About one-third of participants (33%) expressed negative sentiment towards COVID-19, while a further 44% expressed neutral sentiment and 23% expressed positive sentiment. Of the Theoretical Life Domains, behavioural regulation was by far the most commonly impacted life domain, followed by environmental context and resources, emotion, and social influences. Negative sentiment was predicted by financial stress and lower subjective wellbeing. CONCLUSIONS COVID-19 and its containment measures have had dramatic impacts on Australian adults. Ability to regulate health and social behaviours were among the most common impacts and this raises concerns for the effects of public health crises on chronic health and mental health conditions. Positive effects of COVID-19, related to greater flexibility in working arrangements and reductions in life ‘busyness’ were also documented. CLINICALTRIAL N/A

Download Full-text

Artificial Intelligence in News Media: Current Perceptions and Future Outlook

10.20944/preprints202110.0020.v1 ◽

2021 ◽

Author(s):

Mathias-Felipe de-Lima-Santos ◽

Wilson Ceron

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Computer Vision ◽

Science Fiction ◽

Language Processing ◽

News Media ◽

Future Research ◽

Production And Distribution ◽

News Industry ◽

Journalistic Field

In recent years, news media have been hugely disrupted by the potential of technological-driven approaches in the creation, production, and distribution of news products and services. Artificial intelligence (AI) has emerged from the realm of science fiction and has become a very real tool that can aid society in addressing many issues, including the challenges faced by the news industry. The ubiquity of computing has become apparent and has shown the different approaches that can be achieved using AI. We analyzed the news industry AI adoption based on the seven subfields emanated from AI: (i) machine learning; (ii) computer vision (CV); (iii) speech recognition; (iv) natural language processing (NLP); (v) planning, scheduling, and optimization; (vi) expert systems; and (vii) robotics. Our findings suggest that three subfields are being more developed in the news media: machine learning, planning, scheduling &amp; optimization, and computer vision. Other areas are still not fully deployed in the journalistic field. Most of the AI news projects rely on funds from tech companies, such as Google. This limits the potential of AI in the news industry to a small number of players. We conclude by providing examples of how these subfields are being developed in journalism and present an agenda for future research.

Download Full-text

Whether the Weather Will Help Us Weather the COVID-19 Pandemic: Using Machine Learning to Measure Twitter Users' Perceptions

10.1101/2020.07.29.20164814 ◽

2020 ◽

Author(s):

Marichi Gupta ◽

Adity Bansal ◽

Bhav Jain ◽

Jillian Rochelle ◽

Atharv Oak ◽

...

Keyword(s):

Public Health ◽

Machine Learning ◽

Language Processing ◽

Scientific Evidence ◽

The Public ◽

Potential Impact ◽

Twitter Users ◽

Processing Techniques ◽

The Impact ◽

Weather’S Impact

Objective: The potential ability for weather to affect SARS-CoV-2 transmission has been an area of controversial discussion during the COVID-19 pandemic. Individuals' perceptions of the impact of weather can inform their adherence to public health guidelines; however, there is no measure of their perceptions. We quantified Twitter users' perceptions of the effect of weather and analyzed how they evolved with respect to real-world events and time. Materials and Methods: We collected 166,005 tweets posted between January 23 and June 22, 2020 and employed machine learning/natural language processing techniques to filter for relevant tweets, classify them by the type of effect they claimed, and identify topics of discussion. Results: We identified 28,555 relevant tweets and estimate that 40.4% indicate uncertainty about weather's impact, 33.5% indicate no effect, and 26.1% indicate some effect. We tracked changes in these proportions over time. Topic modeling revealed major latent areas of discussion. Discussion: There is no consensus among the public for weather's potential impact. Earlier months were characterized by tweets that were uncertain of weather's effect or claimed no effect; later, the portion of tweets claiming some effect of weather increased. Tweets claiming no effect of weather comprised the largest class by June. Major topics of discussion included comparisons to influenza's seasonality, President Trump's comments on weather's effect, and social distancing. Conclusion: There is a major gap between scientific evidence and public opinion of weather's impacts on COVID-19. We provide evidence of public's misconceptions and topics of discussion, which can inform public health communications.

Download Full-text

Identifying Key Target Audiences for Public Health Campaigns: Leveraging Machine Learning in the Case of Hookah Tobacco Smoking (Preprint)

10.2196/preprints.12443 ◽

2018 ◽

Author(s):

Kar-Hai Chu ◽

Jason Colditz ◽

Momin Malik ◽

Tabitha Yates ◽

Brian Primack

Keyword(s):

Public Health ◽

Machine Learning ◽

Social Media ◽

Language Processing ◽

Tobacco Smoking ◽

A Priori ◽

Machine Learning Techniques ◽

Systematic Research ◽

Health Campaigns ◽

Public Health Officials

BACKGROUND Hookah tobacco smoking (HTS) is a particularly important issue for public health professionals to address owing to its prevalence and deleterious health effects. Social media sites can be a valuable tool for public health officials to conduct informational health campaigns. Current social media platforms provide researchers with opportunities to better identify and target specific audiences and even individuals. However, we are not aware of systematic research attempting to identify audiences with mixed or ambivalent views toward HTS. OBJECTIVE The objective of this study was to (1) confirm previous research showing positively skewed HTS sentiment on Twitter using a larger dataset by leveraging machine learning techniques and (2) systematically identify individuals who exhibit mixed opinions about HTS via the Twitter platform and therefore represent key audiences for intervention. METHODS We prospectively collected tweets related to HTS from January to June 2016. We double-coded sentiment for a subset of approximately 5000 randomly sampled tweets for sentiment toward HTS and used these data to train a machine learning classifier to assess the remaining approximately 556,000 HTS-related Twitter posts. Natural language processing software was used to extract linguistic features (ie, language-based covariates). The data were processed by machine learning tools and algorithms using R. Finally, we used the results to identify individuals who, because they had consistently posted both positive and negative content, might be ambivalent toward HTS and represent an ideal audience for intervention. RESULTS There were 561,960 HTS-related tweets: 373,911 were classified as positive and 183,139 were classified as negative. A set of 12,861 users met a priori criteria indicating that they posted both positive and negative tweets about HTS. CONCLUSIONS Sentiment analysis can allow researchers to identify audience segments on social media that demonstrate ambiguity toward key public health issues, such as HTS, and therefore represent ideal populations for intervention. Using large social media datasets can help public health officials to preemptively identify specific audience segments that would be most receptive to targeted campaigns.

Download Full-text

Recent Advances in Using Natural Language Processing to Address Public Health Research Questions Using Social Media and ConsumerGenerated Data

Yearbook of Medical Informatics ◽

10.1055/s-0039-1677918 ◽

2019 ◽

Vol 28 (01) ◽

pp. 208-217 ◽

Cited By ~ 9

Author(s):

Mike Conway ◽

Mengke Hu ◽

Wendy W. Chapman

Keyword(s):

Public Health ◽

Mental Health ◽

Machine Learning ◽

Social Media ◽

Natural Language Processing ◽

Language Processing ◽

Online Health Communities ◽

Machine Learning Methods ◽

Health Communities ◽

Health Applications

Objective: We present a narrative review of recent work on the utilisation of Natural Language Processing (NLP) for the analysis of social media (including online health communities) specifically for public health applications. Methods: We conducted a literature review of NLP research that utilised social media or online consumer-generated text for public health applications, focussing on the years 2016 to 2018. Papers were identified in several ways, including PubMed searches and the inspection of recent conference proceedings from the Association of Computational Linguistics (ACL), the Conference on Human Factors in Computing Systems (CHI), and the International AAAI (Association for the Advancement of Artificial Intelligence) Conference on Web and Social Media (ICWSM). Popular data sources included Twitter, Reddit, various online health communities, and Facebook. Results: In the recent past, communicable diseases (e.g., influenza, dengue) have been the focus of much social media-based NLP health research. However, mental health and substance use and abuse (including the use of tobacco, alcohol, marijuana, and opioids) have been the subject of an increasing volume of research in the 2016 - 2018 period. Associated with this trend, the use of lexicon-based methods remains popular given the availability of psychologically validated lexical resources suitable for mental health and substance abuse research. Finally, we found that in the period under review “modern" machine learning methods (i.e. deep neural-network-based methods), while increasing in popularity, remain less widely used than “classical" machine learning methods.

Download Full-text

County-Level Data Key to Effective Public Health Practice

PsycEXTRA Dataset ◽

10.1037/e600602007-003 ◽

2004 ◽

Keyword(s):

Public Health ◽

Public Health Practice ◽

County Level ◽

Health Practice ◽

Level Data ◽

Effective Public Health

Download Full-text