A Proof of Concept for Assessing Emergency Room Use with Primary Care Data and Natural Language Processing

SummaryObjective: The objective of this study was to undertake a proof of concept that demonstrated the use of primary care data and natural language processing and term extraction to assess emergency room use. The study extracted biopsychosocial concepts from primary care free text and related them to inappropriate emergency room use through the use of odds ratios.Methods: De-identified free text notes were extracted from a primary care clinic in Guelph, Ontario and analyzed with a software toolkit that incorporated General Architecture for Text Engineering (GATE) and MetaMap components for natural language processing and term extraction.Results: Over 10 million concepts were extracted from 13,836 patient records. Codes found in at least 1% percent of the sample were regressed against inappropriate emergency room use. 77 codes fell within the realm of biopsychosocial, were very statistically significant (p < 0.001) and had an OR > 2.0. Thematically, these codes involved mental health and pain related concepts.Conclusions: Analyzed thematically, mental health issues and pain are important themes; we have concluded that pain and mental health problems are primary drivers for inappropriate emergency room use. Age and sex were not significant. This proof of concept demonstrates the feasibly of combining natural language processing and primary care data to analyze a system use question. As a first work it supports further research and could be applied to investigate other, more complex problems.

Download Full-text

Natural language processing for cognitive therapy: Extracting schemas from thought records

PLoS ONE ◽

10.1371/journal.pone.0257832 ◽

2021 ◽

Vol 16 (10) ◽

pp. e0257832

Author(s):

Franziska Burger ◽

Mark A. Neerincx ◽

Willem-Paul Brinkman

Keyword(s):

Mental Health ◽

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Recurrent Neural Networks ◽

Support Vector ◽

Free Text ◽

Cognitive Approach ◽

Text Input

The cognitive approach to psychotherapy aims to change patients’ maladaptive schemas, that is, overly negative views on themselves, the world, or the future. To obtain awareness of these views, they record their thought processes in situations that caused pathogenic emotional responses. The schemas underlying such thought records have, thus far, been largely manually identified. Using recent advances in natural language processing, we take this one step further by automatically extracting schemas from thought records. To this end, we asked 320 healthy participants on Amazon Mechanical Turk to each complete five thought records consisting of several utterances reflecting cognitive processes. Agreement between two raters on manually scoring the utterances with respect to how much they reflect each schema was substantial (Cohen’s κ = 0.79). Natural language processing software pretrained on all English Wikipedia articles from 2014 (GLoVE embeddings) was used to represent words and utterances, which were then mapped to schemas using k-nearest neighbors algorithms, support vector machines, and recurrent neural networks. For the more frequently occurring schemas, all algorithms were able to leverage linguistic patterns. For example, the scores assigned to the Competence schema by the algorithms correlated with the manually assigned scores with Spearman correlations ranging between 0.64 and 0.76. For six of the nine schemas, a set of recurrent neural networks trained separately for each of the schemas outperformed the other algorithms. We present our results here as a benchmark solution, since we conducted this research to explore the possibility of automatically processing qualitative mental health data and did not aim to achieve optimal performance with any of the explored models. The dataset of 1600 thought records comprising 5747 utterances is published together with this article for researchers and machine learning enthusiasts to improve upon our outcomes. Based on our promising results, we see further opportunities for using free-text input and subsequent natural language processing in other common therapeutic tools, such as ecological momentary assessments, automated case conceptualizations, and, more generally, as an alternative to mental health scales.

Download Full-text

CHRONOSIG: Digital Triage for Secondary Mental Healthcare using Natural Language Processing - rationale and protocol

10.1101/2021.11.23.21266750 ◽

2021 ◽

Author(s):

Dan W Joyce ◽

Andrey Kormilitzin ◽

Julia Hamer-Hunt ◽

Anthony James ◽

Alejo Nevado-Holgado ◽

...

Keyword(s):

Mental Health ◽

Health Care ◽

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Mental Health Care ◽

Language Processing ◽

Clinical Decision ◽

Free Text ◽

Data Set

ABSTRACTBackgroundAccessing specialist secondary mental health care in the NHS in England requires a referral, usually from primary or acute care. Community mental health teams triage these referrals deciding on the most appropriate team to meet patients’ needs. Referrals require resource-intensive review by clinicians and often, collation and review of the patient’s history with services captured in their electronic health records (EHR). Triage processes are, however, opaque and often result in patients not receiving appropriate and timely access to care that is a particular concern for some minority and under-represented groups. Our project, funded by the National Institute of Health Research (NIHR) will develop a clinical decision support tool (CDST) to deliver accurate, explainable and justified triage recommendations to assist clinicians and expedite access to secondary mental health care.MethodsOur proposed CDST will be trained on narrative free-text data combining referral documentation and historical EHR records for patients in the UK-CRIS database. This high-volume data set will enable training of end-to-end neural network natural language processing (NLP) to extract ‘signatures’ of patients who were (historically) triaged to different treatment teams. The resulting algorithm will be externally validated using data from different NHS trusts (Nottinghamshire Healthcare, Southern Health, West London and Oxford Health). We will use an explicit algorithmic fairness framework to mitigate risk of unintended harm evident in some artificial intelligence (AI) healthcare applications. Consequently, the performance of the CDST will be explicitly evaluated in simulated triage team scenarios where the tool augments clinician’s decision making, in contrast to traditional “human versus AI” performance metrics.DiscussionThe proposed CDST represents an important test-case for AI applied to real-world process improvement in mental health. The project leverages recent advances in NLP while emphasizing the risks and benefits for patients of AI-augmented clinical decision making. The project’s ambition is to deliver a CDST that is scalable and can be deployed to any mental health trust in England to assist with digital triage.

Download Full-text

Identify Patients with Congestive Heart Failure through Analyzing Free-Text Clinical Notes

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.1032 ◽

2018 ◽

Vol 3 (4) ◽

Author(s):

Margot Yann ◽

Therese Stukel ◽

Liisa Jaakkimainen ◽

Karen Tu

Keyword(s):

Heart Failure ◽

Primary Care ◽

Congestive Heart Failure ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Substantial Reduction ◽

Free Text ◽

Test Dataset ◽

Clinical Notes

IntroductionA number of challenges exist in analyzing unstructured free text data in electronic medical records (EMRs). EMR text are difficult to represent and model due to their high dimensionality, heterogeneity, sparsity, incompleteness, random errors and the presence of noise. Objectives and ApproachStandard Natural Language Processing (NLP) tools make errors when applied to clinical notes due to physician use of unconventional language, involving polysemy, abbreviations, ambiguity, misspelling, variations, and negation. This paper presents a novel NLP framework, “Clinical Learning On Natural Expression” (CLONE), to automatically learn from a large primary care EMR database, analyzing free text clinical notes from primary care practices. CLONE’s predictive clinical models using text mining and neural network approach to extract features to identify patterns. To demonstrate effectiveness, we evaluate CLONE’s ability in a case study to identify patients with a specific chronic condition: congestive heart failure (CHF). ResultsA random selected sample of 7500 patients from Electronic Medical Record Administrative data Linked Database (EMRALD) is used. In this dataset, each patient’s medical chart includes a reference standard, manually reviewed by medical practitioners. Prevalence of CHF is approximately 2%. The low prevalence leads to another challenging problem in machine learning: imbalanced datasets. After pre-processing, we build deep learning models to represent and extract important medical information from free text to identify CHF patients through analyzing patient charts. We evaluated the effectiveness of CLONE by comparing the predicted labels with the standard references on a holdout test dataset. Comparing it with a number of alternative algorithms, we improve the overall accuracy to over 90% on a test dataset. Conclusion/ImplicationsAs the role of NLP in EMR data expands, the CLONE natural language processing framework can lead to substantial reduction in manual processing, while improving predictive accuracy.

Download Full-text

The Case for Retaining Natural Language Descriptions of Phenotypes in Plant Databases and a Web Application as Proof of Concept

10.1101/2021.02.04.429796 ◽

2021 ◽

Author(s):

Ian R. Braun ◽

Diane C. Bassham ◽

Carolyn J. Lawrence-Dill

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Web Application ◽

Free Text ◽

Similarity Metrics ◽

Proof Of Concept ◽

Link Type ◽

Plant Genes ◽

Phenotype Similarity

ABSTRACTMotivationFinding similarity across phenotypic descriptions is not straightforward, with previous successes in computation requiring significant expert data curation. Natural language processing of free text phenotype descriptions is often easier to apply than intensive curation. It is therefore critical to understand the extent to which these techniques can be used to organize and analyze biological datasets and enable biological discoveries.ResultsA wide variety of approaches from the natural language processing domain perform as well as similarity metrics over curated annotations for predicting shared phenotypes. These approaches also show promise both for helping curators organize and work through large datasets as well as for enabling researchers to explore relationships among available phenotype descriptions. Here we generate networks of phenotype similarity and share a web application for querying a dataset of associated plant genes using these text mining approaches. Example situations and species for which application of these techniques is most useful are discussed.AvailabilityThe dataset used in this work is available at https://git.io/JTutQ. The code for the analysis performed here is available at https://git.io/JTutN and https://git.io/JTuqv. The code for the web application discussed here is available at https://git.io/Jtv9J, and the application itself is available at https://quoats.dill-picl.org/.

Download Full-text

Sentiment Analysis Techniques Applied to Raw-Text Data from a Csq-8 Questionnaire about Mindfulness in Times of COVID-19 to Improve Strategy Generation

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126408 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6408

Author(s):

Mario Jojoa Acosta ◽

Gema Castillo-Sánchez ◽

Begonya Garcia-Zapirain ◽

Isabel de la Torre Díez ◽

Manuel Franco-Martín

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Transfer Learning ◽

Language Processing ◽

Health Care Professionals ◽

Ground Truth ◽

Relevant Information ◽

Free Text

The use of artificial intelligence in health care has grown quickly. In this sense, we present our work related to the application of Natural Language Processing techniques, as a tool to analyze the sentiment perception of users who answered two questions from the CSQ-8 questionnaires with raw Spanish free-text. Their responses are related to mindfulness, which is a novel technique used to control stress and anxiety caused by different factors in daily life. As such, we proposed an online course where this method was applied in order to improve the quality of life of health care professionals in COVID 19 pandemic times. We also carried out an evaluation of the satisfaction level of the participants involved, with a view to establishing strategies to improve future experiences. To automatically perform this task, we used Natural Language Processing (NLP) models such as swivel embedding, neural networks, and transfer learning, so as to classify the inputs into the following three categories: negative, neutral, and positive. Due to the limited amount of data available—86 registers for the first and 68 for the second—transfer learning techniques were required. The length of the text had no limit from the user’s standpoint, and our approach attained a maximum accuracy of 93.02% and 90.53%, respectively, based on ground truth labeled by three experts. Finally, we proposed a complementary analysis, using computer graphic text representation based on word frequency, to help researchers identify relevant information about the opinions with an objective approach to sentiment. The main conclusion drawn from this work is that the application of NLP techniques in small amounts of data using transfer learning is able to obtain enough accuracy in sentiment analysis and text classification stages.

Download Full-text

A Natural Language Processing Approach to Measuring Treatment Adherence and Consistency Using Semantic Similarity

AERA Open ◽

10.1177/23328584211028615 ◽

2021 ◽

Vol 7 ◽

pp. 233285842110286

Author(s):

Kylie L. Anglin ◽

Vivian C. Wong ◽

Arielle Boguslav

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Semantic Similarity ◽

Language Processing ◽

Intervention Implementation ◽

Proof Of Concept ◽

Coaching Intervention ◽

Processing Techniques ◽

Teacher Coaching ◽

The Impact

Though there is widespread recognition of the importance of implementation research, evaluators often face intense logistical, budgetary, and methodological challenges in their efforts to assess intervention implementation in the field. This article proposes a set of natural language processing techniques called semantic similarity as an innovative and scalable method of measuring implementation constructs. Semantic similarity methods are an automated approach to quantifying the similarity between texts. By applying semantic similarity to transcripts of intervention sessions, researchers can use the method to determine whether an intervention was delivered with adherence to a structured protocol, and the extent to which an intervention was replicated with consistency across sessions, sites, and studies. This article provides an overview of semantic similarity methods, describes their application within the context of educational evaluations, and provides a proof of concept using an experimental study of the impact of a standardized teacher coaching intervention.

Download Full-text

Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2020-100262 ◽

2021 ◽

Vol 28 (1) ◽

pp. e100262

Author(s):

Mustafa Khanbhai ◽

Patrick Anyadi ◽

Joshua Symons ◽

Kelsey Flott ◽

Ara Darzi ◽

...

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Patient Experience ◽

Language Processing ◽

Performance Metrics ◽

Free Text ◽

Patient Feedback

ObjectivesUnstructured free-text patient feedback contains rich information, and analysing these data manually would require a lot of personnel resources which are not available in most healthcare organisations.To undertake a systematic review of the literature on the use of natural language processing (NLP) and machine learning (ML) to process and analyse free-text patient experience data.MethodsDatabases were systematically searched to identify articles published between January 2000 and December 2019 examining NLP to analyse free-text patient feedback. Due to the heterogeneous nature of the studies, a narrative synthesis was deemed most appropriate. Data related to the study purpose, corpus, methodology, performance metrics and indicators of quality were recorded.ResultsNineteen articles were included. The majority (80%) of studies applied language analysis techniques on patient feedback from social media sites (unsolicited) followed by structured surveys (solicited). Supervised learning was frequently used (n=9), followed by unsupervised (n=6) and semisupervised (n=3). Comments extracted from social media were analysed using an unsupervised approach, and free-text comments held within structured surveys were analysed using a supervised approach. Reported performance metrics included the precision, recall and F-measure, with support vector machine and Naïve Bayes being the best performing ML classifiers.ConclusionNLP and ML have emerged as an important tool for processing unstructured free text. Both supervised and unsupervised approaches have their role depending on the data source. With the advancement of data analysis tools, these techniques may be useful to healthcare organisations to generate insight from the volumes of unstructured free-text data.

Download Full-text

Measuring Adoption of Patient Priorities-Aligned Care Using Natural Language Processing

Innovation in Aging ◽

10.1093/geroni/igaa057.592 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 183-183

Author(s):

Javad Razjouyan ◽

Jennifer Freytag ◽

Edward Odom ◽

Lilian Dindo ◽

Aanand Naik

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Chart Review ◽

Group Analysis ◽

Intervention Group ◽

Multiple Chronic Conditions ◽

Free Text ◽

Term Care

Abstract Patient Priorities Care (PPC) is a model of care that aligns health care recommendations with priorities of older adults with multiple chronic conditions. Social workers (SW), after online training, document PPC in the patient’s electronic health record (EHR). Our goal is to identify free-text notes with PPC language using a natural language processing (NLP) model and to measure PPC adoption and effect on long term services and support (LTSS) use. Free-text notes from the EHR produced by trained SWs passed through a hybrid NLP model that utilized rule-based and statistical machine learning. NLP accuracy was validated against chart review. Patients who received PPC were propensity matched with patients not receiving PPC (control) on age, gender, BMI, Charlson comorbidity index, facility and SW. The change in LTSS utilization 6-month intervals were compared by groups with univariate analysis. Chart review indicated that 491 notes out of 689 had PPC language and the NLP model reached to precision of 0.85, a recall of 0.90, an F1 of 0.87, and an accuracy of 0.91. Within group analysis shows that intervention group used LTSS 1.8 times more in the 6 months after the encounter compared to 6 months prior. Between group analysis shows that intervention group has significant higher number of LTSS utilization (p=0.012). An automated NLP model can be used to reliably measure the adaptation of PPC by SW. PPC seems to encourage use of LTSS that may delay time to long term care placement.

Download Full-text

Natural language processing methods for knowledge management—Applying document clustering for fast search and grouping of engineering documents

Concurrent Engineering ◽

10.1177/1063293x20982973 ◽

2021 ◽

pp. 1063293X2098297

Author(s):

Ivar Örn Arnarsson ◽

Otto Frost ◽

Emil Gustavsson ◽

Mats Jirstrand ◽

Johan Malmqvist

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Domain Knowledge ◽

Clustering Algorithms ◽

Document Clustering ◽

Unstructured Data ◽

Free Text ◽

Engineering Change ◽

Engineering Documents

Product development companies collect data in form of Engineering Change Requests for logged design issues, tests, and product iterations. These documents are rich in unstructured data (e.g. free text). Previous research affirms that product developers find that current IT systems lack capabilities to accurately retrieve relevant documents with unstructured data. In this research, we demonstrate a method using Natural Language Processing and document clustering algorithms to find structurally or contextually related documents from databases containing Engineering Change Request documents. The aim is to radically decrease the time needed to effectively search for related engineering documents, organize search results, and create labeled clusters from these documents by utilizing Natural Language Processing algorithms. A domain knowledge expert at the case company evaluated the results and confirmed that the algorithms we applied managed to find relevant document clusters given the queries tested.

Download Full-text