scholarly journals Use of Natural Language Processing to identify Obsessive Compulsive Symptoms in patients with schizophrenia, schizoaffective disorder or bipolar disorder

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
David Chandran ◽  
Deborah Ahn Robbins ◽  
Chin-Kuo Chang ◽  
Hitesh Shetty ◽  
Jyoti Sanyal ◽  
...  

Abstract Obsessive and Compulsive Symptoms (OCS) or Obsessive Compulsive Disorder (OCD) in the context of schizophrenia or related disorders are of clinical importance as these are associated with a range of adverse outcomes. Natural Language Processing (NLP) applied to Electronic Health Records (EHRs) presents an opportunity to create large datasets to facilitate research in this area. This is a challenging endeavour however, because of the wide range of ways in which these symptoms are recorded, and the overlap of terms used to describe OCS with those used to describe other conditions. We developed an NLP algorithm to extract OCS information from a large mental healthcare EHR data resource at the South London and Maudsley NHS Foundation Trust using its Clinical Record Interactive Search (CRIS) facility. We extracted documents from individuals who had received a diagnosis of schizophrenia, schizoaffective disorder, or bipolar disorder. These text documents, annotated by human coders, were used for developing and refining the NLP algorithm (600 documents) with an additional set reserved for final validation (300 documents). The developed NLP algorithm utilized a rules-based approach to identify each of symptoms associated with OCS, and then combined them to determine the overall number of instances of OCS. After its implementation, the algorithm was shown to identify OCS with a precision and recall (with 95% confidence intervals) of 0.77 (0.65–0.86) and 0.67 (0.55–0.77) respectively. The development of this application demonstrated the potential to extract complex symptomatic data from mental healthcare EHRs using NLP to facilitate further analyses of these clinical symptoms and their relevance for prognosis and intervention response.

Author(s):  
Clifford Nangle ◽  
Stuart McTaggart ◽  
Margaret MacLeod ◽  
Jackie Caldwell ◽  
Marion Bennie

ABSTRACT ObjectivesThe Prescribing Information System (PIS) datamart, hosted by NHS National Services Scotland receives around 90 million electronic prescription messages per year from GP practices across Scotland. Prescription messages contain information including drug name, quantity and strength stored as coded, machine readable, data while prescription dose instructions are unstructured free text and difficult to interpret and analyse in volume. The aim, using Natural Language Processing (NLP), was to extract drug dose amount, unit and frequency metadata from freely typed text in dose instructions to support calculating the intended number of days’ treatment. This then allows comparison with actual prescription frequency, treatment adherence and the impact upon prescribing safety and effectiveness. ApproachAn NLP algorithm was developed using the Ciao implementation of Prolog to extract dose amount, unit and frequency metadata from dose instructions held in the PIS datamart for drugs used in the treatment of gastrointestinal, cardiovascular and respiratory disease. Accuracy estimates were obtained by randomly sampling 0.1% of the distinct dose instructions from source records, comparing these with metadata extracted by the algorithm and an iterative approach was used to modify the algorithm to increase accuracy and coverage. ResultsThe NLP algorithm was applied to 39,943,465 prescription instructions issued in 2014, consisting of 575,340 distinct dose instructions. For drugs used in the gastrointestinal, cardiovascular and respiratory systems (i.e. chapters 1, 2 and 3 of the British National Formulary (BNF)) the NLP algorithm successfully extracted drug dose amount, unit and frequency metadata from 95.1%, 98.5% and 97.4% of prescriptions respectively. However, instructions containing terms such as ‘as directed’ or ‘as required’ reduce the usability of the metadata by making it difficult to calculate the total dose intended for a specific time period as 7.9%, 0.9% and 27.9% of dose instructions contained terms meaning ‘as required’ while 3.2%, 3.7% and 4.0% contained terms meaning ‘as directed’, for drugs used in BNF chapters 1, 2 and 3 respectively. ConclusionThe NLP algorithm developed can extract dose, unit and frequency metadata from text found in prescriptions issued to treat a wide range of conditions and this information may be used to support calculating treatment durations, medicines adherence and cumulative drug exposure. The presence of terms such as ‘as required’ and ‘as directed’ has a negative impact on the usability of the metadata and further work is required to determine the level of impact this has on calculating treatment durations and cumulative drug exposure.


AI Magazine ◽  
2015 ◽  
Vol 36 (1) ◽  
pp. 99-102
Author(s):  
Tiffany Barnes ◽  
Oliver Bown ◽  
Michael Buro ◽  
Michael Cook ◽  
Arne Eigenfeldt ◽  
...  

The AIIDE-14 Workshop program was held Friday and Saturday, October 3–4, 2014 at North Carolina State University in Raleigh, North Carolina. The workshop program included five workshops covering a wide range of topics. The titles of the workshops held Friday were Games and Natural Language Processing, and Artificial Intelligence in Adversarial Real-Time Games. The titles of the workshops held Saturday were Diversity in Games Research, Experimental Artificial Intelligence in Games, and Musical Metacreation. This article presents short summaries of those events.


2004 ◽  
Vol 10 (1) ◽  
pp. 57-89 ◽  
Author(s):  
MARJORIE MCSHANE ◽  
SERGEI NIRENBURG ◽  
RON ZACHARSKI

The topic of mood and modality (MOD) is a difficult aspect of language description because, among other reasons, the inventory of modal meanings is not stable across languages, moods do not map neatly from one language to another, modality may be realised morphologically or by free-standing words, and modality interacts in complex ways with other modules of the grammar, like tense and aspect. Describing MOD is especially difficult if one attempts to develop a unified approach that not only provides cross-linguistic coverage, but is also useful in practical natural language processing systems. This article discusses an approach to MOD that was developed for and implemented in the Boas Knowledge-Elicitation (KE) system. Boas elicits knowledge about any language, L, from an informant who need not be a trained linguist. That knowledge then serves as the static resources for an L-to-English translation system. The KE methodology used throughout Boas is driven by a resident inventory of parameters, value sets, and means of their realisation for a wide range of language phenomena. MOD is one of those parameters, whose values are the inventory of attested and not yet attested moods (e.g. indicative, conditional, imperative), and whose realisations include flective morphology, agglutinating morphology, isolating morphology, words, phrases and constructions. Developing the MOD elicitation procedures for Boas amounted to wedding the extensive theoretical and descriptive research on MOD with practical approaches to guiding an untrained informant through this non-trivial task. We believe that our experience in building the MOD module of Boas offers insights not only into cross-linguistic aspects of MOD that have not previously been detailed in the natural language processing literature, but also into KE methodologies that could be applied more broadly.


2021 ◽  
Author(s):  
Taishiro Kishimoto ◽  
Hironobu Nakamura ◽  
Yoshinobu Kano ◽  
Yoko Eguchi ◽  
Momoko Kitazawa ◽  
...  

AbstractIntroductionPsychiatric disorders are diagnosed according to diagnostic criteria such as the DSM-5 and ICD-11. Basically, psychiatrists extract symptoms and make a diagnosis by conversing with patients. However, such processes often lack objectivity. In contrast, specific linguistic features can be observed in some psychiatric disorders, such as a loosening of associations in schizophrenia. The purposes of the present study are to quantify the language features of psychiatric disorders and neurocognitive disorders using natural language processing and to identify features that differentiate disorders from one another and from healthy subjects.MethodsThis study will have a multi-center prospective design. Major depressive disorder, bipolar disorder, schizophrenia, anxiety disorder including obsessive compulsive disorder and, major and minor neurocognitive disorders, as well as healthy subjects will be recruited. A psychiatrist or psychologist will conduct 30-to-60-min interviews with each participant and these interviews will be recorded using a microphone headset. In addition, the severity of disorders will be assessed using clinical rating scales. Data will be collected from each participant at least twice during the study period and up to a maximum of five times.DiscussionThe overall goal of this proposed study, the Understanding Psychiatric Illness Through Natural Language Processing (UNDERPIN), is to develop objective and easy-to-use biomarkers for diagnosing and assessing the severity of each psychiatric disorder using natural language processing. As of August 2021, we have collected a total of >900 datasets from >350 participants. To the best of our knowledge, this data sample is one of the largest in this field.Trial registrationUMIN000032141, University Hospital Medical Information Network (UMIN).


2020 ◽  
Vol 34 (05) ◽  
pp. 7456-7463 ◽  
Author(s):  
Zied Bouraoui ◽  
Jose Camacho-Collados ◽  
Steven Schockaert

One of the most remarkable properties of word embeddings is the fact that they capture certain types of semantic and syntactic relationships. Recently, pre-trained language models such as BERT have achieved groundbreaking results across a wide range of Natural Language Processing tasks. However, it is unclear to what extent such models capture relational knowledge beyond what is already captured by standard word embeddings. To explore this question, we propose a methodology for distilling relational knowledge from a pre-trained language model. Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation. We then use a subset of these extracted sentences as templates. Finally, we fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.


BMJ Open ◽  
2018 ◽  
Vol 8 (9) ◽  
pp. e025216 ◽  
Author(s):  
Karen Isabel Birnie ◽  
Robert Stewart ◽  
Anna Kolliakou

ObjectivesHallucinations are present in many conditions, notably psychosis. Although under-researched, atypical hallucinations, such as tactile, olfactory and gustatory (TOGHs), may arise secondary to hypnotic drug use, particularly non-benzodiazepine hypnotics (‘Z drugs’). This retrospective case-control study investigated the frequency of TOGHs and their associations with prior Z drug use in a large mental healthcare database.MethodsTOGHs were ascertained in 2014 using a bespoke natural language processing algorithm and were analysed against covariates (including use of Z drugs, demographic factors, diagnosis, disorder severity and other psychotropic medications) ascertained prior to 2014.ResultsIn 43 339 patients with International Classification of Diseases, Tenth Edition schizophreniform or affective disorder diagnoses, 324 (0.75%) had any TOGH recorded (0.54% tactile, 0.24% olfactory, 0.06% gustatory hallucinations). TOGHs were associated with male gender, black ethnicity, schizophreniform diagnosis and higher disorder severity on Health of the National Outcome Scales. In fully adjusted models, tactile and olfactory hallucinations remained independently associated with prior mention of Z drugs (ORs 1.86 and 1.60, respectively).ConclusionsWe successfully developed a natural language processing algorithm to identify instances of TOGHs in the clinical record. TOGHs overall, tactile and olfactory hallucinations were shown to be associated with prior mention of Z drugs. This may have implications for the diagnosis and treatment of patients with comorbid sleep and psychiatric conditions.


Author(s):  
S. Kavibharathi ◽  
S. Lakshmi Priyankaa ◽  
M.S. Kaviya ◽  
Dr.S. Vasanthi

The World Wide Web such as social networking sites and blog comments forum has huge user comments emotion data from different social events and product brand and arguments in the form of political views. Generate a heap. Reflects the user's mood on the network, the reader, has a huge impact on product suppliers and politicians. The challenge for the credibility of the analysis is the lack of sufficient tag data in the Natural Language Processing (NLP) field. Positive and negative classify content based on user feedback, live chat, whether the user is used as the base for a wide range of tasks related to the text content of a meaningful assessment. Data collection, and function number for all variants. A recurrent neural network is very good text classification. Analyzing unstructured form from social media data, reasonable structure, and analyzes attach great importance to note for this emotion. Emotional rewiring can use natural language processing sentiment analysis to predict. In the method by the Recurrent Neural Networks (RNNs) of the proposed prediction chat live chat into sentiment analysis. Sentiment analysis and in-depth learning technology have been integrated into the solution to this problem, with their deep learning model automatic learning function is active. Using a Recurrent Neural Networks (RNNs) reputation analysis to solve various problems and language problems of text analysis and visualization product retrospective sentiment classifier cross-depth analysis of the learning model implementation.


2021 ◽  
Author(s):  
Edgar Bernier ◽  
Sebastien Perrier

Abstract Maximizing operational efficiency is a critical challenge in oil and gas production, particularly important for mature assets in the North Sea. The causes of production shortfalls are numerous, distributed across a wide range of disciplines, technical and non-technical causes. The primary reason to apply Natural Language Processing (NLP) and text mining on several years of shortfall history was the need to support efficiently the evaluation of digital transformation use-case screenings and value mapping exercises, through a proper mapping of the issues faced. Obviously, this mapping contributed as well to reflect on operational surveillance and maintenance strategies to reduce the production shortfalls. This paper presents a methodology where the historical records of descriptions, comments and results of investigation regarding production shortfalls are revisited, adding to existing shortfall classifications and statistics, in particular in two domains: richer first root-cause mapping, and a series of advanced visualizations and analytics. The methodology put in place uses natural-language pre-processing techniques, combined with keyword-based text-mining and classification techniques. The limitations associated to the size and quality of these language datasets will be described, and the results discussed, highlighting the value of reaching high level of data granularity while defeating the ‘more information, less attention’ bias. At the same time, visual designs are introduced to display efficiently the different dimensions of this data (impact, frequency evolution through time, location in term of field and affected systems, root causes and other cause-related categories). The ambition in the domain of visualization is to create User Experience-friendly shortfall analytics, that can be displayed in smart rooms and collaborative rooms, where display's efficiency is higher when user-interactions are kept minimal, number of charts is limited and multiple dimensions do not collide. The paper is based on several applications across the North Sea. This case study and the associated lessons learned regarding natural language processing and text mining applied to similar technical concise data are answering several frequently asked questions on the value of the textual data records gathered over years.


Author(s):  
Danielle S. McNamara ◽  
Arthur C. Graesser

Coh-Metrix provides indices for the characteristics of texts on multiple levels of analysis, including word characteristics, sentence characteristics, and the discourse relationships between ideas in text. Coh-Metrix was developed to provide a wide range of indices within one tool. This chapter describes Coh-Metrix and studies that have been conducted validating the Coh-Metrix indices. Coh-Metrix can be used to better understand differences between texts and to explore the extent to which linguistic and discourse features successfully distinguish between text types. Coh-Metrix can also be used to develop and improve natural language processing approaches. We also describe the Coh-Metrix Text Easability Component Scores, which provide a picture of text ease (and hence potential challenges). The Text Easability components provided by Coh-Metrix go beyond traditional readability measures by providing metrics of text characteristics on multiple levels of language and discourse.


Author(s):  
Divaakar Siva Baala Sundaram ◽  
Shivaram P. Arunachalam ◽  
Devanshi N. Damani ◽  
Nasibeh Z. Farahani ◽  
Moein Enayati ◽  
...  

Abstract Hypertrophic Cardiomyopathy (HCM) is the most common genetic heart disease in the US and is known to cause sudden death (SCD) in young adults. While significant advancements have been made in HCM diagnosis and management, there is a need to identify HCM cases from electronic health record (EHR) data to develop automated tools based on natural language processing guided machine learning (ML) models for accurate HCM case identification to improve management and reduce adverse outcomes of HCM patients. Cardiac Magnetic Resonance (CMR) Imaging, plays a significant role in HCM diagnosis and risk stratification. CMR reports, generated by clinician annotation, offer rich data in the form of cardiac measurements as well as narratives describing interpretation and phenotypic description. The purpose of this study is to develop an NLP-based interpretable model utilizing impressions extracted from CMR reports to automatically identify HCM patients. CMR reports of patients with suspected HCM diagnosis between the years 1995 to 2019 were used in this study. Patients were classified into three categories of yes HCM, no HCM and, possible HCM. A random forest (RF) model was developed to predict the performance of both CMR measurements and impression features to identify HCM patients. The RF model yielded an accuracy of 86% (608 features) and 85% (30 features). These results offer promise for accurate identification of HCM patients using CMR reports from EHR for efficient clinical management transforming health care delivery for these patients.


Sign in / Sign up

Export Citation Format

Share Document