Quoted text in the mental healthcare electronic record: an analysis of the distribution and content of single-word quotations

ObjectiveTo investigate the distribution and content of quoted text within the electronic health records (EHRs) using a previously developed natural language processing tool to generate a database of quotations.Designχ2 and logistic regression were used to assess the profile of patients receiving mental healthcare for whom quotations exist. K-means clustering using pre-trained word embeddings developed on general discharge summaries and psychosis specific mental health records were used to group one-word quotations into semantically similar groups and labelled by human subjective judgement.SettingEHRs from a large mental healthcare provider serving a geographic catchment area of 1.3 million residents in South London.ParticipantsFor analysis of distribution, 33 499 individuals receiving mental healthcare on 30 June 2019 in South London and Maudsley. For analysis of content, 1587 unique lemmatised words, appearing a minimum of 20 times on the database of quotations created on 16 January 2020.ResultsThe strongest individual indicator of quoted text is inpatient care in the preceding 12 months (OR 9.79, 95% CI 7.84 to 12.23). Next highest indicator is ethnicity with those with a black background more likely to have quoted text in comparison to white background (OR 2.20, 95% CI 2.08 to 2.33). Both are attenuated slightly in the adjusted model. Early psychosis intervention word embeddings subjectively produced categories pertaining to: mental illness, verbs, negative sentiment, people/relationships, mixed sentiment, aggression/violence and negative connotation.ConclusionsThe findings that inpatients and those from a black ethnic background more commonly have quoted text raise important questions around where clinical attention is focused and whether this may point to any systematic bias. Our study also shows that word embeddings trained on early psychosis intervention records are useful in categorising even small subsets of the clinical records represented by one-word quotations.

Download Full-text

An evaluation of symptom domains in the 2 years before pregnancy as predictors of relapse in the perinatal period in women with severe mental illness

European Psychiatry ◽

10.1192/j.eurpsy.2021.18 ◽

2021 ◽

Vol 64 (1) ◽

Author(s):

Sharvari Khapre ◽

Robert Stewart ◽

Clare Taylor

Keyword(s):

Mental Illness ◽

Severe Mental Illness ◽

Language Processing ◽

Mental Healthcare ◽

Positive Association ◽

Perinatal Period ◽

Health Records ◽

Increased Risk ◽

Pregnancy And Postpartum ◽

Symptom Domains

Abstract Background Symptoms may be more useful prognostic markers for mental illness than diagnoses. We sought to investigate symptom domains in women with pre-existing severe mental illness (SMI; psychotic and bipolar disorder) as predictors of relapse risk during the perinatal period. Methods Data were obtained from electronic health records of 399 pregnant women with SMI diagnoses from a large south London mental healthcare provider. Symptoms within six domains characteristically associated with SMI (positive, negative, disorganization, mania, depression, and catatonia) recorded in clinical notes 2 years before pregnancy were identified with natural language processing algorithms to extract data from text, and associations investigated with hospitalization during pregnancy and 3 months postpartum. Results Seventy-six women (19%) relapsed during pregnancy and 107 (27%) relapsed postpartum. After adjusting for covariates, disorganization symptoms showed a positive association at borderline significance with relapse during pregnancy (adjusted odds ratio [aOR] = 1.36; 95% confidence interval [CI] = 0.99–1.87 per unit increase in number of symptoms) and depressive symptoms negatively with relapse postpartum (0.78; 0.62–0.98). Restricting the sample to women with at least one recorded symptom in any given domain, higher disorganization (1.84; 1.22–2.76), positive (1.50; 1.07–2.11), and manic (1.48; 1.03–2.11) symptoms were associated with relapse during pregnancy, and disorganization (1.54; 1.08–2.20) symptom domains were associated with relapse postpartum. Conclusions Positive, disorganization, and manic symptoms recorded in the 2 years before pregnancy were associated with increased risk of relapse during pregnancy and postpartum. The characterization of routine health records from text fields is relatively transferrable and could help inform predictive risk modelling.

Download Full-text

Natural language processing tool for automatic diseases and drugs recognition from electronic health records in polish- pilot study

European Heart Journal ◽

10.1093/eurheartj/ehab724.3169 ◽

2021 ◽

Vol 42 (Supplement_1) ◽

Author(s):

C M Maciejewski ◽

M K Krajsman ◽

K O Ozieranski ◽

M B Basza ◽

M G Gawalko ◽

...

Keyword(s):

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Structured Data ◽

Health Records ◽

Funding Sources ◽

Natural Language Processing Tool ◽

Electronic Health ◽

Automatic Tool

Abstract Background An estimate of 80% of data gathered in electronic health records is unstructured, textual information that cannot be utilized for research purposes until it is manually coded into a database. Manual coding is a both cost and time- consuming process. Natural language processing (NLP) techniques may be utilized for extraction of structured data from text. However, little is known about the accuracy of data obtained through these methods. Purpose To evaluate the possibility of employing NLP techniques in order to obtain data regarding risk factors needed for CHA2DS2VASc scale calculation and detection of antithrombotic medication prescribed in the population of atrial fibrillation (AF) patients of a cardiology ward. Methods An automatic tool for diseases and drugs recognition based on regular expressions rules was designed through cooperation of physicians and IT specialists. Records of 194 AF patients discharged from a cardiology ward were manually reviewed by a physician- annotator as a comparator for the automatic approach. Results Median CHA2DS2VASc score calculated by the automatic was 3 (IQR 2–4) versus 3 points (IQR 2–4) for the manual method (p=0.66). High agreement between CHA2DS2VASc scores calculated by both methods was present (Kendall's W=0.979; p<0.001). In terms of anticoagulant recognition, the automatic tool misqualified the drug prescribed in 4 cases. Conclusion NLP-based techniques are a promising tools for obtaining structured data for research purposes from electronic health records in polish. Tight cooperation of physicians and IT specialists is crucial for establishing accurate recognition patterns. Funding Acknowledgement Type of funding sources: None.

Download Full-text

Applying a natural language processing tool to electronic health records to assess performance on colonoscopy quality measures

Gastrointestinal Endoscopy ◽

10.1016/j.gie.2012.01.045 ◽

2012 ◽

Vol 75 (6) ◽

pp. 1233-1239.e14 ◽

Cited By ~ 46

Author(s):

Ateev Mehrotra ◽

Evan S. Dellon ◽

Robert E. Schoen ◽

Melissa Saul ◽

Faraz Bishehsari ◽

...

Keyword(s):

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Quality Measures ◽

Health Records ◽

Natural Language Processing Tool ◽

Electronic Health ◽

Colonoscopy Quality

Download Full-text

Natural language processing of lifestyle modification documentation

Health Informatics Journal ◽

10.1177/1460458218824742 ◽

2019 ◽

Vol 26 (1) ◽

pp. 388-405 ◽

Cited By ~ 1

Author(s):

Kimberly Shoenbill ◽

Yiqiang Song ◽

Lisa Gress ◽

Heather Johnson ◽

Maureen Smith ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Lifestyle Modification ◽

Care Delivery ◽

First Line ◽

Health Records ◽

Obesity And Diabetes ◽

Natural Language Processing Tool ◽

Electronic Health

Lifestyle modification, including diet, exercise, and tobacco cessation, is the first-line treatment of many disorders including hypertension, obesity, and diabetes. Lifestyle modification data are not easily retrieved or used in research due to their textual nature. This study addresses this knowledge gap using natural language processing to automatically identify lifestyle modification documentation from electronic health records. Electronic health record notes from hypertension patients were analyzed using an open-source natural language processing tool to retrieve assessment and advice regarding lifestyle modification. These data were classified as lifestyle modification assessment or advice and mapped to a coded standard ontology. Combined lifestyle modification (advice and assessment) recall was 99.27 percent, precision 94.44 percent, and correct classification 88.15 percent. Through extraction and transformation of narrative lifestyle modification data to coded data, this critical information can be used in research, metric development, and quality improvement efforts regarding care delivery for multiple medical conditions that benefit from lifestyle modification.

Download Full-text

Colonoscopy quality, quality measures, and a natural language processing tool for electronic health records

Gastrointestinal Endoscopy ◽

10.1016/j.gie.2012.02.031 ◽

2012 ◽

Vol 75 (6) ◽

pp. 1240-1242 ◽

Cited By ~ 5

Author(s):

John C. Deutsch

Keyword(s):

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Quality Measures ◽

Health Records ◽

Natural Language Processing Tool ◽

Electronic Health ◽

Colonoscopy Quality

Download Full-text

A natural language processing approach for identifying temporal disease onset information from mental healthcare text

Scientific Reports ◽

10.1038/s41598-020-80457-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Natalia Viani ◽

Riley Botelle ◽

Jack Kerwin ◽

Lucia Yin ◽

Rashmi Patel ◽

...

Keyword(s):

Mental Health ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Disease Onset ◽

Mental Healthcare ◽

Onset Date ◽

Health Records ◽

Appropriate Treatment ◽

Ranked List

AbstractReceiving timely and appropriate treatment is crucial for better health outcomes, and research on the contribution of specific variables is essential. In the mental health domain, an important research variable is the date of psychosis symptom onset, as longer delays in treatment are associated with worse intervention outcomes. The growing adoption of electronic health records (EHRs) within mental health services provides an invaluable opportunity to study this problem at scale retrospectively. However, disease onset information is often only available in open text fields, requiring natural language processing (NLP) techniques for automated analyses. Since this variable can be documented at different points during a patient’s care, NLP methods that model clinical and temporal associations are needed. We address the identification of psychosis onset by: 1) manually annotating a corpus of mental health EHRs with disease onset mentions, 2) modelling the underlying NLP problem as a paragraph classification approach, and 3) combining multiple onset paragraphs at the patient level to generate a ranked list of likely disease onset dates. For 22/31 test patients (71%) the correct onset date was found among the top-3 NLP predictions. The proposed approach was also applied at scale, allowing an onset date to be estimated for 2483 patients.

Download Full-text

Learning emotional word embeddings for sentiment analysis

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201993 ◽

2021 ◽

pp. 1-13

Author(s):

Qingtian Zeng ◽

Xishi Zhao ◽

Xiaohui Hu ◽

Hua Duan ◽

Zhongying Zhao ◽

...

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

State Of The Art ◽

Research Problem ◽

Emotional Word ◽

Classification Model ◽

Data Sets ◽

Word Embeddings ◽

Real World Data ◽

Text Documents

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.

Download Full-text

Machine learning with persistent homology and chemical word embeddings improves prediction accuracy and interpretability in metal-organic frameworks

Scientific Reports ◽

10.1038/s41598-021-88027-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Aditi S. Krishnapriyan ◽

Joseph Montoya ◽

Maciej Haranczyk ◽

Jens Hummelshøj ◽

Dmitriy Morozov

Keyword(s):

Machine Learning ◽

Language Processing ◽

Metal Organic Framework ◽

Persistent Homology ◽

Level Structure ◽

Chemical Information ◽

Material System ◽

Word Embeddings ◽

Structure Property ◽

Metal Organic

AbstractMachine learning has emerged as a powerful approach in materials discovery. Its major challenge is selecting features that create interpretable representations of materials, useful across multiple prediction tasks. We introduce an end-to-end machine learning model that automatically generates descriptors that capture a complex representation of a material’s structure and chemistry. This approach builds on computational topology techniques (namely, persistent homology) and word embeddings from natural language processing. It automatically encapsulates geometric and chemical information directly from the material system. We demonstrate our approach on multiple nanoporous metal–organic framework datasets by predicting methane and carbon dioxide adsorption across different conditions. Our results show considerable improvement in both accuracy and transferability across targets compared to models constructed from the commonly-used, manually-curated features, consistently achieving an average 25–30% decrease in root-mean-squared-deviation and an average increase of 40–50% in R2 scores. A key advantage of our approach is interpretability: Our model identifies the pores that correlate best to adsorption at different pressures, which contributes to understanding atomic-level structure–property relationships for materials design.

Download Full-text

Interoperability frameworks linking mHealth applications to electronic record systems

BMC Health Services Research ◽

10.1186/s12913-021-06473-6 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Kagiso Ndlovu ◽

Maurice Mars ◽

Richard E. Scott

Keyword(s):

Primary Healthcare ◽

Developing World ◽

Healthcare Delivery ◽

Implementation Strategies ◽

Electronic Record ◽

Digital Information ◽

Patient Centered ◽

Health Records ◽

Healthcare Needs ◽

The Impact

Abstract Background mHealth presents innovative approaches to enhance primary healthcare delivery in developing countries like Botswana. The impact of mHealth solutions can be improved if they are interoperable with eRecord systems such as electronic health records, electronic medical records and patient health records. eHealth interoperability frameworks exist but their availability and utility for linking mHealth solutions to eRecords in developing world settings like Botswana is unknown. The recently adopted eHealth Strategy for Botswana recognises interoperability as an issue and mHealth as a potential solution for some healthcare needs, but does not address linking the two. Aim This study reviewed published reviews of eHealth interoperability frameworks for linking mHealth solutions with eRecords, and assessed their relevance to informing interoperability efforts with respect to Botswana’s eHealth Strategy. Methods A structured literature review and analysis of published reviews of eHealth interoperability frameworks was performed to determine if any are relevant to linking mHealth with eRecords. The Botswanan eHealth Strategy was reviewed. Results Four articles presented and reviewed eHealth interoperability frameworks that support linking of mHealth interventions to eRecords and associated implementation strategies. While the frameworks were developed for specific circumstances and therefore were based upon varying assumptions and perspectives, they entailed aspects that are relevant and could be drawn upon when developing an mHealth interoperability framework for Botswana. Common emerging themes of infrastructure, interoperability standards, data security and usability were identified and discussed; all of which are important in the developing world context such as in Botswana. The Botswana eHealth Strategy recognises interoperability, mHealth, and eRecords as distinct issues, but not linking of mHealth solutions with eRecords. Conclusions Delivery of healthcare is shifting from hospital-based to patient-centered primary healthcare and community-based settings, using mHealth interventions. The impact of mHealth solutions can be improved if data generated from them are converted into digital information ready for transmission and incorporation into eRecord systems. The Botswana eHealth Strategy stresses the need to have interoperable eRecords, but mHealth solutions must not be left out. Literature insight about mHealth interoperability with eRecords can inform implementation strategies for Botswana and elsewhere.

Download Full-text