clinical narrative Latest Research Papers

Using topic modelling for unsupervised annotation of electronic health records to identify an outbreak of disease in UK dogs

PLoS ONE ◽

10.1371/journal.pone.0260402 ◽

2021 ◽

Vol 16 (12) ◽

pp. e0260402

Author(s):

Peter-John Mäntylä Noble ◽

Charlotte Appleton ◽

Alan David Radford ◽

Goran Nenadic

Keyword(s):

Electronic Health Records ◽

Time Course ◽

Latent Dirichlet Allocation ◽

Disease Outbreak ◽

Free Text ◽

Topic Modelling ◽

Sentinel Network ◽

Health Records ◽

Clinical Narrative ◽

Electronic Health

A key goal of disease surveillance is to identify outbreaks of known or novel diseases in a timely manner. Such an outbreak occurred in the UK associated with acute vomiting in dogs between December 2019 and March 2020. We tracked this outbreak using the clinical free text component of anonymised electronic health records (EHRs) collected from a sentinel network of participating veterinary practices. We sourced the free text (narrative) component of each EHR supplemented with one of 10 practitioner-derived main presenting complaints (MPCs), with the ‘gastroenteric’ MPC identifying cases involved in the disease outbreak. Such clinician-derived annotation systems can suffer from poor compliance requiring retrospective, often manual, coding, thereby limiting real-time usability, especially where an outbreak of a novel disease might not present clinically as a currently recognised syndrome or MPC. Here, we investigate the use of an unsupervised method of EHR annotation using latent Dirichlet allocation topic-modelling to identify topics inherent within the clinical narrative component of EHRs. The model comprised 30 topics which were used to annotate EHRs spanning the natural disease outbreak and investigate whether any given topic might mirror the outbreak time-course. Narratives were annotated using the Gensim Library LdaModel module for the topic best representing the text within them. Counts for narratives labelled with one of the topics significantly matched the disease outbreak based on the practitioner-derived ‘gastroenteric’ MPC (Spearman correlation 0.978); no other topics showed a similar time course. Using artificially injected outbreaks, it was possible to see other topics that would match other MPCs including respiratory disease. The underlying topics were readily evaluated using simple word-cloud representations and using a freely available package (LDAVis) providing rapid insight into the clinical basis of each topic. This work clearly shows that unsupervised record annotation using topic modelling linked to simple text visualisations can provide an easily interrogable method to identify and characterise outbreaks and other anomalies of known and previously un-characterised diseases based on changes in clinical narratives.

Automated Modeling of Clinical Narrative with High Definition Natural Language Processing Using Solor and Analysis Normal Form

10.3233/shti210822 ◽

2021 ◽

Author(s):

Melissa P. Resnick ◽

Frank LeHouillier ◽

Steven H. Brown ◽

Keith E. Campbell ◽

Diane Montella ◽

...

Keyword(s):

Natural Language Processing ◽

Decision Support ◽

Normal Form ◽

Natural Language ◽

Language Processing ◽

Gold Standard ◽

Snomed Ct ◽

High Definition ◽

Automated Modeling ◽

Clinical Narrative

Objective: One important concept in informatics is data which meets the principles of Findability, Accessibility, Interoperability and Reusability (FAIR). Standards, such as terminologies (findability), assist with important tasks like interoperability, Natural Language Processing (NLP) (accessibility) and decision support (reusability). One terminology, Solor, integrates SNOMED CT, LOINC and RxNorm. We describe Solor, HL7 Analysis Normal Form (ANF), and their use with the high definition natural language processing (HD-NLP) program. Methods: We used HD-NLP to process 694 clinical narratives prior modeled by human experts into Solor and ANF. We compared HD-NLP output to the expert gold standard for 20% of the sample. Each clinical statement was judged “correct” if HD-NLP output matched ANF structure and Solor concepts, or “incorrect” if any ANF structure or Solor concepts were missing or incorrect. Judgements were summed to give totals for “correct” and “incorrect”. Results: 113 (80.7%) correct, 26 (18.6%) incorrect, and 1 error. Inter-rater reliability was 97.5% with Cohen’s kappa of 0.948. Conclusion: The HD-NLP software provides useable complex standards-based representations for important clinical statements designed to drive CDS.

693. Performance of ICD Code Versus Discharge Summary based Query for Endocarditis Cohort Identification

Open Forum Infectious Diseases ◽

10.1093/ofid/ofab466.890 ◽

2021 ◽

Vol 8 (Supplement_1) ◽

pp. S448-S448

Author(s):

H Nina Kim ◽

Ayushi Gupta ◽

Kristine F Lan ◽

Jenell C Stewart ◽

Shireesha Dhanireddy ◽

...

Keyword(s):

Infective Endocarditis ◽

International Classification Of Diseases ◽

Discharge Summary ◽

Admission Diagnosis ◽

Test Characteristics ◽

Clinical Narrative ◽

Classification Of Diseases ◽

Icd 10 ◽

Icd Codes ◽

Discharge Summaries

Abstract Background Studies on infective endocarditis (IE) have relied on International Classification of Diseases (ICD) codes to identify cases but few have validated this method which may be prone to misclassification. Examination of clinical narrative data could offer greater accuracy and richness. Methods We evaluated two algorithms for IE identification from 7/1/2015 to 7/31/2019: (1) a standard query of ICD codes for IE (ICD-9: 424.9, 424.91, 424.99, 421.0, 421.1, 421.9, 112.81, 036.42 and ICD-10: I38, I39, I33, I33.9, B37.6 and A39.51) with or without procedure codes for echocardiogram (93303-93356) and (2) a key word, pattern-based text query of discharge summaries (DS) that selected on the term “endocarditis” in fields headed by “Discharge Diagnosis” or “Admission Diagnosis” or similar. Further coding extracted the nature and type of valve and the organism responsible for the IE if present in DS. All identified cases were chart reviewed using pre-specified criteria for true IE. Positive predictive value (PPV) was calculated as the total number of verified cases over the algorithm-selected cases. Sensitivity was the total number of algorithm-matched cases over a final list of 166 independently identified true IE cases from ID and Cardiology services. Specificity was defined using 119 pre-adjudicated non-cases minus the number of algorithm-matched cases over 119. Results The ICD-based query identified 612 individuals from July 2015 to July 2019 who had a hospital billing code for infective endocarditis; of these, 534 also had an echocardiogram. The DS query identified 387 cases. PPV for the DS query was 84.5% (95% confidence interval [CI] 80.6%, 87.8%) compared with 72.4% (95% CI 68.7%, 75.8%) for ICD only and 75.8% (95% CI 72.0%, 79.3%) for ICD + echo queries. Sensitivity was 75.9% for the DS query and 86.8-93.4% for the ICD queries. Specificity was high for all queries >94%. The DS query also yielded valve data (prosthetic, tricuspid, pulmonic, aortic or mitral) in 60% and microbiologic data in 73% of identified cases with an accuracy of 94% and 90% respectively when assessed by chart review. Table 1. Test Characteristics of Three Electronic Health Record Queries for Infective Endocarditis Conclusion Compared to traditional ICD-based queries, text-based queries of discharge summaries have the potential to improve precision of IE case ascertainment and extract key clinical variables. Disclosures All Authors: No reported disclosures

Clinical Concept Extraction with Lexical Semantics to Support Automatic Annotation

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph182010564 ◽

2021 ◽

Vol 18 (20) ◽

pp. 10564

Author(s):

Asim Abbas ◽

Muhammad Afzal ◽

Jamil Hussain ◽

Taqdir Ali ◽

Hafiz Syed Muhammad Bilal ◽

...

Keyword(s):

Deep Learning ◽

Clinical Decision Support Systems ◽

Data Driven ◽

Semantic Features ◽

Automatic Annotation ◽

Learning Approaches ◽

Rule Based ◽

Concept Extraction ◽

Clinical Narrative ◽

Better Than

Extracting clinical concepts, such as problems, diagnosis, and treatment, from unstructured clinical narrative documents enables data-driven approaches such as machine and deep learning to support advanced applications such as clinical decision-support systems, the assessment of disease progression, and the intelligent analysis of treatment efficacy. Various tools such as cTAKES, Sophia, MetaMap, and other rules-based approaches and algorithms have been used for automatic concept extraction. Recently, machine- and deep-learning approaches have been used to extract, classify, and accurately annotate terms and phrases. However, the requirement of an annotated dataset, which is labor-intensive, impedes the success of data-driven approaches. A rule-based mechanism could support the process of annotation, but existing rule-based approaches fail to adequately capture contextual, syntactic, and semantic patterns. This study intends to introduce a comprehensive rule-based system that automatically extracts clinical concepts from unstructured narratives with higher accuracy and transparency. The proposed system is a pipelined approach, capable of recognizing clinical concepts of three types, problem, treatment, and test, in the dataset collected from a published repository as a part of the I2b2 challenge 2010. The system’s performance is compared with that of three existing systems: Quick UMLS, BIO-CRF, and the Rules (i2b2) model. Compared to the baseline systems, the average F1-score of 72.94% was found to be 13% better than Quick UMLS, 3% better than BIO CRF, and 30.1% better than the Rules (i2b2) model. Individually, the system performance was noticeably higher for problem-related concepts, with an F1-score of 80.45%, followed by treatment-related concepts and test-related concepts, with F1-scores of 76.06% and 55.3%, respectively. The proposed methodology significantly improves the performance of concept extraction from unstructured clinical narratives by exploiting the linguistic and lexical semantic features. The approach can ease the automatic annotation process of clinical data, which ultimately improves the performance of supervised data-driven applications trained with these data.

To be or not to be three: a clinical narrative, an unanswered question

Couple and Family Psychoanalysis ◽

10.33212/cfp.v11n2.2021.113 ◽

2021 ◽

Vol 11 (2) ◽

pp. 113-128

Author(s):

Jill Savege Scharff

Keyword(s):

Family Of Origin ◽

Social Research ◽

Couple Relationship ◽

Unanswered Question ◽

The Unconscious ◽

Potential Child ◽

Psychoanalytic Literature ◽

Clinical Narrative ◽

Psychoanalytic Understanding ◽

Psychoanalytic Writing

The psychoanalytic literature deals with parenthood as a developmental stage but barely addresses the couple's preconception of fertility intentions. The author reviews the available literature from social research and psychoanalytic writing. Working with a couple over family of origin conflicts, she uncovers the hidden conflict over the wish to have or not have a child, reveals unconscious fantasies about the potential child, and deals with conflict in the otherwise compatible couple relationship itself. The author offers this clinical vignette to extend psychoanalytic understanding of the unconscious fantasies involved. She concludes with a discussion of transference towards the couple therapist as an infection to be avoided, an annoying parent to speed away from, and a disturbing child about whom the couple was ambivalent.

Transformation and Interpretation: The Case of Adam, A Clinical Narrative and Discussion

The Psychoanalytic Quarterly ◽

10.1080/00332828.2021.1938873 ◽

2021 ◽

Vol 90 (3) ◽

pp. 439-467

Author(s):

Margaret Ann Fitzpatrick Hanly ◽

Siri Erika Gullestad ◽

Robert S. White ◽

Ricardo Bernardi

Keyword(s):

Clinical Narrative

Oral ulcers in children- a clinical narrative overview

Italian Journal of Pediatrics ◽

10.1186/s13052-021-01097-2 ◽

2021 ◽

Vol 47 (1) ◽

Author(s):

Corinne Légeret ◽

Raoul Furlano

Keyword(s):

Treatment Options ◽

Correct Diagnosis ◽

Pemphigus Vulgaris ◽

Nutritional Deficiencies ◽

Stevens Johnson Syndrome ◽

Mucous Membrane Pemphigoid ◽

Duration Of Symptoms ◽

Family Doctors ◽

Oral Ulcers ◽

Clinical Narrative

AbstractThe prevalence of oral ulcers in children is reported to be 9%, however diagnosis of oral lesions can be challenging, being an unspecific symptom of several diseases. Differential diagnosis can range from classic infectious disease of childhood (e.g. herpangina, hand-foot-and-mouth-disease) over nutritional deficiencies, gastrointestinal disorders, inflammations (e.g. pemphigus vulgaris, lichen planus, mucous membrane pemphigoid) to side effects of medications (Stevens-Johnson Syndrome) or chronic dieseases (e.g. sarcoidosis, systemic Lupus erythematodes, familial Mediterrenean fever). Therefore, children with oral ulcers are treated by many different specialists such as dentists, family doctors, paediatricians, rheumatologists, haematologists, gastroenterologists and otorhinolaryngologists.A systematic literature search and a narrative literature review about the potential 48 diseases connected to oral ulcers were performed. According to the duration of symptoms and size of the lesions, a tabular overview was created to support the clinician in making a correct diagnosis, additionally different treatment options are presented.

Multi-faceted Semantic Clustering With Text-derived Phenotypes

10.1101/2021.05.26.21257830 ◽

2021 ◽

Author(s):

Luke T Slater ◽

John A Williams ◽

Andreas Karwath ◽

Hilary Fanning ◽

Simon Ball ◽

...

Keyword(s):

Semantic Similarity ◽

Unitary Similarity ◽

Human Phenotype ◽

Semantic Clustering ◽

Formal Ontologies ◽

Clinical Narrative ◽

Nuanced Understanding ◽

Complex Relationships ◽

Evaluation Techniques ◽

Similarity Scores

Identification of ontology concepts in clinical narrative text enables the creation of phenotype profiles that can be associated with clinical entities, such as patients or drugs. Constructing patient phenotype profiles using formal ontologies enables their analysis via semantic similarity, in turn enabling the use of background knowledge in clustering or classification analyses. However, traditional semantic similarity approaches collapse complex relationships between patient phenotypes into a unitary similarity scores for each pair of patients. Moreover, single scores may be based only on matching terms with the greatest information content (IC), ignoring other dimensions of patient similarity. This process necessarily leads to a loss of information in the resulting representation of patient similarity, and is especially apparent when using very large text-derived and highly multi-morbid phenotype profiles. Moreover, it renders finding a biological explanation for similarity very difficult; the black box problem. In this article, we explore the generation of multiple semantic similarity scores for patients based on different facets of their phenotypic manifestation, which we define through different sub-graphs in the Human Phenotype Ontology. We further present a new methodology for deriving sets of qualitative class descriptions for groups of entities described by ontology terms. Leveraging this strategy to obtain meaningful explanations for our semantic clusters alongside other evaluation techniques, we show that semantic clustering with ontology-derived facets enables the representation, and thus identification of, clinically relevant phenotype relationships not easily recoverable using overall clustering alone. In this way, we demonstrate the potential of faceted semantic clustering for gaining a deeper and more nuanced understanding of text-derived patient phenotypes.

Comparison of Machine Learning Algorithms for the Prediction of Current Procedural Terminology (CPT) Codes from Pathology Reports

10.1101/2021.03.13.21253502 ◽

2021 ◽

Author(s):

Joshua Levy ◽

Nishitha Vattikonda ◽

Christian Haudenschild ◽

Brock Christensen ◽

Louis Vaickus

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Machine Learning Algorithms ◽

Pathology Report ◽

Free Text ◽

Medical Procedure ◽

Current Procedural Terminology ◽

Learning Methods ◽

Clinical Narrative ◽

Pathology Reports

AbstractBackgroundPathology reports serve as an auditable trail of a patient’s clinical narrative containing important free text pertaining to diagnosis, prognosis and specimen processing. Recent works have utilized sophisticated natural language processing (NLP) pipelines which include rule-based or machine learning analytics to uncover patterns from text to inform clinical endpoints and biomarker information. While deep learning methods have come to the forefront of NLP, there have been limited comparisons with the performance of other machine learning methods in extracting key insights for prediction of medical procedure information (Current Procedural Terminology; CPT codes), that informs insurance claims, medical research, and healthcare policy and utilization. Additionally, the utility of combining and ranking information from multiple report subfields as compared to exclusively using the diagnostic field for the prediction of CPT codes and signing pathologist remains unclear.MethodsAfter passing pathology reports through a preprocessing pipeline, we utilized advanced topic modeling techniques such as UMAP and LDA to identify topics with diagnostic relevance in order to characterize a cohort of 93,039 pathology reports at the Dartmouth-Hitchcock Department of Pathology and Laboratory Medicine (DPLM). We separately compared XGBoost, SVM, and BERT methodologies for prediction of 38 different CPT codes using 5-fold cross validation, using both the diagnostic text only as well as text from all subfields. We performed similar analyses for characterizing text from a group of the twenty pathologists with the most pathology report sign-outs. Finally, we interpreted report and cohort level important words using TF-IDF, Shapley Additive Explanations (SHAP), attention, and integrated gradients.ResultsWe identified 10 topics for both the diagnostic-only and all-fields text, which pertained to diagnostic and procedural information respectively. The topics were associated with select CPT codes, pathologists and report clusters. Operating on the diagnostic text alone, XGBoost performed similarly to BERT for prediction of CPT codes. When utilizing all report subfields, XGBoost outperformed BERT for prediction of CPT codes, though XGBoost and BERT performed similarly for prediction of signing pathologist. Both XGBoost and BERT outperformed SVM. Utilizing additional subfields of the pathology report increased prediction accuracy for the CPT code and pathologist classification tasks. Misclassification of pathologist was largely subspecialty related. We identified text that is CPT and pathologist specific.ConclusionsOur approach generated CPT code predictions with an accuracy higher than that reported in previous literature. While diagnostic text is an important information source for NLP pipelines in pathology, additional insights may be extracted from other report subfields. Although deep learning approaches did not outperform XGBoost approaches, they may lend valuable information to pipelines that combine image, text and -omics information. Future resource-saving opportunities exist for utilizing pathology reports to help hospitals detect mis-billing and estimate productivity metrics that pertain to pathologist compensation (RVU’s).

A Complex Model of Clinical Narrative Information for the Diagnostic Act

10.1101/2021.02.21.21252158 ◽

2021 ◽

Author(s):

David Chartash ◽

Marc B Rosenman ◽

Johan Bollen ◽

Markus Dickinson ◽

Stephen M Downs

Keyword(s):

Medical Record ◽

Causal Reasoning ◽

Clinical Medicine ◽

Clinical Care ◽

Complex Model ◽

Diagnostic Information ◽

Structure Theory ◽

Adequate Model ◽

Clinical Narrative ◽

Unstructured Information

AbstractBackgroundThe act of diagnosis is one which precipitates semiotic closure, the complex integration of signs and symptoms through cognitive perspectives to ultimately activate causal reasoning and calibrate the assignment of a disease entity to the patient. In writing about this act, physicians encode both structured and unstructured information into the medical record. Unstructured information contains a latent structure which entwines both the cognitive components of the diagnostic act and the linguistic patterns associated with clinical documentation. Existing models of clinical language primarily use a physical or dialogic model of information as their basis, and do not adequately account for the complexity inherent in the diagnostic act.MethodsFraming the diagnostic information collected in clinical care as a narrative, we developed a model representative of said information, accounting for its content and structure, as well as the inherent complexity therein. Using an exemplar text, we present the use of known predication and semantic relations from ontological (the Unified Medical Language System) and linguistic theory (Rhetorical Structure Theory) to facilitate the operationalization of the model, and analyze the result.ResultsThe resulting model is demonstrated to be complex, representative of the clinical narrative text, and is fundamentally aligned with the clinical acts of both documentation and diagnosis. We find the model’s representation of the cognitive aspects of narrative consistent with models of reading, as well as an adequate model of information as presented by clinical medicine and the clinical sub-language.ConclusionsWe present a model to represent diagnostic information in the physician’s note which accounts for the clinical and textual narrative precipitated by the cognition involved in encoding said information into the unstructured medical record. This model prepends the development of (computational) linguistic models of the clinical sublanguage within the physician’s note as it relates to diagnosis, beyond the information level of the lexical unit. Such analysis would facilitate better reflection on the structure and meaning of the clinical note, offering improvements to medical education and care.

clinical narrative
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Using topic modelling for unsupervised annotation of electronic health records to identify an outbreak of disease in UK dogs

Automated Modeling of Clinical Narrative with High Definition Natural Language Processing Using Solor and Analysis Normal Form

693. Performance of ICD Code Versus Discharge Summary based Query for Endocarditis Cohort Identification

Clinical Concept Extraction with Lexical Semantics to Support Automatic Annotation

To be or not to be three: a clinical narrative, an unanswered question

Transformation and Interpretation: The Case of Adam, A Clinical Narrative and Discussion

Oral ulcers in children- a clinical narrative overview

Multi-faceted Semantic Clustering With Text-derived Phenotypes

Comparison of Machine Learning Algorithms for the Prediction of Current Procedural Terminology (CPT) Codes from Pathology Reports

A Complex Model of Clinical Narrative Information for the Diagnostic Act

Export Citation Format

clinical narrativeRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Using topic modelling for unsupervised annotation of electronic health records to identify an outbreak of disease in UK dogs

Automated Modeling of Clinical Narrative with High Definition Natural Language Processing Using Solor and Analysis Normal Form

693. Performance of ICD Code Versus Discharge Summary based Query for Endocarditis Cohort Identification

Clinical Concept Extraction with Lexical Semantics to Support Automatic Annotation

To be or not to be three: a clinical narrative, an unanswered question

Transformation and Interpretation: The Case of Adam, A Clinical Narrative and Discussion

Oral ulcers in children- a clinical narrative overview

Multi-faceted Semantic Clustering With Text-derived Phenotypes

Comparison of Machine Learning Algorithms for the Prediction of Current Procedural Terminology (CPT) Codes from Pathology Reports

A Complex Model of Clinical Narrative Information for the Diagnostic Act

clinical narrative
Recently Published Documents