scholarly journals Improving early diagnosis of rare diseases using Natural Language Processing in unstructured medical records: an illustration from Dravet syndrome

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Tommaso Lo Barco ◽  
Mathieu Kuchenbuch ◽  
Nicolas Garcelon ◽  
Antoine Neuraz ◽  
Rima Nabbout

Abstract Background The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care. Methods Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions. Results We found significative higher representation of concepts related to seizures’ phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders. Conclusions Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.

2021 ◽  
Author(s):  
Tommaso Lo Barco ◽  
Mathieu Kuchenbuch ◽  
Nicolas Garcelon ◽  
Antoine Neuraz ◽  
Rima Nabbout

Abstract BackgroundThe growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare Developmental and Epileptic Encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after two years, as it is difficult to differentiate DS at onset from FS.We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of two years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care.MethodsData were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of four years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before two years and using a series of logistic regressions. ResultsWe found significative higher representation of concepts related to seizures’ phenotype distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders. ConclusionsNarrative medical reports of individuals younger than two years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.


2018 ◽  
Vol 13 (10) ◽  
pp. S772
Author(s):  
X. Sui ◽  
T. Liu ◽  
Q. Huang ◽  
Y. Hou ◽  
Y. Wang ◽  
...  

2019 ◽  
Vol 08 (02) ◽  
pp. 031-037
Author(s):  
Tyler J. Burr ◽  
Karen L. Skjei

AbstractDravet's syndrome (DS) or severe myoclonic epilepsy of infancy is a rare, genetic, and infantile-onset epileptic encephalopathy. DS presents with recurrent febrile seizures and/or febrile status epilepticus in developmentally normal infants, and subsequently evolves into a drug-resistant mixed-seizure disorder with developmental arrest or regression. As many defining clinical features of DS do not become evident until 3 to 4 years of age, diagnosis is often delayed. Early seizure control, particularly the prevention of status epilepticus in infancy, has been shown to correlate with better long-term outcomes. Thus, early diagnosis and seizure control is crucial. Several treatment algorithms have been published in recent years to guide antiepileptic drug selection and escalation. Last year, two agents, stiripentol and cannabidiol, were approved by the U.S. Food and Drug Administration specifically for use in DS, and a third has been submitted (fenfluramine). Additional therapies, including serotonin modulators lorcaserin and trazodone, verapamil, and several first-in-class medications, are currently in various phases of investigation.


Author(s):  
Matthias Hölscher ◽  
Rudiger Buchkremer

Rare diseases in their entirety have a substantial impact on the healthcare market, as they affect a large number of patients worldwide. Governments provide financial support for diagnosis and treatment. Market orientation is crucial for any market participant to achieve business profitability. However, the market for rare diseases is opaque. The authors compare results from search engines and healthcare databases utilizing natural language processing. The approach starts with an information retrieval process, applying the MeSH thesaurus. The results are prioritized and visualized, using word clouds. In total, the chapter is about the examination of 30 rare diseases and about 500,000 search results in the databases Pubmed, FindZebra, and the search engine Google. The authors compare their results to the search for common diseases. The authors conclude that FindZebra and Google provide relatively good results for the evaluation of therapies and diagnoses. However, the quantity of the findings from professional databases such as Pubmed remains unsurpassed.


Sign in / Sign up

Export Citation Format

Share Document