Medical Reports Analysis Using Natural Language Processing for Disease Classification

Author(s):  
Sumathi S. ◽  
Indumathi S. ◽  
Rajkumar S.

Text classification in medical domain could result in an easier way of handling large volumes of medical data. They can be segregated depending on the type of diseases, which can be determined by extracting the decisive key texts from the original document. Due to various nuances present in understanding language in general, a requirement of large volumes of text-based data is required for algorithms to learn patterns properly. The problem with existing systems such as MedScape, MedLinePlus, Wrappin, and MedHunt is that they involve human interaction and high time consumption in handling a large volume of data. By employing automation in this proposed field, the large involvement of manpower could be removed which in turn speeds up the process of classification of the medical documents by which the shortage of medical technicians in third world countries are addressed.

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Tommaso Lo Barco ◽  
Mathieu Kuchenbuch ◽  
Nicolas Garcelon ◽  
Antoine Neuraz ◽  
Rima Nabbout

Abstract Background The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care. Methods Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions. Results We found significative higher representation of concepts related to seizures’ phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders. Conclusions Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.


2018 ◽  
Vol 13 (10) ◽  
pp. S772
Author(s):  
X. Sui ◽  
T. Liu ◽  
Q. Huang ◽  
Y. Hou ◽  
Y. Wang ◽  
...  

2014 ◽  
Vol 08 (03) ◽  
pp. 249-255
Author(s):  
Joseph R. Barr ◽  
Dimitri Popolov

This paper discusses principles for the design of natural language processing (NLP) systems to automatically extract data from doctor's notes, laboratory results and other medical documents in free-form text. We argue that rather than searching for "atom units of meaning" in the text and then trying to generalize them into a broader set of documents through increasingly complicated system of rules, an NLP practitioner should take concepts as a whole and as a meaningful unit of text. This simplifies the rules and makes NLP system easier to maintain and adapt. The departure point is purely practical; however, a deeper investigation of typical problems with the implementation of such systems leads us to a discussion of broader linguistic theories underlying the NLP practices, such as metaphors theories and models of human communication.


Sign in / Sign up

Export Citation Format

Share Document