scholarly journals A System for Medical Information Extraction and Verification from Unstructured Text

2020 ◽  
Vol 34 (08) ◽  
pp. 13314-13319
Author(s):  
Damir Juric ◽  
Giorgos Stoilos ◽  
Andre Melo ◽  
Jonathan Moore ◽  
Mohammad Khodadadi

A wealth of medical knowledge has been encoded in terminologies like SNOMED CT, NCI, FMA, and more. However, these resources are usually lacking information like relations between diseases, symptoms, and risk factors preventing their use in diagnostic or other decision making applications. In this paper we present a pipeline for extracting such information from unstructured text and enriching medical knowledge bases. Our approach uses Semantic Role Labelling and is unsupervised. We show how we dealt with several deficiencies of SRL-based extraction, like copula verbs, relations expressed through nouns, and assigning scores to extracted triples. The system have so far extracted about 120K relations and in-house doctors verified about 5k relationships. We compared the output of the system with a manually constructed network of diseases, symptoms and risk factors build by doctors in the course of a year. Our results show that our pipeline extracts good quality and precise relations and speeds up the knowledge acquisition process considerably.

2019 ◽  
Vol 15 (3) ◽  
pp. 359-382 ◽  
Author(s):  
Nassim Abdeldjallal Otmani ◽  
Malik Si-Mohammed ◽  
Catherine Comparot ◽  
Pierre-Jean Charrel

Purpose The purpose of this study is to propose a framework for extracting medical information from the Web using domain ontologies. Patient–Doctor conversations have become prevalent on the Web. For instance, solutions like HealthTap or AskTheDoctors allow patients to ask doctors health-related questions. However, most online health-care consumers still struggle to express their questions efficiently due mainly to the expert/layman language and knowledge discrepancy. Extracting information from these layman descriptions, which typically lack expert terminology, is challenging. This hinders the efficiency of the underlying applications such as information retrieval. Herein, an ontology-driven approach is proposed, which aims at extracting information from such sparse descriptions using a meta-model. Design/methodology/approach A meta-model is designed to bridge the gap between the vocabulary of the medical experts and the consumers of the health services. The meta-model is mapped with SNOMED-CT to access the comprehensive medical vocabulary, as well as with WordNet to improve the coverage of layman terms during information extraction. To assess the potential of the approach, an information extraction prototype based on syntactical patterns is implemented. Findings The evaluation of the approach on the gold standard corpus defined in Task1 of ShARe CLEF 2013 showed promising results, an F-score of 0.79 for recognizing medical concepts in real-life medical documents. Originality/value The originality of the proposed approach lies in the way information is extracted. The context defined through a meta-model proved to be efficient for the task of information extraction, especially from layman descriptions.


1987 ◽  
Vol 26 (01) ◽  
pp. 31-39 ◽  
Author(s):  
S. Lester ◽  
R. Rada

SummaryThe Medical Subject Headings (MeSH) of the National Library of Medicine may be viewed as a semantic network. The relationships in this semantic network are of a broader-than/narrower-than type. A knowledge base of this type may be augmented by adding new terms and new relationships to the network. The Current Medical Information and Terminology (CMIT) of the American Medical Association represents a rich source of relationships for the disease terms of MeSH. A subset of MeSH was augmented with the knowledge from a subset of CMIT using a matching and similarity strategy. The matching portion of the experiment showed that about half of CMIT may be directly merged with MeSH based on exact and partial matches and utilization of alternate and synonym terms from CMIT. The similarity portion of the experiment showed that a method of merging based on similarity of features is a workable approach to incorporating knowledge into MeSH when lexical matches are not available. Evaluation of the resulting merged knowledge base suggested that the etiology property of CMIT was the most highly inherited property. The augmented knowledge base was used as a basis for an automatic indexer. The indexer was less accurate after augmentation than before. One key difficulty stemmed from the way that CMIT was encoded into MeSH. More powerful encodings of CMIT into MeSH are being pursued. Building on MeSH, CMIT, and other such knowledge bases that already exist on the computer is one way to try to develop intelligent medical computer systems.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Chao Zhang ◽  
Deyu Li ◽  
Yan Yan

In medical science, disease diagnosis is one of the difficult tasks for medical experts who are confronted with challenges in dealing with a lot of uncertain medical information. And different medical experts might express their own thought about the medical knowledge base which slightly differs from other medical experts. Thus, to solve the problems of uncertain data analysis and group decision making in disease diagnoses, we propose a new rough set model called dual hesitant fuzzy multigranulation rough set over two universes by combining the dual hesitant fuzzy set and multigranulation rough set theories. In the framework of our study, both the definition and some basic properties of the proposed model are presented. Finally, we give a general approach which is applied to a decision making problem in disease diagnoses, and the effectiveness of the approach is demonstrated by a numerical example.


1987 ◽  
Vol 26 (01) ◽  
pp. 3-12 ◽  
Author(s):  
J. M. Martin ◽  
L. Benamghar ◽  
B. Junod ◽  
P. Marrel

SummaryThe problems of assisting in the medical decision-making process are attracting more and more attention.Actually a certain number of computer systems have considerably improved the availability of medical data. However, we encounter some difficulties when extending these systems. In order to surmount these problems, it is necessary to proceed further in the analysis and comprehension of medical information and processes.To accomplish this goal, it is necessary to have a better understanding of the way in which a group of medical data is derived from one piece of medical knowledge and also how a chunk of medical knowledge is related to its corresponding medical data.This article is a beginning in the study of the transition from medical data to health knowledge, and this transition represents only part of the global entity, the nature, the representation, and use of medical information.


1989 ◽  
Vol 28 (02) ◽  
pp. 78-85 ◽  
Author(s):  
R. Linnarsson ◽  
O. Wigertz

Abstract:The medical information systems of the future will probably include the entire medical record as well as a knowledge base, providing decision support for the physician during patient care. Data dictionaries will play an important role in integrating the medical knowledge bases with the clinical databases.This article presents an infological data model of such an integrated medical information system. Medical events, medical terms, and medical facts are the basic concepts that constitute the model. To allow the transfer of information and knowledge between systems, the data dictionary should be organized with regard to several common classification schemes of medical nomenclature.


1999 ◽  
Vol 38 (04/05) ◽  
pp. 279-286 ◽  
Author(s):  
L. L. Weed

AbstractIt is widely recognised that accessing and processing medical information in libraries and patient records is a burden beyond the capacities of the physician’s unaided mind in the conditions of medical practice. Physicians are quite capable of tremendous intellectual feats but cannot possibly do it all. The way ahead requires the development of a framework in which the brilliant pieces of understanding are routinely assembled into a working unit of social machinery that is coherent and as error free as possible – a challenge in which we ourselves are among the working parts to be organized and brought under control.Such a framework of intellectual rigor and discipline in the practice of medicine can only be achieved if knowledge is embedded in tools; the system requiring the routine use of those tools in all decision making by both providers and patients.


1990 ◽  
Vol 29 (04) ◽  
pp. 386-392 ◽  
Author(s):  
R. Degani ◽  
G. Bortolan

AbstractThe main lines ofthe program designed for the interpretation of ECGs, developed in Padova by LADSEB-CNR with the cooperation of the Medical School of the University of Padova are described. In particular, the strategies used for (i) morphology recognition, (ii) measurement evaluation, and (iii) linguistic decision making are illustrated. The main aspect which discerns this program in comparison with other approaches to computerized electrocardiography is its ability of managing the imprecision in both the measurements and the medical knowledge through the use of fuzzy-set methodologies. So-called possibility distributions are used to represent ill-defined parameters as well as threshold limits for diagnostic criteria. In this way, smooth conclusions are derived when the evidence does not support a crisp decision. The influence of the CSE project on the evolution of the Padova program is illustrated.


Sign in / Sign up

Export Citation Format

Share Document