Development of a Medical-text Parsing Algorithm Based on Character Adjacent Probability Distribution for Japanese Radiology Reports

2008 ◽  
Vol 47 (06) ◽  
pp. 513-521 ◽  
Author(s):  
S. Terae ◽  
M. Uesugi ◽  
K. Ogasawara ◽  
T. Sakurai ◽  
N. Nishimoto

Summary Objectives: The objectives of this study were to investigate the transitional probability distribution of medical term boundaries between characters and to develop a parsing algorithm specifically for medical texts. Methods: Medical terms in Japanese computed tomography (CT) reports were identified using the ChaSen morphological analysis system. MeSH-based medical terms (51,385 entries), obtained from the metathesaurus in the Unified Medical Language System (UMLS, 2005AA), were added as a medical dictionary for ChaSen. A radiographer corrected the set of results containing 300 parsed CT reports. In addition, two radiologists checked the medical term parsing of 200 CT sentences. Results: We obtained modified inter-annotator agreement scores for the text corrected by the radiologists. We retrieved the transitional probability as the conditional probability of a uni-gram, bi-gram, and tri-gram. The highest transitional probability P(Ci | Ci - 2*Ci - 1) was 1.00. For an example of anatomical location, the term “pulmonary hilum” was parsed as a tri-gram. Conclusions: Retrieval of transitional probability will improve the accuracy of parsing compound medical terms.

2019 ◽  
Vol 44 (1) ◽  
Author(s):  
Julian Varghese ◽  
Michael Fujarski ◽  
Martin Dugas

AbstractStudyPortal was implemented as the first multilingual search platform for geographic visualization of clinical trials and scientific articles. The platform queries information from ClinicalTrials.gov, PubMed, a geodatabase and geographic maps to enable geospatial study search and real-time rendering of study locations or research networks on a map. Thus, disease-specific clinical studies or whole research networks can be shown in a geographic proximity. Moreover, a semantic layer enables multilingual disease input and autosuggestion of medical terms based on the Unified Medical Language System. The portal is accessible on https://studyportal.uni-muenster.de. This paper presents details on implementation of the novel search platform, its search evaluation and future work.


Author(s):  
Hannes Seuss ◽  
Peter Dankerl ◽  
Matthias Ihle ◽  
Andrea Grandjean ◽  
Rebecca Hammon ◽  
...  

Purpose Projects involving collaborations between different institutions require data security via selective de-identification of words or phrases. A semi-automated de-identification tool was developed and evaluated on different types of medical reports natively and after adapting the algorithm to the text structure. Materials and Methods A semi-automated de-identification tool was developed and evaluated for its sensitivity and specificity in detecting sensitive content in written reports. Data from 4671 pathology reports (4105 + 566 in two different formats), 2804 medical reports, 1008 operation reports, and 6223 radiology reports of 1167 patients suffering from breast cancer were de-identified. The content was itemized into four categories: direct identifiers (name, address), indirect identifiers (date of birth/operation, medical ID, etc.), medical terms, and filler words. The software was tested natively (without training) in order to establish a baseline. The reports were manually edited and the model re-trained for the next test set. After manually editing 25, 50, 100, 250, 500 and if applicable 1000 reports of each type re-training was applied. Results In the native test, 61.3 % of direct and 80.8 % of the indirect identifiers were detected. The performance (P) increased to 91.4 % (P25), 96.7 % (P50), 99.5 % (P100), 99.6 % (P250), 99.7 % (P500) and 100 % (P1000) for direct identifiers and to 93.2 % (P25), 97.9 % (P50), 97.2 % (P100), 98.9 % (P250), 99.0 % (P500) and 99.3 % (P1000) for indirect identifiers. Without training, 5.3 % of medical terms were falsely flagged as critical data. The performance increased, after training, to 4.0 % (P25), 3.6 % (P50), 4.0 % (P100), 3.7 % (P250), 4.3 % (P500), and 3.1 % (P1000). Roughly 0.1 % of filler words were falsely flagged. Conclusion Training of the developed de-identification tool continuously improved its performance. Training with roughly 100 edited reports enables reliable detection and labeling of sensitive data in different types of medical reports. Key Points:  Citation Format


2021 ◽  
pp. medethics-2020-107192
Author(s):  
David Shaw ◽  
Alex Manara ◽  
Anne Laure Dalle Ave

In this paper, we discuss the largely neglected topic of semantics in medicine and the associated ethical issues. We analyse several key medical terms from the informed perspective of the healthcare professional, the lay perspective of the patient and the patient’s family, and the descriptive perspective of what the term actually signifies objectively. The choice of a particular medical term may deliver different meanings when viewed from these differing perspectives. Consequently, several ethical issues may arise. Technical terms that are not commonly understood by lay people may be used by physicians, consciously or not, and may obscure the understanding of the situation by lay people. The choice of particular medical terms may be accidental use of jargon, an attempt to ease the communication of psychologically difficult information, or an attempt to justify a preferred course of action and/or to manipulate the decision-making process.


2021 ◽  
pp. e2021092
Author(s):  
Gulsen Akoglu ◽  
Pelin Esme ◽  
Irem Yildiz

Background: The use of medical terms and folk names (euphemisms) affect a patient’s understanding of diseases and perceptions of severity. Objectives: We determine the psychological effects on patients with hidradenitis suppurativa of medical and folk names of their disease. Methods: This was a cross-sectional and exploratory study conducted at a tertiary referral university hospital in Turkey. A questionnaire on the medical and folk names of hidradenitis suppurativa was administered to 31 males and 25 females. Results: The patients expressed that they found the medical term hidradenitis suppurativa to be incomprehensible because it is a foreign term. When hearing it for the first time, it evoked negative responses such as confusion and worry about their health. Half of the patients preferred their doctors to use a more understandable and pronounceable name. More than 80% of patients expressed feeling depressed and stigmatized by the folk name of their disease. They preferred the terms boils, abscesses, or hidradenitis when referring to their disease. Conclusion: Both medical and folk names for hidradenitis suppurativa have negative effects on patients, and most patients feel stigmatized by either term.


2004 ◽  
Vol 10 (3) ◽  
pp. 295
Author(s):  
Jung Ae Lee ◽  
Hwa Jeong Seo ◽  
Kee Won Kim ◽  
Mingoo Kim ◽  
Seung Kwon Hong ◽  
...  

1999 ◽  
Vol 38 (04/05) ◽  
pp. 303-307 ◽  
Author(s):  
I. Antipov ◽  
W. Hersh ◽  
C. A. Smith ◽  
M. Mailhot ◽  
H. J. Lowe

AbstractThis paper describes preliminary work evaluating automated semantic indexing of radiology imaging reports to represent images stored in the Image Engine multimedia medical record system at the University of Pittsburgh Medical Center. The authors used the SAPHIRE indexing system to automatically identify important biomedical concepts within radiology reports and represent these concepts with terms from the 1998 edition of the U.S. National Library of Medicine’s Unified Medical Language System (UMLS) Metathesaurus. This automated UMLS indexing was then compared with manual UMLS indexing of the same reports. Human indexing identified appropriate UMLS Metathesaurus descriptors for 81% of the important biomedical concepts contained in the report set. SAPHIRE automatically identified UMLS Metathesaurus descriptors for 64% of the important biomedical concepts contained in the report set. The overall conclusions of this pilot study were that the UMLS metathesaurus provided adequate coverage of the majority of the important concepts contained within the radiology report test set and that SAPHIRE could automatically identify and translate almost two thirds of these concepts into appropriate UMLS descriptors. Further work is required to improve both the recall and precision of this automated concept extraction process.


2014 ◽  
Vol 6 (2) ◽  
pp. 70-78 ◽  
Author(s):  
Kencana Wulan Argakusumah ◽  
Seng Hansun

Dictionary is a reference book that contains words and phrases that are usually arranged alphabetically and followed by description of the meaning, usage, or translation. Dictionary has several kinds, one of which is a dictionary of terms. Medical Dictionary is a dictionary of terms that contain medical terms. Medical dictionary identical with the thick book that complicated and slow in its use. To overcome this we need a medical dictionary application. Medical dictionary applications require a search process to support the performance of the application. Search process is needed to shorten the time in the search itself. Searching is done using the Boyer - Moore algorithm is a string search algorithm that has the fastest searching time, because the strings match moves from right to left, so can shorten the time and simplify the use of medical dictionary application. This research was made based on Android using Java programming language with SQLite databases. After testing, 100% of 45 respondents stated that the application of this medical dictionary has the accuracy of the resulting word in the search process. Index Terms - medical dictionary application, search string, Boyer-Moore algorithm, Java, SQLite


2020 ◽  
Vol 27 (10) ◽  
pp. 1585-1592 ◽  
Author(s):  
Yonatan Bitton ◽  
Raphael Cohen ◽  
Tamar Schifter ◽  
Eitan Bachmat ◽  
Michael Elhadad ◽  
...  

Abstract Objective In Hebrew online health communities, participants commonly write medical terms that appear as transliterated forms of a source term in English. Such transliterations introduce high variability in text and challenge text-analytics methods. To reduce their variability, medical terms must be normalized, such as linking them to Unified Medical Language System (UMLS) concepts. We present a method to identify both transliterated and translated Hebrew medical terms and link them with UMLS entities. Materials and Methods We investigate the effect of linking terms in Camoni, a popular Israeli online health community in Hebrew. Our method, MDTEL (Medical Deep Transliteration Entity Linking), includes (1) an attention-based recurrent neural network encoder-decoder to transliterate words and mapping UMLS from English to Hebrew, (2) an unsupervised method for creating a transliteration dataset in any language without manually labeled data, and (3) an efficient way to identify and link medical entities in the Hebrew corpus to UMLS concepts, by producing a high-recall list of candidate medical terms in the corpus, and then filtering the candidates to relevant medical terms. Results We carry out experiments on 3 disease-specific communities: diabetes, multiple sclerosis, and depression. MDTEL tagging and normalizing on Camoni posts achieved 99% accuracy, 92% recall, and 87% precision. When tagging and normalizing terms in queries from the Camoni search logs, UMLS-normalized queries improved search results in 46% of the cases. Conclusions Cross-lingual UMLS entity linking from Hebrew is possible and improves search performance across communities. Annotated datasets, annotation guidelines, and code are made available online (https://github.com/yonatanbitton/mdtel).


Author(s):  
I Gusti Agung Sri Rwa Jayantini ◽  
Ida Bagus Putra Yadnya ◽  
I Nyoman Suparwa ◽  
Ida Ayu Made Puspani

This paper focuses on identifying meaning components in the translation of English medical terms into Indonesian. The data used in this study are the English medical term disorder and its Indonesian equivalent penyakit (disease). The two terms are purposively chosen as the data of the present study, which is a comparative research on the lexical meaning investigation in two different languages. The investigation involving a particular term in one language and its equivalent in the other language is worth doing since the lexicons in every language have their own specific concepts that may be synonymous, yet they are not always interchangeable in all contexts. The analysis into meaning components is called decomposition by means of several semantic theories to analyse the meaning of a lexical item (Löbner 2013). Here, the meaning components of the two compared terms are demonstrated through a semantic approach, particularly Natural Semantic Metalanguage (NSM) supported by the investigation on their synonyms and how the terms are used in different contexts. The results show that the meaning components of a particular term in one language like the English term disorder are not always found in the Indonesian term penyakit, or, conversely, some of the meaning components of the Indonesian term do not always exist in the English term.


Sign in / Sign up

Export Citation Format

Share Document