medical texts
Recently Published Documents


TOTAL DOCUMENTS

492
(FIVE YEARS 162)

H-INDEX

13
(FIVE YEARS 2)

2022 ◽  
Vol 3 (1) ◽  
pp. 1-27
Author(s):  
Md Momin Al Aziz ◽  
Tanbir Ahmed ◽  
Tasnia Faequa ◽  
Xiaoqian Jiang ◽  
Yiyu Yao ◽  
...  

Technological advancements in data science have offered us affordable storage and efficient algorithms to query a large volume of data. Our health records are a significant part of this data, which is pivotal for healthcare providers and can be utilized in our well-being. The clinical note in electronic health records is one such category that collects a patient’s complete medical information during different timesteps of patient care available in the form of free-texts. Thus, these unstructured textual notes contain events from a patient’s admission to discharge, which can prove to be significant for future medical decisions. However, since these texts also contain sensitive information about the patient and the attending medical professionals, such notes cannot be shared publicly. This privacy issue has thwarted timely discoveries on this plethora of untapped information. Therefore, in this work, we intend to generate synthetic medical texts from a private or sanitized (de-identified) clinical text corpus and analyze their utility rigorously in different metrics and levels. Experimental results promote the applicability of our generated data as it achieves more than 80\% accuracy in different pragmatic classification problems and matches (or outperforms) the original text data.


2022 ◽  
Vol 22 (1) ◽  
Author(s):  
Zhao Shuai ◽  
Diao Xiaolin ◽  
Yuan Jing ◽  
Huo Yanni ◽  
Cui Meng ◽  
...  

Abstract Background Automated ICD coding on medical texts via machine learning has been a hot topic. Related studies from medical field heavily relies on conventional bag-of-words (BoW) as the feature extraction method, and do not commonly use more complicated methods, such as word2vec (W2V) and large pretrained models like BERT. This study aimed at uncovering the most effective feature extraction methods for coding models by comparing BoW, W2V and BERT variants. Methods We experimented with a Chinese dataset from Fuwai Hospital, which contains 6947 records and 1532 unique ICD codes, and a public Spanish dataset, which contains 1000 records and 2557 unique ICD codes. We designed coding tasks with different code frequency thresholds (denoted as $$f_s$$ f s ), with a lower threshold indicating a more complex task. Using traditional classifiers, we compared BoW, W2V and BERT variants on accomplishing these coding tasks. Results When $$f_s$$ f s was equal to or greater than 140 for Fuwai dataset, and 60 for the Spanish dataset, the BERT variants with the whole network fine-tuned was the best method, leading to a Micro-F1 of 93.9% for Fuwai data when $$f_s=200$$ f s = 200 , and a Micro-F1 of 85.41% for the Spanish dataset when $$f_s=180$$ f s = 180 . When $$f_s$$ f s fell below 140 for Fuwai dataset, and 60 for the Spanish dataset, BoW turned out to be the best, leading to a Micro-F1 of 83% for Fuwai dataset when $$f_s=20$$ f s = 20 , and a Micro-F1 of 39.1% for the Spanish dataset when $$f_s=20$$ f s = 20 . Our experiments also showed that both the BERT variants and BoW possessed good interpretability, which is important for medical applications of coding models. Conclusions This study shed light on building promising machine learning models for automated ICD coding by revealing the most effective feature extraction methods. Concretely, our results indicated that fine-tuning the whole network of the BERT variants was the optimal method for tasks covering only frequent codes, especially codes that represented unspecified diseases, while BoW was the best for tasks involving both frequent and infrequent codes. The frequency threshold where the best-performing method varied differed between different datasets due to factors like language and codeset.


2021 ◽  
Vol 24 ◽  
Author(s):  
Ioannis Petropoulos

Although the extraordinary progress in medicine since the 19th century has made Hippocrates and Galen irrelevant, Greek and Greek-derived terms continue to be used in the medical sciences today. The marked ability of the Greek language to form compounds facilitated the expansion of its medical lexicon. Greek medicine evolved far longer than its modern counterpart; its enduring cachet has lent it an atemporality. This article traces the main stages in the history of the nearly continuous reception of Greek medical nomenclature across more than two millennia. The process is shown to have been inseparable from the transmission and editing of Greek medical texts and their translation into Latin, Arabic, and eventually into vernacular languages. The article also sheds incidental light on the history of translation and transliteration in Europe and the Arab world.


2021 ◽  
Vol 51 (44) ◽  
pp. 54-70
Author(s):  
Marlene Erschbamer

Himalayan peoples bathe in hot springs for medical and spiritual therapy. Included in local myths, hot springs are natural features that form a part of cultural memory and are social, cultural, religious, and medical venues. They also represent the tension between economic growth and environmental protection and, consequently, the competition between different parts of people’s identities. By analyzing religious, historical, and medical texts in combination with biographical accounts, a comprehensive picture of the cultural and religious significance of hot springs in the Himalayas is presented. The focus lies on Buddhist influenced societies within the Tibetan Cultural Area which are those parts in the Himalayas that have been influenced by Tibetan culture


Author(s):  
Alexander Sboev ◽  
Anton Selivanov ◽  
Ivan Moloshnikov ◽  
Roman Rybka ◽  
Artem Gryaznov ◽  
...  

Nowadays, an analysis of virtual media to predict society’s reaction to any events or processes is a task of great relevance. Especially it concerns meaningful information on healthcare problems. Internet sources contain a large amount of pharmacologically meaningful information useful for pharmacovigilance purposes and repurposing drug use. An analysis of such a scale of information demands developing the methods that require the creation of a corpus with labeled relations among entities. Before, there have been no such Russian language datasets. This paper considers the first Russian language dataset where labeled entity pairs are divided into multiple contexts within a single text (by used drugs, by different users, by the cases of use, etc.), and a method based on the XLM-RoBERTa language model, previously trained on medical texts to evaluate the state-of-the-art accuracy for the task of indication of the four types of relationships among entities: ADR–Drugname, Drugname–Diseasename, Drugname–SourceInfoDrug, Diseasename–Indication. As shown based on the presented dataset from the Russian Drug Review Corpus, the developed method achieves the F1-score of 81.2% (obtained using cross-validation and averaged for the four types of relationships), which is 7.8% higher than the basic classifiers.


Molecules ◽  
2021 ◽  
Vol 26 (22) ◽  
pp. 6933
Author(s):  
Martina Bottoni ◽  
Fabrizia Milani ◽  
Paolo M. Galimberti ◽  
Lucia Vignati ◽  
Patrizia Luise Romanini ◽  
...  

This work is based on the study of 150 majolica vases dated back to the mid XVII century that once preserved medicinal remedies prepared in the ancient Pharmacy annexed to the Ospedale Maggiore Ca’ Granda in Milan (Lombardy, Italy). The Hortus simplicium was created in 1641 as a source of plant-based ingredients for those remedies. The main objective of the present work is to lay the knowledge base for the restoration of the ancient Garden for educational and informative purposes. Therefore, the following complementary phases were carried out: (i) the analysis of the inscriptions on the jars, along with the survey on historical medical texts, allowing for the positive identification of the plant ingredients of the remedies and their ancient use as medicines; (ii) the bibliographic research in modern pharmacological literature in order to validate or refute the historical uses; (iii) the realization of the checklist of plants potentially present in cultivation at the ancient Garden, concurrently with the comparison with the results of a previous in situ archaeobotanical study concerning pollen grains. For the species selection, considerations were made also regarding drug amounts in the remedies and pedoclimatic conditions of the study area. Out of the 150 vases, 108 contained plant-based remedies, corresponding to 148 taxa. The remedies mainly treated gastrointestinal and respiratory disorders. At least one of the medicinal uses was validated in scientific literature for 112 out of the 148 examined species. Finally, a checklist of 40 taxa, presumably hosted in the Hortus simplicium, was assembled.


2021 ◽  
Vol 2 (6) ◽  
pp. 214-220
Author(s):  
Irina V. Telezhko ◽  

The problem of the adequacy of the translation of the features of lexical units in German medical texts is considered. Difficulties of lexical and terminological plan are identified and analyzed: translation of terms, borrowings, abbreviations, false friends of the translator, borrowings, eponyms. Methods and techniques of translation of problematic medical lexical units, variants of interlanguage correspondences are presented. Examples of successful overcoming of lexical and terminological difficulties in translating medical texts from German into Russian are given. The results of the study confirm the need to expand the scope of the study of professional medical discourse by considering industry lexical units in the translation of medical texts.


2021 ◽  
Vol 21 (S9) ◽  
Author(s):  
Cheng Yan ◽  
Yuanzhe Zhang ◽  
Kang Liu ◽  
Jun Zhao ◽  
Yafei Shi ◽  
...  

Abstract Background A lot of medical mentions can be extracted from a huge amount of medical texts. In order to make use of these medical mentions, a prerequisite step is to link those medical mentions to a medical domain knowledge base (KB). This linkage of mention to a well-defined, unambiguous KB is a necessary part of the downstream application such as disease diagnosis and prescription of drugs. Such demand becomes more urgent in colloquial and informal situations like online medical consultation, where the medical language is more casual and vaguer. In this article, we propose an unsupervised method to link the Chinese medical symptom mentions to the ICD10 classification in a colloquial background. Methods We propose an unsupervised entity linking model using multi-instance learning (MIL). Our approach builds on a basic unsupervised entity linking method (named BEL), which is an embedding similarity-based EL model in this paper, and uses MIL training paradigm to boost the performance of BEL. First, we construct a dataset from an unlabeled large-scale Chinese medical consultation corpus with the help of BEL. Subsequently, we use a variety of encoders to obtain the representations of mention-context and the ICD10 entities. Then the representations are fed into a ranking network to score candidate entities. Results We evaluate the proposed model on the test dataset annotated by professional doctors. The evaluation results show that our method achieves 60.34% accuracy, exceeding the fundamental BEL by 1.72%. Conclusions We propose an unsupervised entity linking method to the entity linking in the medical domain, using MIL training manner. We annotate a test set for evaluation. The experimental results show that our model behaves better than the fundamental model BEL, and provides an insight for future research.


Author(s):  
Adam Gabriel Dobrakowski ◽  
Agnieszka Mykowiecka ◽  
Małgorzata Marciniak ◽  
Wojciech Jaworski ◽  
Przemysław Biecek

AbstractMedical free-text records store a lot of useful information that can be exploited in developing computer-supported medicine. However, extracting the knowledge from the unstructured text is difficult and depends on the language. In the paper, we apply Natural Language Processing methods to process raw medical texts in Polish and propose a new methodology for clustering of patients’ visits. We (1) extract medical terminology from a corpus of free-text clinical records, (2) annotate data with medical concepts, (3) compute vector representations of medical concepts and validate them on the proposed term analogy tasks, (4) compute visit representations as vectors, (5) introduce a new method for clustering of patients’ visits and (6) apply the method to a corpus of 100,000 visits. We use several approaches to visual exploration that facilitate interpretation of segments. With our method, we obtain stable and separated segments of visits which are positively validated against final medical diagnoses. In this paper we show how algorithm for segmentation of medical free-text records may be used to aid medical doctors. In addition to this, we share implementation of described methods with examples as open-source package .


Sign in / Sign up

Export Citation Format

Share Document