unified medical language system
Recently Published Documents


TOTAL DOCUMENTS

135
(FIVE YEARS 54)

H-INDEX

15
(FIVE YEARS 4)

2021 ◽  
Vol 268 ◽  
pp. 552-561
Author(s):  
Katherine M Reitz ◽  
Daniel E Hall ◽  
Myrick C Shinall ◽  
Paula K Shireman ◽  
Jonathan C Silverstein

2021 ◽  
Author(s):  
Ilya Tyagin ◽  
Ilya Safro

In this paper we present an approach for interpretable visualization of scientific hypotheses that is based on the idea of semantic concept interconnectivity, network-based and topic modeling methods. Our visualization approach has numerous adjustable parameters which provides the domain experts with additional flexibility in their decision making process. We also make use of the Unified Medical Language System metadata by integrating it directly into the resulting topics, and adding the variability into hypotheses resolution. To demonstrate the proposed approach in action, we deployed end-to-end hypothesis generation pipeline AGATHA, which was evaluated by BioCreative VII experts with COVID-19-related queries.


2021 ◽  
Author(s):  
E. C. Wood ◽  
Amy K. Glen ◽  
Lindsey G. Kvarfordt ◽  
Finn Womack ◽  
Liliana Acevedo ◽  
...  

Background: Biomedical translational science is increasingly leveraging computational reasoning on large repositories of structured knowledge (such as the Unified Medical Language System (UMLS), the Semantic Medline Database (SemMedDB), ChEMBL, DrugBank, and the Small Molecule Pathway Database (SMPDB)) and data in order to facilitate discovery of new therapeutic targets and modalities. Since 2016, the NCATS Biomedical Data Translator project has been working to federate autonomous reasoning agents and knowledge providers within a distributed system for answering translational questions. Within that project and within the field more broadly, there is an urgent need for an open-source framework that can efficiently and reproducibly build an integrated, standards-compliant, and comprehensive biomedical knowledge graph that can be either downloaded in standard serialized form or queried via a public application programming interface (API) that accords with the FAIR data principles. Results: To create a knowledge provider system within the Translator project, we have developed RTX-KG2, an open-source software system for building—and hosting a web API for querying—a biomedical knowledge graph that uses an Extract-Transform-Load (ETL) approach to integrate 70 knowledge sources (including the aforementioned sources) into a single knowledge graph. The semantic layer and schema for RTX-KG2 follow the standard Biolink metamodel to maximize interoperability within Translator. RTX-KG2 is currently being used by multiple Translator reasoning agents, both in its downloadable form and via its SmartAPI-registered web interface. JavaScript Object Notation (JSON) serializations of RTX-KG2 are available for download of RTX-KG2 in both the pre-canonicalized form and in canonicalized form (in which synonym concepts are merged). The current canonicalized version (KG2.7.3) of RTX-KG2 contains 6.4M concept nodes and 39.3M relationship edges with a rich set of 77 relationship types. Conclusion: RTX-KG2 is the first open-source knowledge graph of which we are aware that integrates UMLS, SemMedDB, ChEMBL, DrugBank, SMPDB, and 65 additional knowledge sources within a knowledge graph that conforms to the Biolink standard for its semantic layer and schema at the intersections of these databases. RTX-KG2 is publicly available for querying via its (API) at arax.ncats.io/api/rtxkg2/v1.2/openapi.json. The code to build RTX-KG2 is publicly available at github:RTXteam/RTX-KG2.


2021 ◽  
Vol 30 (01) ◽  
pp. 189-189

Le DH. UFO: A tool for unifying biomedical ontology-based semantic similarity calculation, enrichment analysis and visualization. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0235670 Robinson PN, Ravanmehr V, Jacobsen JOB, Danis D, Zhang XA, Carmody LC, Gargano MA, Thaxton CL, Core UNCB, Karlebach G, Reese J, Holtgrewe M, Kohler S, McMurry JA, Haendel MA, Smedley D. Interpretable Clinical Genomics with a Likelihood Ratio Paradigm. https://www.cell.com/ajhg/fulltext/S0002-9297(20)30230-5 Slater LT, Gkoutos GV, Hoehndorf R. Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-020-01336-2 Zheng F, Shi J, Yang Y, Zheng WJ, Cui L. A transformation-based method for auditing the IS-A hierarchy of biomedical terminologies in the Unified Medical Language System. https://pubmed.ncbi.nlm.nih.gov/32918476/


2021 ◽  
Vol 3 ◽  
Author(s):  
Samuel Dobbie ◽  
Huw Strafford ◽  
W. Owen Pickrell ◽  
Beata Fonferko-Shadrach ◽  
Carys Jones ◽  
...  

Across various domains, such as health and social care, law, news, and social media, there are increasing quantities of unstructured texts being produced. These potential data sources often contain rich information that could be used for domain-specific and research purposes. However, the unstructured nature of free-text data poses a significant challenge for its utilisation due to the necessity of substantial manual intervention from domain-experts to label embedded information. Annotation tools can assist with this process by providing functionality that enables the accurate capture and transformation of unstructured texts into structured annotations, which can be used individually, or as part of larger Natural Language Processing (NLP) pipelines. We present Markup (https://www.getmarkup.com/) an open-source, web-based annotation tool that is undergoing continued development for use across all domains. Markup incorporates NLP and Active Learning (AL) technologies to enable rapid and accurate annotation using custom user configurations, predictive annotation suggestions, and automated mapping suggestions to both domain-specific ontologies, such as the Unified Medical Language System (UMLS), and custom, user-defined ontologies. We demonstrate a real-world use case of how Markup has been used in a healthcare setting to annotate structured information from unstructured clinic letters, where captured annotations were used to build and test NLP applications.


JAMIA Open ◽  
2021 ◽  
Vol 4 (3) ◽  
Author(s):  
Jennifer L Wilson ◽  
Mike Wong ◽  
Nicholas Stepanov ◽  
Dragutin Petkovic ◽  
Russ Altman

Abstract Objectives We sought to cluster biological phenotypes using semantic similarity and create an easy-to-install, stable, and reproducible tool. Materials and Methods We generated Phenotype Clustering (PhenClust)—a novel application of semantic similarity for interpreting biological phenotype associations—using the Unified Medical Language System (UMLS) metathesaurus, demonstrated the tool’s application, and developed Docker containers with stable installations of two UMLS versions. Results PhenClust identified disease clusters for drug network-associated phenotypes and a meta-analysis of drug target candidates. The Dockerized containers eliminated the requirement that the user install the UMLS metathesaurus. Discussion Clustering phenotypes summarized all phenotypes associated with a drug network and two drug candidates. Docker containers can support dissemination and reproducibility of tools that are otherwise limited due to insufficient software support. Conclusion PhenClust can improve interpretation of high-throughput biological analyses where many phenotypes are associated with a query and the Dockerized PhenClust achieved our objective of decreasing installation complexity.


Author(s):  
Sarah Dahir ◽  
Abderrahim El Qadi ◽  
Hamid Bennis

<p class="0abstract">Information Retrieval (IR) in the medical domain is considered as a challenging task for many reasons. Short health queries tend to lack information on user's intent, and the target corpus may not have sufficient information for Relevance Feedbacks. And even, if the user obtains relevant documents to his/her queries, it is difficult for him/her to understand the technical terms.  In contrast, in this paper, we propose an approach for health queries reformulation based on graph matching between two external linked data sources: DBpedia and Unified Medical Language System (UMLS). DBpedia has a broad coverage of topics and less noise compared to Wikipedia articles, and UMLS is specific to the medical domain. We also introduced the degree centrality to measure the graph connectivity and to select the most efficient candidate terms for query expansion. Experimental results on MEDLINE collection using Okapi BM25 as a retrieval model showed that our approach outperformed related methods, and the two sources achieved very good retrieval results. They helped in the diversification of the retrieved documents and the improvement of the recall.</p>


2021 ◽  
Author(s):  
Mahdi Abdollahi ◽  
Xiaoying Gao ◽  
Yi Mei ◽  
S Ghosh ◽  
J Li

Document classification (DC) is the task of assigning pre-defined labels to unseen documents by utilizing a model trained on the available labeled documents. DC has attracted much attention in medical fields recently because many issues can be formulated as a classification problem. It can assist doctors in decision making and correct decisions can reduce the medical expenses. Medical documents have special attributes that distinguish them from other texts and make them difficult to analyze. For example, many acronyms and abbreviations, and short expressions make it more challenging to extract information. The classification accuracy of the current medical DC methods is not satisfactory. The goal of this work is to enhance the input feature sets of the DC method to improve the accuracy. To approach this goal, a novel two-stage approach is proposed. In the first stage, a domain-specific dictionary, namely the Unified Medical Language System (UMLS), is employed to extract the key features belonging to the most relevant concepts such as diseases or symptoms. In the second stage, PSO is applied to select more related features from the extracted features in the first stage. The performance of the proposed approach is evaluated on the 2010 Informatics for Integrating Biology and the Bedside (i2b2) data set which is a widely used medical text dataset. The experimental results show substantial improvement by the proposed method on the accuracy of classification.


2021 ◽  
Author(s):  
Mahdi Abdollahi ◽  
Xiaoying Gao ◽  
Yi Mei ◽  
S Ghosh ◽  
J Li

Document classification (DC) is one of the broadly investigated natural language processing tasks. Medical document classification can support doctors in making decision and improve medical services. Since the data in document classification often appear in raw form such as medical discharge notes, extracting meaningful information to use as features is a challenging task. There are many specialized words and expressions in medical documents which make them more challenging to analyze. The classification accuracy of available methods in medical field is not good enough. This work aims to improve the quality of the input feature sets to increase the accuracy. A new three-stage approach is proposed. In the first stage, the Unified Medical Language System (UMLS) which is a medical-specific dictionary is used to extract the meaningful phrases by considering disease or symptom concepts. In the second stage, all the possible pairs of the extracted concepts are created as new features. In the third stage, Particle Swarm Optimisation (PSO) is employed to select features from the extracted and constructed features in the previous stages. The experimental results show that the proposed three-stage method achieved substantial improvement over the existing medical DC approaches.


Sign in / Sign up

Export Citation Format

Share Document