PhenClust, a standalone tool for identifying trends within sets of biological phenotypes using semantic similarity and the Unified Medical Language System metathesaurus

Abstract Objectives We sought to cluster biological phenotypes using semantic similarity and create an easy-to-install, stable, and reproducible tool. Materials and Methods We generated Phenotype Clustering (PhenClust)—a novel application of semantic similarity for interpreting biological phenotype associations—using the Unified Medical Language System (UMLS) metathesaurus, demonstrated the tool’s application, and developed Docker containers with stable installations of two UMLS versions. Results PhenClust identified disease clusters for drug network-associated phenotypes and a meta-analysis of drug target candidates. The Dockerized containers eliminated the requirement that the user install the UMLS metathesaurus. Discussion Clustering phenotypes summarized all phenotypes associated with a drug network and two drug candidates. Docker containers can support dissemination and reproducibility of tools that are otherwise limited due to insufficient software support. Conclusion PhenClust can improve interpretation of high-throughput biological analyses where many phenotypes are associated with a query and the Dockerized PhenClust achieved our objective of decreasing installation complexity.

Download Full-text

Semantic Similarity Measures between Terms in the Biomedical Domain within frame work Unified Medical Language System (UMLS)

International Journal of Computer Applications Technology and Research ◽

10.7753/ijcatr0708.1007 ◽

2018 ◽

Vol 7 (8) ◽

pp. 331-340

Author(s):

Abdelhakeem M. B. Abdelrahman ◽

Dr. Ahmad Kayed

Keyword(s):

Semantic Similarity ◽

Similarity Measures ◽

Biomedical Domain ◽

Language System ◽

Unified Medical Language System ◽

Medical Language ◽

Frame Work

Download Full-text

Best Paper Selection

Yearbook of Medical Informatics ◽

10.1055/s-0041-1726509 ◽

2021 ◽

Vol 30 (01) ◽

pp. 189-189

Keyword(s):

Semantic Similarity ◽

Likelihood Ratio ◽

Enrichment Analysis ◽

Biomedical Ontology ◽

Biomedical Ontologies ◽

Clinical Genomics ◽

Language System ◽

Unified Medical Language System ◽

Medical Language ◽

Similarity Calculation

Le DH. UFO: A tool for unifying biomedical ontology-based semantic similarity calculation, enrichment analysis and visualization. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0235670 Robinson PN, Ravanmehr V, Jacobsen JOB, Danis D, Zhang XA, Carmody LC, Gargano MA, Thaxton CL, Core UNCB, Karlebach G, Reese J, Holtgrewe M, Kohler S, McMurry JA, Haendel MA, Smedley D. Interpretable Clinical Genomics with a Likelihood Ratio Paradigm. https://www.cell.com/ajhg/fulltext/S0002-9297(20)30230-5 Slater LT, Gkoutos GV, Hoehndorf R. Towards semantic interoperability: finding and repairing hidden contradictions in biomedical ontologies. https://bmcmedinformdecismak.biomedcentral.com/articles/10.1186/s12911-020-01336-2 Zheng F, Shi J, Yang Y, Zheng WJ, Cui L. A transformation-based method for auditing the IS-A hierarchy of biomedical terminologies in the Unified Medical Language System. https://pubmed.ncbi.nlm.nih.gov/32918476/

Download Full-text

Unified Medical Language System

10.32388/urur42 ◽

2020 ◽

Cited By ~ 2

Author(s):

Keyword(s):

Language System ◽

Unified Medical Language System ◽

Medical Language

Download Full-text

Using Semantic and Structural Properties of the Unified Medical Language System to Discover Potential Terminological Relationships

Journal of the American Medical Informatics Association ◽

10.1197/jamia.m2931 ◽

2009 ◽

Vol 16 (3) ◽

pp. 346-353 ◽

Cited By ~ 10

Author(s):

C. O. Patel ◽

J. J. Cimino

Keyword(s):

Structural Properties ◽

Language System ◽

Unified Medical Language System ◽

Medical Language

Download Full-text

Auditing the Unified Medical Language System with Semantic Methods

Journal of the American Medical Informatics Association ◽

10.1136/jamia.1998.0050041 ◽

1998 ◽

Vol 5 (1) ◽

pp. 41-51 ◽

Cited By ~ 48

Author(s):

J. J. Cimino

Keyword(s):

Language System ◽

Unified Medical Language System ◽

Medical Language

Download Full-text

The outline of Unified Medical Language System(UMLS) Knowledge Sources.

Journal of Information Processing and Management ◽

10.1241/johokanri.41.15 ◽

1998 ◽

Vol 41 (1) ◽

pp. 15-23

Author(s):

Koreni KAWANO

Keyword(s):

Knowledge Sources ◽

Language System ◽

Unified Medical Language System ◽

Medical Language

Download Full-text

Unified Medical Language System

Electronic Health Record ◽

10.1002/9781118479612.ch16 ◽

2012 ◽

pp. 145-152 ◽

Cited By ~ 1

Keyword(s):

Language System ◽

Unified Medical Language System ◽

Medical Language

Download Full-text

IAIMS and UMLS at Columbia-Presbyterian Medical Center

Medical Decision Making ◽

10.1177/0272989x9101104s17 ◽

1991 ◽

Vol 11 (4_suppl) ◽

pp. S89-S93 ◽

Cited By ~ 4

Author(s):

James J. Cimino ◽

Soumitra Sengupta

Keyword(s):

Information Management ◽

Management System ◽

Medical Center ◽

Information Management System ◽

Language System ◽

Unified Medical Language System ◽

Medical Language ◽

Academic Information

The authors use an example to illustrate combining Integrated Academic Information Management System (IAIMS) components (applications) into an integral whole, to facilitate using the components simultaneously or in sequence. They examine a model for classifying IAIMS systems, proposing ways in which the Unified Medical Language System (UMLS) can be exploited in them.

Download Full-text

Mining Biomedical Data Using MetaMap Transfer (MMTx) and the Unified Medical Language System (UMLS)

Gene Function Analysis - Methods in Molecular Biology™ ◽

10.1007/978-1-59745-547-3_9 ◽

2007 ◽

pp. 153-169 ◽

Cited By ~ 15

Author(s):

John D. Osborne ◽

Simon Lin ◽

Lihua Julie Zhu ◽

Warren A. Kibbe

Keyword(s):

Biomedical Data ◽

Language System ◽

Unified Medical Language System ◽

Medical Language

Download Full-text

Use of word and graph embedding to measure semantic relatedness between Unified Medical Language System concepts

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocaa136 ◽

2020 ◽

Vol 27 (10) ◽

pp. 1538-1546 ◽

Cited By ~ 1

Author(s):

Yuqing Mao ◽

Kin Wah Fung

Keyword(s):

Word Sense Disambiguation ◽

Graph Embedding ◽

Semantic Relatedness ◽

Word Sense ◽

Medical Subject Headings ◽

Network Graph ◽

Convolutional Network ◽

Language System ◽

Unified Medical Language System ◽

Medical Language

Abstract Objective The study sought to explore the use of deep learning techniques to measure the semantic relatedness between Unified Medical Language System (UMLS) concepts. Materials and Methods Concept sentence embeddings were generated for UMLS concepts by applying the word embedding models BioWordVec and various flavors of BERT to concept sentences formed by concatenating UMLS terms. Graph embeddings were generated by the graph convolutional networks and 4 knowledge graph embedding models, using graphs built from UMLS hierarchical relations. Semantic relatedness was measured by the cosine between the concepts’ embedding vectors. Performance was compared with 2 traditional path-based (shortest path and Leacock-Chodorow) measurements and the publicly available concept embeddings, cui2vec, generated from large biomedical corpora. The concept sentence embeddings were also evaluated on a word sense disambiguation (WSD) task. Reference standards used included the semantic relatedness and semantic similarity datasets from the University of Minnesota, concept pairs generated from the Standardized MedDRA Queries and the MeSH (Medical Subject Headings) WSD corpus. Results Sentence embeddings generated by BioWordVec outperformed all other methods used individually in semantic relatedness measurements. Graph convolutional network graph embedding uniformly outperformed path-based measurements and was better than some word embeddings for the Standardized MedDRA Queries dataset. When used together, combined word and graph embedding achieved the best performance in all datasets. For WSD, the enhanced versions of BERT outperformed BioWordVec. Conclusions Word and graph embedding techniques can be used to harness terms and relations in the UMLS to measure semantic relatedness between concepts. Concept sentence embedding outperforms path-based measurements and cui2vec, and can be further enhanced by combining with graph embedding.

Download Full-text