A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships

Abstract Objective Large and complex terminologies, such as Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT), are prone to errors and inconsistencies. Abstraction networks are compact summarizations of the content and structure of a terminology. Abstraction networks have been shown to support terminology quality assurance. In this paper, we introduce an abstraction network derivation methodology which can be applied to SNOMED CT target hierarchies whose classes are defined using only hierarchical relationships (ie, without attribute relationships) and similar description-logic-based terminologies. Methods We introduce the tribal abstraction network (TAN), based on the notion of a tribe—a subhierarchy rooted at a child of a hierarchy root, assuming only the existence of concepts with multiple parents. The TAN summarizes a hierarchy that does not have attribute relationships using sets of concepts, called tribal units that belong to exactly the same multiple tribes. Tribal units are further divided into refined tribal units which contain closely related concepts. A quality assurance methodology that utilizes TAN summarizations is introduced. Results A TAN is derived for the Observable entity hierarchy of SNOMED CT, summarizing its content. A TAN-based quality assurance review of the concepts of the hierarchy is performed, and erroneous concepts are shown to appear more frequently in large refined tribal units than in small refined tribal units. Furthermore, more erroneous concepts appear in large refined tribal units of more tribes than of fewer tribes. Conclusions In this paper we introduce the TAN for summarizing SNOMED CT target hierarchies. A TAN was derived for the Observable entity hierarchy of SNOMED CT. A quality assurance methodology utilizing the TAN was introduced and demonstrated.

Download Full-text

SNOMED CT Implementation

Methods of Information in Medicine ◽

10.3414/me11-02-0023 ◽

2012 ◽

Vol 51 (06) ◽

pp. 529-538 ◽

Cited By ~ 17

Author(s):

K. Rosenbeck Gøeg ◽

A. Randorff Højen

Keyword(s):

Quality Assurance ◽

Clinical Practice ◽

Electronic Health Record ◽

Quality Criteria ◽

Clinical Information ◽

Health Record ◽

Snomed Ct ◽

Depth Analysis ◽

Systematized Nomenclature Of Medicine ◽

Future Work

SummaryClinical practice as well as research and quality-assurance benefit from unambiguous clinical information resulting from the use of a common terminology like the Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT). A common terminology is a necessity to enable consistent reuse of data, and supporting semantic interoperability. Managing use of terminology for large cross specialty Electronic Health Record systems (EHR systems) or just beyond the level of single EHR systems requires that mappings are kept consistent. The objective of this study is to provide a clear methodology for SNOMED CT mapping to enhance applicability of SNOMED CT despite incompleteness and redundancy. Such mapping guidelines are presented based on an in depth analysis of 14 different EHR templates retrieved from five Danish and Swedish EHR systems. Each mapping is assessed against defined quality criteria and mapping guidelines are specified. Future work will include guideline validation.

Download Full-text

Quality assurance and enrichment of biological and biomedical ontologies and terminologies

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01342-4 ◽

2020 ◽

Vol 20 (S10) ◽

Author(s):

Ankur Agrawal ◽

Licong Cui

Keyword(s):

Quality Assurance ◽

Cancer Registries ◽

Supplement Issue ◽

Biomedical Ontologies ◽

Snomed Ct ◽

Unified Medical Language System ◽

Domain Specific ◽

Healthcare Settings ◽

Domain Specific Knowledge ◽

Enrichment Techniques

AbstractBiological and biomedical ontologies and terminologies are used to organize and store various domain-specific knowledge to provide standardization of terminology usage and to improve interoperability. The growing number of such ontologies and terminologies and their increasing adoption in clinical, research and healthcare settings call for effective and efficient quality assurance and semantic enrichment techniques of these ontologies and terminologies. In this editorial, we provide an introductory summary of nine articles included in this supplement issue for quality assurance and enrichment of biological and biomedical ontologies and terminologies. The articles cover a range of standards including SNOMED CT, National Cancer Institute Thesaurus, Unified Medical Language System, North American Association of Central Cancer Registries and OBO Foundry Ontologies.

Download Full-text

An exploration of the properties of the CORE problem list subset and how it facilitates the implementation of SNOMED CT

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocu022 ◽

2015 ◽

Vol 22 (3) ◽

pp. 649-658 ◽

Cited By ~ 6

Author(s):

Kin Wah Fung ◽

Julia Xu

Keyword(s):

Electronic Health Records ◽

Growth Pattern ◽

Problem List ◽

Snomed Ct ◽

Health Records ◽

The Core ◽

Core Subset ◽

Electronic Health ◽

Systematized Nomenclature Of Medicine ◽

Core Problem

Abstract Objective Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is the emergent international health terminology standard for encoding clinical information in electronic health records. The CORE Problem List Subset was created to facilitate the terminology’s implementation. This study evaluates the CORE Subset’s coverage and examines its growth pattern as source datasets are being incorporated. Methods Coverage of frequently used terms and the corresponding usage of the covered terms were assessed by “leave-one-out” analysis of the eight datasets constituting the current CORE Subset. The growth pattern was studied using a retrospective experiment, growing the Subset one dataset at a time and examining the relationship between the size of the starting subset and the coverage of frequently used terms in the incoming dataset. Linear regression was used to model that relationship. Results On average, the CORE Subset covered 80.3% of the frequently used terms of the left-out dataset, and the covered terms accounted for 83.7% of term usage. There was a significant positive correlation between the CORE Subset’s size and the coverage of the frequently used terms in an incoming dataset. This implies that the CORE Subset will grow at a progressively slower pace as it gets bigger. Conclusion The CORE Problem List Subset is a useful resource for the implementation of Systematized Nomenclature of Medicine Clinical Terms in electronic health records. It offers good coverage of frequently used terms, which account for a high proportion of term usage. If future datasets are incorporated into the CORE Subset, it is likely that its size will remain small and manageable.

Download Full-text

A Context-based Crowd Sourcing Tool for Quality Assurance of SNOMED CT

10.1109/bibm52615.2021.9669688 ◽

2021 ◽

Author(s):

Kashifuddin Qazi ◽

Ankur Agrawal

Keyword(s):

Quality Assurance ◽

Crowd Sourcing ◽

Snomed Ct

Download Full-text

Evaluation of the Content Coverage of SNOMED CT Representing ICNP Seven-axis Version 1 Concepts

Methods of Information in Medicine ◽

10.3414/me11-01-0004 ◽

2011 ◽

Vol 50 (05) ◽

pp. 472-478 ◽

Cited By ~ 10

Author(s):

C. Lundberg ◽

A. Coenen ◽

D. Konicek ◽

H. A. Park

Keyword(s):

Coverage Rate ◽

Snomed Ct ◽

Validation Process ◽

The Third ◽

Model Version ◽

Content Coverage ◽

Systematized Nomenclature Of Medicine ◽

Mapping Result

SummaryObjectives: The purpose of this study is to evaluate the ability of SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) to represent the concepts of the ICNP version 1 – the seven-axis model.Methods: The first author mapped 1658 concepts of the ICNP version 1 to SNOMED CT using CLUE browser 5.0. The second author from SNOMED Terminology Solutions – with a team of SNOMED CT experts – and the third author from the ICN with a team of ICNP experts validated the mapping result. If there was any disagreement during the validation process, the three of us convened online meetings to reach a consensus.Results: In total, SNOMED CT covered 1331 out of 1658 (80%) ICNP seven-axis model concepts ranging from a 61% coverage rate of the Actions Axis concepts to a 94% coverage rate of the Judgment axis concepts.Conclusions: SNOMED CT can represent most (80%) of the ICNP version 1 concepts. However, improvements in the ICNP version 1 in terms of concept naming and definition, and the addition of missing concepts to SNOMED CT, would lead to a greater harmonization of the ICNP seven-axis model version 1 concepts with SNOMED CT.

Download Full-text

Quality Assurance of UMLS Semantic Type Assignments Using SNOMED CT Hierarchies

Methods of Information in Medicine ◽

10.3414/me14-01-0104 ◽

2016 ◽

Vol 55 (02) ◽

pp. 158-165 ◽

Cited By ~ 10

Author(s):

Y. Chen ◽

Z. He ◽

M. Halper ◽

L. Chen ◽

H. Gu

Keyword(s):

Quality Assurance ◽

Error Rate ◽

Semantic Network ◽

Snomed Ct ◽

Semantic Type ◽

Domain Experts ◽

Unified Medical Language System ◽

Type Assignment ◽

Semantic Types ◽

High Level

SummaryBackground: The Unified Medical Language System (UMLS) is one of the largest biomedical terminological systems, with over 2.5 million concepts in its Metathesaurus repository. The UMLS’s Semantic Network (SN) with its collection of 133 high-level semantic types serves as an abstraction layer on top of the Metathesaurus. In particular, the SN elaborates an aspect of the Metathesaurus’s concepts via the assignment of one or more types to each concept. Due to the scope and complexity of the Metathesaurus, errors are all but inevitable in this semantic-type assignment process.Objectives: To develop a semi-automated methodology to help assure the quality of semantic-type assignments within the UMLS.Methods: The methodology uses a cross- validation strategy involving SNOMED CT’s hierarchies in combination with UMLS se -mantic types. Semantically uniform, disjoint concept groups are generated programmatically by partitioning the collection of all concepts in the same SNOMED CT hierarchy according to their respective semantic-type assignments in the UMLS. Domain experts are then called upon to review the concepts in any group having a small number of concepts. It is our hypothesis that a semantic-type assignment combination applicable only to a very small number of concepts in a SNOMED CT hierarchy is an indicator of potential problems.Results: The methodology was applied to the UMLS 2013AA release along with the SNOMED CT from January 2013. An overall error rate of 33% was found for concepts proposed by the quality-assurance methodology. Supporting our hypothesis, that number was four times higher than the error rate found in control samples.Conclusion: The results show that the quality-assurance methodology can aid in effective and efficient identification of UMLS semantic-type assignment errors.

Download Full-text

Effect of vocabulary mapping for conditions on phenotype cohorts

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy124 ◽

2018 ◽

Vol 25 (12) ◽

pp. 1618-1625 ◽

Cited By ~ 22

Author(s):

George Hripcsak ◽

Matthew E Levine ◽

Ning Shang ◽

Patrick B Ryan

Keyword(s):

Knowledge Engineering ◽

Total Error ◽

False Negative ◽

International Classification Of Diseases ◽

Error Rates ◽

Small Error ◽

Snomed Ct ◽

Order Of Magnitude ◽

Systematized Nomenclature Of Medicine ◽

Gold Standards

Abstract Objective To study the effect on patient cohorts of mapping condition (diagnosis) codes from source billing vocabularies to a clinical vocabulary. Materials and Methods Nine International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9-CM) concept sets were extracted from eMERGE network phenotypes, translated to Systematized Nomenclature of Medicine - Clinical Terms concept sets, and applied to patient data that were mapped from source ICD9-CM and ICD10-CM codes to Systematized Nomenclature of Medicine - Clinical Terms codes using Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) vocabulary mappings. The original ICD9-CM concept set and a concept set extended to ICD10-CM were used to create patient cohorts that served as gold standards. Results Four phenotype concept sets were able to be translated to Systematized Nomenclature of Medicine - Clinical Terms without ambiguities and were able to perform perfectly with respect to the gold standards. The other 5 lost performance when 2 or more ICD9-CM or ICD10-CM codes mapped to the same Systematized Nomenclature of Medicine - Clinical Terms code. The patient cohorts had a total error (false positive and false negative) of up to 0.15% compared to querying ICD9-CM source data and up to 0.26% compared to querying ICD9-CM and ICD10-CM data. Knowledge engineering was required to produce that performance; simple automated methods to generate concept sets had errors up to 10% (one outlier at 250%). Discussion The translation of data from source vocabularies to Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT) resulted in very small error rates that were an order of magnitude smaller than other error sources. Conclusion It appears possible to map diagnoses from disparate vocabularies to a single clinical vocabulary and carry out research using a single set of definitions, thus improving efficiency and transportability of research.

Download Full-text