scholarly journals A tribal abstraction network for SNOMED CT target hierarchies without attribute relationships

2014 ◽  
Vol 22 (3) ◽  
pp. 628-639 ◽  
Author(s):  
Christopher Ochs ◽  
James Geller ◽  
Yehoshua Perl ◽  
Yan Chen ◽  
Ankur Agrawal ◽  
...  

Abstract Objective Large and complex terminologies, such as Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT), are prone to errors and inconsistencies. Abstraction networks are compact summarizations of the content and structure of a terminology. Abstraction networks have been shown to support terminology quality assurance. In this paper, we introduce an abstraction network derivation methodology which can be applied to SNOMED CT target hierarchies whose classes are defined using only hierarchical relationships (ie, without attribute relationships) and similar description-logic-based terminologies. Methods We introduce the tribal abstraction network (TAN), based on the notion of a tribe—a subhierarchy rooted at a child of a hierarchy root, assuming only the existence of concepts with multiple parents. The TAN summarizes a hierarchy that does not have attribute relationships using sets of concepts, called tribal units that belong to exactly the same multiple tribes. Tribal units are further divided into refined tribal units which contain closely related concepts. A quality assurance methodology that utilizes TAN summarizations is introduced. Results A TAN is derived for the Observable entity hierarchy of SNOMED CT, summarizing its content. A TAN-based quality assurance review of the concepts of the hierarchy is performed, and erroneous concepts are shown to appear more frequently in large refined tribal units than in small refined tribal units. Furthermore, more erroneous concepts appear in large refined tribal units of more tribes than of fewer tribes. Conclusions In this paper we introduce the TAN for summarizing SNOMED CT target hierarchies. A TAN was derived for the Observable entity hierarchy of SNOMED CT. A quality assurance methodology utilizing the TAN was introduced and demonstrated.

2012 ◽  
Vol 51 (06) ◽  
pp. 529-538 ◽  
Author(s):  
K. Rosenbeck Gøeg ◽  
A. Randorff Højen

SummaryClinical practice as well as research and quality-assurance benefit from unambiguous clinical information resulting from the use of a common terminology like the Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT). A common terminology is a necessity to enable consistent reuse of data, and supporting semantic interoperability. Managing use of terminology for large cross specialty Electronic Health Record systems (EHR systems) or just beyond the level of single EHR systems requires that mappings are kept consistent. The objective of this study is to provide a clear methodology for SNOMED CT mapping to enhance applicability of SNOMED CT despite incompleteness and redundancy. Such mapping guidelines are presented based on an in depth analysis of 14 different EHR templates retrieved from five Danish and Swedish EHR systems. Each mapping is assessed against defined quality criteria and mapping guidelines are specified. Future work will include guideline validation.


2020 ◽  
Vol 20 (S10) ◽  
Author(s):  
Ankur Agrawal ◽  
Licong Cui

AbstractBiological and biomedical ontologies and terminologies are used to organize and store various domain-specific knowledge to provide standardization of terminology usage and to improve interoperability. The growing number of such ontologies and terminologies and their increasing adoption in clinical, research and healthcare settings call for effective and efficient quality assurance and semantic enrichment techniques of these ontologies and terminologies. In this editorial, we provide an introductory summary of nine articles included in this supplement issue for quality assurance and enrichment of biological and biomedical ontologies and terminologies. The articles cover a range of standards including SNOMED CT, National Cancer Institute Thesaurus, Unified Medical Language System, North American Association of Central Cancer Registries and OBO Foundry Ontologies.


2015 ◽  
Vol 22 (3) ◽  
pp. 649-658 ◽  
Author(s):  
Kin Wah Fung ◽  
Julia Xu

Abstract Objective Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is the emergent international health terminology standard for encoding clinical information in electronic health records. The CORE Problem List Subset was created to facilitate the terminology’s implementation. This study evaluates the CORE Subset’s coverage and examines its growth pattern as source datasets are being incorporated. Methods Coverage of frequently used terms and the corresponding usage of the covered terms were assessed by “leave-one-out” analysis of the eight datasets constituting the current CORE Subset. The growth pattern was studied using a retrospective experiment, growing the Subset one dataset at a time and examining the relationship between the size of the starting subset and the coverage of frequently used terms in the incoming dataset. Linear regression was used to model that relationship. Results On average, the CORE Subset covered 80.3% of the frequently used terms of the left-out dataset, and the covered terms accounted for 83.7% of term usage. There was a significant positive correlation between the CORE Subset’s size and the coverage of the frequently used terms in an incoming dataset. This implies that the CORE Subset will grow at a progressively slower pace as it gets bigger. Conclusion The CORE Problem List Subset is a useful resource for the implementation of Systematized Nomenclature of Medicine Clinical Terms in electronic health records. It offers good coverage of frequently used terms, which account for a high proportion of term usage. If future datasets are incorporated into the CORE Subset, it is likely that its size will remain small and manageable.


2011 ◽  
Vol 50 (05) ◽  
pp. 472-478 ◽  
Author(s):  
C. Lundberg ◽  
A. Coenen ◽  
D. Konicek ◽  
H. A. Park

SummaryObjectives: The purpose of this study is to evaluate the ability of SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) to represent the concepts of the ICNP version 1 – the seven-axis model.Methods: The first author mapped 1658 concepts of the ICNP version 1 to SNOMED CT using CLUE browser 5.0. The second author from SNOMED Terminology Solutions – with a team of SNOMED CT experts – and the third author from the ICN with a team of ICNP experts validated the mapping result. If there was any disagreement during the validation process, the three of us convened online meetings to reach a consensus.Results: In total, SNOMED CT covered 1331 out of 1658 (80%) ICNP seven-axis model concepts ranging from a 61% coverage rate of the Actions Axis concepts to a 94% coverage rate of the Judgment axis concepts.Conclusions: SNOMED CT can represent most (80%) of the ICNP version 1 concepts. However, improvements in the ICNP version 1 in terms of concept naming and definition, and the addition of missing concepts to SNOMED CT, would lead to a greater harmonization of the ICNP seven-axis model version 1 concepts with SNOMED CT.


2016 ◽  
Vol 55 (02) ◽  
pp. 158-165 ◽  
Author(s):  
Y. Chen ◽  
Z. He ◽  
M. Halper ◽  
L. Chen ◽  
H. Gu

SummaryBackground: The Unified Medical Language System (UMLS) is one of the largest biomedical terminological systems, with over 2.5 million concepts in its Metathesaurus repository. The UMLS’s Semantic Network (SN) with its collection of 133 high-level semantic types serves as an abstraction layer on top of the Metathesaurus. In particular, the SN elaborates an aspect of the Metathesaurus’s concepts via the assignment of one or more types to each concept. Due to the scope and complexity of the Metathesaurus, errors are all but inevitable in this semantic-type assignment process.Objectives: To develop a semi-automated methodology to help assure the quality of semantic-type assignments within the UMLS.Methods: The methodology uses a cross- validation strategy involving SNOMED CT’s hierarchies in combination with UMLS se -mantic types. Semantically uniform, disjoint concept groups are generated programmatically by partitioning the collection of all concepts in the same SNOMED CT hierarchy according to their respective semantic-type assignments in the UMLS. Domain experts are then called upon to review the concepts in any group having a small number of concepts. It is our hypothesis that a semantic-type assignment combination applicable only to a very small number of concepts in a SNOMED CT hierarchy is an indicator of potential problems.Results: The methodology was applied to the UMLS 2013AA release along with the SNOMED CT from January 2013. An overall error rate of 33% was found for concepts proposed by the quality-assurance methodology. Supporting our hypothesis, that number was four times higher than the error rate found in control samples.Conclusion: The results show that the quality-assurance methodology can aid in effective and efficient identification of UMLS semantic-type assignment errors.


2018 ◽  
Vol 25 (12) ◽  
pp. 1618-1625 ◽  
Author(s):  
George Hripcsak ◽  
Matthew E Levine ◽  
Ning Shang ◽  
Patrick B Ryan

Abstract Objective To study the effect on patient cohorts of mapping condition (diagnosis) codes from source billing vocabularies to a clinical vocabulary. Materials and Methods Nine International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9-CM) concept sets were extracted from eMERGE network phenotypes, translated to Systematized Nomenclature of Medicine - Clinical Terms concept sets, and applied to patient data that were mapped from source ICD9-CM and ICD10-CM codes to Systematized Nomenclature of Medicine - Clinical Terms codes using Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) vocabulary mappings. The original ICD9-CM concept set and a concept set extended to ICD10-CM were used to create patient cohorts that served as gold standards. Results Four phenotype concept sets were able to be translated to Systematized Nomenclature of Medicine - Clinical Terms without ambiguities and were able to perform perfectly with respect to the gold standards. The other 5 lost performance when 2 or more ICD9-CM or ICD10-CM codes mapped to the same Systematized Nomenclature of Medicine - Clinical Terms code. The patient cohorts had a total error (false positive and false negative) of up to 0.15% compared to querying ICD9-CM source data and up to 0.26% compared to querying ICD9-CM and ICD10-CM data. Knowledge engineering was required to produce that performance; simple automated methods to generate concept sets had errors up to 10% (one outlier at 250%). Discussion The translation of data from source vocabularies to Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT) resulted in very small error rates that were an order of magnitude smaller than other error sources. Conclusion It appears possible to map diagnoses from disparate vocabularies to a single clinical vocabulary and carry out research using a single set of definitions, thus improving efficiency and transportability of research.


2007 ◽  
Vol 39 (3) ◽  
pp. 183-195 ◽  
Author(s):  
Olivier Bodenreider ◽  
Barry Smith ◽  
Anand Kumar ◽  
Anita Burgun
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document