Quality assurance and enrichment of biological and biomedical ontologies and terminologies

AbstractBiological and biomedical ontologies and terminologies are used to organize and store various domain-specific knowledge to provide standardization of terminology usage and to improve interoperability. The growing number of such ontologies and terminologies and their increasing adoption in clinical, research and healthcare settings call for effective and efficient quality assurance and semantic enrichment techniques of these ontologies and terminologies. In this editorial, we provide an introductory summary of nine articles included in this supplement issue for quality assurance and enrichment of biological and biomedical ontologies and terminologies. The articles cover a range of standards including SNOMED CT, National Cancer Institute Thesaurus, Unified Medical Language System, North American Association of Central Cancer Registries and OBO Foundry Ontologies.

Download Full-text

Quality Assurance of UMLS Semantic Type Assignments Using SNOMED CT Hierarchies

Methods of Information in Medicine ◽

10.3414/me14-01-0104 ◽

2016 ◽

Vol 55 (02) ◽

pp. 158-165 ◽

Cited By ~ 10

Author(s):

Y. Chen ◽

Z. He ◽

M. Halper ◽

L. Chen ◽

H. Gu

Keyword(s):

Quality Assurance ◽

Error Rate ◽

Semantic Network ◽

Snomed Ct ◽

Semantic Type ◽

Domain Experts ◽

Unified Medical Language System ◽

Type Assignment ◽

Semantic Types ◽

High Level

SummaryBackground: The Unified Medical Language System (UMLS) is one of the largest biomedical terminological systems, with over 2.5 million concepts in its Metathesaurus repository. The UMLS’s Semantic Network (SN) with its collection of 133 high-level semantic types serves as an abstraction layer on top of the Metathesaurus. In particular, the SN elaborates an aspect of the Metathesaurus’s concepts via the assignment of one or more types to each concept. Due to the scope and complexity of the Metathesaurus, errors are all but inevitable in this semantic-type assignment process.Objectives: To develop a semi-automated methodology to help assure the quality of semantic-type assignments within the UMLS.Methods: The methodology uses a cross- validation strategy involving SNOMED CT’s hierarchies in combination with UMLS se -mantic types. Semantically uniform, disjoint concept groups are generated programmatically by partitioning the collection of all concepts in the same SNOMED CT hierarchy according to their respective semantic-type assignments in the UMLS. Domain experts are then called upon to review the concepts in any group having a small number of concepts. It is our hypothesis that a semantic-type assignment combination applicable only to a very small number of concepts in a SNOMED CT hierarchy is an indicator of potential problems.Results: The methodology was applied to the UMLS 2013AA release along with the SNOMED CT from January 2013. An overall error rate of 33% was found for concepts proposed by the quality-assurance methodology. Supporting our hypothesis, that number was four times higher than the error rate found in control samples.Conclusion: The results show that the quality-assurance methodology can aid in effective and efficient identification of UMLS semantic-type assignment errors.

Download Full-text

Using Knowledge Graphs to Plausibly Infer Missing Associations in EMR Data

Studies in Health Technology and Informatics - Public Health and Informatics ◽

10.3233/shti210192 ◽

2021 ◽

Author(s):

William Van Woensel ◽

Chad Armstrong ◽

Malavan Rajaratnam ◽

Vaibhav Gupta ◽

Syed Sibte Raza Abidi

Keyword(s):

Medical Records ◽

Medical Knowledge ◽

Data Entry ◽

Clinical Encounter ◽

Snomed Ct ◽

Record Keeping ◽

Domain Specific ◽

Knowledge Based ◽

Domain Specific Knowledge ◽

Entry Restrictions

Electronic Medical Records (EMRs) are increasingly being deployed at primary points of care and clinics for digital record keeping, increasing productivity and improving communication. In practice, however, there still exists an often incomplete picture of patient profiles, not only because of disconnected EMR systems but also due to incomplete EMR data entry – often caused by clinician time constraints and lack of data entry restrictions. To complete a patient’s partial EMR data, we plausibly infer missing causal associations between medical EMR concepts, such as diagnoses and treatments, for situations that lack sufficient raw data to enable machine learning methods. We follow a knowledge-based approach, where we leverage open medical knowledge sources such as SNOMED-CT and ICD, combined with knowledge-based reasoning with explainable inferences, to infer clinical encounter information from incomplete medical records. To bootstrap this process, we apply a semantic Extract-Transform-Load process to convert an EMR database into an enriched domain-specific Knowledge Graph.

Download Full-text

Using the wisdom of the crowds to find critical errors in biomedical ontologies: a study of SNOMED CT

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2014-002901 ◽

2014 ◽

Vol 22 (3) ◽

pp. 640-648 ◽

Cited By ~ 28

Author(s):

Jonathan M Mortensen ◽

Evan P Minty ◽

Michael Januszyk ◽

Timothy E Sweeney ◽

Alan L Rector ◽

...

Keyword(s):

Biomedical Ontologies ◽

Rater Agreement ◽

Snomed Ct ◽

Domain Experts ◽

Domain Specific ◽

Knowledge Intensive ◽

A Prospective Study ◽

Systematized Nomenclature Of Medicine ◽

The Cost ◽

Wisdom Of The Crowds

Abstract Objectives The verification of biomedical ontologies is an arduous process that typically involves peer review by subject-matter experts. This work evaluated the ability of crowdsourcing methods to detect errors in SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) and to address the challenges of scalable ontology verification. Methods We developed a methodology to crowdsource ontology verification that uses micro-tasking combined with a Bayesian classifier. We then conducted a prospective study in which both the crowd and domain experts verified a subset of SNOMED CT comprising 200 taxonomic relationships. Results The crowd identified errors as well as any single expert at about one-quarter of the cost. The inter-rater agreement (κ) between the crowd and the experts was 0.58; the inter-rater agreement between experts themselves was 0.59, suggesting that the crowd is nearly indistinguishable from any one expert. Furthermore, the crowd identified 39 previously undiscovered, critical errors in SNOMED CT (eg, ‘septic shock is a soft-tissue infection’). Discussion The results show that the crowd can indeed identify errors in SNOMED CT that experts also find, and the results suggest that our method will likely perform well on similar ontologies. The crowd may be particularly useful in situations where an expert is unavailable, budget is limited, or an ontology is too large for manual error checking. Finally, our results suggest that the online anonymous crowd could successfully complete other domain-specific tasks. Conclusions We have demonstrated that the crowd can address the challenges of scalable ontology verification, completing not only intuitive, common-sense tasks, but also expert-level, knowledge-intensive tasks.

Download Full-text

‘I’m an expert in me and I know what I can cope with’: Patient expertise in rheumatoid arthritis

Communication & Medicine ◽

10.1558/cam.v10i3.249 ◽

2014 ◽

Vol 10 (3) ◽

pp. 249-261 ◽

Cited By ~ 3

Author(s):

Tessa Sanderson ◽

Jo Angouri

Keyword(s):

Rheumatoid Arthritis ◽

Meta Analysis ◽

Healthcare Systems ◽

Specific Knowledge ◽

Active Involvement ◽

Domain Specific ◽

Access To Health ◽

Patient Expertise ◽

Domain Specific Knowledge ◽

The Uk

The active involvement of patients in decision-making and the focus on patient expertise in managing chronic illness constitutes a priority in many healthcare systems including the NHS in the UK. With easier access to health information, patients are almost expected to be (or present self) as an ‘expert patient’ (Ziebland 2004). This paper draws on the meta-analysis of interview data collected for identifying treatment outcomes important to patients with rheumatoid arthritis (RA). Taking a discourse approach to identity, the discussion focuses on the resources used in the negotiation and co-construction of expert identities, including domain-specific knowledge, access to institutional resources, and ability to self-manage. The analysis shows that expertise is both projected (institutionally sanctioned) and claimed by the patient (self-defined). We close the paper by highlighting the limitations of our pilot study and suggest avenues for further research.

Download Full-text

Construction and Application Technology Architecture of Domain-specific Knowledge Graphin Military Field

Journal of Physics Conference Series ◽

10.1088/1742-6596/1792/1/012044 ◽

2021 ◽

Vol 1792 (1) ◽

pp. 012044

Author(s):

Xing Meng ◽

Jin Li-ya ◽

Yang Chao-hong ◽

Bi Jian-quan

Keyword(s):

Application Technology ◽

Specific Knowledge ◽

Domain Specific ◽

Domain Specific Knowledge

Download Full-text

Collecting specialty-related medical terms: Development and evaluation of a resource for Spanish

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01495-w ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Pilar López-Úbeda ◽

Alexandra Pomares-Quimbaya ◽

Manuel Carlos Díaz-Galiano ◽

Stefan Schulz

Keyword(s):

Language Processing ◽

Classification Problem ◽

Snomed Ct ◽

Language Resources ◽

Clinical Specialty ◽

Controlled Vocabularies ◽

Clinical Text ◽

Domain Specific ◽

Medical Terms ◽

Core Vocabulary

Abstract Background Controlled vocabularies are fundamental resources for information extraction from clinical texts using natural language processing (NLP). Standard language resources available in the healthcare domain such as the UMLS metathesaurus or SNOMED CT are widely used for this purpose, but with limitations such as lexical ambiguity of clinical terms. However, most of them are unambiguous within text limited to a given clinical specialty. This is one rationale besides others to classify clinical text by the clinical specialty to which they belong. Results This paper addresses this limitation by proposing and applying a method that automatically extracts Spanish medical terms classified and weighted per sub-domain, using Spanish MEDLINE titles and abstracts as input. The hypothesis is biomedical NLP tasks benefit from collections of domain terms that are specific to clinical subdomains. We use PubMed queries that generate sub-domain specific corpora from Spanish titles and abstracts, from which token n-grams are collected and metrics of relevance, discriminatory power, and broadness per sub-domain are computed. The generated term set, called Spanish core vocabulary about clinical specialties (SCOVACLIS), was made available to the scientific community and used in a text classification problem obtaining improvements of 6 percentage points in the F-measure compared to the baseline using Multilayer Perceptron, thus demonstrating the hypothesis that a specialized term set improves NLP tasks. Conclusion The creation and validation of SCOVACLIS support the hypothesis that specific term sets reduce the level of ambiguity when compared to a specialty-independent and broad-scope vocabulary.

Download Full-text

Domain-specific knowledge and memory performance: A comparison of high- and low-aptitude children.

Journal of Educational Psychology ◽

10.1037/0022-0663.81.3.306 ◽

1989 ◽

Vol 81 (3) ◽

pp. 306-312 ◽

Cited By ~ 134

Author(s):

Wolfgang Schneider ◽

Joachim Körkel ◽

Franz E. Weinert

Keyword(s):

Memory Performance ◽

Specific Knowledge ◽

Domain Specific ◽

Domain Specific Knowledge

Download Full-text

Pattern-based design applied to cultural heritage knowledge graphs

Semantic Web ◽

10.3233/sw-200422 ◽

2020 ◽

pp. 1-45

Author(s):

Valentina Anita Carriero ◽

Aldo Gangemi ◽

Maria Letizia Mancinelli ◽

Andrea Giovanni Nuzzolese ◽

Valentina Presutti ◽

...

Keyword(s):

Cultural Heritage ◽

Design Patterns ◽

Lessons Learned ◽

Knowledge Graph ◽

Unit Testing ◽

Domain Specific ◽

Rigorous Testing ◽

Ontology Design ◽

Domain Specific Knowledge ◽

Knowledge Graphs

Ontology Design Patterns (ODPs) have become an established and recognised practice for guaranteeing good quality ontology engineering. There are several ODP repositories where ODPs are shared as well as ontology design methodologies recommending their reuse. Performing rigorous testing is recommended as well for supporting ontology maintenance and validating the resulting resource against its motivating requirements. Nevertheless, it is less than straightforward to find guidelines on how to apply such methodologies for developing domain-specific knowledge graphs. ArCo is the knowledge graph of Italian Cultural Heritage and has been developed by using eXtreme Design (XD), an ODP- and test-driven methodology. During its development, XD has been adapted to the need of the CH domain e.g. gathering requirements from an open, diverse community of consumers, a new ODP has been defined and many have been specialised to address specific CH requirements. This paper presents ArCo and describes how to apply XD to the development and validation of a CH knowledge graph, also detailing the (intellectual) process implemented for matching the encountered modelling problems to ODPs. Relevant contributions also include a novel web tool for supporting unit-testing of knowledge graphs, a rigorous evaluation of ArCo, and a discussion of methodological lessons learned during ArCo’s development.

Download Full-text

Domain-Specific Knowledge Systems in the Brain: The Animate-Inanimate Distinction

Journal of Cognitive Neuroscience ◽

10.1162/089892998563752 ◽

1998 ◽

Vol 10 (1) ◽

pp. 1-34 ◽

Cited By ~ 737

Author(s):

Alfonso Caramazza ◽

Jennifer R. Shelton

Keyword(s):

Knowledge Systems ◽

Neural Mechanisms ◽

Specific Knowledge ◽

Knowledge Framework ◽

Semantic Categories ◽

Domain Specific ◽

Domain Specific Knowledge ◽

Cognitive Neuropsychological ◽

The Brain ◽

Selective Impairment

We claim that the animate and inanimate conceptual categories represent evolutionarily adapted domain-specific knowledge systems that are subserved by distinct neural mechanisms, thereby allowing for their selective impairment in conditions of brain damage. On this view, (some of) the category-specific deficits that have recently been reported in the cognitive neuropsychological literature—for example, the selective damage or sparing of knowledge about animals—are truly categorical effects. Here, we articulate and defend this thesis against the dominant, reductionist theory of category-specific deficits, which holds that the categorical nature of the deficits is the result of selective damage to noncategorically organized visual or functional semantic subsystems. On the latter view, the sensory/functional dimension provides the fundamental organizing principle of the semantic system. Since, according to the latter theory, sensory and functional properties are differentially important in determining the meaning of the members of different semantic categories, selective damage to the visual or the functional semantic subsystem will result in a category-like deficit. A review of the literature and the results of a new case of category-specific deficit will show that the domain-specific knowledge framework provides a better account of category-specific deficits than the sensory/functional dichotomy theory.

Download Full-text

Toward Knowledge Management for Smart Manufacturing

Journal of Computing and Information Science in Engineering ◽

10.1115/1.4037178 ◽

2017 ◽

Vol 17 (3) ◽

Cited By ~ 27

Author(s):

Shaw C. Feng ◽

William Z. Bernstein ◽

Thomas Hedberg ◽

Allison Barnard Feeney

Keyword(s):

Knowledge Management ◽

Design Process ◽

Process Planning ◽

Smart Manufacturing ◽

Specific Knowledge ◽

Domain Specific ◽

Domain Specific Knowledge ◽

Knowledge Objects ◽

Planning Production

The need for capturing knowledge in the digital form in design, process planning, production, and inspection has increasingly become an issue in manufacturing industries as the variety and complexity of product lifecycle applications increase. Both knowledge and data need to be well managed for quality assurance, lifecycle impact assessment, and design improvement. Some technical barriers exist today that inhibit industry from fully utilizing design, planning, processing, and inspection knowledge. The primary barrier is a lack of a well-accepted mechanism that enables users to integrate data and knowledge. This paper prescribes knowledge management to address a lack of mechanisms for integrating, sharing, and updating domain-specific knowledge in smart manufacturing (SM). Aspects of the knowledge constructs include conceptual design, detailed design, process planning, material property, production, and inspection. The main contribution of this paper is to provide a methodology on what knowledge manufacturing organizations access, update, and archive in the context of SM. The case study in this paper provides some example knowledge objects to enable SM.

Download Full-text