scholarly journals Quality assurance and enrichment of biological and biomedical ontologies and terminologies

2020 ◽  
Vol 20 (S10) ◽  
Author(s):  
Ankur Agrawal ◽  
Licong Cui

AbstractBiological and biomedical ontologies and terminologies are used to organize and store various domain-specific knowledge to provide standardization of terminology usage and to improve interoperability. The growing number of such ontologies and terminologies and their increasing adoption in clinical, research and healthcare settings call for effective and efficient quality assurance and semantic enrichment techniques of these ontologies and terminologies. In this editorial, we provide an introductory summary of nine articles included in this supplement issue for quality assurance and enrichment of biological and biomedical ontologies and terminologies. The articles cover a range of standards including SNOMED CT, National Cancer Institute Thesaurus, Unified Medical Language System, North American Association of Central Cancer Registries and OBO Foundry Ontologies.

2016 ◽  
Vol 55 (02) ◽  
pp. 158-165 ◽  
Author(s):  
Y. Chen ◽  
Z. He ◽  
M. Halper ◽  
L. Chen ◽  
H. Gu

SummaryBackground: The Unified Medical Language System (UMLS) is one of the largest biomedical terminological systems, with over 2.5 million concepts in its Metathesaurus repository. The UMLS’s Semantic Network (SN) with its collection of 133 high-level semantic types serves as an abstraction layer on top of the Metathesaurus. In particular, the SN elaborates an aspect of the Metathesaurus’s concepts via the assignment of one or more types to each concept. Due to the scope and complexity of the Metathesaurus, errors are all but inevitable in this semantic-type assignment process.Objectives: To develop a semi-automated methodology to help assure the quality of semantic-type assignments within the UMLS.Methods: The methodology uses a cross- validation strategy involving SNOMED CT’s hierarchies in combination with UMLS se -mantic types. Semantically uniform, disjoint concept groups are generated programmatically by partitioning the collection of all concepts in the same SNOMED CT hierarchy according to their respective semantic-type assignments in the UMLS. Domain experts are then called upon to review the concepts in any group having a small number of concepts. It is our hypothesis that a semantic-type assignment combination applicable only to a very small number of concepts in a SNOMED CT hierarchy is an indicator of potential problems.Results: The methodology was applied to the UMLS 2013AA release along with the SNOMED CT from January 2013. An overall error rate of 33% was found for concepts proposed by the quality-assurance methodology. Supporting our hypothesis, that number was four times higher than the error rate found in control samples.Conclusion: The results show that the quality-assurance methodology can aid in effective and efficient identification of UMLS semantic-type assignment errors.


Author(s):  
William Van Woensel ◽  
Chad Armstrong ◽  
Malavan Rajaratnam ◽  
Vaibhav Gupta ◽  
Syed Sibte Raza Abidi

Electronic Medical Records (EMRs) are increasingly being deployed at primary points of care and clinics for digital record keeping, increasing productivity and improving communication. In practice, however, there still exists an often incomplete picture of patient profiles, not only because of disconnected EMR systems but also due to incomplete EMR data entry – often caused by clinician time constraints and lack of data entry restrictions. To complete a patient’s partial EMR data, we plausibly infer missing causal associations between medical EMR concepts, such as diagnoses and treatments, for situations that lack sufficient raw data to enable machine learning methods. We follow a knowledge-based approach, where we leverage open medical knowledge sources such as SNOMED-CT and ICD, combined with knowledge-based reasoning with explainable inferences, to infer clinical encounter information from incomplete medical records. To bootstrap this process, we apply a semantic Extract-Transform-Load process to convert an EMR database into an enriched domain-specific Knowledge Graph.


2014 ◽  
Vol 22 (3) ◽  
pp. 640-648 ◽  
Author(s):  
Jonathan M Mortensen ◽  
Evan P Minty ◽  
Michael Januszyk ◽  
Timothy E Sweeney ◽  
Alan L Rector ◽  
...  

Abstract Objectives The verification of biomedical ontologies is an arduous process that typically involves peer review by subject-matter experts. This work evaluated the ability of crowdsourcing methods to detect errors in SNOMED CT (Systematized Nomenclature of Medicine Clinical Terms) and to address the challenges of scalable ontology verification. Methods We developed a methodology to crowdsource ontology verification that uses micro-tasking combined with a Bayesian classifier. We then conducted a prospective study in which both the crowd and domain experts verified a subset of SNOMED CT comprising 200 taxonomic relationships. Results The crowd identified errors as well as any single expert at about one-quarter of the cost. The inter-rater agreement (κ) between the crowd and the experts was 0.58; the inter-rater agreement between experts themselves was 0.59, suggesting that the crowd is nearly indistinguishable from any one expert. Furthermore, the crowd identified 39 previously undiscovered, critical errors in SNOMED CT (eg, ‘septic shock is a soft-tissue infection’). Discussion The results show that the crowd can indeed identify errors in SNOMED CT that experts also find, and the results suggest that our method will likely perform well on similar ontologies. The crowd may be particularly useful in situations where an expert is unavailable, budget is limited, or an ontology is too large for manual error checking. Finally, our results suggest that the online anonymous crowd could successfully complete other domain-specific tasks. Conclusions We have demonstrated that the crowd can address the challenges of scalable ontology verification, completing not only intuitive, common-sense tasks, but also expert-level, knowledge-intensive tasks.


2014 ◽  
Vol 10 (3) ◽  
pp. 249-261 ◽  
Author(s):  
Tessa Sanderson ◽  
Jo Angouri

The active involvement of patients in decision-making and the focus on patient expertise in managing chronic illness constitutes a priority in many healthcare systems including the NHS in the UK. With easier access to health information, patients are almost expected to be (or present self) as an ‘expert patient’ (Ziebland 2004). This paper draws on the meta-analysis of interview data collected for identifying treatment outcomes important to patients with rheumatoid arthritis (RA). Taking a discourse approach to identity, the discussion focuses on the resources used in the negotiation and co-construction of expert identities, including domain-specific knowledge, access to institutional resources, and ability to self-manage. The analysis shows that expertise is both projected (institutionally sanctioned) and claimed by the patient (self-defined). We close the paper by highlighting the limitations of our pilot study and suggest avenues for further research.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Pilar López-Úbeda ◽  
Alexandra Pomares-Quimbaya ◽  
Manuel Carlos Díaz-Galiano ◽  
Stefan Schulz

Abstract Background Controlled vocabularies are fundamental resources for information extraction from clinical texts using natural language processing (NLP). Standard language resources available in the healthcare domain such as the UMLS metathesaurus or SNOMED CT are widely used for this purpose, but with limitations such as lexical ambiguity of clinical terms. However, most of them are unambiguous within text limited to a given clinical specialty. This is one rationale besides others to classify clinical text by the clinical specialty to which they belong. Results This paper addresses this limitation by proposing and applying a method that automatically extracts Spanish medical terms classified and weighted per sub-domain, using Spanish MEDLINE titles and abstracts as input. The hypothesis is biomedical NLP tasks benefit from collections of domain terms that are specific to clinical subdomains. We use PubMed queries that generate sub-domain specific corpora from Spanish titles and abstracts, from which token n-grams are collected and metrics of relevance, discriminatory power, and broadness per sub-domain are computed. The generated term set, called Spanish core vocabulary about clinical specialties (SCOVACLIS), was made available to the scientific community and used in a text classification problem obtaining improvements of 6 percentage points in the F-measure compared to the baseline using Multilayer Perceptron, thus demonstrating the hypothesis that a specialized term set improves NLP tasks. Conclusion The creation and validation of SCOVACLIS support the hypothesis that specific term sets reduce the level of ambiguity when compared to a specialty-independent and broad-scope vocabulary.


Semantic Web ◽  
2020 ◽  
pp. 1-45
Author(s):  
Valentina Anita Carriero ◽  
Aldo Gangemi ◽  
Maria Letizia Mancinelli ◽  
Andrea Giovanni Nuzzolese ◽  
Valentina Presutti ◽  
...  

Ontology Design Patterns (ODPs) have become an established and recognised practice for guaranteeing good quality ontology engineering. There are several ODP repositories where ODPs are shared as well as ontology design methodologies recommending their reuse. Performing rigorous testing is recommended as well for supporting ontology maintenance and validating the resulting resource against its motivating requirements. Nevertheless, it is less than straightforward to find guidelines on how to apply such methodologies for developing domain-specific knowledge graphs. ArCo is the knowledge graph of Italian Cultural Heritage and has been developed by using eXtreme Design (XD), an ODP- and test-driven methodology. During its development, XD has been adapted to the need of the CH domain e.g. gathering requirements from an open, diverse community of consumers, a new ODP has been defined and many have been specialised to address specific CH requirements. This paper presents ArCo and describes how to apply XD to the development and validation of a CH knowledge graph, also detailing the (intellectual) process implemented for matching the encountered modelling problems to ODPs. Relevant contributions also include a novel web tool for supporting unit-testing of knowledge graphs, a rigorous evaluation of ArCo, and a discussion of methodological lessons learned during ArCo’s development.


1998 ◽  
Vol 10 (1) ◽  
pp. 1-34 ◽  
Author(s):  
Alfonso Caramazza ◽  
Jennifer R. Shelton

We claim that the animate and inanimate conceptual categories represent evolutionarily adapted domain-specific knowledge systems that are subserved by distinct neural mechanisms, thereby allowing for their selective impairment in conditions of brain damage. On this view, (some of) the category-specific deficits that have recently been reported in the cognitive neuropsychological literature—for example, the selective damage or sparing of knowledge about animals—are truly categorical effects. Here, we articulate and defend this thesis against the dominant, reductionist theory of category-specific deficits, which holds that the categorical nature of the deficits is the result of selective damage to noncategorically organized visual or functional semantic subsystems. On the latter view, the sensory/functional dimension provides the fundamental organizing principle of the semantic system. Since, according to the latter theory, sensory and functional properties are differentially important in determining the meaning of the members of different semantic categories, selective damage to the visual or the functional semantic subsystem will result in a category-like deficit. A review of the literature and the results of a new case of category-specific deficit will show that the domain-specific knowledge framework provides a better account of category-specific deficits than the sensory/functional dichotomy theory.


Author(s):  
Shaw C. Feng ◽  
William Z. Bernstein ◽  
Thomas Hedberg ◽  
Allison Barnard Feeney

The need for capturing knowledge in the digital form in design, process planning, production, and inspection has increasingly become an issue in manufacturing industries as the variety and complexity of product lifecycle applications increase. Both knowledge and data need to be well managed for quality assurance, lifecycle impact assessment, and design improvement. Some technical barriers exist today that inhibit industry from fully utilizing design, planning, processing, and inspection knowledge. The primary barrier is a lack of a well-accepted mechanism that enables users to integrate data and knowledge. This paper prescribes knowledge management to address a lack of mechanisms for integrating, sharing, and updating domain-specific knowledge in smart manufacturing (SM). Aspects of the knowledge constructs include conceptual design, detailed design, process planning, material property, production, and inspection. The main contribution of this paper is to provide a methodology on what knowledge manufacturing organizations access, update, and archive in the context of SM. The case study in this paper provides some example knowledge objects to enable SM.


Sign in / Sign up

Export Citation Format

Share Document