The Ontologist

Author(s):  
Kurt Cagle

This paper covers the design details of the Ontologist, a MarkLogic based project for using RDF triples, XQuery search capabilities and RESTful services to provide controlled vocabularies, taxonomy management and semantic wikis.

2010 ◽  
Author(s):  
G. Beuchelt ◽  
T. Kehoe ◽  
P. J. Miller ◽  
R. Modeen ◽  
C. Partridge ◽  
...  
Keyword(s):  

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Pilar López-Úbeda ◽  
Alexandra Pomares-Quimbaya ◽  
Manuel Carlos Díaz-Galiano ◽  
Stefan Schulz

Abstract Background Controlled vocabularies are fundamental resources for information extraction from clinical texts using natural language processing (NLP). Standard language resources available in the healthcare domain such as the UMLS metathesaurus or SNOMED CT are widely used for this purpose, but with limitations such as lexical ambiguity of clinical terms. However, most of them are unambiguous within text limited to a given clinical specialty. This is one rationale besides others to classify clinical text by the clinical specialty to which they belong. Results This paper addresses this limitation by proposing and applying a method that automatically extracts Spanish medical terms classified and weighted per sub-domain, using Spanish MEDLINE titles and abstracts as input. The hypothesis is biomedical NLP tasks benefit from collections of domain terms that are specific to clinical subdomains. We use PubMed queries that generate sub-domain specific corpora from Spanish titles and abstracts, from which token n-grams are collected and metrics of relevance, discriminatory power, and broadness per sub-domain are computed. The generated term set, called Spanish core vocabulary about clinical specialties (SCOVACLIS), was made available to the scientific community and used in a text classification problem obtaining improvements of 6 percentage points in the F-measure compared to the baseline using Multilayer Perceptron, thus demonstrating the hypothesis that a specialized term set improves NLP tasks. Conclusion The creation and validation of SCOVACLIS support the hypothesis that specific term sets reduce the level of ambiguity when compared to a specialty-independent and broad-scope vocabulary.


Author(s):  
John Zobolas ◽  
Vasundra Touré ◽  
Martin Kuiper ◽  
Steven Vercruysse

Abstract Summary We present a set of software packages that provide uniform access to diverse biological vocabulary resources that are instrumental for current biocuration efforts and tools. The Unified Biological Dictionaries (UniBioDicts or UBDs) provide a single query-interface for accessing the online API services of leading biological data providers. Given a search string, UBDs return a list of matching term, identifier and metadata units from databases (e.g. UniProt), controlled vocabularies (e.g. PSI-MI) and ontologies (e.g. GO, via BioPortal). This functionality can be connected to input fields (user-interface components) that offer autocomplete lookup for these dictionaries. UBDs create a unified gateway for accessing life science concepts, helping curators find annotation terms across resources (based on descriptive metadata and unambiguous identifiers), and helping data users search and retrieve the right query terms. Availability and implementation The UBDs are available through npm and the code is available in the GitHub organisation UniBioDicts (https://github.com/UniBioDicts) under the Affero GPL license. Supplementary information Supplementary data are available at Bioinformatics online.


1984 ◽  
Vol 8 (2) ◽  
pp. 63-66 ◽  
Author(s):  
C.P.R. Dubois

The controlled vocabulary versus the free text approach to information retrieval is reviewed from the mid 1960s to the early 1980s. The dominance of the free text approach following the Cranfield tests is increasingly coming into question as a result of tests on existing online data bases and case studies. This is supported by two case studies on the Coffeeline data base. The differences and values of the two approaches are explored considering thesauri as semantic maps. It is suggested that the most appropriate evaluatory technique for indexing languages is to study the actual use made of various techniques in a wide variety of search environments. Such research is becoming more urgent. Economic and other reasons for the scarcity of online thesauri are reviewed and suggestions are made for methods to secure revenue from thesaurus display facilities. Finally, the promising outlook for renewed develop ment of controlled vocabularies with more effective online display techniques is mentioned, although such development must be based on firm research of user behaviour and needs.


2014 ◽  
Vol 70 (1) ◽  
pp. 173-193 ◽  
Author(s):  
Charlie Mayor ◽  
Lyn Robinson

Purpose – The purpose of this article is to evaluate the development and use of the gene ontology (GO), a scientific vocabulary widely used in molecular biology databases, with particular reference to the relation between the theoretical basis of the GO, and the pragmatics of its application. Design/methodology/approach – The study uses a combination of bibliometric analysis, content analysis and discourse analysis. These analyses focus on details of the ways in which the terms of the ontology are amended and deleted, and in which they are applied by users. Findings – Although the GO is explicitly based on an objective realist epistemology, a considerable extent of subjectivity and social factors are evident in its development and use. It is concluded that bio-ontologies could beneficially be extended to be pluralist, while remaining objective, taking a view of concepts closer to that of more traditional controlled vocabularies. Originality/value – This is one of very few studies which evaluate the development of a formal ontology in relation to its conceptual foundations, and the first to consider the GO in this way.


2007 ◽  
Vol 26 (3) ◽  
pp. 13 ◽  
Author(s):  
Louise F. Spiteri

This article examines the linguistic structure of folksonomy tags collected over a thirty-day period from the daily tag logs of Del.icio.us, Furl, and Technorati. The tags were evaluated against the National Information Standards Organization (NISO) guidelines for the construction of controlled vocabularies. The results indicate that the tags correspond closely to the NISO guidelines pertaining to types of concepts expressed, the predominance of single terms and nouns, and the use of recognized spelling. Problem areas pertain to the inconsistent use of count nouns and the incidence of ambiguous tags in the form of homographs, abbreviations, and acronyms. With the addition of guidelines to the construction of unambiguous tags and links to useful external reference sources, folksonomies could serve as a powerful, flexible tool for increasing the user-friendliness and interactivity of public library catalogs, and also may be useful for encouraging other activities, such as informal online communities of readers and user-driven readers’ advisory services.


2021 ◽  
Author(s):  
Nurdan Atalan Çayırezmez ◽  
Piraye Hacıgüzeller ◽  
Tuna Kalayci

This article provides a brief overview of archaeological digital archiving in Turkey. It introduces the legal framework and the stakeholders involved in conducting archaeological excavations and surveys. The current situation in archiving born-digital and digitised documentation produced during archaeological fieldwork is then introduced. Existing repositories serving as hubs for archaeological and heritage archiving are listed and briefly discussed. Analysis of online publishing practices for archaeological digital resources points to an eclectic landscape that only minimally complies with the FAIR (Findable, Accessible, Interoperable, Reusable) principles. We conclude that guidelines for best practice in metadata and semantic technologies, locally applicable standards (especially controlled vocabularies), technical know-how, and a larger acceptance of open data and scholarship remain much-needed assets for archaeological digital archiving in Turkey. We also conclude that the future promises progress towards more interoperable archaeological digital archives thanks to international training, network and knowledge transfer opportunities (e.g. SEADDA Project).


Sign in / Sign up

Export Citation Format

Share Document