Integrating corpus data in dynamic knowledge bases

Terminological information is a key element in the construction of a knowledge base. In order for a knowledge base to be useful to different users, terminological information should be extracted from corpora so as to reflect the different pragmatic nuances. Puertoterm is a knowledge base in the field of Coastal Engineering, which has made use of corpus information to develop terminological entries. It also includes contextual information in such a way that this information interacts with other elements of the knowledge base. We describe the methodology followed in the project regarding corpus design, retrieval of lexical information, conceptual organization of the domain of Coastal Engineering, and the elaboration of terminological entries.

Download Full-text

Framing Terminology: A Process-Oriented Approach1

Meta Journal des traducteurs ◽

10.7202/019916ar ◽

2009 ◽

Vol 50 (4) ◽

Cited By ~ 7

Author(s):

Pamela Faber Benítez ◽

Carlos Márquez Linares ◽

Miguel Vega Expósito

Keyword(s):

Dynamic Process ◽

Coastal Engineering ◽

Semantic Structure ◽

Entire System ◽

Frame Semantics ◽

Conceptual Organization ◽

Adequate Representation ◽

Corpus Data ◽

Process Oriented ◽

Case Frames

Abstract The frame notion used in Frame Semantics can be traced to case frames, which were said to characterize a small abstract situation in such a way that if one wished to understand the semantic structure of a verb it was necessary to understand the properties of the entire scene that it activated. A frame has been more broadly defined as any system of concepts related in such a way that one concept evokes the entire system. In this sense, it bears an obvious affinity with terminology, which is also based on such conceptual organization. However, despite the fact that Frame Semantics has been usefully applied to lexicology and syntax, so far it has not been systematically applied to terminology. This paper argues for a frame-based organization of specialized fields in which a dynamic process-oriented frame provides the conceptual underpinnings for the location of sub-hierarchies of concepts within a specialized domain event, and the elaboration of a definition template, thus opening the door to a more adequate representation of specialized fields as well as supplying a better way of linking terms to concepts. The domain of coastal engineering is used as an example because the entities in play take part in processes that are difficult to describe only by means of conceptual trees. Through the use of corpus data we demonstrate how it is possible to represent such an event and create a dynamic frame which enriches and enhances the understanding of specialized field concepts.

Download Full-text

Open Knowledge Base: Resources and Units of Knowledge

10.31219/osf.io/7vayt ◽

2020 ◽

Author(s):

Matheus Pereira Lobo

Keyword(s):

Knowledge Base ◽

Knowledge Bases

This paper is about highlighting two categories of knowledge bases, one built as a repository of links, and other based on units of knowledge.

Download Full-text

Building an OWL ontology with Xper3

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25614 ◽

2018 ◽

Vol 2 ◽

pp. e25614 ◽

Cited By ~ 1

Author(s):

Florian Pellen ◽

Sylvain Bouquin ◽

Isabelle Mougenot ◽

Régine Vignes-Lebbe

Keyword(s):

Semantic Web ◽

Knowledge Base ◽

Expressive Power ◽

Knowledge Bases ◽

User Friendliness ◽

Standard Format ◽

Ontology Language ◽

Closed World ◽

Web Standards ◽

Description Framework

Xper3 (Vignes Lebbe et al. 2016) is a collaborative knowledge base publishing platform that, since its launch in november 2013, has been adopted by over 2 thousand users (Pinel et al. 2017). This is mainly due to its user friendly interface and the simplicity of its data model. The data are stored in MySQL Relational DBs, but the exchange format uses the TDWG standard format SDD (Structured Descriptive DataHagedorn et al. 2005). However, each Xper3 knowledge base is a closed world that the author(s) may or may not share with the scientific community or the public via publishing content and/or identification key (Kopfstein 2016). The explicit taxonomic, geographic and phenotypic limits of a knowledge base are not always well defined in the metadata fields. Conversely terminology vocabularies, such as Phenotype and Trait Ontology PATO and the Plant Ontology PO, and software to edit them, such as Protégé and Phenoscape, are essential in the semantic web, but difficult to handle for biologist without computer skills. These ontologies constitute open worlds, and are expressed themselves by RDF triples (Resource Description Framework). Protégé offers vizualisation and reasoning capabilities for these ontologies (Gennari et al. 2003, Musen 2015). Our challenge is to combine the user friendliness of Xper3 with the expressive power of OWL (Web Ontology Language), the W3C standard for building ontologies. We therefore focused on analyzing the representation of the same taxonomic contents under Xper3 and under different models in OWL. After this critical analysis, we chose a description model that allows automatic export of SDD to OWL and can be easily enriched. We will present the results obtained and their validation on two knowledge bases, one on parasitic crustaceans (Sacculina) and the second on current ferns and fossils (Corvez and Grand 2014). The evolution of the Xper3 platform and the perspectives offered by this link with semantic web standards will be discussed.

Download Full-text

Improving the Quality of Linked Data Using Statistical Distributions

Information Retrieval and Management ◽

10.4018/978-1-5225-5191-1.ch074 ◽

2018 ◽

pp. 1638-1664 ◽

Cited By ~ 1

Author(s):

Heiko Paulheim ◽

Christian Bizer

Keyword(s):

Knowledge Base ◽

Linked Data ◽

Relational Databases ◽

Knowledge Bases ◽

Structured Data ◽

Data Sources ◽

Data Sets ◽

Statistical Distributions ◽

The Web

Linked Data on the Web is either created from structured data sources (such as relational databases), from semi-structured sources (such as Wikipedia), or from unstructured sources (such as text). In the latter two cases, the generated Linked Data will likely be noisy and incomplete. In this paper, we present two algorithms that exploit statistical distributions of properties and types for enhancing the quality of incomplete and noisy Linked Data sets: SDType adds missing type statements, and SDValidate identifies faulty statements. Neither of the algorithms uses external knowledge, i.e., they operate only on the data itself. We evaluate the algorithms on the DBpedia and NELL knowledge bases, showing that they are both accurate as well as scalable. Both algorithms have been used for building the DBpedia 3.9 release: With SDType, 3.4 million missing type statements have been added, while using SDValidate, 13,000 erroneous RDF statements have been removed from the knowledge base.

Download Full-text

Formal Query Building with Query Structure Prediction for Complex Question Answering over Knowledge Base

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/519 ◽

2020 ◽

Author(s):

Yongrui Chen ◽

Huiying Li ◽

Yuncheng Hua ◽

Guilin Qi

Keyword(s):

Knowledge Base ◽

Structure Prediction ◽

Question Answering ◽

State Transition ◽

Knowledge Bases ◽

Second Stage ◽

Transition Strategy ◽

Query Structure ◽

Two Stages ◽

Complex Question

Formal query building is an important part of complex question answering over knowledge bases. It aims to build correct executable queries for questions. Recent methods try to rank candidate queries generated by a state-transition strategy. However, this candidate generation strategy ignores the structure of queries, resulting in a considerable number of noisy queries. In this paper, we propose a new formal query building approach that consists of two stages. In the first stage, we predict the query structure of the question and leverage the structure to constrain the generation of the candidate queries. We propose a novel graph generation framework to handle the structure prediction task and design an encoder-decoder model to predict the argument of the predetermined operation in each generative step. In the second stage, we follow the previous methods to rank the candidate queries. The experimental results show that our formal query building approach outperforms existing methods on complex questions while staying competitive on simple questions.

Download Full-text

Data mining for building knowledge bases: techniques, architectures and applications

The Knowledge Engineering Review ◽

10.1017/s0269888916000047 ◽

2016 ◽

Vol 31 (2) ◽

pp. 97-123 ◽

Cited By ~ 4

Author(s):

Alfred Krzywicki ◽

Wayne Wobcke ◽

Michael Bain ◽

John Calvo Martinez ◽

Paul Compton

Keyword(s):

Data Mining ◽

Knowledge Base ◽

Question Answering ◽

Knowledge Bases ◽

Event Extraction ◽

Data Sources ◽

Small Scale ◽

Knowledge Mining ◽

Practical Applications ◽

Unstructured Text

AbstractData mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we callknowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.

Download Full-text

Contextual information in terminological knowledge bases: A multimodal approach

Journal of Pragmatics ◽

10.1016/j.pragma.2009.12.008 ◽

2010 ◽

Vol 42 (7) ◽

pp. 1928-1950 ◽

Cited By ~ 11

Author(s):

Arianne Reimerink ◽

Mercedes García de Quesada ◽

Silvia Montero-Martínez

Keyword(s):

Contextual Information ◽

Knowledge Bases ◽

Multimodal Approach

Download Full-text

Towards a Food Safety Knowledge Base Applicable in Crisis Situations and Beyond

BioMed Research International ◽

10.1155/2015/830809 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Alexander Falenski ◽

Armin A. Weiser ◽

Christian Thöns ◽

Bernd Appel ◽

Annemarie Käsbohrer ◽

...

Keyword(s):

Food Safety ◽

Knowledge Base ◽

Knowledge Bases ◽

Decision Makers ◽

Mitigation Measures ◽

Base Concept ◽

Food Quality And Safety ◽

Contamination Event ◽

Safety Knowledge ◽

Crisis Situations

In case of contamination in the food chain, fast action is required in order to reduce the numbers of affected people. In such situations, being able to predict the fate of agents in foods would help risk assessors and decision makers in assessing the potential effects of a specific contamination event and thus enable them to deduce the appropriate mitigation measures. One efficient strategy supporting this is using model based simulations. However, application in crisis situations requires ready-to-use and easy-to-adapt models to be available from the so-called food safety knowledge bases. Here, we illustrate this concept and its benefits by applying the modular open source software tools PMM-Lab and FoodProcess-Lab. As a fictitious sample scenario, an intentional ricin contamination at a beef salami production facility was modelled. Predictive models describing the inactivation of ricin were reviewed, relevant models were implemented with PMM-Lab, and simulations on residual toxin amounts in the final product were performed with FoodProcess-Lab. Due to the generic and modular modelling concept implemented in these tools, they can be applied to simulate virtually any food safety contamination scenario. Apart from the application in crisis situations, the food safety knowledge base concept will also be useful in food quality and safety investigations.

Download Full-text

Extending Knowledge Base Update into First-Order Knowledge Bases

New Frontiers in Applied Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-69052-8_66 ◽

2008 ◽

pp. 631-640

Author(s):

Ewa Madalińska-Bugaj ◽

Witold Łukaszewicz

Keyword(s):

Knowledge Base ◽

Knowledge Bases ◽

First Order

Download Full-text

Web knowledge

Agency and the Semantic Web ◽

10.1093/oso/9780199292486.003.0008 ◽

2006 ◽

Author(s):

Christopher Walton

Keyword(s):

Semantic Web ◽

Knowledge Representation ◽

Knowledge Base ◽

Web Applications ◽

Semantic Network ◽

Knowledge Bases ◽

Web Standards ◽

Description Framework ◽

The Web

In the introductory chapter of this book, we discussed the means by which knowledge can be made available on the Web. That is, the representation of the knowledge in a form by which it can be automatically processed by a computer. To recap, we identified two essential steps that were deemed necessary to achieve this task: 1. We discussed the need to agree on a suitable structure for the knowledge that we wish to represent. This is achieved through the construction of a semantic network, which defines the main concepts of the knowledge, and the relationships between these concepts. We presented an example network that contained the main concepts to differentiate between kinds of cameras. Our network is a conceptualization, or an abstract view of a small part of the world. A conceptualization is defined formally in an ontology, which is in essence a vocabulary for knowledge representation. 2. We discussed the construction of a knowledge base, which is a store of knowledge about a domain in machine-processable form; essentially a database of knowledge. A knowledge base is constructed through the classification of a body of information according to an ontology. The result will be a store of facts and rules that describe the domain. Our example described the classification of different camera features to form a knowledge base. The knowledge base is expressed formally in the language of the ontology over which it is defined. In this chapter we elaborate on these two steps to show how we can define ontologies and knowledge bases specifically for the Web. This will enable us to construct Semantic Web applications that make use of this knowledge. The chapter is devoted to a detailed explanation of the syntax and pragmatics of the RDF, RDFS, and OWL Semantic Web standards. The resource description framework (RDF) is an established standard for knowledge representation on the Web. Taken together with the associated RDF Schema (RDFS) standard, we have a language for representing simple ontologies and knowledge bases on the Web.

Download Full-text