Natural language query formalization to SPARQL for querying knowledge bases using Rasa

AbstractThe idea of data to be semantically linked and the subsequent usage of this linked data with modern computer applications has been one of the most important aspects of Web 3.0. However, the actualization of this aspect has been challenging due to the difficulties associated with building knowledge bases and using formal languages to query them. In this regard, SPARQL, a recursive acronym for standard query language and protocol for Linked Open Data and Resource Description Framework databases, is a most popular formal querying language. Nonetheless, writing SPARQL queries is known to be difficult, even for experts. Natural language query formalization, which involves semantically parsing natural language queries to their formal language equivalents, has been an essential step in overcoming this steep learning curve. Recent work in the field has seen the usage of artificial intelligence (AI) techniques for language modelling with adequate accuracy. This paper discusses a design for creating a closed domain ontology, which is then used by an AI-powered chat-bot that incorporates natural language query formalization for querying linked data using Rasa for entity extraction after intent recognition. A precision–recall analysis is performed using in-built Rasa tools in conjunction with our own testing parameters, and it is found that our system achieves a precision of 0.78, recall of 0.79 and F1-score of 0.79, which are better than the current state of the art.

Download Full-text

An Unsupervised Approach for Determining Link Specifications

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2018100106 ◽

2018 ◽

Vol 13 (4) ◽

pp. 104-123

Author(s):

Khayra Bencherif ◽

Mimoun Malki ◽

Djamel Amar Bensaber

Keyword(s):

Linked Data ◽

Open Data ◽

Real Data ◽

Knowledge Bases ◽

Structured Data ◽

Data Sets ◽

Novel Approach ◽

Link Discovery ◽

Unsupervised Approach

This article describes how the Linked Open Data Cloud project allows data providers to publish structured data on the web according to the Linked Data principles. In this context, several link discovery frameworks have been developed for connecting entities contained in knowledge bases. In order to achieve a high effectiveness for the link discovery task, a suitable link configuration is required to specify the similarity conditions. Unfortunately, such configurations are specified manually; which makes the link discovery task tedious and more difficult for the users. In this article, the authors address this drawback by proposing a novel approach for the automatic determination of link specifications. The proposed approach is based on a neural network model to combine a set of existing metrics into a compound one. The authors evaluate the effectiveness of the proposed approach in three experiments using real data sets from the LOD Cloud. In addition, the proposed approach is compared against link specifications approaches to show that it outperforms them in most experiments.

Download Full-text

The Linked Data Wiki: Leveraging Organizational Knowledge Bases with Linked Open Data

Communications in Computer and Information Science - Knowledge Discovery, Knowledge Engineering and Knowledge Management ◽

10.1007/978-3-030-15640-4_15 ◽

2019 ◽

pp. 294-319 ◽

Cited By ~ 1

Author(s):

Matthias T. Frank ◽

Stefan Zander

Keyword(s):

Linked Data ◽

Open Data ◽

Organizational Knowledge ◽

Knowledge Bases ◽

Linked Open Data

Download Full-text

Balancing control, usability and visibility of linked open government data to create public value

International Journal of Public Sector Management ◽

10.1108/ijpsm-02-2018-0062 ◽

2019 ◽

Vol 32 (5) ◽

pp. 451-466 ◽

Cited By ~ 1

Author(s):

Benedikt Simon Hitz-Gamper ◽

Oliver Neumann ◽

Matthias Stürmer

Keyword(s):

Linked Data ◽

Open Data ◽

Knowledge Bases ◽

Open Government ◽

Public Value ◽

Content Type ◽

Triple Store ◽

Open Government Data ◽

Government Data ◽

Governance Modes

Purpose Linked data is a technical standard to structure complex information and relate independent sets of data. Recently, governments have started to use this technology for bridging separated data “(silos)” by launching linked open government data (LOGD) portals. The purpose of this paper is to explore the role of LOGD as a smart technology and strategy to create public value. This is achieved by enhancing the usability and visibility of open data provided by public organizations. Design/methodology/approach In this study, three different LOGD governance modes are deduced: public agencies could release linked data via a dedicated triple store, via a shared triple store or via an open knowledge base. Each of these modes has different effects on usability and visibility of open data. Selected case studies illustrate the actual use of these three governance modes. Findings According to this study, LOGD governance modes present a trade-off between retaining control over governmental data and potentially gaining public value by the increased use of open data by citizens. Originality/value This study provides recommendations for public sector organizations for the development of their data publishing strategy to balance control, usability and visibility considering also the growing popularity of open knowledge bases such as Wikidata.

Download Full-text

A Natural Language Query Method for Linked Data

10.1109/bdee52938.2021.00012 ◽

2021 ◽

Author(s):

Zheng Xiao ◽

Yang Xiao

Keyword(s):

Natural Language ◽

Linked Data ◽

Natural Language Query

Download Full-text

Hindi Language Interface to Database

ITM Web of Conferences ◽

10.1051/itmconf/20203201007 ◽

2020 ◽

Vol 32 ◽

pp. 01007

Author(s):

Rachana Dubey ◽

Tejal Kawale ◽

Twisha Choudhary ◽

Vaibhav Narawade

Keyword(s):

Natural Language ◽

Data Storage ◽

Query Language ◽

Database Systems ◽

Sources Of Information ◽

Common People ◽

Storage And Retrieval ◽

Natural Language Query ◽

Sql Query ◽

Information Database

In our everyday lives we require information to accomplish daily tasks. Database is one of the most important sources of information. Database systems have been widely used in data storage and retrieval. However, to extract information from databases, we need to have some knowledge of database languages like SQL. But SQL has predefined structures and format, so it is hard for the non-expert users to formulate the desired query. To override this complexity, we have turned to natural language to retrieve information from database, which can be an ideal channel between a non-technical user and the application. But the application cannot understand natural language so an interface is required. This interface is capable of converting the user’s natural language query to an equivalent database language query. In this paper, we address the system architecture for translating a Hindi sentence in the form of an audio to an equivalent SQL query. The users don’t need to learn any formal query language; hence it’s easy to use for common people.

Download Full-text

Special Thematic Section on Semantic Models for Natural Language Processing (Preface)

Cybernetics and Information Technologies ◽

10.2478/cait-2018-0008 ◽

2018 ◽

Vol 18 (1) ◽

pp. 93-94

Author(s):

Kiril Simov ◽

Petya Osenova

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Question Answering ◽

Open Data ◽

Knowledge Bases ◽

Lexical Resources ◽

Text Simplification ◽

Semantic Models ◽

Semantic Resources

Abstract With the availability of large language data online, cross-linked lexical resources (such as BabelNet, Predicate Matrix and UBY) and semantically annotated corpora (SemCor, OntoNotes, etc.), more and more applications in Natural Language Processing (NLP) have started to exploit various semantic models. The semantic models have been created on the base of LSA, clustering, word embeddings, deep learning, neural networks, etc., and abstract logical forms, such as Minimal Recursion Semantics (MRS) or Abstract Meaning Representation (AMR), etc. Additionally, the Linguistic Linked Open Data Cloud has been initiated (LLOD Cloud) which interlinks linguistic data for improving the tasks of NLP. This cloud has been expanding enormously for the last four-five years. It includes corpora, lexicons, thesauri, knowledge bases of various kinds, organized around appropriate ontologies, such as LEMON. The semantic models behind the data organization as well as the representation of the semantic resources themselves are a challenge to the NLP community. The NLP applications that extensively rely on the above discussed models include Machine Translation, Information Extraction, Question Answering, Text Simplification, etc.

Download Full-text

Natural language query filtration in the Conceptual Query Language

Proceedings of the Thirtieth Hawaii International Conference on System Sciences ◽

10.1109/hicss.1997.661696 ◽

2002 ◽

Cited By ~ 5

Author(s):

V. Owei ◽

Hyeun-Suk Rhee ◽

S. Navathe

Keyword(s):

Natural Language ◽

Query Language ◽

Natural Language Query

Download Full-text

Complex Program Induction for Querying Knowledge Bases in the Absence of Gold Programs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00262 ◽

2019 ◽

Vol 7 ◽

pp. 185-200 ◽

Cited By ~ 4

Author(s):

Amrita Saha ◽

Ghulam Ahmed Ansari ◽

Abhishek Laddha ◽

Karthik Sankaranarayanan ◽

Soumen Chakrabarti

Keyword(s):

Natural Language ◽

Question Answering ◽

Knowledge Bases ◽

Combinatorial Explosion ◽

Complex Queries ◽

Present Complex ◽

Complex Program ◽

Natural Language Query ◽

High Level ◽

Complex Question

Recent years have seen increasingly complex question-answering on knowledge bases (KBQA) involving logical, quantitative, and comparative reasoning over KB subgraphs. Neural Program Induction (NPI) is a pragmatic approach toward modularizing the reasoning process by translating a complex natural language query into a multi-step executable program. While NPI has been commonly trained with the ‘‘gold’’ program or its sketch, for realistic KBQA applications such gold programs are expensive to obtain. There, practically only natural language queries and the corresponding answers can be provided for training. The resulting combinatorial explosion in program space, along with extremely sparse rewards, makes NPI for KBQA ambitious and challenging. We present Complex Imperative Program Induction from Terminal Rewards (CIPITR), an advanced neural programmer that mitigates reward sparsity with auxiliary rewards, and restricts the program space to semantically correct programs using high-level constraints, KB schema, and inferred answer type. CIPITR solves complex KBQA considerably more accurately than key-value memory networks and neural symbolic machines (NSM). For moderately complex queries requiring 2- to 5-step programs, CIPITR scores at least 3× higher F1 than the competing systems. On one of the hardest class of programs (comparative reasoning) with 5–10 steps, CIPITR outperforms NSM by a factor of 89 and memory networks by 9 times. 1

Download Full-text

Evaluating the usability of natural language query languages and interfaces to Semantic Web knowledge bases

Journal of Web Semantics ◽

10.1016/j.websem.2010.06.001 ◽

2010 ◽

Vol 8 (4) ◽

pp. 377-393 ◽

Cited By ~ 71

Author(s):

Esther Kaufmann ◽

Abraham Bernstein

Keyword(s):

Semantic Web ◽

Natural Language ◽

Knowledge Bases ◽

Query Languages ◽

Natural Language Query ◽

Web Knowledge

Download Full-text

Exploiting Linked Data for Open and Configurable Named Entity Extraction

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213015400126 ◽

2015 ◽

Vol 24 (02) ◽

pp. 1540012 ◽

Cited By ~ 8

Author(s):

Pavlos Fafalios ◽

Manolis Baritakis ◽

Yannis Tzitzikas

Keyword(s):

Linked Data ◽

Question Answering ◽

Knowledge Bases ◽

Semantic Knowledge ◽

Configuration Model ◽

Generic Model ◽

Entity Extraction ◽

Named Entity ◽

Domain Specific ◽

Named Entity Extraction

Named Entity Extraction (NEE) is the process of identifying entities in texts and, very commonly, linking them to related (Web) resources. This task is useful in several applications, e.g. for question answering, annotating documents, post-processing of search results, etc. However, existing NEE tools lack an open or easy configuration although this is very important for building domain-specific applications. For example, supporting a new category of entities, or specifying how to link the detected entities with online resources, is either impossible or very laborious. In this paper, we show how we can exploit semantic information (Linked Data) at real-time for configuring (handily) a NEE system and we propose a generic model for configuring such services. To explicitly define the semantics of the proposed model, we introduce an RDF/S vocabulary, called “Open NEE Configuration Model”, which allows a NEE service to describe (and publish as Linked Data) its entity mining capabilities, but also to be dynamically configured. To allow relating the output of a NEE process with an applied configuration, we propose an extension of the Open Annotation Data Model which also enables an application to run advanced queries over the annotated data. As a proof of concept, we present X-Link, a fully-configurable NEE framework that realizes this approach. Contrary to the existing tools, X-Link allows the user to easily define the categories of entities that are interesting for the application at hand by exploiting one or more semantic Knowledge Bases. The user is also able to update a category and specify how to semantically link and enrich the identified entities. This enhanced configurability allows X-Link to be easily configured for different contexts for building domain-specific applications. To test the approach, we conducted a task-based evaluation with users that demonstrates its usability, and a case study that demonstrates its feasibility.

Download Full-text