RDF Keyword Search Query Processing via Tensor Calculus

Author(s):  
Roberto De Virgilio
Author(s):  
Paolo Cappellari ◽  
Roberto De Virgilio ◽  
Antonio Maccioni ◽  
Mark Roantree

2018 ◽  
Vol 14 (3) ◽  
pp. 299-316 ◽  
Author(s):  
Chang-Sup Park

Purpose This paper aims to propose a new keyword search method on graph data to improve the relevance of search results and reduce duplication of content nodes in the answer trees obtained by previous approaches based on distinct root semantics. The previous approaches are restricted to find answer trees having different root nodes and thus often generate a result consisting of answer trees with low relevance to the query or duplicate content nodes. The method allows limited redundancy in the root nodes of top-k answer trees to produce more effective query results. Design/methodology/approach A measure for redundancy in a set of answer trees regarding their root nodes is defined, and according to the metric, a set of answer trees with limited root redundancy is proposed for the result of a keyword query on graph data. For efficient query processing, an index on the useful paths in the graph using inverted lists and a hash map is suggested. Then, based on the path index, a top-k query processing algorithm is presented to find most relevant and diverse answer trees given a maximum amount of root redundancy allowed for a set of answer trees. Findings The results of experiments using real graph datasets show that the proposed approach can produce effective query answers which are more diverse in the content nodes and more relevant to the query than the previous approach based on distinct root semantics. Originality/value This paper first takes redundancy in the root nodes of answer trees into account to improve the relevance and content nodes redundancy of query results over the previous distinct root semantics. It can satisfy the users’ various information need on a large and complex graph data using a keyword-based query.


2000 ◽  
Vol 33 (1-6) ◽  
pp. 119-135 ◽  
Author(s):  
Daniela Florescu ◽  
Donald Kossmann ◽  
Ioana Manolescu

2016 ◽  
Vol 1 (1) ◽  
pp. 40-44
Author(s):  
Suchetadevi M. Gaikwad ◽  
Sanjay B. Thakare

As deep web enlarges; there has been increased interest in methods which help efficiently trace deep-web interfaces. However, because of huge volume and varying nature of deep-web, achieving wide coverage and high efficiency is difficult issue. We proposed a three stage framework, an Enhanced Crawler, for efficiently gathering deep web interfaces. In first stage, enhanced crawler performs site based searching of center pages using automated search engines, avoiding visiting an oversized variety of pages and consuming time. In second stage, enhanced crawler achieves quick in site browsing by fetching most relevant links with associate degree of reconciling link ranking. For further enhancement, our system ranks and priorities websites and also uses a link tree data structure to achieve deep coverage. In third stage, our system provides pre-query processing mechanism so as to help users to write their search query easily by providing char by char keyword search with ranked indexing.


2015 ◽  
Vol 11 (1) ◽  
pp. 33-53
Author(s):  
Abubakar Roko ◽  
Shyamala Doraisamy ◽  
Azrul Hazri Jantan ◽  
Azreen Azman

Purpose – The purpose of this paper is to propose and evaluate XKQSS, a query structuring method that relegates the task of generating structured queries from a user to a search engine while retaining the simple keyword search query interface. A more effective way for searching XML database is to use structured queries. However, using query languages to express queries prove to be difficult for most users since this requires learning a query language and knowledge of the underlying data schema. On the other hand, the success of Web search engines has made many users to be familiar with keyword search and, therefore, they prefer to use a keyword search query interface to search XML data. Design/methodology/approach – Existing query structuring approaches require users to provide structural hints in their input keyword queries even though their interface is keyword base. Other problems with existing systems include their inability to put keyword query ambiguities into consideration during query structuring and how to select the best generated structure query that best represents a given keyword query. To address these problems, this study allows users to submit a schema independent keyword query, use named entity recognition (NER) to categorize query keywords to resolve query ambiguities and compute semantic information for a node from its data content. Algorithms were proposed that find user search intentions and convert the intentions into a set of ranked structured queries. Findings – Experiments with Sigmod and IMDB datasets were conducted to evaluate the effectiveness of the method. The experimental result shows that the XKQSS is about 20 per cent more effective than XReal in terms of return nodes identification, a state-of-art systems for XML retrieval. Originality/value – Existing systems do not take keyword query ambiguities into account. XKSS consists of two guidelines based on NER that help to resolve these ambiguities before converting the submitted query. It also include a ranking function computes a score for each generated query by using both semantic information and data statistic, as opposed to data statistic only approach used by the existing approaches.


2010 ◽  
Vol 23 (5) ◽  
pp. 491-504 ◽  
Author(s):  
Kefeng Xuan ◽  
Geng Zhao ◽  
David Taniar ◽  
Maytham Safar ◽  
Bala Srinivasan

2021 ◽  
Vol 3 (1) ◽  
pp. 263-283
Author(s):  
Callum Hughes ◽  
Maxim Filimonov ◽  
Alison Wray ◽  
Irena Spasić

Idioms are multi-word expressions whose meaning cannot always be deduced from the literal meaning of constituent words. A key feature of idioms that is central to this paper is their peculiar mixture of fixedness and variability, which poses challenges for their retrieval from large corpora using traditional search approaches. These challenges hinder insights into idiom usage, affecting users who are conducting linguistic research as well as those involved in language education. To facilitate access to idiom examples taken from real-world contexts, we introduce an information retrieval system designed specifically for idioms. Given a search query that represents an idiom, typically in its canonical form, the system expands it automatically to account for the most common types of idiom variation including inflection, open slots, adjectival or adverbial modification and passivisation. As a by-product of query expansion, other types of idiom variation captured include derivation, compounding, negation, distribution across multiple clauses as well as other unforeseen types of variation. The system was implemented on top of Elasticsearch, an open-source, distributed, scalable, real-time search engine. Flexible retrieval of idioms is supported by a combination of linguistic pre-processing of the search queries, their translation into a set of query clauses written in a query language called Query DSL, and analysis, an indexing process that involves tokenisation and normalisation. Our system outperformed the phrase search in terms of recall and outperformed the keyword search in terms of precision. Out of the three, our approach was found to provide the best balance between precision and recall. By providing a fast and easy way of finding idioms in large corpora, our approach can facilitate further developments in fields such as linguistics, language education and natural language processing.


Sign in / Sign up

Export Citation Format

Share Document