Path-Oriented Keyword Search Query over RDF

Effective keyword query structuring using NER for XML retrieval

International Journal of Web Information Systems ◽

10.1108/ijwis-06-2014-0022 ◽

2015 ◽

Vol 11 (1) ◽

pp. 33-53

Author(s):

Abubakar Roko ◽

Shyamala Doraisamy ◽

Azrul Hazri Jantan ◽

Azreen Azman

Keyword(s):

Semantic Information ◽

Keyword Search ◽

Query Languages ◽

Entity Recognition ◽

Experimental Result ◽

Query Interface ◽

Search Query ◽

Keyword Query ◽

Xml Retrieval ◽

Content Type

Purpose – The purpose of this paper is to propose and evaluate XKQSS, a query structuring method that relegates the task of generating structured queries from a user to a search engine while retaining the simple keyword search query interface. A more effective way for searching XML database is to use structured queries. However, using query languages to express queries prove to be difficult for most users since this requires learning a query language and knowledge of the underlying data schema. On the other hand, the success of Web search engines has made many users to be familiar with keyword search and, therefore, they prefer to use a keyword search query interface to search XML data. Design/methodology/approach – Existing query structuring approaches require users to provide structural hints in their input keyword queries even though their interface is keyword base. Other problems with existing systems include their inability to put keyword query ambiguities into consideration during query structuring and how to select the best generated structure query that best represents a given keyword query. To address these problems, this study allows users to submit a schema independent keyword query, use named entity recognition (NER) to categorize query keywords to resolve query ambiguities and compute semantic information for a node from its data content. Algorithms were proposed that find user search intentions and convert the intentions into a set of ranked structured queries. Findings – Experiments with Sigmod and IMDB datasets were conducted to evaluate the effectiveness of the method. The experimental result shows that the XKQSS is about 20 per cent more effective than XReal in terms of return nodes identification, a state-of-art systems for XML retrieval. Originality/value – Existing systems do not take keyword query ambiguities into account. XKSS consists of two guidelines based on NER that help to resolve these ambiguities before converting the submitted query. It also include a ranking function computes a score for each generated query by using both semantic information and data statistic, as opposed to data statistic only approach used by the existing approaches.

Download Full-text

Leaving No Stone Unturned: Flexible Retrieval of Idiomatic Expressions from a Large Text Corpus

Machine Learning and Knowledge Extraction ◽

10.3390/make3010013 ◽

2021 ◽

Vol 3 (1) ◽

pp. 263-283

Author(s):

Callum Hughes ◽

Maxim Filimonov ◽

Alison Wray ◽

Irena Spasić

Keyword(s):

Language Processing ◽

Query Expansion ◽

Retrieval System ◽

Keyword Search ◽

Query Language ◽

Language Education ◽

Information Retrieval System ◽

Search Query ◽

Linguistic Research ◽

Adverbial Modification

Idioms are multi-word expressions whose meaning cannot always be deduced from the literal meaning of constituent words. A key feature of idioms that is central to this paper is their peculiar mixture of fixedness and variability, which poses challenges for their retrieval from large corpora using traditional search approaches. These challenges hinder insights into idiom usage, affecting users who are conducting linguistic research as well as those involved in language education. To facilitate access to idiom examples taken from real-world contexts, we introduce an information retrieval system designed specifically for idioms. Given a search query that represents an idiom, typically in its canonical form, the system expands it automatically to account for the most common types of idiom variation including inflection, open slots, adjectival or adverbial modification and passivisation. As a by-product of query expansion, other types of idiom variation captured include derivation, compounding, negation, distribution across multiple clauses as well as other unforeseen types of variation. The system was implemented on top of Elasticsearch, an open-source, distributed, scalable, real-time search engine. Flexible retrieval of idioms is supported by a combination of linguistic pre-processing of the search queries, their translation into a set of query clauses written in a query language called Query DSL, and analysis, an indexing process that involves tokenisation and normalisation. Our system outperformed the phrase search in terms of recall and outperformed the keyword search in terms of precision. Out of the three, our approach was found to provide the best balance between precision and recall. By providing a fast and easy way of finding idioms in large corpora, our approach can facilitate further developments in fields such as linguistics, language education and natural language processing.

Download Full-text

Which Ranking for Effective Keyword Search Query over RDF Graphs?

Information Reuse and Integration in Academia and Industry ◽

10.1007/978-3-7091-1538-1_8 ◽

2013 ◽

pp. 163-186

Author(s):

Roberto De Virgilio

Keyword(s):

Keyword Search ◽

Search Query ◽

Rdf Graphs

Download Full-text

RDF Keyword Search Query Processing via Tensor Calculus

On the Move to Meaningful Internet Systems: OTM 2012 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-33615-7_22 ◽

2012 ◽

pp. 780-788 ◽

Cited By ~ 1

Author(s):

Roberto De Virgilio

Keyword(s):

Query Processing ◽

Keyword Search ◽

Search Query ◽

Tensor Calculus

Download Full-text

A Path-Oriented RDF Index for Keyword Search Query Processing

Lecture Notes in Computer Science - Database and Expert Systems Applications ◽

10.1007/978-3-642-23091-2_31 ◽

2011 ◽

pp. 366-380 ◽

Cited By ~ 13

Author(s):

Paolo Cappellari ◽

Roberto De Virgilio ◽

Antonio Maccioni ◽

Mark Roantree

Keyword(s):

Query Processing ◽

Keyword Search ◽

Search Query

Download Full-text

An Approach to Trie Based Keyword Search for Search Engines

International Journal of Library and Information Services ◽

10.4018/ijlis.2017010101 ◽

2017 ◽

Vol 6 (1) ◽

pp. 1-16

Author(s):

Pranav Murali

Keyword(s):

Data Structure ◽

Search Engines ◽

Keyword Search ◽

Relevant Information ◽

Binary Search ◽

Access Time ◽

Search Query ◽

Search Technique ◽

Indexing Methods ◽

Good Improvement

Search Engines use indexing techniques to minimize the time taken to find the relevant information to a search query. They maintain a keywords list that may reside either in the memory or in the external storage, like a hard disk. While a pure binary search can be used for this purpose, it suffers from performance issue when keywords are stored in the external storage. Some implementations of search engines use a B-tree and sparse indexes to reduce access time. This paper aims at reducing the keyword access time further. It presents a keyword search technique that utilizes a combination of trie data structure and a new keyword prefixing method. Experimental results show good improvement in performance over pure binary search. The merits of incorporating trie based approach into contemporary indexing methods is also discussed. Keyword prefixing method is described and some salient steps in the process of keyword generation are outlined.

Download Full-text

Machine Learning of Motor Vehicle Accident Categories from Narrative Data

Methods of Information in Medicine ◽

10.1055/s-0038-1634680 ◽

1996 ◽

Vol 35 (04/05) ◽

pp. 309-316 ◽

Cited By ~ 4

Author(s):

M. R. Lehto ◽

G. S. Sorock

Keyword(s):

Machine Learning ◽

Bayesian Model ◽

Keyword Search ◽

Motor Vehicle ◽

Motor Vehicle Accident ◽

Computer Search ◽

Vehicle Accident ◽

Learning Technique ◽

Expert Ratings ◽

Keyword Searches

Abstract:Bayesian inferencing as a machine learning technique was evaluated for identifying pre-crash activity and crash type from accident narratives describing 3,686 motor vehicle crashes. It was hypothesized that a Bayesian model could learn from a computer search for 63 keywords related to accident categories. Learning was described in terms of the ability to accurately classify previously unclassifiable narratives not containing the original keywords. When narratives contained keywords, the results obtained using both the Bayesian model and keyword search corresponded closely to expert ratings (P(detection)≥0.9, and P(false positive)≤0.05). For narratives not containing keywords, when the threshold used by the Bayesian model was varied between p>0.5 and p>0.9, the overall probability of detecting a category assigned by the expert varied between 67% and 12%. False positives correspondingly varied between 32% and 3%. These latter results demonstrated that the Bayesian system learned from the results of the keyword searches.

Download Full-text