inverted list Latest Research Papers

Advancements of various Geographic Information Technologies have resulted in huge growth in Geo-Textual data. Many Indexing and searching algorithms are developed to handle this Geo-Textual data which contains spatial, textual and temporal information. In past, Indexing and searching algorithms are developed for the applications in which the object trajectory or velocity vector is known in advance and hence we can predict the future position of the objects. There are real time applications like emergency management systems, traffic monitoring, where the objects movements are unpredictable and hence future position of the objects cannot be predicted. Techniques are required to answer the geo-textual kNN query where the velocity vectors or trajectories of moving and moving queries are not known. In case of moving objects, capturing current position of the object and maintaining spatial index optimally is very much essential. The hybrid indexing techniques used earlier are based on R-tree spatial index. The nodes of the R-tree index structure are split or merged to maintain the locations of continuously moving objects, increasing the maintenance cost as compared to the grid index. In this paper a solution is proposed for creating and maintaining hybrid index for moving objects and queries based on grid and inverted list hybrid indexing techniques. The method is also proposed for finding Geo-Textual nearest neighbours for static and moving queries using hybrid index and conceptual partitioning of the grid. The overall gain reported by the experimental work using hybrid index over the non- hybrid index is 30 to 40 percent depending on the grid size chosen for mapping the data space and on the parameters of queries.

Download Full-text

Selectivity Estimation on Set Containment Search

Data Science and Engineering ◽

10.1007/s41019-019-00104-1 ◽

2019 ◽

Vol 4 (3) ◽

pp. 254-268 ◽

Cited By ~ 1

Author(s):

Yang Yang ◽

Wenjie Zhang ◽

Ying Zhang ◽

Xuemin Lin ◽

Liping Wang

Keyword(s):

Random Sampling ◽

Simple Random Sampling ◽

Divide And Conquer ◽

Selectivity Estimation ◽

Vocabulary Size ◽

Set Containment ◽

Inverted List ◽

Efficient Sampling ◽

Occurrence Patterns ◽

Sampling Approach

Abstract In this paper, we study the problem of selectivity estimation on set containment search. Given a query record Q and a record dataset $${\mathcal {S}}$$ S , we aim to accurately and efficiently estimate the selectivity of set containment search of query Q over $${\mathcal {S}}$$ S . We first extend existing distinct value estimating techniques to solve this problem and develop an inverted list and G-KMV sketch-based approach IL-GKMV. We analyze that the performance of IL-GKMV degrades with the increase in vocabulary size. Motivated by limitations of existing techniques and the inherent challenges of the problem, we resort to developing effective and efficient sampling approaches and propose an ordered trie structure-based sampling approach named OT-Sampling. OT-Sampling partitions records based on element frequency and occurrence patterns and is significantly more accurate compared with simple random sampling method and IL-GKMV. To further enhance the performance, a divide-and-conquer-based sampling approach, DC-Sampling, is presented with an inclusion/exclusion prefix to explore the pruning opportunities. Meanwhile, we consider weighted set containment selectivity estimation and devise stratified random sampling approach named StrRS. We theoretically analyze the proposed techniques regarding various accuracy estimators. Our comprehensive experiments on nine real datasets verify the effectiveness and efficiency of our proposed techniques.

Download Full-text

An Efficient and Effective Index Structure for Query Evaluation in Search Engines

Advances in Computer and Electrical Engineering - Advanced Methodologies and Technologies in Network Architecture, Mobile Computing, and Data Analytics ◽

10.4018/978-1-5225-7598-6.ch127 ◽

2019 ◽

pp. 1730-1743

Author(s):

Yangjun Chen

Keyword(s):

Search Engines ◽

Main Idea ◽

Effective Index ◽

Index Structure ◽

Query Evaluation ◽

Set Intersection ◽

Inverted List ◽

Interval Sequence

In this chapter, the authors discuss an efficient and effective index mechanism for search engines to support both conjunctive and disjunctive queries. The main idea behind it is to decompose an inverted list into a collection of disjoint sub-lists. The authors associate each word with an interval sequence, which is created by applying a kind of tree coding to a trie structure constructed over all the word sequences in a database. Then, attach each interval, instead of a word, with an inverted sub-list. In this way, both set intersection and union can be conducted by performing a series of simple interval containment checks. Experiments have been conducted, which shows that the new index is promising. Also, how to maintain indices, when inserting or deleting documents, is discussed in great detail.

Download Full-text

An Efficient and Effective Index Structure for Query Evaluation in Search Engines

Encyclopedia of Information Science and Technology, Fourth Edition ◽

10.4018/978-1-5225-2255-3.ch695 ◽

2018 ◽

pp. 7995-8005

Author(s):

Yangjun Chen

Keyword(s):

Search Engines ◽

Main Idea ◽

Effective Index ◽

Index Structure ◽

Query Evaluation ◽

Set Intersection ◽

Inverted List ◽

Interval Sequence

In this chapter, we discuss an efficient and effective index mechanism for search engines to support both conjunctive and disjunctive queries. The main idea behind it is to decompose an inverted list into a collection of disjoint sub-lists. We will associate each word with an interval sequence, which is created by applying a kind of tree coding to a trie structure constructed over all the word sequences in a database. Then, attach each interval, instead of a word, with an inverted sub-list. In this way, both set intersection and union can be conducted by performing a series of simple interval containment checks. Experiments have been conducted, which shows that the new index is promising. Also, how to maintain indexes, when inserting or deleting documents, is discussed in great detail.

Download Full-text

Inverted List Caching for Topical Index Shards

Lecture Notes in Computer Science - Advances in Information Retrieval ◽

10.1007/978-3-319-76941-7_47 ◽

2018 ◽

pp. 577-583

Author(s):

Zhuyun Dai ◽

Jamie Callan

Keyword(s):

Inverted List

Download Full-text

An Experimental Study of Bitmap Compression vs. Inverted List Compression

Proceedings of the 2017 ACM International Conference on Management of Data - SIGMOD '17 ◽

10.1145/3035918.3064007 ◽

2017 ◽

Cited By ~ 21

Author(s):

Jianguo Wang ◽

Chunbin Lin ◽

Yannis Papakonstantinou ◽

Steven Swanson

Keyword(s):

Experimental Study ◽

Inverted List

Download Full-text

Effective keyword query processing with an extended answer structure in large graph databases

International Journal of Web Information Systems ◽

10.1108/ijwis-11-2013-0030 ◽

2014 ◽

Vol 10 (1) ◽

pp. 65-84 ◽

Cited By ~ 1

Author(s):

Chang-Sup Park ◽

Sungchae Lim

Keyword(s):

Query Processing ◽

Search Algorithm ◽

Previous Method ◽

Search Method ◽

Graph Databases ◽

Information Need ◽

Keyword Query ◽

Processing Scheme ◽

Content Type ◽

Inverted List

Purpose – The paper aims to propose an effective method to process keyword-based queries over graph-structured databases which are widely used in various applications such as XML, semantic web, and social network services. To satisfy users' information need, it proposes an extended answer structure for keyword queries, inverted list indexes on keywords and nodes, and query processing algorithms exploiting the inverted lists. The study aims to provide more effective and relevant answers to a given query than the previous approaches in an efficient way. Design/methodology/approach – A new relevance measure for nodes to a given keyword query is defined in the paper and according to the relevance metric, a new answer tree structure is proposed which has no constraint on the number of keyword nodes chosen for each query keyword. For efficient query processing, an inverted list-style index is suggested which pre-computes connectivity and relevance information on the nodes in the graph. Then, a query processing algorithm based on the pre-constructed inverted lists is designed, which aggregates list entries for each graph node relevant to given keywords and identifies top-k root nodes of answer trees most relevant to the given query. The basic search method is also enhanced by using extend inverted lists which store additional relevance information of the related entries in the lists in order to estimate the relevance score of a node more closely and to find top-k answers more efficiently. Findings – Experiments with real datasets and various test queries were conducted for evaluating effectiveness and performance of the proposed methods in comparison with one of the previous approaches. The experimental results show that the proposed methods with an extended answer structure produce more effective top-k results than the compared previous method for most of the queries, especially for those with OR semantics. An extended inverted list and enhanced search algorithm are shown to achieve much improvement on the execution performance compared to the basic search method. Originality/value – This paper proposes a new extended answer structure and query processing scheme for keyword queries on graph databases which can satisfy the users' information need represented by a keyword set having various semantics.

Download Full-text

Design and Implementation of a Cache System in Web Search Engines

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.462-463.1106 ◽

2013 ◽

Vol 462-463 ◽

pp. 1106-1109

Author(s):

Hong Yuan Ma

Keyword(s):

Search Engine ◽

Search Engines ◽

Web Search ◽

Communication Architecture ◽

Design And Implementation ◽

Web Search Engine ◽

Inverted List ◽

Web Search Engines ◽

Application Processor ◽

Cache System

Web search engine caches the results which is frequently queried by users. It is an effective approach to improve the efficiency of Web search engines. In this paper, we give some valuable experience in our design and implementation of a Web search engine cache system. We present there design principles: logical layer processing, event-based communication architecture and avoiding frequent data copy. We also introduce the architecture presented in practice, including connection processor, application processor, query results caching processor, inverted list caching processor and list intersection caching processor. Experiments are conducted in our cache system using a real Web search engine query log.

Download Full-text

AN INVERTED LIST BASED APPROACH TO GENERATE OPTIMISED PATH IN DSR IN MANETS

International Journal of Computer Applications Technology and Research ◽

10.7753/ijcatr0203.1018 ◽

2013 ◽

Vol 2 (3) ◽

pp. 302-305

Author(s):

Kusum Lata ◽

Sophia Dhankhar

Keyword(s):

Inverted List

Download Full-text

inverted list
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List

Efficient Geo-Textual Hybrid Indexing Techniques for Moving Objects and Queries

Selectivity Estimation on Set Containment Search

An Efficient and Effective Index Structure for Query Evaluation in Search Engines

An Efficient and Effective Index Structure for Query Evaluation in Search Engines

Inverted List Caching for Topical Index Shards

An Experimental Study of Bitmap Compression vs. Inverted List Compression

Effective keyword query processing with an extended answer structure in large graph databases

Design and Implementation of a Cache System in Web Search Engines

AN INVERTED LIST BASED APPROACH TO GENERATE OPTIMISED PATH IN DSR IN MANETS

Export Citation Format

inverted listRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List

Efficient Geo-Textual Hybrid Indexing Techniques for Moving Objects and Queries

Selectivity Estimation on Set Containment Search

An Efficient and Effective Index Structure for Query Evaluation in Search Engines

An Efficient and Effective Index Structure for Query Evaluation in Search Engines

Inverted List Caching for Topical Index Shards

An Experimental Study of Bitmap Compression vs. Inverted List Compression

Effective keyword query processing with an extended answer structure in large graph databases

Design and Implementation of a Cache System in Web Search Engines

AN INVERTED LIST BASED APPROACH TO GENERATE OPTIMISED PATH IN DSR IN MANETS

inverted list
Recently Published Documents