inverted index Latest Research Papers

Abstract Dynamic searchable symmetric encryption (DSSE) allows a client to query or update an outsourced encrypted database. Range queries are commonly needed. Previous range-searchable schemes either do not support updates natively (SIGMOD’16) or use file indexes of many long bit-vectors for distinct keywords, which only support toggling updates via homomorphically flipping the presence bit. (ESORICS’18). We propose a generic upgrade of any (inverted-index) DSSE to support range queries (a.k.a. range DSSE), without homomorphic encryption, and a specific instantiation with a new trade-off reducing client-side storage. Our schemes achieve forward security, an important property that mitigates file injection attacks. Moreover, we identify a variant of injection attacks against the first somewhat dynamic scheme (ESORICS’18). We also extend the definition of backward security to range DSSE and show that our schemes are compatible with a generic upgrade of backward security (CCS’17). We comprehensively analyze the computation and communication overheads, including implementation details of client-side index-related operations omitted by prior schemes. We show high empirical efficiency for million-scale databases over a million-scale keyword space.

Download Full-text

Using Inverted Index for Fingerprint Search

Journal of Information and Data Management ◽

10.5753/jidm.2021.1918 ◽

2021 ◽

Vol 12 (5) ◽

Author(s):

Johnny Marcos S. Soares ◽

Luciano Barbosa ◽

Paulo Antonio Leal Rego ◽

Regis Pires Magalhães ◽

Jose Antônio F. de Macêdo

Keyword(s):

Information Retrieval ◽

Penetration Rate ◽

Locality Sensitive Hashing ◽

Inverted Index ◽

Text Documents ◽

Data Set ◽

Textual Information ◽

Data Indexing ◽

Biometric Information ◽

Fingerprint Data

Fingerprints are the most used biometric information for identifying people. With the increase in fingerprint data, indexing techniques are essential to perform an efficient search. In this work, we devise a solution that applies traditional inverted index, widely used in textual information retrieval, for fingerprint search. For that, it first converts fingerprints to text documents using techniques, such as Minutia Cylinder-Code and Locality-Sensitive Hashing, and then indexes them in inverted files. In the experimental evaluation, our approach obtained 0.42% of error rate with 10% of penetration rate in the FVC2002 DB1a data set, surpassing some established methods.

Download Full-text

DMSE: Dynamic Multi-keyword Search Encryption based on inverted index

Journal of Systems Architecture ◽

10.1016/j.sysarc.2021.102255 ◽

2021 ◽

pp. 102255

Author(s):

Yanrong Liang ◽

Yanping Li ◽

Kai Zhang ◽

Lina Ma

Keyword(s):

Keyword Search ◽

Inverted Index

Download Full-text

Deep SIMBAD: Active Landmark-based Self-localization Using Similarity-based Scene Descriptor

10.31224/osf.io/8uf9t ◽

2021 ◽

Author(s):

kanji tanaka

Keyword(s):

Robot Control ◽

Nearest Neighbor ◽

Inverted Index ◽

Localization Task ◽

State Recognition ◽

Q Learning ◽

Recognition Ability ◽

Ill Posed ◽

Visual Place Recognition ◽

Next Best View

Landmark-based robot self-localization has attracted recent research interest as an efficient maintenance-free approach to visual place recognition (VPR) across domains (e.g., times of the day, weathers, seasons). However, landmark-based self-localization can be an ill-posed problem for a passive observer (e.g., manual robot control), as many viewpoints may not provide effective landmark view. Here, we consider active self-localization task by an active observer, and present a novel reinforcement-learning (RL) -based next-best-view (NBV) planner. Our contributions are summarized as follows. (1) SIMBAD-based VPR: We present a landmark ranking -based compact scene descriptor by introducing a deep-learning extension of similarity-based pattern recognition (SIMBAD). (2) VPR-to-NBV knowledge transfer: We tackle the challenge of RL under uncertainty (i.e., active self-localization) by transferring the VPR's state recognition ability to NBV. (3) NNQL-based NBV: We view the available VPR as the experience database by adapting a nearest-neighbor -based approximation of Q-learning (NNQL). The result is an extremely compact data structure that compresses both the VPR and NBV modules into a single incremental inverted index. Experiments using public NCLT dataset validate the effectiveness of the proposed approach.

Download Full-text

THE PROCESS OF DETECTING BLOCKS WITH REPETITIONS AND EXCESS BUILDING USING A LANGUAGE-INDEPENDENT INCREASE DETECTOR

HERALD of Khmelnytskyi national university ◽

10.31891/2307-5732-2021-297-3-39-45 ◽

2021 ◽

Vol 297 (3) ◽

pp. 39-45

Author(s):

N. PRAVORSKA ◽

О. BARMAC ◽

D. MEDZATIY ◽

T. SHESTAKEVYCH ◽

◽

...

Keyword(s):

Data Structure ◽

Programming Language ◽

Simple Algorithm ◽

Inverted Index ◽

Graphical Methods ◽

Program Code ◽

Executable Code ◽

Software Code ◽

Automated Tools ◽

Global Data

To avoid malfunctions of the developed software caused by errors, even when developed by professionals, a number of automated tools are used, which allow to evaluate the software code. A variety of detectors are commonly used to detect errors that occur due to duplicate blocks of executable code. The importance of developing such detectors is that the product is not dependent on the programming language and has a simple algorithm for finding cloned blocks of code. The approach of the language-independent repetition detector is based on a method based on the use of the clone index. It is a global data structure that resembles a typical inverted index. This approach is based on the text, ie the method becomes the basis for research independent of language. In recent years, additional methods have become increasingly popular, which analyze the source and executable code at a smaller level, and there are attempts to avoid unnecessary recalculations, by transferring information between versions. Reviewing the research presented in the works of scientists dealing with this problem, it was decided to propose an approach to improve methods for detecting repetitions and redundancy of program code based on language-independent incremental repetition detector (MNIDP). Most additional research is based on tree-like and graphical methods, ie they are strictly dependent on the programming language. The solution in the MNIDP campaign is to take the text as a basis, ie the method becomes the basis for research independent of language. This technique is not strictly language-independent, but due to the fact that the tokenization stage will be included, with the help of minor adjustments the desired result has been achieved. This provides a detailed analysis of the internal composition (namely, elements) of the detector and explanations of the work at different stages of the detection process.

Download Full-text

Neural methods for effective, efficient, and exposure-aware information retrieval

ACM SIGIR Forum ◽

10.1145/3476415.3476434 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-2

Author(s):

Bhaskar Mitra

Keyword(s):

Information Retrieval ◽

Language Processing ◽

Large Scale ◽

Web Search ◽

Real Life ◽

Inverted Index ◽

Information Need ◽

Product Model ◽

Performance Improvements ◽

Deep Model

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents---or short passages---in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms---such as a person's name or a product model number---not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections---such as the document index of a commercial Web search engine---containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks. We ground our contributions with a detailed survey of the growing body of neural IR literature [Mitra and Craswell, 2018]. Our key contribution towards improving the effectiveness of deep ranking models is developing the Duet principle [Mitra et al., 2017] which emphasizes the importance of incorporating evidence based on both patterns of exact term matches and similarities between learned latent representations of query and document. To efficiently retrieve from large collections, we develop a framework to incorporate query term independence [Mitra et al., 2019] into any arbitrary deep model that enables large-scale precomputation and the use of inverted index for fast retrieval. In the context of stochastic ranking, we further develop optimization strategies for exposure-based objectives [Diaz et al., 2020]. Finally, this dissertation also summarizes our contributions towards benchmarking neural IR models in the presence of large training datasets [Craswell et al., 2019] and explores the application of neural methods to other IR tasks, such as query auto-completion.

Download Full-text

Top-k Graph Similarity Search Based on Hierarchical Inverted Index

2021 11th International Conference on Information Science and Technology (ICIST) ◽

10.1109/icist52614.2021.9440632 ◽

2021 ◽

Author(s):

Zhongqing Wang ◽

Yan Yang ◽

Yingli Zhong

Keyword(s):

Similarity Search ◽

Inverted Index ◽

Graph Similarity

Download Full-text

ELII: A novel inverted index for fast temporal query, with application to a large Covid-19 EHR dataset

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2021.103744 ◽

2021 ◽

Vol 117 ◽

pp. 103744

Author(s):

Yan Huang ◽

Xiaojin Li ◽

Guo-Qiang Zhang

Keyword(s):

Inverted Index ◽

Temporal Query

Download Full-text

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Entropy ◽

10.3390/e23030296 ◽

2021 ◽

Vol 23 (3) ◽

pp. 296

Author(s):

Andrzej Chmielowiec ◽

Paweł Litwin

Keyword(s):

Compression Algorithm ◽

Binary Sequences ◽

Inverted Index ◽

Small Decrease ◽

Database Applications ◽

Main Application ◽

Fixed Length ◽

Number Of Zeros ◽

Index Compression ◽

Inverted Index Compression

This article deals with compression of binary sequences with a given number of ones, which can also be considered as a list of indexes of a given length. The first part of the article shows that the entropy H of random n-element binary sequences with exactly k elements equal one satisfies the inequalities klog2(0.48·n/k)<H<klog2(2.72·n/k). Based on this result, we propose a simple coding using fixed length words. Its main application is the compression of random binary sequences with a large disproportion between the number of zeros and the number of ones. Importantly, the proposed solution allows for a much faster decompression compared with the Golomb-Rice coding with a relatively small decrease in the efficiency of compression. The proposed algorithm can be particularly useful for database applications for which the speed of decompression is much more important than the degree of index list compression.

Download Full-text

inverted index
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A multi-keyword parallel ciphertext retrieval scheme based on inverted index under the robot distributed system

Forward and Backward-Secure Range-Searchable Symmetric Encryption

Using Inverted Index for Fingerprint Search

DMSE: Dynamic Multi-keyword Search Encryption based on inverted index

Deep SIMBAD: Active Landmark-based Self-localization Using Similarity-based Scene Descriptor

THE PROCESS OF DETECTING BLOCKS WITH REPETITIONS AND EXCESS BUILDING USING A LANGUAGE-INDEPENDENT INCREASE DETECTOR

Neural methods for effective, efficient, and exposure-aware information retrieval

Top-k Graph Similarity Search Based on Hierarchical Inverted Index

ELII: A novel inverted index for fast temporal query, with application to a large Covid-19 EHR dataset

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Export Citation Format

inverted indexRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A multi-keyword parallel ciphertext retrieval scheme based on inverted index under the robot distributed system

Forward and Backward-Secure Range-Searchable Symmetric Encryption

Using Inverted Index for Fingerprint Search

DMSE: Dynamic Multi-keyword Search Encryption based on inverted index

Deep SIMBAD: Active Landmark-based Self-localization Using Similarity-based Scene Descriptor

THE PROCESS OF DETECTING BLOCKS WITH REPETITIONS AND EXCESS BUILDING USING A LANGUAGE-INDEPENDENT INCREASE DETECTOR

Neural methods for effective, efficient, and exposure-aware information retrieval

Top-k Graph Similarity Search Based on Hierarchical Inverted Index

ELII: A novel inverted index for fast temporal query, with application to a large Covid-19 EHR dataset

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

inverted index
Recently Published Documents