scholarly journals Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 296
Author(s):  
Andrzej Chmielowiec ◽  
Paweł Litwin

This article deals with compression of binary sequences with a given number of ones, which can also be considered as a list of indexes of a given length. The first part of the article shows that the entropy H of random n-element binary sequences with exactly k elements equal one satisfies the inequalities klog2(0.48·n/k)<H<klog2(2.72·n/k). Based on this result, we propose a simple coding using fixed length words. Its main application is the compression of random binary sequences with a large disproportion between the number of zeros and the number of ones. Importantly, the proposed solution allows for a much faster decompression compared with the Golomb-Rice coding with a relatively small decrease in the efficiency of compression. The proposed algorithm can be particularly useful for database applications for which the speed of decompression is much more important than the degree of index list compression.

2020 ◽  
Vol 53 (6) ◽  
pp. 1-36
Author(s):  
Giulio Ermanno Pibiri ◽  
Rossano Venturini

Author(s):  
V. Glory ◽  
S. Domnic

Inverted index is used in most Information Retrieval Systems (IRS) to achieve the fast query response time. In inverted index, compression schemes are used to improve the efficiency of IRS. In this chapter, the authors study and analyze various compression techniques that are used for indexing. They also present a new compression technique that is based on FastPFOR called New FastPFOR. The storage structure and the integers' representation of the proposed method can improve its performances both in compression and decompression. The study on existing works shows that the recent research works provide good results either in compression or in decoding, but not in both. Hence, their decompression performance is not fair. To achieve better performance in decompression, the authors propose New FastPFOR in this chapter. To evaluate the performance of the proposed method, they experiment with TREC collections. The results show that the proposed method could achieve better decompression performance than the existing techniques.


2019 ◽  
Vol 13 (2) ◽  
pp. 343-356 ◽  
Author(s):  
Xingshen Song ◽  
Yuexiang Yang ◽  
Yu Jiang ◽  
Kun Jiang

2016 ◽  
Vol 46 (12) ◽  
pp. 3059-3072 ◽  
Author(s):  
Jose Maria Luna ◽  
Alberto Cano ◽  
Mykola Pechenizkiy ◽  
Sebastian Ventura

Sign in / Sign up

Export Citation Format

Share Document