scholarly journals Evaluation of lossless and lossy algorithms for the compression of scientific datasets in netCDF-4 or HDF5 files

2019 ◽  
Vol 12 (9) ◽  
pp. 4099-4113
Author(s):  
Xavier Delaunay ◽  
Aurélie Courtois ◽  
Flavien Gouillon

Abstract. The increasing volume of scientific datasets requires the use of compression to reduce data storage and transmission costs, especially for the oceanographic or meteorological datasets generated by Earth observation mission ground segments. These data are mostly produced in netCDF files. Indeed, the netCDF-4/HDF5 file formats are widely used throughout the global scientific community because of the useful features they offer. HDF5 in particular offers a dynamically loaded filter plugin so that users can write compression/decompression filters, for example, and process the data before reading or writing them to disk. This study evaluates lossy and lossless compression/decompression methods through netCDF-4 and HDF5 tools on analytical and real scientific floating-point datasets. We also introduce the Digit Rounding algorithm, a new relative error-bounded data reduction method inspired by the Bit Grooming algorithm. The Digit Rounding algorithm offers a high compression ratio while keeping a given number of significant digits in the dataset. It achieves a higher compression ratio than the Bit Grooming algorithm with slightly lower compression speed.

2018 ◽  
Author(s):  
Xavier Delaunay ◽  
Aurélie Courtois ◽  
Flavien Gouillon

Abstract. The increasing volume of scientific datasets imposes the use of compression to reduce the data storage or transmission costs, specifically for the oceanography or meteorological datasets generated by Earth observation mission ground segments. These data are mostly produced in NetCDF formatted files. Indeed, the NetCDF-4/HDF5 file formats are widely spread in the global scientific community because of the nice features they offer. Particularly, the HDF5 offers the dynamically loaded filter plugin functionality allowing users to write filters, such as compression/decompression filters, to process the data before reading or writing it on the disk. In this work, we evaluate the performance of lossy and lossless compression/decompression methods through NetCDF-4 and HDF5 tools on analytical and real scientific floating-point datasets. We also introduce the Digit Rounding algorithm, a new relative error bounded data reduction method inspired by the Bit Grooming algorithm. The Digit Rounding algorithm allows high compression ratio while preserving a given number of significant digits in the dataset. It achieves higher compression ratio than the Bit Grooming algorithm while keeping similar compression speed.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1817
Author(s):  
Jiawen Xue ◽  
Li Yin ◽  
Zehua Lan ◽  
Mingzhu Long ◽  
Guolin Li ◽  
...  

This paper proposes a novel 3D discrete cosine transform (DCT) based image compression method for medical endoscopic applications. Due to the high correlation among color components of wireless capsule endoscopy (WCE) images, the original 2D Bayer data pattern is reconstructed into a new 3D data pattern, and 3D DCT is adopted to compress the 3D data for high compression ratio and high quality. For the low computational complexity of 3D-DCT, an optimized 4-point DCT butterfly structure without multiplication operation is proposed. Due to the unique characteristics of the 3D data pattern, the quantization and zigzag scan are ameliorated. To further improve the visual quality of decompressed images, a frequency-domain filter is proposed to eliminate the blocking artifacts adaptively. Experiments show that our method attains an average compression ratio (CR) of 22.94:1 with the peak signal to noise ratio (PSNR) of 40.73 dB, which outperforms state-of-the-art methods.


2019 ◽  
Vol 28 (06) ◽  
pp. 1950106
Author(s):  
Qian Dong ◽  
Bing Li

The hardware-based dictionary compression is widely adopted for high speed requirement of real-time data processing. Hash function helps to manage large dictionary to improve compression ratio but is prone to collisions, so some phrases in match search result are not true matches. This paper presents a novel match search approach called dual chaining hash refining, which can improve the efficiency of match search. From the experimental results, our method showed obvious advantage in compression speed compared with other approach that utilizes single hash function described in the previous publications.


2020 ◽  
Vol 262 ◽  
pp. 114560 ◽  
Author(s):  
Zhuyong Yang ◽  
Niranjan Miganakallu ◽  
Tyler Miller ◽  
Jeremy Worm ◽  
Jeffrey Naber ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document