Stream-Based Lossless Data Compression Applying Adaptive Entropy Coding for Hardware-Based Implementation

Shinichi Yamagiwa; Eisaku Hayakawa; Koichi Marumo

doi:10.3390/a13070159

Stream-Based Lossless Data Compression Applying Adaptive Entropy Coding for Hardware-Based Implementation

Algorithms ◽

10.3390/a13070159 ◽

2020 ◽

Vol 13 (7) ◽

pp. 159 ◽

Cited By ~ 1

Author(s):

Shinichi Yamagiwa ◽

Eisaku Hayakawa ◽

Koichi Marumo

Keyword(s):

Big Data ◽

Data Compression ◽

Data Stream ◽

High Speed ◽

Data Communication ◽

Streaming Data ◽

Entropy Coding ◽

Big Data Applications ◽

Lossless Data Compression ◽

Compressed Data

Toward strong demand for very high-speed I/O for processors, physical performance growth of hardware I/O speed was drastically increased in this decade. However, the recent Big Data applications still demand the larger I/O bandwidth and the lower latency for the speed. Because the current I/O performance does not improve so drastically, it is the time to consider another way to increase it. To overcome this challenge, we focus on lossless data compression technology to decrease the amount of data itself in the data communication path. The recent Big Data applications treat data stream that flows continuously and never allow stalling processing due to the high speed. Therefore, an elegant hardware-based data compression technology is demanded. This paper proposes a novel lossless data compression, called ASE coding. It encodes streaming data by applying the entropy coding approach. ASE coding instantly assigns the fewest bits to the corresponding compressed data according to the number of occupied entries in a look-up table. This paper describes the detailed mechanism of ASE coding. Furthermore, the paper demonstrates performance evaluations to promise that ASE coding adaptively shrinks streaming data and also works on a small amount of hardware resources without stalling or buffering any part of data stream.

Download Full-text

Joint Source-Channel Coding in Dictionary Methods of Lossless Data Compression

International Journal of Electronics and Telecommunications ◽

10.2478/v10177-010-0046-8 ◽

2010 ◽

Vol 56 (4) ◽

pp. 351-355

Author(s):

Marcin Rodziewicz

Keyword(s):

Data Compression ◽

Channel Coding ◽

Data Stream ◽

Noisy Channels ◽

Error Resiliency ◽

Joint Source Channel Coding ◽

Lossless Data Compression ◽

Compressed Data ◽

Source Channel Coding

Joint Source-Channel Coding in Dictionary Methods of Lossless Data Compression Limitations on memory and resources of communications systems require powerful data compression methods. Decompression of compressed data stream is very sensitive to errors which arise during transmission over noisy channels, therefore error correction coding is also required. One of the solutions to this problem is the application of joint source and channel coding. This paper contains a description of methods of joint source-channel coding based on the popular data compression algorithms LZ'77 and LZSS. These methods are capable of introducing some error resiliency into compressed stream of data without degradation of the compression ratio. We analyze joint source and channel coding algorithms based on these compression methods and present their novel extensions. We also present some simulation results showing usefulness and achievable quality of the analyzed algorithms.

Download Full-text

Robust Inference after Random Projections via Hellinger Distance for Location-Scale Family

Entropy ◽

10.3390/e21040348 ◽

2019 ◽

Vol 21 (4) ◽

pp. 348 ◽

Cited By ~ 1

Author(s):

Lei Li ◽

Anand Vidyashankar ◽

Guoqing Diao ◽

Ejaz Ahmed

Keyword(s):

Big Data ◽

Robust Estimation ◽

Numerical Experiments ◽

Estimation Procedure ◽

Hellinger Distance ◽

Streaming Data ◽

Random Projections ◽

Large Sample ◽

Compressed Data

Big data and streaming data are encountered in a variety of contemporary applications in business and industry. In such cases, it is common to use random projections to reduce the dimension of the data yielding compressed data. These data however possess various anomalies such as heterogeneity, outliers, and round-off errors which are hard to detect due to volume and processing challenges. This paper describes a new robust and efficient methodology, using Hellinger distance, to analyze the compressed data. Using large sample methods and numerical experiments, it is demonstrated that a routine use of robust estimation procedure is feasible. The role of double limits in understanding the efficiency and robustness is brought out, which is of independent interest.

Download Full-text

RE-STORM: REAL-TIME ENERGY EFFICIENT DATA ANALYSIS ADAPTING STORM PLATFORM

Jurnal Teknologi ◽

10.11113/jt.v78.7672 ◽

2016 ◽

Vol 78 (10) ◽

Author(s):

Rizwan Patan ◽

Rajasekhara Babu M.

Keyword(s):

Energy Efficiency ◽

Big Data ◽

Response Time ◽

Energy Efficient ◽

Data Stream ◽

High Performance ◽

High Energy ◽

Streaming Data ◽

Stream Computing ◽

High Energy Efficiency

It is necessary to model an energy efficient and stream optimization towards achieve high energy efficiency for Streaming data without degrading response time in big data stream computing. This paper proposes an Energy Efficient Traffic aware resource scheduling and Re-Streaming Stream Structure to replace a default scheduling strategy of storm is entitled as re-storm. The model described in three parts; First, a mathematical relation among energy consumption, low response time and high traffic streams. Second, various approaches provided for reducing an energy without affecting response time and which provides high performance in overall stream computing in big data. Third, re-storm deployed energy efficient traffic aware scheduling on the storm platform. It allocates worker nodes online by using hot-swapping technique with task utilizing by energy consolidation through graph partitioning. Moreover, re-storm is achieved high energy efficiency, low response time in all types of data arriving speeds.it is suitable for allocation of worker nodes in a storm topology. Experiment results have been demonstrated the comparing existing strategies which are dealing with energy issues without affecting or reducing response time for a different data stream speed levels. Finally, it shows that the re-storm platform achieved high energy efficiency and low response time when compared to all existing approaches.

Download Full-text

A VLSI chip set for high-speed lossless data compression

IEEE Transactions on Circuits and Systems for Video Technology ◽

10.1109/76.168903 ◽

1992 ◽

Vol 2 (4) ◽

pp. 381-391 ◽

Cited By ~ 24

Author(s):

J. Venbrux ◽

P.-s. Yeh ◽

M.N. Liu

Keyword(s):

Data Compression ◽

High Speed ◽

Vlsi Chip ◽

Lossless Data Compression

Download Full-text

A High-Speed FPGA-Based Lossless Data Compression Design for the X-ray Spectrometer Solar Energy Spectra

2011 International Conference of Information Technology, Computer Engineering and Management Sciences ◽

10.1109/icm.2011.262 ◽

2011 ◽

Cited By ~ 1

Author(s):

RuiMin Ma ◽

HuanYu Wang

Keyword(s):

Solar Energy ◽

Data Compression ◽

High Speed ◽

Energy Spectra ◽

X Ray ◽

Lossless Data Compression

Download Full-text

Big Data Reduction and Optimization in Sensor Monitoring Network

Journal of Applied Mathematics ◽

10.1155/2014/294591 ◽

2014 ◽

Vol 2014 ◽

pp. 1-8 ◽

Cited By ~ 12

Author(s):

Bin He ◽

Yonggang Li

Keyword(s):

Big Data ◽

Data Compression ◽

Data Communication ◽

Monitoring Network ◽

Temporal Correlation ◽

Network Size ◽

Sensor Nodes ◽

Underground Tunnel ◽

Efficient Data ◽

Subway Tunnels

Wireless sensor networks (WSNs) are increasingly being utilized to monitor the structural health of the underground subway tunnels, showing many promising advantages over traditional monitoring schemes. Meanwhile, with the increase of the network size, the system is incapable of dealing with big data to ensure efficient data communication, transmission, and storage. Being considered as a feasible solution to these issues, data compression can reduce the volume of data travelling between sensor nodes. In this paper, an optimization algorithm based on the spatial and temporal data compression is proposed to cope with these issues appearing in WSNs in the underground tunnel environment. The spatial and temporal correlation functions are introduced for the data compression and data recovery. It is verified that the proposed algorithm is applicable to WSNs in the underground tunnel.

Download Full-text

Improving I/O Performance with Adaptive Data Compression for Big Data Applications

2014 IEEE International Parallel & Distributed Processing Symposium Workshops ◽

10.1109/ipdpsw.2014.138 ◽

2014 ◽

Cited By ~ 21

Author(s):

Hongbo Zou ◽

Yongen Yu ◽

Wei Tang ◽

Hsuanwei Michelle Chen

Keyword(s):

Big Data ◽

Data Compression ◽

Big Data Applications

Download Full-text

CAM based high-speed compressed data communication system development using FPGA

2009 World Congress on Nature & Biologically Inspired Computing (NaBIC) ◽

10.1109/nabic.2009.5393883 ◽

2009 ◽

Author(s):

Tribeni Prasad Banerjee ◽

Amit Konar ◽

Ajith Abraham

Keyword(s):

Communication System ◽

High Speed ◽

System Development ◽

Data Communication ◽

Compressed Data

Download Full-text

Adaptive entropy coding method for stream-based lossless data compression

Proceedings of the 17th ACM International Conference on Computing Frontiers ◽

10.1145/3387902.3394037 ◽

2020 ◽

Author(s):

Shinichi Yamagiwa ◽

Eisaku Hayakawa ◽

Koichi Marumo

Keyword(s):

Data Compression ◽

Entropy Coding ◽

Coding Method ◽

Lossless Data Compression

Download Full-text

P-Codec: Parallel Compressed File Decompression Algorithm for Hadoop

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v15i8.1500 ◽

2016 ◽

Vol 15 (8) ◽

pp. 6991-6998

Author(s):

Idris Hanafi ◽

Amal Abdel-Raouf

Keyword(s):

Big Data ◽

Data Processing ◽

Data Compression ◽

Distributed File System ◽

Design And Implementation ◽

Average Improvement ◽

Hadoop Distributed File System ◽

Compression Data ◽

Compressed Data ◽

Data Analytic

The increasing amount and size of data being handled by data analytic applications running on Hadoop has created a need for faster data processing. One of the effective methods for handling big data sizes is compression. Data compression not only makes network I/O processing faster, but also provides better utilization of resources. However, this approach defeats one of Hadoopâ€™s main purposes, which is the parallelism of map and reduce tasks. The number of map tasks created is determined by the size of the file, so by compressing a large file, the number of mappers is reduced which in turn decreases parallelism. Consequently, standard Hadoop takes longer times to process. In this paper, we propose the design and implementation of a Parallel Compressed File Decompressor (P-Codec) that improves the performance of Hadoop when processing compressed data. P-Codec includes two modules; the first module decompresses data upon retrieval by a data node during the phase of uploading the data to the Hadoop Distributed File System (HDFS). This process reduces the runtime of a job by removing the burden of decompression during the MapReduce phase. The second P-Codec module is a decompressed map task divider that increases parallelism by dynamically changing the map task split sizes based on the size of the final decompressed block. Our experimental results using five different MapReduce benchmarks show an average improvement of approximately 80% compared to standard Hadoop.

Download Full-text