scholarly journals Stream-Based Lossless Data Compression Applying Adaptive Entropy Coding for Hardware-Based Implementation

Algorithms ◽  
2020 ◽  
Vol 13 (7) ◽  
pp. 159 ◽  
Author(s):  
Shinichi Yamagiwa ◽  
Eisaku Hayakawa ◽  
Koichi Marumo

Toward strong demand for very high-speed I/O for processors, physical performance growth of hardware I/O speed was drastically increased in this decade. However, the recent Big Data applications still demand the larger I/O bandwidth and the lower latency for the speed. Because the current I/O performance does not improve so drastically, it is the time to consider another way to increase it. To overcome this challenge, we focus on lossless data compression technology to decrease the amount of data itself in the data communication path. The recent Big Data applications treat data stream that flows continuously and never allow stalling processing due to the high speed. Therefore, an elegant hardware-based data compression technology is demanded. This paper proposes a novel lossless data compression, called ASE coding. It encodes streaming data by applying the entropy coding approach. ASE coding instantly assigns the fewest bits to the corresponding compressed data according to the number of occupied entries in a look-up table. This paper describes the detailed mechanism of ASE coding. Furthermore, the paper demonstrates performance evaluations to promise that ASE coding adaptively shrinks streaming data and also works on a small amount of hardware resources without stalling or buffering any part of data stream.

2010 ◽  
Vol 56 (4) ◽  
pp. 351-355
Author(s):  
Marcin Rodziewicz

Joint Source-Channel Coding in Dictionary Methods of Lossless Data Compression Limitations on memory and resources of communications systems require powerful data compression methods. Decompression of compressed data stream is very sensitive to errors which arise during transmission over noisy channels, therefore error correction coding is also required. One of the solutions to this problem is the application of joint source and channel coding. This paper contains a description of methods of joint source-channel coding based on the popular data compression algorithms LZ'77 and LZSS. These methods are capable of introducing some error resiliency into compressed stream of data without degradation of the compression ratio. We analyze joint source and channel coding algorithms based on these compression methods and present their novel extensions. We also present some simulation results showing usefulness and achievable quality of the analyzed algorithms.


Entropy ◽  
2019 ◽  
Vol 21 (4) ◽  
pp. 348 ◽  
Author(s):  
Lei Li ◽  
Anand Vidyashankar ◽  
Guoqing Diao ◽  
Ejaz Ahmed

Big data and streaming data are encountered in a variety of contemporary applications in business and industry. In such cases, it is common to use random projections to reduce the dimension of the data yielding compressed data. These data however possess various anomalies such as heterogeneity, outliers, and round-off errors which are hard to detect due to volume and processing challenges. This paper describes a new robust and efficient methodology, using Hellinger distance, to analyze the compressed data. Using large sample methods and numerical experiments, it is demonstrated that a routine use of robust estimation procedure is feasible. The role of double limits in understanding the efficiency and robustness is brought out, which is of independent interest.


2016 ◽  
Vol 78 (10) ◽  
Author(s):  
Rizwan Patan ◽  
Rajasekhara Babu M.

It is necessary to model an energy efficient and stream optimization towards achieve high energy efficiency for Streaming data without degrading response time in big data stream computing. This paper proposes an Energy Efficient Traffic aware resource scheduling and Re-Streaming Stream Structure to replace a default scheduling strategy of storm is entitled as re-storm. The model described in three parts; First, a mathematical relation among energy consumption, low response time and high traffic streams. Second, various approaches provided for reducing an energy without affecting response time and which provides high performance in overall stream computing in big data. Third, re-storm deployed energy efficient traffic aware scheduling on the storm platform. It allocates worker nodes online by using hot-swapping technique with task utilizing by energy consolidation through graph partitioning. Moreover, re-storm is achieved high energy efficiency, low response time in all types of data arriving speeds.it is suitable for allocation of worker nodes in a storm topology. Experiment results have been demonstrated the comparing existing strategies which are dealing with energy issues without affecting or reducing response time for a different data stream speed levels. Finally, it shows that the re-storm platform achieved high energy efficiency and low response time when compared to all existing approaches.


2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Bin He ◽  
Yonggang Li

Wireless sensor networks (WSNs) are increasingly being utilized to monitor the structural health of the underground subway tunnels, showing many promising advantages over traditional monitoring schemes. Meanwhile, with the increase of the network size, the system is incapable of dealing with big data to ensure efficient data communication, transmission, and storage. Being considered as a feasible solution to these issues, data compression can reduce the volume of data travelling between sensor nodes. In this paper, an optimization algorithm based on the spatial and temporal data compression is proposed to cope with these issues appearing in WSNs in the underground tunnel environment. The spatial and temporal correlation functions are introduced for the data compression and data recovery. It is verified that the proposed algorithm is applicable to WSNs in the underground tunnel.


2016 ◽  
Vol 15 (8) ◽  
pp. 6991-6998
Author(s):  
Idris Hanafi ◽  
Amal Abdel-Raouf

The increasing amount and size of data being handled by data analytic applications running on Hadoop has created a need for faster data processing. One of the effective methods for handling big data sizes is compression. Data compression not only makes network I/O processing faster, but also provides better utilization of resources. However, this approach defeats one of Hadoop’s main purposes, which is the parallelism of map and reduce tasks. The number of map tasks created is determined by the size of the file, so by compressing a large file, the number of mappers is reduced which in turn decreases parallelism. Consequently, standard Hadoop takes longer times to process. In this paper, we propose the design and implementation of a Parallel Compressed File Decompressor (P-Codec) that improves the performance of Hadoop when processing compressed data. P-Codec includes two modules; the first module decompresses data upon retrieval by a data node during the phase of uploading the data to the Hadoop Distributed File System (HDFS). This process reduces the runtime of a job by removing the burden of decompression during the MapReduce phase. The second P-Codec module is a decompressed map task divider that increases parallelism by dynamically changing the map task split sizes based on the size of the final decompressed block. Our experimental results using five different MapReduce benchmarks show an average improvement of approximately 80% compared to standard Hadoop.


Sign in / Sign up

Export Citation Format

Share Document