scholarly journals Storage Preservation Using Big Data Based Intelligent Compression Scheme

Author(s):  
Ramya. S ◽  
Gokula Krishnan. V

Big data has reached a maturity that leads it into a productive phase. This means that most of the main issues with big data have been addressed to a degree that storage has become interesting for full commercial exploitation. However, concerns over data compression still prevent many users from migrating data to remote storage. Client-side data compression in particular ensures that multiple uploads of the same content only consume network bandwidth and storage space of a single upload. Compression is actively used by a number of backup providers as well as various services. Unfortunately, compressed data is pseudorandom and thus cannot be deduplicated: as a consequence, current schemes have to entirely sacrifice storage efficiency. In this system, present a scheme that permits a more fine-grained trade-off. And present a novel idea that differentiates data according to their popularity. Based on this idea, design a compression scheme that guarantees semantic storage preservation for unpopular data and provides scalable data storage and bandwidth benefits for popular data. We can implement variable data chunk similarity algorithm for analyze the chunks data and store the original data with compressed format. And also includes the encryption algorithm to secure the data. Finally, can use the backup recover system at the time of blocking and also analyze frequent login access system.

Author(s):  
M. Chinnadurai ◽  
A. Jayashri

Cloud computing is one of the important factoring that leads it into a productive phase. This means that most of the main problems with cloud computing have been addressed to a degree that clouds have become interesting for full commercial exploitation. However, permissions over data security still prevent many users from migrating data to remote storage. Client-side data compression in particular ensures that multiple uploads of the same content only consume network bandwidth and storage space of a single upload. Compression is actively used by a number of cloud backup providers as well as various cloud services. Unfortunately, encrypted data is pseudorandom and thus cannot be deduplicated: as a consequence, current schemes have to entirely sacrifice either security or storage efficiency. In this system, present a scheme that permits a more fine-grained trade-off. The intuition is that outsourced data may require different levels of protection, depending on how popular it is: content shared by many users. Then present a novel idea that differentiates data according to their popularity. In this proposed system, implement an encryption scheme that guarantees semantic security for unpopular data and provides weaker security and better storage and bandwidth benefits for popular data. Proposed data de-duplication can be effective for popular data, also semantically secure encryption protects unpopular content. Finally, can use the backup recover system at the time of blocking and also analyze frequent login access system.


2018 ◽  
Vol 5 (2) ◽  
pp. 95-118 ◽  
Author(s):  
Bharat S Rawal ◽  
Songjie Liang ◽  
Shiva Gautam ◽  
Harsha Kumara Kalutarage ◽  
P Vijayakumar

To cope up with the Big Data explosion, the Nth Order Binary Encoding (NOBE) algorithm with the Split-protocol has been proposed. In the earlier papers, the application Split-protocol for security, reliability, availability, HPC have been demonstrated and implemented encoding. This technology will significantly reduce the network traffic, improve the transmission rate and augment the capacity for data storage. In addition to data compression, improving the privacy and security is an inherent benefit of the proposed method. It is possible to encode the data recursively up to N times and use a unique combination of NOBE's parameters to generate encryption keys for additional security and privacy for data on the flight or at a station. This paper describes the design and a preliminary demonstration of (NOBE) algorithm, serving as a foundation for application implementers. It also reports the outcomes of computable studies concerning the performance of the underlying implementation.


2020 ◽  
Vol 2020 ◽  
pp. 1-22
Author(s):  
Qin Jiancheng ◽  
Lu Yiqin ◽  
Zhong Yu

With the advent of IR (Industrial Revolution) 4.0, the spread of sensors in IoT (Internet of Things) may generate massive data, which will challenge the limited sensor storage and network bandwidth. Hence, the study of big data compression is valuable in the field of sensors. A problem is how to compress the long-stream data efficiently with the finite memory of a sensor. To maintain the performance, traditional techniques of compression have to treat the data streams on a small and incompetent scale, which will reduce the compression ratio. To solve this problem, this paper proposes a block-split coding algorithm named “CZ-Array algorithm,” and implements it in the shareware named “ComZip.” CZ-Array can use a relatively small data window to cover a configurable large scale, which benefits the compression ratio. It is fast with the time complexity O(N) and fits the big data compression. The experiment results indicate that ComZip with CZ-Array can obtain a better compression ratio than gzip, lz4, bzip2, and p7zip in the multiple stream data compression, and it also has a competent speed among these general data compression software. Besides, CZ-Array is concise and fits the hardware parallel implementation of sensors.


2021 ◽  
Vol 22 (4) ◽  
pp. 401-412
Author(s):  
Hrachya Astsatryan ◽  
Arthur Lalayan ◽  
Aram Kocharyan ◽  
Daniel Hagimont

The MapReduce framework manages Big Data sets by splitting the large datasets into a set of distributed blocks and processes them in parallel. Data compression and in-memory file systems are widely used methods in Big Data processing to reduce resource-intensive I/O operations and improve I/O rate correspondingly. The article presents a performance-efficient modular and configurable decision-making robust service relying on data compression and in-memory data storage indicators. The service consists of Recommendation and Prediction modules, predicts the execution time of a given job based on metrics, and recommends the best configuration parameters to improve Hadoop and Spark frameworks' performance. Several CPU and data-intensive applications and micro-benchmarks have been evaluated to improve the performance, including Log Analyzer, WordCount, and K-Means.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Jianmin Wang ◽  
Yukun Xia ◽  
Wenbin Zhao ◽  
Yuhang Zhang ◽  
Feng Wu

Big data is massive and heterogeneous, along with the rapid increase in data quantity, and the diversification of user access, traditional database, and access control methods can no longer meet the requirements of big data storage and flexible access control. To solve this problem, an entity relationship completion and authority management method is proposed. By combining the weighted graph convolutional neural network and the attention mechanism, a knowledge base completion model is given. On this basis, the authority management model is formally defined and the process of multilevel trust access control is designed. The effectiveness of the proposed method is verified by experiments, and the authority management of knowledge base is more fine-grained and more secure.


2020 ◽  
Vol 12 (11) ◽  
pp. 190
Author(s):  
Elarbi Badidi ◽  
Zineb Mahrez ◽  
Essaid Sabir

Demographic growth in urban areas means that modern cities face challenges in ensuring a steady supply of water and electricity, smart transport, livable space, better health services, and citizens’ safety. Advances in sensing, communication, and digital technologies promise to mitigate these challenges. Hence, many smart cities have taken a new step in moving away from internal information technology (IT) infrastructure to utility-supplied IT delivered over the Internet. The benefit of this move is to manage the vast amounts of data generated by the various city systems, including water and electricity systems, the waste management system, transportation system, public space management systems, health and education systems, and many more. Furthermore, many smart city applications are time-sensitive and need to quickly analyze data to react promptly to the various events occurring in a city. The new and emerging paradigms of edge and fog computing promise to address big data storage and analysis in the field of smart cities. Here, we review existing service delivery models in smart cities and present our perspective on adopting these two emerging paradigms. We specifically describe the design of a fog-based data pipeline to address the issues of latency and network bandwidth required by time-sensitive smart city applications.


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 240
Author(s):  
Shinichi Yamagiwa ◽  
Koichi Marumo ◽  
Suzukaze Kuwabara

It is getting popular to implement an environment where communications are performed remotely among IoT edge devices, such as sensory devices and the cloud servers due to applying, for example, artificial intelligence algorithms to the system. In such situations that handle big data, lossless data compression is one of the solutions to reduce the big data. In particular, the stream-based data compression technology is focused on such systems to compress infinitely continuous data stream with very small delay. However, during the continuous data compression process, it is not able to insert an exception code among the compressed data without any additional mechanisms, such as data framing and the packeting technique, as used in networking technologies. The exception code indicates configurations for the compressor/decompressor and/or its peripheral logics. Then, it is used in real time for the configuration of parameters against those components. To implement the exception code, data compression algorithm must include a mechanism to distinguish original data before compression and the exception code clearly. However, the conventional algorithms do not include such mechanism. This paper proposes novel methods to implement the exception code in data compression that uses look-up table, called the exception symbol. Additionally, we describe implementation details of the method by applying it to algorithms of stream-based data compression. Because some of the proposed mechanisms need to reserve entries in the table, we also discuss the effect against data compression performance according to experimental evaluations.


Sign in / Sign up

Export Citation Format

Share Document