scholarly journals Unbalanced Big Data-Compatible Cloud Storage Method Based on Redundancy Elimination Technology

2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Tingting Yu

In order to meet the requirements of users in terms of speed, capacity, storage efficiency, and security, with the goal of improving data redundancy and reducing data storage space, an unbalanced big data compatible cloud storage method based on redundancy elimination technology is proposed. A new big data acquisition platform is designed based on Hadoop and NoSQL technologies. Through this platform, efficient unbalanced data acquisition is realized. The collected data are classified and processed by classifier. The classified unbalanced big data are compressed by Huffman algorithm, and the data security is improved by data encryption. Based on the data processing results, the big data redundancy processing is carried out by using the data deduplication algorithm. The cloud platform is designed to store redundant data in the cloud. The results show that the method in this paper has high data deduplication rate and data deduplication speed rate and low data storage space and effectively reduces the burden of data storage.

Webology ◽  
2021 ◽  
Vol 18 (Special Issue 01) ◽  
pp. 288-301
Author(s):  
G. Sujatha ◽  
Dr. Jeberson Retna Raj

Data storage is one of the significant cloud services available to the cloud users. Since the magnitude of information outsourced grows extremely high, there is a need of implementing data deduplication technique in the cloud storage space for efficient utilization. The cloud storage space supports all kind of digital data like text, audio, video and image. In the hash-based deduplication system, cryptographic hash value should be calculated for all data irrespective of its type and stored in the memory for future reference. Using these hash value only, duplicate copies can be identified. The problem in this existing scenario is size of the hash table. To find a duplicate copy, all the hash values should be checked in the worst case irrespective of its data type. At the same time, all kind of digital data does not suit with same structure of hash table. In this study we proposed an approach to have multiple hash tables for different digital data. By having dedicated hash table for each digital data type will improve the searching time of duplicate data.


2019 ◽  
Vol 8 (4) ◽  
pp. 2329-2333

Frequently, in reality, substances have at least two portrayals in databases. Copy records don't share a typical key as well as they contain mistakes that make copy coordinating a troublesome assignment. Mistakes are presented as the consequence of interpretation blunders, inadequate data, absence of standard configurations, or any mix of these components. In big data storage data is excessively enormous and productively store data is troublesome errand. To take care of this issue Hadoop instrument gives HDFS that oversees data by keep up duplication of data however this expanded duplication. In our anticipated strategy bigdata stream is given to the fixed size chunking calculation to make fixed size chunks. In this manuscript, we introduce an exhaustive investigation of the writing on crowd sourcing based big data deduplication technique. In our strategy is to create the guide diminish result after that MapReduce model is connected to discover whether hash esteems and are copy or not. To be familiar with the copy hash esteems MapReduce model contrasted these hash esteems and as of now put away hash esteems in Big data storage space. On the off chance that these hash esteems are now there in the Big data storage space, at that point these can be distinguished as copy. On the off chance that the hash esteems are copied, at that point don't store the data into the Hadoop Distributed File System (HDFS) else then store the data into the HDFS. we additionally spread various deduplication systems in crowd sourcing data's.


Cloud Computing is well known today on account of enormous measure of data storage and quick access of information over the system. It gives an individual client boundless extra space, accessibility and openness of information whenever at anyplace. Cloud service provider can boost information storage by incorporating data deduplication into cloud storage, despite the fact that information deduplication removes excess information and reproduced information happens in cloud environment. This paper presents a literature survey alongside different deduplication procedures that have been based on cloud information storage. To all the more likely guarantee secure deduplication in cloud, this paper examines file level data deduplication and block level data deduplication.


In the cryptocurrency era, Blockchain is one of the expeditiously growing information technologies that help in providing security to the data. Data tampering and authentication problems generally occur in centralized servers while sharing and storing the data. Blockchain provides the platform for big data and cloud storage in enhancing the security by evading from pernicious users. In this paper, we have discussed the exhaustive description of blockchain and its need, features and applications. Analysis of blockchain is done for different domains such as big data, cloud, internet of things and mobile cloud where the differences V’s are compared with big data and blockchain. SWOT (Strength Weakness Opportunities Threats) analysis is performed to address the merits and limitations in blockchain technology. The survey in aspects of data security, data storage, data sharing and data authentication through blockchain technology is done and the challenges are discussed to overcome the problem that leads in big data and cloud storage. The detailed comparative analysis proves that the blockchain technology overcomes the problems in big data storage and data security in cloud.


2021 ◽  
Author(s):  
Yang Wang

Abstract In the information age, with the development of big data intelligence, Intellectual Property (IP) related data is growing in a geometric progression, so the demand for data storage space is also growing, and the distributed platform of intellectual property data based on cloud storage is also emerging. Cloud computing platform has huge storage space and powerful computing power, and the distributed platform of intellectual property data based on cloud storage has emerged one after another. With this, the privacy and security issues of cloud platform also get more attention. Because the biggest feature of cloud storage is that storage is a service, it puts forward higher requirements for intellectual property services. Firstly, this paper introduces the domestic IP cloud platform services from three perspectives of government support, state-owned enterprises and private enterprises. Secondly, four typical distributed platforms provided by business resources are selected to introduce their operation modes respectively, and the problems faced by domestic IP service modes are summarized emphatically. Then, it compares and discusses the current situation of domestic IP distributed platforms. In view of the current domestic intellectual property service mode, taking TSITE IP as an example, the paper puts forward the design and construction strategy of intellectual property protection, intellectual property operation service distributed platform and operation service mode under the background of information age.


Cloud computing, an efficient technology that utilizes huge amount of data file storage with security. However, the content owner does not controlling data access for unauthorized clients and does not control data storage and usage of data. Some previous approaches data access control to help data de-duplication concurrently for cloud storage system. Encrypted data for cloud storage is not effectively handled by current industrial de-duplication solutions. The deduplication is unguarded from brute-force attacks and fails in supporting control of data access .An efficient data confining technique that eliminates redundant data’s multiple copies which is commonly used is Data-Deduplication. It reduces the space needed to store these data and thus bandwidth is saved. An efficient content discovery and preserving De-duplication (ECDPD) algorithm that detects client file range and block range of de-duplication in storing data files in the cloud storage system was proposed to overpower the above problems.Data access control is supported by ECDPD actively. Based on Experimental evaluations, proposed ECDPD method reduces 3.802 milliseconds of DUT (Data Uploading Time) and 3.318 milliseconds of DDT (Data Downloading Time) compared than existing approaches


Sign in / Sign up

Export Citation Format

Share Document