STRENGTHNING THE PRODUCTIVITY OF STORAGE FOR BIG DATA STORAGE
SYSTEMS USING DISTRIBUTED DEDUPLICATION
Cloud storage is one of the key features of cloud computing, which helps cloud users outsource large numbers of data without upgrading their devices. However, Cloud Service Providers (CSPs) data storage faces problems with data redundancy. The data deduplication technique aims at eliminating redundant information segments and maintains one single instance of the data set, even if any number of users own similar data set. Since blocks of data are spread on many servers, each block of the file has to be downloaded before restoring the file to decrease system output. We suggest a cloud storage server data recovery module to improve file access efficiency and reduce time spent on network bandwidth. Device coding is used in the suggested method to store blocks in distributed cloud storage, and for data integrity, MD5 (Message Digest 5) is used. Running recovery algorithm helps the user to retrieve a file directly from the cloud servers without downloading every block. The scheme proposed improves system time efficiency and the ability to access the stored data quickly. This reduces bandwidth consumption and reduces overhead user processing while downloading the data file.