scholarly journals Composing private and censorship-resistant solutions for distributed storage

Author(s):  
◽  
Dorian Burihabwa

Cloud storage has durably entered the stage as go-to solution for business and personal storage. Virtually extending storage capabilities to infinity, cloud storage enables companies and individuals to focus on content creation without fear of running out of space or losing data. But as users entrust more and more data to the cloud, they also have to accept a loss of control over the data they o˜oad to the cloud. At a time when online services seem to be making a significant part of their profits by exploiting customer data, concerns over privacy and integrity of said data naturally arise. Are their online documents read by the storage provider or its employees? Is the content of these documents shared with third party partners of the storage provider? What happens if the provider goes bankrupt? Whatever answer can be o˙ered by the storage provider, the loss of control should be cause for concern. But storage providers also have to worry about trust and reliability. As they build distributed solutions to accommodate their customers’ needs, these concerns of control extend to the infrastructure they operate on. Conciliating security, confidentiality, resilience and perform-ance over large sets of distributed storage nodes is a tricky balancing act. And even when a suitable balance can be found, it is often done at the expense of increased storage overhead. In this dissertation, we try to mitigate these issues by focusing on three aspects. First, we study solutions to empower users with flexible tooling ensuring security, integrity and redundancy in distributed storage settings. By leveraging public cloud storage o˙erings to build a configurable file system and storage middleware, we show that securing cloud-storage from the client-side is an e˙ective way maintaining control. Second, we build a distributed archive whose resilience goes beyond standard redundancy schemes. To achieve this, we implement Recast, relying on a data entanglement scheme, that encodes and distributes data over a set of storage nodes to ensure durability at a manageable cost. Finally, we look into o˙setting the increase in storage overhead by means of data reduction. This is made possible by the use of Generalised Deduplication, a scheme that improves over classical data deduplication by detecting similarities beyond exact matches.

2018 ◽  
Vol 10 (4) ◽  
pp. 43-66 ◽  
Author(s):  
Shubhanshi Singhal ◽  
Pooja Sharma ◽  
Rajesh Kumar Aggarwal ◽  
Vishal Passricha

This article describes how data deduplication efficiently eliminates the redundant data by selecting and storing only single instance of it and becoming popular in storage systems. Digital data is growing much faster than storage volumes, which shows the importance of data deduplication among scientists and researchers. Data deduplication is considered as most successful and efficient technique of data reduction because it is computationally efficient and offers a lossless data reduction. It is applicable to various storage systems, i.e. local storage, distributed storage, and cloud storage. This article discusses the background, components, and key features of data deduplication which helps the reader to understand the design issues and challenges in this field.


2018 ◽  
Vol 10 (2) ◽  
pp. 70-89 ◽  
Author(s):  
Jun Li ◽  
Mengshu Hou

This article describes how in order to reduce the amount of data, deduplication technology is introduced in the cloud storage. Adopting this technology, the duplicated data can be eliminated, users can conserve the storage requirement. However, deduplication technology also increases the data unavailability. To solve this problem, the authors propose a method to improve data availability in the deduplication storage system. It is based on the data chunk reference count and access frequency, and increases redundant information for the data chunks, to ensure data availability and minimize storage overhead. Extensive experiments are conducted to evaluate effectiveness of the improved method. WFD, CDC, and sliding block deduplication technology are used for comparison. The experimental results show that the proposed method can achieve higher data availability than the conventional method and increase little storage overhead.


Author(s):  
Hema S and Dr.Kangaiammal A

Cloud services increase data availability so as to offer flawless service to the client. Because of increasing data availability, more redundancies and more memory space are required to store such data. Cloud computing requires essential storage and efficient protection for all types of data. With the amount of data produced seeing an exponential increase with time, storing the replicated data contents is inevitable. Hence, using storage optimization approaches becomes an important pre-requisite for enormous storage domains like cloud storage. Data deduplication is the technique which compresses the data by eliminating the replicated copies of similar data and it is widely utilized in cloud storage to conserve bandwidth and minimize the storage space. Despite the data deduplication eliminates data redundancy and data replication; it likewise presents significant data privacy and security problems for the end-user. Considering this, in this work, a novel security-based deduplication model is proposed to reduce a hash value of a given file size and provide additional security for cloud storage. In proposed method the hash value of a given file is reduced employing Distributed Storage Hash Algorithm (DSHA) and to provide security the file is encrypted by using an Improved Blowfish Encryption Algorithm (IBEA). This framework also proposes the enhanced fuzzy based intrusion detection system (EFIDS) by defining rules for the major attacks, thereby alert the system automatically. Finally the combination of data exclusion and security encryption technique allows cloud users to effectively manage their cloud storage by avoiding repeated data encroachment. It also saves bandwidth and alerts the system from attackers. The results of experiments reveal that the discussed algorithm yields improved throughput and bytes saved per second in comparison with other chunking algorithms.


Author(s):  
Shivansh Mishra ◽  
Surjit Singh

Deduplication is the process of removing duplicate data by storing only one copy of the original data and replacing the others simply as a reference to the original. When data is stored on cloud storage, client-side deduplication helps in reducing storage and communications overheads both from the client as well as the server perspective. Secure deduplication is the practice by which the data stored on the cloud is secured from external influences such that the clients maintain the privacy of their data, and the server also gets to take advantage of deduplication. This is done by encrypting the data using different schemes into ciphertext, which makes sense only to the original client. The schemes created for secure deduplication on cloud storage provide a solution to the problem of duplication detection in encrypted ciphertext. This chapter provides a brief overview of secure deduplication used on cloud storage along with the issues encountered during its implementation. The chapter also includes a literature review and comparison of some deduplication techniques.


2021 ◽  
Author(s):  
Ruba S ◽  
A.M. Kalpana

Abstract Deduplication can be used as a data redundancy removal method that has been constructed to save system storage resources through redundant data reduction in cloud storage. Now a day, deduplication techniques are increasingly exploited to cloud data centers with the growth of cloud computing techniques. Therefore, many deduplication methods were presented by many researchers to eliminate redundant data in cloud storage. For secure deduplication, previous works typically have introduced third-party auditors for the data integrity verification, but it may be suffered from data leak by the third-party auditors. And also the customary methods could not face more difficulties in big data deduplication to correctly consider the two conflicting aims of high duplicate elimination ratio and deduplication throughput. In this paper, an improved blockchain-based secure data deduplication is presented with efficient cryptographic methods to save cloud storage securely. In the proposed method, an attribute-based role key generation (ARKG) method is constructed in a hierarchical tree manner to generate a role key when the data owners upload their data to cloud service provider (CSP) and to allow authorized users to download the data. In our system, the smart contract (agreement between the data owner and CSP) is done using SHA-256 (Secure Hash Algorithm-256) to generate a tamper-proofing ledger for data integrity, in which data is protected from illegal modifications, and duplication detection is executed through hash-tag that can be formed by SHA-256. Message Locked encryption (MLE) is employed to encrypt data for data uploading by the data owners to the CSP. The experimental results show that our proposed secure deduplication scheme can give higher throughput and a low duplicate elimination ratio.


2017 ◽  
Vol 2 (7) ◽  
pp. 14-19
Author(s):  
X. Alphonse Inbaraj ◽  
A. Seshagiri Rao

Security has been a concern since the early days of computing, when a computer was isolated in a room and a threat could be posed only by malicious insiders. To support authorized Data Deduplication in cloud computing ,encryption is enhanced before outsource. Data Deduplication helps to store identical copy of data in Cloud Storage and that consumption is low bandwidth. Third Party control generates a spectrum of concerns caused by the lack of transparency and limited user control .For example , a cloud provider may subcontract some resources from a third party whose level of trust is questionable. There are examples when subcontractors fails to maintain the customer data. There are also examples when third party was not a subcontractor but a hardware supplier and the loss of data was caused by poor –quality storage devices[12].To overcome the problem of integrity and security, this paper makes the first attempt that applying Data Coloring, Watermarking techniques on shared data objects. Then applying Merkle Hash Tree[11],make tighten access control for sensitive data in both private and public clouds.


The Cloud Storage can be depicted as a service model where raw or processed data is stored, handled, and backed-up remotely while accessible to multiple users simultaneously over a network. Few of the ideal features of cloud storage is reliability, easy deployment, disaster recovery, security for data, accessibility and on top of that lesser overall storage costs which removes the hindrance of purchasing and maintaining the technologies for cloud storage. In this modern technology world, massive amount of data are produced in day to day life. So, it has become necessary to handle those big data on demand which is a challenging task for current data storage systems. The process of eliminating redundant copies of data thereby reducing the storage overhead is termed as Data Deduplication (DD). One of the ultimate aim of this research is to achieve ideal deduplication on secured data of client side. On the other hand as the client’s data are encrypted with different keys, the cross user deduplication is merely impossible as having a single key encryption among multiple user’s leads to an in secure system resulting in fragile to client’s expectations. The proposed research adapts Message Locked Encryption (MLE) technique that looks for redundant files in cloud before uploading the client’s file which eventually reduces the storage. Since the redundant files are swept, the network bandwidth is considerably reduced with respect to the redundant contents uploaded several times.


2014 ◽  
Vol 556-562 ◽  
pp. 6223-6227 ◽  
Author(s):  
Chao Ling Li ◽  
Yue Chen

To deduplicate the sensitive data in a cloud storage center, a scheme called as MHT-Dedup that is based on MHT (Merkle Hash Tree) is proposed. It achieves the cross-user file-level client-side deduplication and local block-level client-side deduplication concurrently. It firstly encrypts the file on block granularity, and then authenticates the file ciphertext to find duplicated files (Proofs of oWnership, PoW) and check the hash of block plaintext to find duplicated blocks. In the PoW protocol of MHT-Dedup, an authenticating binary tree is generated from the tags of encrypted blocks to assuredly find the duplicated files. MHT-Dedup gets rid of the conflict between data deduplication and encryption, achieves the file-level and block-level deduplication concurrently, avoids the misuse of storage system by users, resists to the inside and outside attacks to data confidentiality, and prevents the target collision attack to files and brute force attack to blocks.


Sign in / Sign up

Export Citation Format

Share Document