Composing private and censorship-resistant solutions for distributed storage

Mapping Intimacies ◽

10.35662/unine-thesis-2865 ◽

2020 ◽

Author(s):

◽

Dorian Burihabwa

Keyword(s):

Cloud Storage ◽

Distributed Storage ◽

Third Party ◽

Loss Of Control ◽

Data Deduplication ◽

Customer Data ◽

Large Sets ◽

Storage Overhead ◽

Distributed Solutions ◽

Client Side

Cloud storage has durably entered the stage as go-to solution for business and personal storage. Virtually extending storage capabilities to infinity, cloud storage enables companies and individuals to focus on content creation without fear of running out of space or losing data. But as users entrust more and more data to the cloud, they also have to accept a loss of control over the data they o˜oad to the cloud. At a time when online services seem to be making a significant part of their profits by exploiting customer data, concerns over privacy and integrity of said data naturally arise. Are their online documents read by the storage provider or its employees? Is the content of these documents shared with third party partners of the storage provider? What happens if the provider goes bankrupt? Whatever answer can be o˙ered by the storage provider, the loss of control should be cause for concern. But storage providers also have to worry about trust and reliability. As they build distributed solutions to accommodate their customers’ needs, these concerns of control extend to the infrastructure they operate on. Conciliating security, confidentiality, resilience and perform-ance over large sets of distributed storage nodes is a tricky balancing act. And even when a suitable balance can be found, it is often done at the expense of increased storage overhead. In this dissertation, we try to mitigate these issues by focusing on three aspects. First, we study solutions to empower users with flexible tooling ensuring security, integrity and redundancy in distributed storage settings. By leveraging public cloud storage o˙erings to build a configurable file system and storage middleware, we show that securing cloud-storage from the client-side is an e˙ective way maintaining control. Second, we build a distributed archive whose resilience goes beyond standard redundancy schemes. To achieve this, we implement Recast, relying on a data entanglement scheme, that encodes and distributes data over a set of storage nodes to ensure durability at a manageable cost. Finally, we look into o˙setting the increase in storage overhead by means of data reduction. This is made possible by the use of Generalised Deduplication, a scheme that improves over classical data deduplication by detecting similarities beyond exact matches.

Download Full-text

A Global Survey on Data Deduplication

International Journal of Grid and High Performance Computing ◽

10.4018/ijghpc.2018100103 ◽

2018 ◽

Vol 10 (4) ◽

pp. 43-66 ◽

Cited By ~ 3

Author(s):

Shubhanshi Singhal ◽

Pooja Sharma ◽

Rajesh Kumar Aggarwal ◽

Vishal Passricha

Keyword(s):

Data Reduction ◽

Cloud Storage ◽

Storage Systems ◽

Distributed Storage ◽

Digital Data ◽

Data Deduplication ◽

Computationally Efficient ◽

Redundant Data ◽

Single Instance ◽

Local Storage

This article describes how data deduplication efficiently eliminates the redundant data by selecting and storing only single instance of it and becoming popular in storage systems. Digital data is growing much faster than storage volumes, which shows the importance of data deduplication among scientists and researchers. Data deduplication is considered as most successful and efficient technique of data reduction because it is computationally efficient and offers a lossless data reduction. It is applicable to various storage systems, i.e. local storage, distributed storage, and cloud storage. This article discusses the background, components, and key features of data deduplication which helps the reader to understand the design issues and challenges in this field.

Download Full-text

Improving Data Availability for Deduplication in Cloud Storage

International Journal of Grid and High Performance Computing ◽

10.4018/ijghpc.2018040106 ◽

2018 ◽

Vol 10 (2) ◽

pp. 70-89 ◽

Cited By ~ 3

Author(s):

Jun Li ◽

Mengshu Hou

Keyword(s):

Conventional Method ◽

Cloud Storage ◽

Storage System ◽

Data Availability ◽

Redundant Information ◽

Storage Requirement ◽

Improved Method ◽

Data Deduplication ◽

Access Frequency ◽

Storage Overhead

This article describes how in order to reduce the amount of data, deduplication technology is introduced in the cloud storage. Adopting this technology, the duplicated data can be eliminated, users can conserve the storage requirement. However, deduplication technology also increases the data unavailability. To solve this problem, the authors propose a method to improve data availability in the deduplication storage system. It is based on the data chunk reference count and access frequency, and increases redundant information for the data chunks, to ensure data availability and minimize storage overhead. Extensive experiments are conducted to evaluate effectiveness of the improved method. WFD, CDC, and sliding block deduplication technology are used for comparison. The experimental results show that the proposed method can achieve higher data availability than the conventional method and increase little storage overhead.

Download Full-text

A Secure Method for Managing Data in Cloud Storage using Deduplication and Enhanced Fuzzy Based Intrusion Detection Framework

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst061131 ◽

2020 ◽

Vol 6 (11) ◽

pp. 165-173

Author(s):

Hema S and Dr.Kangaiammal A

Keyword(s):

Intrusion Detection ◽

Cloud Storage ◽

Data Privacy ◽

Detection System ◽

Distributed Storage ◽

Cloud Services ◽

Data Availability ◽

Similar Data ◽

Data Deduplication ◽

Privacy And Security

Cloud services increase data availability so as to offer flawless service to the client. Because of increasing data availability, more redundancies and more memory space are required to store such data. Cloud computing requires essential storage and efficient protection for all types of data. With the amount of data produced seeing an exponential increase with time, storing the replicated data contents is inevitable. Hence, using storage optimization approaches becomes an important pre-requisite for enormous storage domains like cloud storage. Data deduplication is the technique which compresses the data by eliminating the replicated copies of similar data and it is widely utilized in cloud storage to conserve bandwidth and minimize the storage space. Despite the data deduplication eliminates data redundancy and data replication; it likewise presents significant data privacy and security problems for the end-user. Considering this, in this work, a novel security-based deduplication model is proposed to reduce a hash value of a given file size and provide additional security for cloud storage. In proposed method the hash value of a given file is reduced employing Distributed Storage Hash Algorithm (DSHA) and to provide security the file is encrypted by using an Improved Blowfish Encryption Algorithm (IBEA). This framework also proposes the enhanced fuzzy based intrusion detection system (EFIDS) by defining rules for the major attacks, thereby alert the system automatically. Finally the combination of data exclusion and security encryption technique allows cloud users to effectively manage their cloud storage by avoiding repeated data encroachment. It also saves bandwidth and alerts the system from attackers. The results of experiments reveal that the discussed algorithm yields improved throughput and bytes saved per second in comparison with other chunking algorithms.

Download Full-text

Secure Data Deduplication on Cloud Storage

Advances in Wireless Technologies and Telecommunication - Handbook of Research on the IoT, Cloud Computing, and Wireless Network Optimization ◽

10.4018/978-1-5225-7335-7.ch002 ◽

2019 ◽

pp. 23-40

Author(s):

Shivansh Mishra ◽

Surjit Singh

Keyword(s):

Literature Review ◽

Cloud Storage ◽

Original Data ◽

Data Deduplication ◽

External Influences ◽

Secure Data ◽

Duplication Detection ◽

Client Side

Deduplication is the process of removing duplicate data by storing only one copy of the original data and replacing the others simply as a reference to the original. When data is stored on cloud storage, client-side deduplication helps in reducing storage and communications overheads both from the client as well as the server perspective. Secure deduplication is the practice by which the data stored on the cloud is secured from external influences such that the clients maintain the privacy of their data, and the server also gets to take advantage of deduplication. This is done by encrypting the data using different schemes into ciphertext, which makes sense only to the original client. The schemes created for secure deduplication on cloud storage provide a solution to the problem of duplication detection in encrypted ciphertext. This chapter provides a brief overview of secure deduplication used on cloud storage along with the issues encountered during its implementation. The chapter also includes a literature review and comparison of some deduplication techniques.

Download Full-text

Differentially private client-side data deduplication protocol for cloud storage services

Security and Communication Networks ◽

10.1002/sec.1159 ◽

2014 ◽

Vol 8 (12) ◽

pp. 2114-2123 ◽

Cited By ~ 13

Author(s):

Youngjoo Shin ◽

Kwangjo Kim

Keyword(s):

Cloud Storage ◽

Data Deduplication ◽

Client Side ◽

Cloud Storage Services

Download Full-text

An Improved Blockchain-Based Secure Data Deduplication using Attribute-Based Role Key Generation with Efficient Cryptographic Methods

10.21203/rs.3.rs-596633/v1 ◽

2021 ◽

Author(s):

Ruba S ◽

A.M. Kalpana

Keyword(s):

Cloud Storage ◽

Data Integrity ◽

Cloud Service ◽

Third Party ◽

Key Generation ◽

Data Deduplication ◽

Cloud Data ◽

Redundant Data ◽

Encrypt Data ◽

Secure Data

Abstract Deduplication can be used as a data redundancy removal method that has been constructed to save system storage resources through redundant data reduction in cloud storage. Now a day, deduplication techniques are increasingly exploited to cloud data centers with the growth of cloud computing techniques. Therefore, many deduplication methods were presented by many researchers to eliminate redundant data in cloud storage. For secure deduplication, previous works typically have introduced third-party auditors for the data integrity verification, but it may be suffered from data leak by the third-party auditors. And also the customary methods could not face more difficulties in big data deduplication to correctly consider the two conflicting aims of high duplicate elimination ratio and deduplication throughput. In this paper, an improved blockchain-based secure data deduplication is presented with efficient cryptographic methods to save cloud storage securely. In the proposed method, an attribute-based role key generation (ARKG) method is constructed in a hierarchical tree manner to generate a role key when the data owners upload their data to cloud service provider (CSP) and to allow authorized users to download the data. In our system, the smart contract (agreement between the data owner and CSP) is done using SHA-256 (Secure Hash Algorithm-256) to generate a tamper-proofing ledger for data integrity, in which data is protected from illegal modifications, and duplication detection is executed through hash-tag that can be formed by SHA-256. Message Locked encryption (MLE) is employed to encrypt data for data uploading by the data owners to the CSP. The experimental results show that our proposed secure deduplication scheme can give higher throughput and a low duplicate elimination ratio.

Download Full-text

Modified Secure Data Deduplication Computing in Cloud based Environment

Circulation in Computer Science ◽

10.22632/ccs-2017-252-42 ◽

2017 ◽

Vol 2 (7) ◽

pp. 14-19

Author(s):

X. Alphonse Inbaraj ◽

A. Seshagiri Rao

Keyword(s):

Poor Quality ◽

Third Party ◽

Cloud Provider ◽

Data Deduplication ◽

Sensitive Data ◽

Customer Data ◽

Storage Devices ◽

Merkle Hash Tree ◽

Private And Public ◽

Data Objects

Security has been a concern since the early days of computing, when a computer was isolated in a room and a threat could be posed only by malicious insiders. To support authorized Data Deduplication in cloud computing ,encryption is enhanced before outsource. Data Deduplication helps to store identical copy of data in Cloud Storage and that consumption is low bandwidth. Third Party control generates a spectrum of concerns caused by the lack of transparency and limited user control .For example , a cloud provider may subcontract some resources from a third party whose level of trust is questionable. There are examples when subcontractors fails to maintain the customer data. There are also examples when third party was not a subcontractor but a hardware supplier and the loss of data was caused by poor –quality storage devices[12].To overcome the problem of integrity and security, this paper makes the first attempt that applying Data Coloring, Watermarking techniques on shared data objects. Then applying Merkle Hash Tree[11],make tighten access control for sensitive data in both private and public clouds.

Download Full-text

Data Storage and Retrieval with Deduplication in Secured Cloud Storage

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e7371.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 609-615

Keyword(s):

Data Storage ◽

Cloud Storage ◽

Modern Technology ◽

Current Data ◽

Data Deduplication ◽

Data Accessibility ◽

Storage And Retrieval ◽

Network Bandwidth ◽

Secured Data ◽

Client Side

The Cloud Storage can be depicted as a service model where raw or processed data is stored, handled, and backed-up remotely while accessible to multiple users simultaneously over a network. Few of the ideal features of cloud storage is reliability, easy deployment, disaster recovery, security for data, accessibility and on top of that lesser overall storage costs which removes the hindrance of purchasing and maintaining the technologies for cloud storage. In this modern technology world, massive amount of data are produced in day to day life. So, it has become necessary to handle those big data on demand which is a challenging task for current data storage systems. The process of eliminating redundant copies of data thereby reducing the storage overhead is termed as Data Deduplication (DD). One of the ultimate aim of this research is to achieve ideal deduplication on secured data of client side. On the other hand as the client’s data are encrypted with different keys, the cross user deduplication is merely impossible as having a single key encryption among multiple user’s leads to an in secure system resulting in fragile to client’s expectations. The proposed research adapts Message Locked Encryption (MLE) technique that looks for redundant files in cloud before uploading the client’s file which eventually reduces the storage. Since the redundant files are swept, the network bandwidth is considerably reduced with respect to the redundant contents uploaded several times.

Download Full-text

Merkle Hash Tree Based Deduplication in Cloud Storage

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.556-562.6223 ◽

2014 ◽

Vol 556-562 ◽

pp. 6223-6227 ◽

Cited By ~ 1

Author(s):

Chao Ling Li ◽

Yue Chen

Keyword(s):

Cloud Storage ◽

Storage System ◽

Data Deduplication ◽

Sensitive Data ◽

User File ◽

Merkle Hash Tree ◽

Block Level ◽

Brute Force Attack ◽

Client Side ◽

Local Block

To deduplicate the sensitive data in a cloud storage center, a scheme called as MHT-Dedup that is based on MHT (Merkle Hash Tree) is proposed. It achieves the cross-user file-level client-side deduplication and local block-level client-side deduplication concurrently. It firstly encrypts the file on block granularity, and then authenticates the file ciphertext to find duplicated files (Proofs of oWnership, PoW) and check the hash of block plaintext to find duplicated blocks. In the PoW protocol of MHT-Dedup, an authenticating binary tree is generated from the tags of encrypted blocks to assuredly find the duplicated files. MHT-Dedup gets rid of the conflict between data deduplication and encryption, achieves the file-level and block-level deduplication concurrently, avoids the misuse of storage system by users, resists to the inside and outside attacks to data confidentiality, and prevents the target collision attack to files and brute force attack to blocks.

Download Full-text

Design of an Improved Data Deduplication Technique for Cloud Storage

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i2.5156 ◽

2019 ◽

Vol 7 (2) ◽

pp. 51-56

Author(s):

Sunil Gupta ◽

Rajeev Bedi ◽

Amandeep Kaur

Keyword(s):

Cloud Storage ◽

Data Deduplication

Download Full-text