A Secure Method for Managing Data in Cloud
Storage using Deduplication and Enhanced Fuzzy
Based Intrusion Detection Framework

Cloud services increase data availability so as to offer flawless service to the client. Because of increasing data availability, more redundancies and more memory space are required to store such data. Cloud computing requires essential storage and efficient protection for all types of data. With the amount of data produced seeing an exponential increase with time, storing the replicated data contents is inevitable. Hence, using storage optimization approaches becomes an important pre-requisite for enormous storage domains like cloud storage. Data deduplication is the technique which compresses the data by eliminating the replicated copies of similar data and it is widely utilized in cloud storage to conserve bandwidth and minimize the storage space. Despite the data deduplication eliminates data redundancy and data replication; it likewise presents significant data privacy and security problems for the end-user. Considering this, in this work, a novel security-based deduplication model is proposed to reduce a hash value of a given file size and provide additional security for cloud storage. In proposed method the hash value of a given file is reduced employing Distributed Storage Hash Algorithm (DSHA) and to provide security the file is encrypted by using an Improved Blowfish Encryption Algorithm (IBEA). This framework also proposes the enhanced fuzzy based intrusion detection system (EFIDS) by defining rules for the major attacks, thereby alert the system automatically. Finally the combination of data exclusion and security encryption technique allows cloud users to effectively manage their cloud storage by avoiding repeated data encroachment. It also saves bandwidth and alerts the system from attackers. The results of experiments reveal that the discussed algorithm yields improved throughput and bytes saved per second in comparison with other chunking algorithms.

Download Full-text

An extensive research survey on data integrity and deduplication towards privacy in cloud storage

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i2.pp2011-2022 ◽

2020 ◽

Vol 10 (2) ◽

pp. 2011

Author(s):

Anil Kumar G. ◽

Shantala C. P.

Keyword(s):

Cloud Storage ◽

Data Privacy ◽

Storage System ◽

Data Integrity ◽

Data Deduplication ◽

Research Issues ◽

Existing Problems ◽

Research Survey ◽

Challenging Tasks ◽

Future Direction

Owing to the highly distributed nature of the cloud storage system, it is one of the challenging tasks to incorporate a higher degree of security towards the vulnerable data. Apart from various security concerns, data privacy is still one of the unsolved problems in this regards. The prime reason is that existing approaches of data privacy doesn't offer data integrity and secure data deduplication process at the same time, which is highly essential to ensure a higher degree of resistance against all form of dynamic threats over cloud and internet systems. Therefore, data integrity, as well as data deduplication is such associated phenomena which influence data privacy. Therefore, this manuscript discusses the explicit research contribution toward data integrity, data privacy, and data deduplication. The manuscript also contributes towards highlighting the potential open research issues followed by a discussion of the possible future direction of work towards addressing the existing problems.

Download Full-text

A Global Survey on Data Deduplication

International Journal of Grid and High Performance Computing ◽

10.4018/ijghpc.2018100103 ◽

2018 ◽

Vol 10 (4) ◽

pp. 43-66 ◽

Cited By ~ 3

Author(s):

Shubhanshi Singhal ◽

Pooja Sharma ◽

Rajesh Kumar Aggarwal ◽

Vishal Passricha

Keyword(s):

Data Reduction ◽

Cloud Storage ◽

Storage Systems ◽

Distributed Storage ◽

Digital Data ◽

Data Deduplication ◽

Computationally Efficient ◽

Redundant Data ◽

Single Instance ◽

Local Storage

This article describes how data deduplication efficiently eliminates the redundant data by selecting and storing only single instance of it and becoming popular in storage systems. Digital data is growing much faster than storage volumes, which shows the importance of data deduplication among scientists and researchers. Data deduplication is considered as most successful and efficient technique of data reduction because it is computationally efficient and offers a lossless data reduction. It is applicable to various storage systems, i.e. local storage, distributed storage, and cloud storage. This article discusses the background, components, and key features of data deduplication which helps the reader to understand the design issues and challenges in this field.

Download Full-text

Improving Data Availability for Deduplication in Cloud Storage

International Journal of Grid and High Performance Computing ◽

10.4018/ijghpc.2018040106 ◽

2018 ◽

Vol 10 (2) ◽

pp. 70-89 ◽

Cited By ~ 3

Author(s):

Jun Li ◽

Mengshu Hou

Keyword(s):

Conventional Method ◽

Cloud Storage ◽

Storage System ◽

Data Availability ◽

Redundant Information ◽

Storage Requirement ◽

Improved Method ◽

Data Deduplication ◽

Access Frequency ◽

Storage Overhead

This article describes how in order to reduce the amount of data, deduplication technology is introduced in the cloud storage. Adopting this technology, the duplicated data can be eliminated, users can conserve the storage requirement. However, deduplication technology also increases the data unavailability. To solve this problem, the authors propose a method to improve data availability in the deduplication storage system. It is based on the data chunk reference count and access frequency, and increases redundant information for the data chunks, to ensure data availability and minimize storage overhead. Extensive experiments are conducted to evaluate effectiveness of the improved method. WFD, CDC, and sliding block deduplication technology are used for comparison. The experimental results show that the proposed method can achieve higher data availability than the conventional method and increase little storage overhead.

Download Full-text

A Novel Intrusion Detection System for Pervasive Cloud Services

Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering ◽

10.4203/ccp.111.18 ◽

2017 ◽

Author(s):

L. Sellami ◽

D. Idoughi ◽

K. Sellami ◽

P. F. Tiako

Keyword(s):

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Cloud Services

Download Full-text

Data Privacy Security Guaranteed Network Intrusion Detection System Based on Federated Learning

IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) ◽

10.1109/infocomwkshps51825.2021.9484545 ◽

2021 ◽

Author(s):

Jibo Shi ◽

Bin Ge ◽

Yang Liu ◽

Yu Yan ◽

Shuang Li

Keyword(s):

Intrusion Detection ◽

Intrusion Detection System ◽

Data Privacy ◽

Detection System ◽

Network Intrusion Detection ◽

Network Intrusion ◽

Network Intrusion Detection System

Download Full-text

Trust Hardware Based Secured Privacy Preserving Computation System for Three-Dimensional Data

Electronics ◽

10.3390/electronics10131546 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1546

Author(s):

Munan Yuan ◽

Xiaofeng Li ◽

Xiru Li ◽

Haibo Tan ◽

Jinlin Xu

Keyword(s):

Data Privacy ◽

High Performance ◽

Distributed Storage ◽

Three Dimensional ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy And Security ◽

Time Consumption ◽

Blockchain Technology ◽

3D Data

Three-dimensional (3D) data are easily collected in an unconscious way and are sensitive to lead biological characteristics exposure. Privacy and ownership have become important disputed issues for the 3D data application field. In this paper, we design a privacy-preserving computation system (SPPCS) for sensitive data protection, based on distributed storage, trusted execution environment (TEE) and blockchain technology. The SPPCS separates a storage and analysis calculation from consensus to build a hierarchical computation architecture. Based on a similarity computation of graph structures, the SPPCS finds data requirement matching lists to avoid invalid transactions. With TEE technology, the SPPCS implements a dual hybrid isolation model to restrict access to raw data and obscure the connections among transaction parties. To validate confidential performance, we implement a prototype of SPPCS with Ethereum and Intel Software Guard Extensions (SGX). The evaluation results derived from test datasets show that (1) the enhanced security and increased time consumption (490 ms in this paper) of multiple SGX nodes need to be balanced; (2) for a single SGX node to enhance data security and preserve privacy, an increased time consumption of about 260 ms is acceptable; (3) the transaction relationship cannot be inferred from records on-chain. The proposed SPPCS implements data privacy and security protection with high performance.

Download Full-text

PaaSword: A Holistic Data Privacy and Security by Design Framework for Cloud Services

Journal of Grid Computing ◽

10.1007/s10723-017-9394-2 ◽

2017 ◽

Vol 15 (2) ◽

pp. 219-234 ◽

Cited By ~ 11

Author(s):

Yiannis Verginadis ◽

Antonis Michalas ◽

Panagiotis Gouvas ◽

Gunther Schiefer ◽

Gerald Hübsch ◽

...

Keyword(s):

Data Privacy ◽

Cloud Services ◽

Design Framework ◽

Privacy And Security ◽

Security By Design

Download Full-text

Efficient top representative for multi-authorship encrypted cloud data to assist cognitive search

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189130 ◽

2020 ◽

Vol 39 (6) ◽

pp. 8079-8089

Author(s):

P. Shanthi ◽

A. Umamakeswari

Keyword(s):

Cloud Storage ◽

Data Privacy ◽

New Technologies ◽

Keyword Search ◽

Information Leakage ◽

Cloud Services ◽

Security And Privacy ◽

Business World ◽

Cloud Data ◽

Privacy And Confidentiality

Cloud computing is gaining ground in the digital and business world. It delivers storage service for user access using Internet as a medium. Besides the numerous benefits of cloud services, migrating to public cloud storage leads to security and privacy concerns. Encryption method protects data privacy and confidentiality. However, encrypted data stored in cloud storage reduces the flexibility in processing data. Therefore, the development of new technologies to search top representatives from encrypted public storage is the current requirement. This paper presents a similarity-based keyword search for multi-author encrypted documents. The proposed Authorship Attribute-Based Ranked Keyword Search (AARKS) encrypts documents using user attributes, and returns ranked results to authorized users. The scheme assigns weight to index vectors by finding the dominant keywords of the specific authority document collection. Search using the proposed indexing prunes away branches and processes only fewer nodes. Re-weighting documents using the relevant feedback also improves user experience. The proposed scheme ensures the privacy and confidentiality of data supporting the cognitive search for encrypted cloud data. Experiments are performed using the Enron dataset and simulated using a set of queries. The precision obtained for the proposed ranked retrieval is 0.7262. Furthermore, information leakage to a cloud server is prevented, thereby proving its suitability for public storage.

Download Full-text

Improving the Efficiency of Deduplication Process by Dedicated Hash Table for each Digital Data Type in Cloud Storage System

Webology ◽

10.14704/web/v18si01/web18060 ◽

2021 ◽

Vol 18 (Special Issue 01) ◽

pp. 288-301

Author(s):

G. Sujatha ◽

Dr. Jeberson Retna Raj

Keyword(s):

Data Storage ◽

Cloud Storage ◽

Storage System ◽

Hash Table ◽

Cloud Services ◽

Data Type ◽

Digital Data ◽

Storage Space ◽

Data Deduplication ◽

Worst Case

Data storage is one of the significant cloud services available to the cloud users. Since the magnitude of information outsourced grows extremely high, there is a need of implementing data deduplication technique in the cloud storage space for efficient utilization. The cloud storage space supports all kind of digital data like text, audio, video and image. In the hash-based deduplication system, cryptographic hash value should be calculated for all data irrespective of its type and stored in the memory for future reference. Using these hash value only, duplicate copies can be identified. The problem in this existing scenario is size of the hash table. To find a duplicate copy, all the hash values should be checked in the worst case irrespective of its data type. At the same time, all kind of digital data does not suit with same structure of hash table. In this study we proposed an approach to have multiple hash tables for different digital data. By having dedicated hash table for each digital data type will improve the searching time of duplicate data.

Download Full-text

STRENGTHNING THE PRODUCTIVITY OF STORAGE FOR BIG DATA STORAGE SYSTEMS USING DISTRIBUTED DEDUPLICATION

International Journal For Innovative Engineering and Management Research ◽

10.48047/ijiemr/v09/i12/114 ◽

2020 ◽

pp. 691-694

Keyword(s):

Data Storage ◽

Cloud Storage ◽

Service Providers ◽

Cloud Service ◽

Similar Data ◽

Data Deduplication ◽

Data Set ◽

Recovery Algorithm ◽

Network Bandwidth ◽

File Access

Cloud storage is one of the key features of cloud computing, which helps cloud users outsource large numbers of data without upgrading their devices. However, Cloud Service Providers (CSPs) data storage faces problems with data redundancy. The data deduplication technique aims at eliminating redundant information segments and maintains one single instance of the data set, even if any number of users own similar data set. Since blocks of data are spread on many servers, each block of the file has to be downloaded before restoring the file to decrease system output. We suggest a cloud storage server data recovery module to improve file access efficiency and reduce time spent on network bandwidth. Device coding is used in the suggested method to store blocks in distributed cloud storage, and for data integrity, MD5 (Message Digest 5) is used. Running recovery algorithm helps the user to retrieve a file directly from the cloud servers without downloading every block. The scheme proposed improves system time efficiency and the ability to access the stored data quickly. This reduces bandwidth consumption and reduces overhead user processing while downloading the data file.

Download Full-text