Comparative Analysis for Content Defined Chunking Algorithms in Data Deduplication

This article describes how data deduplication efficiently eliminates the redundant data by selecting and storing only single instance of it and becoming popular in storage systems. Digital data is growing much faster than storage volumes, which shows the importance of data deduplication among scientists and researchers. Data deduplication is considered as most successful and efficient technique of data reduction because it is computationally efficient and offers a lossless data reduction. It is applicable to various storage systems, i.e. local storage, distributed storage, and cloud storage. This article discusses the background, components, and key features of data deduplication which helps the reader to understand the design issues and challenges in this field.

Download Full-text

Privacy preserving proof of ownership for data in cloud storage systems

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.8.10317 ◽

2018 ◽

Vol 7 (2.8) ◽

pp. 13

Author(s):

B Tirapathi Reddy ◽

M V. P. Chandra Sekhara Rao

Keyword(s):

False Positive ◽

Cloud Storage ◽

Storage Systems ◽

False Positive Rate ◽

Data Deduplication ◽

Storage Devices ◽

Redundant Data ◽

Proof Of Ownership ◽

Positive Rate ◽

Abundant Data

Storing data in cloud has become a necessity as users are accumulating abundant data every day and they are running out of physical storage devices. But majority of the data in the cloud storage is redundant. Data deduplication using convergent key encryption has been the mechanism popularly used to eliminate redundant data items in the cloud storage. Convergent key encryption suffers from various drawbacks. For instance, if data items are deduplicated based on convergent key, any unauthorized user can compromise the cloud storage by simply having a guessed hash of the file. So, ensuring the ownership of the data items is essential to protect the data items. As cukoo filter offers the minimum false positive rate, with minimal space overhead our mechanism has provided the proof of ownership.

Download Full-text

Secure Data Deduplication of Encrypted Data in Cloud

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch040 ◽

2021 ◽

pp. 873-889

Author(s):

Sumit Kumar Mahana ◽

Rajesh Kumar Aggarwal

Keyword(s):

Cloud Storage ◽

Storage Systems ◽

Optimization Strategy ◽

Data Deduplication ◽

Time Data ◽

Redundant Data ◽

Security Parameter ◽

Data Content ◽

Storage Optimization ◽

Existing Data

In the present digital scenario, data is of prime significance for individuals and moreover for organizations. With the passage of time, data content being produced increases exponentially, which poses a serious concern as the huge amount of redundant data contents stored on the cloud employs a severe load on the cloud storage systems itself which cannot be accepted. Therefore, a storage optimization strategy is a fundamental prerequisite to cloud storage systems. Data deduplication is a storage optimization strategy that is used for deleting identical copies of redundant data, optimizing bandwidth, improves utilization of storage space, and hence, minimizes storage cost. To guarantee the security parameter, the data which is stored on the cloud must be in an encrypted form to ensure the security of the stored data. Consequently, executing deduplication safely over the encrypted information in the cloud seems to be a challenging job. This chapter discusses various existing data deduplication techniques with a notion of securing the data on the cloud that addresses this challenge.

Download Full-text

Data Deduplication Techniques for Big Data Storage Systems

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9129.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1145-1150 ◽

Cited By ~ 1

Keyword(s):

Big Data ◽

Data Storage ◽

Storage Systems ◽

Optimization Technique ◽

Optimization Techniques ◽

Digital Data ◽

Data Deduplication ◽

Redundant Data ◽

And Performance ◽

Big Data Storage

The enormous growth of digital data, especially the data in unstructured format has brought a tremendous challenge on data analysis as well as the data storage systems which are essentially increasing the cost and performance of the backup systems. The traditional systems do not provide any optimization techniques to keep the duplicated data from being backed up. Deduplication of data has become an essential and financial way of the capacity optimization technique which replaces the redundant data. The following paper reviews the deduplication process, types of deduplication and techniques available for data deduplication. Also, many approaches proposed by various researchers on deduplication in Big data storage systems are studied and compared.

Download Full-text

Secure Data Deduplication of Encrypted Data in Cloud

Advances in Wireless Technologies and Telecommunication - Handbook of Research on the IoT, Cloud Computing, and Wireless Network Optimization ◽

10.4018/978-1-5225-7335-7.ch010 ◽

2019 ◽

pp. 196-212

Author(s):

Sumit Kumar Mahana ◽

Rajesh Kumar Aggarwal

Keyword(s):

Cloud Storage ◽

Storage Systems ◽

Optimization Strategy ◽

Data Deduplication ◽

Time Data ◽

Redundant Data ◽

Security Parameter ◽

Data Content ◽

Storage Optimization ◽

Existing Data

In the present digital scenario, data is of prime significance for individuals and moreover for organizations. With the passage of time, data content being produced increases exponentially, which poses a serious concern as the huge amount of redundant data contents stored on the cloud employs a severe load on the cloud storage systems itself which cannot be accepted. Therefore, a storage optimization strategy is a fundamental prerequisite to cloud storage systems. Data deduplication is a storage optimization strategy that is used for deleting identical copies of redundant data, optimizing bandwidth, improves utilization of storage space, and hence, minimizes storage cost. To guarantee the security parameter, the data which is stored on the cloud must be in an encrypted form to ensure the security of the stored data. Consequently, executing deduplication safely over the encrypted information in the cloud seems to be a challenging job. This chapter discusses various existing data deduplication techniques with a notion of securing the data on the cloud that addresses this challenge.

Download Full-text

The comparative analysis on thermal storage systems for solar power with direct steam generation

Renewable Energy ◽

10.1016/j.renene.2017.08.046 ◽

2018 ◽

Vol 115 ◽

pp. 217-225 ◽

Cited By ~ 9

Author(s):

Jiangfeng Guo ◽

Xiulan Huai ◽

Keyong Cheng

Keyword(s):

Comparative Analysis ◽

Solar Power ◽

Storage Systems ◽

Thermal Storage ◽

Steam Generation ◽

Direct Steam

Download Full-text

DynDL: Scheduling Data-Locality-Aware Tasks with Dynamic Data Transfer Cost for Multicore-Server-Based Big Data Clusters

Applied Sciences ◽

10.3390/app8112216 ◽

2018 ◽

Vol 8 (11) ◽

pp. 2216

Author(s):

Jiahui Jin ◽

Qi An ◽

Wei Zhou ◽

Jiakai Tang ◽

Runqun Xiong

Keyword(s):

Big Data ◽

Data Processing ◽

Processing Time ◽

Data Transfer ◽

Data Locality ◽

Free Time ◽

Time Data ◽

Dynamic Data ◽

Network Bandwidth ◽

Transfer Cost

Network bandwidth is a scarce resource in big data environments, so data locality is a fundamental problem for data-parallel frameworks such as Hadoop and Spark. This problem is exacerbated in multicore server-based clusters, where multiple tasks running on the same server compete for the server’s network bandwidth. Existing approaches solve this problem by scheduling computational tasks near the input data and considering the server’s free time, data placements, and data transfer costs. However, such approaches usually set identical values for data transfer costs, even though a multicore server’s data transfer cost increases with the number of data-remote tasks. Eventually, this hampers data-processing time, by minimizing it ineffectively. As a solution, we propose DynDL (Dynamic Data Locality), a novel data-locality-aware task-scheduling model that handles dynamic data transfer costs for multicore servers. DynDL offers greater flexibility than existing approaches by using a set of non-decreasing functions to evaluate dynamic data transfer costs. We also propose online and offline algorithms (based on DynDL) that minimize data-processing time and adaptively adjust data locality. Although DynDL is NP-complete (nondeterministic polynomial-complete), we prove that the offline algorithm runs in quadratic time and generates optimal results for DynDL’s specific uses. Using a series of simulations and real-world executions, we show that our algorithms are 30% better than algorithms that do not consider dynamic data transfer costs in terms of data-processing time. Moreover, they can adaptively adjust data localities based on the server’s free time, data placement, and network bandwidth, and schedule tens of thousands of tasks within subseconds or seconds.

Download Full-text

Remote Method Invocation and Mobile Agent: A Comparative Analysis

10.28945/3033 ◽

2006 ◽

Author(s):

G. Adesola Aderounmu ◽

Bosede Oyatokun ◽

Matthew Adigun

Keyword(s):

Comparative Analysis ◽

Fault Tolerance ◽

Mobile Agent ◽

Research Work ◽

Search Time ◽

Object Oriented Programming ◽

Information Storage ◽

Superior Performance ◽

Remote Method Invocation ◽

Network Bandwidth

This paper presents a comparative analysis of Remote Method Invocation (RMI) and Mobile Agent (MA) paradigm used to implement the information storage and retrieval system in a distributed computing environment. Simulation program was developed to measure the performance of MA and RMI using object oriented programming language (the following parameters: search time, fault tolerance and invocation cost. We used search time, fault tolerance and invocation cost as performance parameters in this research work. Experimental results showed that Mobile Agent paradigm offers a superior performance compared to RMI paradigm, offers fast computational speed; procure lower invocation cost by making local invocations instead of remote invocations over the network, thereby reducing network bandwidth. Finally MA has a better fault tolerance than the RMI. With a probability of failure pr = 0.1, mobile agent degrades gracefully.

Download Full-text

Prototype production and comparative analysis of high-speed flywheel energy storage systems during regenerative braking in hybrid and electric vehicles

Journal of Energy Storage ◽

10.1016/j.est.2021.103237 ◽

2021 ◽

Vol 43 ◽

pp. 103237

Author(s):

Koray Erhan ◽

Engin Özdemir

Keyword(s):

Comparative Analysis ◽

Energy Storage ◽

Electric Vehicles ◽

High Speed ◽

Storage Systems ◽

Regenerative Braking ◽

Flywheel Energy Storage ◽

Energy Storage Systems

Download Full-text

A New Application for the Management of the MongoDB Servers

International Journal of Sustainable Economies Management ◽

10.4018/ijsem.2013070105 ◽

2013 ◽

Vol 2 (3) ◽

pp. 58-71

Author(s):

Tudorica Bogdan George

Keyword(s):

Comparative Analysis ◽

Data Storage ◽

Storage Systems ◽

Large Data ◽

Ease Of Use ◽

Relative Lack

The application presented in the following subsections intends to cover one of the noticeable gaps of the NoSQL domain, namely the relative lack of working tools and systems administration for new large data storage systems. Following the comparative analysis of the NoSQL solutions on the market, the MongoDB system was chosen as target application for this step of development, for reasons mainly related to proven performance, flexibility, market presence already in place and ease of use.

Download Full-text