A Novel File-Level Continuous Data Protection System

2012 ◽  
Vol 566 ◽  
pp. 406-413
Author(s):  
Si Han ◽  
Wen Bin Yao ◽  
Bo Shi Liu ◽  
Cong Wang

Continuous Data Protection is a data recovery method which can protect file systems against malicious attacks or users’ mistakes. This paper proposes BCFBS (BUPT Continuous File Backup System): a continuous data protection architecture at file level. Compared with other approaches, it uses caching technique to protect the consistence between file versions, thereby speeding up both the backup of file version and space recycling. Furthermore, BCFBS combines techniques of filter the type of file, adjusting the frequency of the backup of file with incremental backup to make up the storage waste default of traditional CDP. Experimental results demonstrate that BCFBS can save storage space by 50%.

2021 ◽  
Vol 2 (2) ◽  
pp. 1-16
Author(s):  
Ru Yang ◽  
Yuhui Deng ◽  
Yi Zhou ◽  
Ping Huang

Restoring data is the main purpose of data backup in storage systems. The fragmentation issue, caused by physically scattering logically continuous data across a variety of disk locations, poses a negative impact on the restoring performance of a deduplication system. Rewriting algorithms are used to alleviate the fragmentation problem by improving the restoring speed of a deduplication system. However, rewriting methods give birth to a big sacrifice in terms of deduplication ratio, leading to a huge storage space waste. Furthermore, traditional backup approaches treat file metadata and chunk metadata as the same, which causes frequent on-disk metadata accesses. In this article, we start by analyzing storage characteristics of backup metadata. An intriguing finding shows that with 10 million files, the file metadata merely takes up approximately 340 MB. Motivated by this finding, we propose a Classified-Metadata based Restoring method (CMR) that classifies backup metadata into file metadata and chunk metadata . Because the file metadata merely takes up a meager amount of space, CMR maintains all file metadata in memory, whereas chunk metadata are aggressively prefetched to memory in a greedy manner. A deduplication system with CMR in place exhibits three salient features: (i) It avoids rewriting algorithms’ additional overhead by reducing the number of disk reads in a restoring process, (ii) it increases the restoring throughput without sacrificing the deduplication ratio, and (iii) it thoroughly leverages the hardware resources to boost the restoring performance. To quantitatively evaluate the performance of CMR, we compare our CMR against two state-of-the-art approaches, namely, a history-aware rewriting method (HAR) and a context-based rewriting scheme (CAP). The experimental results show that compared to HAR and CAP, CMR reduces the restoring time by 27.2% and 29.3%, respectively. Moreover, the deduplication ratio is improved by 1.91% and 4.36%, respectively.


2011 ◽  
Vol 2 ◽  
pp. 54-69 ◽  
Author(s):  
Leon Mugoh ◽  
Ismail Lukandu Ateya ◽  
Bernard Shibwabo Kasamani

2011 ◽  
Vol 22 (10) ◽  
pp. 2523-2537
Author(s):  
Xiao LI ◽  
Yu-An TAN ◽  
Yuan-Zhang LI

2021 ◽  
pp. 8-17
Author(s):  
Amer Ramadan ◽  

This paper reports on an in-depth examination of the impact of the backing filesystems to Docker performance in the context of Linux container-based virtualization. The experimental design was a 3x3x4 arrangement, i.e., we considered three different numbers of Docker containers, three filesystems (Ext4, XFS and Btrfs), and four application workloads related to Web server I/O activity, e-mail server I/O activity, file server I/O activity and random file access I/O activity, respectively. The experimental results indicate that Ext4 is the most optimal filesystem, among the considered filesystems, for the considered experimental settings. In addition, the XFS filesystem is not suitable for workloads that are dominated by synchronous random write components (e.g., characteristical for mail workload), while the Btrfs filesystem is not suitable for workloads dominated by random write and sequential write components (e.g., file server workload).


Sign in / Sign up

Export Citation Format

Share Document