The Multi-Level Metadata Indexing in Mass Storage System

2012 ◽  
Vol 532-533 ◽  
pp. 818-822
Author(s):  
De Jiao Niu ◽  
Yong Zhao Zhan ◽  
Tao Cai

Metadata query plays an important role in mass storage system. Efficient indexing algorithm can reduce the time and space which greatly determine the efficiency of mass storage system. Typically, temporal and spatial consuming is immense and volatile in the existing metadata management algorithms. In this paper, a novel metadata indexing algorithm is presented. Metadata query algorithm is based on two-level indexing strategy. The metadata is classified into two categories, that are active metadata and non-active metadata. The Bloom Filter is used to generate binary string for active metadata, and the B-tree is used to establish index of each active partition. While, the suitable hash function is selected for each non-active metadata partition. The results show that the multi-level metadata indexing algorithm can reduce the temporal and spatial costs of metadata query.

2017 ◽  
Vol 898 ◽  
pp. 062003
Author(s):  
Qiulan Huang ◽  
Ran Du ◽  
YaoDong Cheng ◽  
Jingyan Shi ◽  
Gang Chen ◽  
...  

2012 ◽  
Vol 490-495 ◽  
pp. 1034-1038
Author(s):  
Si Ma ◽  
Tao Cai ◽  
Yong Zhao Zhan

Metadata management algorithm plays an important role in file system performance and file system is the most common way of data accessing in mass storage systems, so metadata management algorithm is very important for the performance of mass storage system. In this paper we analyze current metadata management algorithms and bring in the metadata dynamic hashing algorithm to solve the problem of poor adaptability of them. We realize the prototype on Lustre. After testing system performance with common tools by adjusting the several parameters’ value of metadata dynamic hashing algorithm, we find that performance of the prototype I/O is superior to Lustre. Furthermore, we analyze impacts of the parameters of dynamic hashing function on I/O performance of prototype


2018 ◽  
Vol 228 ◽  
pp. 01011
Author(s):  
Haifeng Zhong ◽  
Jianying Xiong

The wan Internet storage system based on Distributed Hash Table uses fully distributed data and metadata management, and constructs an extensible and efficient mass storage system for the application based on Internet. However, such systems work in highly dynamic environments, and the frequent entry and exit of nodes will lead to huge communication costs. Therefore, this paper proposes a new hierarchical metadata routing management mechanism based on DHT, which makes full use of the node stabilization point to reduce the maintenance overhead of the overlay. Analysis shows that the algorithm can effectively improve efficiency and enhance stability.


2015 ◽  
Vol 608 ◽  
pp. 012013 ◽  
Author(s):  
Pier Paolo Ricci ◽  
Alessandro Cavalli ◽  
Luca Dell'Agnello ◽  
Matteo Favaro ◽  
Daniele Gregori ◽  
...  

Author(s):  
HIROFUMI FUJII ◽  
RYOSUKE ITOH ◽  
ATSUSHI MANABE ◽  
AKIYA MIYAMOTO ◽  
YOUHEI MORITA ◽  
...  

2019 ◽  
Vol 214 ◽  
pp. 04009
Author(s):  
Alessandro Cavalli ◽  
Daniele Cesini ◽  
Enrico Fattibene ◽  
Andrea Prosperini ◽  
Vladimir Sapunenko

IBM Spectrum Protect (ISP) software, one of the leader solutions in data protection, contributes to the data management infrastructure operated at CNAF, the central computing and storage facility of INFN (Istituto Nazionale di Fisica Nucleare – Italian National Institute for Nuclear Physics). It is used to manage about 55 Petabytes of scientific data produced by LHC (Large Hadron Collider at CERN) and other experiments in which INFN is involved, stored on tape resources as the highest latency storage tier within HSM (Hierarchical Space Management) environment. To accomplish this task, ISP works together with IBM Spectrum Scale (formerly GPFS - General Parallel File System) and GEMSS (Grid Enabled Mass Storage System), an in-house developed software layer that manages migration and recall queues. Moreover, we perform backup/archive operation of main IT services running at CNAF, such as mail servers, configurations, repositories, documents, logs, etc. In this paper we present the current configuration of the HSM infrastructure and the backup and recovery service, with particular attention to issues related to the increasing amount of scientific data to manage, expected for the next years.


Sign in / Sign up

Export Citation Format

Share Document