Metadata Indexing Sub-System for Distributed File System

2011 ◽  
Vol 143-144 ◽  
pp. 864-868
Author(s):  
De Jiao Niu ◽  
Tao Cai ◽  
Yong Zhao Zhan ◽  
Shi Guang Ju

The efficiency of metadata indexing is important to the performance of distributed file system. Time and space spending of current metadata management algorithms are unstable. In this paper, we use B-tree to index the metadata of distributed file system. Lustre is an open source distributed file system in which Hash function is used to manage the metadata. We implement the prototype of metadata indexing sub-system on Lustre and use Iozone to test the I/O performance of Lustre with and without the metadata indexing sub-system respectively. The simulation results show that Lustre with the metadata indexing sub-system has higher adaptability than Lustre with Hash-based metadata management algorithm.

2012 ◽  
Vol 532-533 ◽  
pp. 818-822
Author(s):  
De Jiao Niu ◽  
Yong Zhao Zhan ◽  
Tao Cai

Metadata query plays an important role in mass storage system. Efficient indexing algorithm can reduce the time and space which greatly determine the efficiency of mass storage system. Typically, temporal and spatial consuming is immense and volatile in the existing metadata management algorithms. In this paper, a novel metadata indexing algorithm is presented. Metadata query algorithm is based on two-level indexing strategy. The metadata is classified into two categories, that are active metadata and non-active metadata. The Bloom Filter is used to generate binary string for active metadata, and the B-tree is used to establish index of each active partition. While, the suitable hash function is selected for each non-active metadata partition. The results show that the multi-level metadata indexing algorithm can reduce the temporal and spatial costs of metadata query.


2012 ◽  
Vol 214 ◽  
pp. 584-590 ◽  
Author(s):  
De Jiao Niu ◽  
Tao Cai ◽  
Yong Zhao Zhan ◽  
Shi Guang Ju

Cloud storage is a hot topic in current research. Different from previous work, we emphasize the importance of metadata cache in the study of cloud storage. Because the efficiency of distributed file system has much effect on cloud storage The metadata operation accounts for more than 50% of the total file operation. So the strategy of efficient metadata management is important. There are three parts in this paper. We start with a brief introduction of cloud storage. Then a metadata caching algorithm for cloud storage is proposed. An additional discussion of its performance is also provided. The prototype which incorporates the proposed metadata caching algorithm is realized on Luster to evaluate its performance. Comparing experimental results from this study conclude that the metadata caching subsystem can improve the performance of cloud storage.


Apache Hadoop is an free open source Java framework under Apache Software Foundation. It provides storage of large amount of data efficiently with low costing. Hadoop has two main core components one is HDFS (Hadoop Distributed File System) and second Map Reduce. It is basically a file system and has capability of high fault-tolerant and while deploying supports less cost hardware. It. provides the high speed admittance to the relevance data. The Hadoop architecture is based on cluster, which consist of two nodes named as Data -Node and Name-Node which perform the internal activity known as heart beat to process data storage on distributed file system and Map reducing is performed internally to show the clustering of distributed data on localhost of ssh serverwebsite. Large quantity of data is needed to store in distributed file structure, for this Hadoop has played important role. Maintaining the large volume storage, making data duplicity for providing security and recovery of big data for its analysis and prediction.


2014 ◽  
Vol 556-562 ◽  
pp. 4009-4013
Author(s):  
Yi Jiang ◽  
Qiang Xiao ◽  
Rong Huang ◽  
An Ping Xiong

With the development of information technology, distributed file system is widely used in massive information storage. Usually, distributed file system uses metadata server to achieve quick access to files according to directory, thus the organization and management of metadata are the keys to the file system performance. In general, directory subtree partition method and hash algorithm are used by existing mass storage system to manage metadata. However, to solve the problems, like low access efficiency of metadata, ineffective balance of load and poor extensibility, in existing metadata management strategy of distributed file system, dynamic load balance strategy of metadata based on hash tags is put forward, in which the tags act as partition granularity and hot tags of metadata will be hashed again to achieve the goal of load balance. The experimental results in this paper turn out that modified metadata management strategy based on hash tags has greater system throughput and less average response time than the one based on tags.,


2012 ◽  
Vol 490-495 ◽  
pp. 1034-1038
Author(s):  
Si Ma ◽  
Tao Cai ◽  
Yong Zhao Zhan

Metadata management algorithm plays an important role in file system performance and file system is the most common way of data accessing in mass storage systems, so metadata management algorithm is very important for the performance of mass storage system. In this paper we analyze current metadata management algorithms and bring in the metadata dynamic hashing algorithm to solve the problem of poor adaptability of them. We realize the prototype on Lustre. After testing system performance with common tools by adjusting the several parameters’ value of metadata dynamic hashing algorithm, we find that performance of the prototype I/O is superior to Lustre. Furthermore, we analyze impacts of the parameters of dynamic hashing function on I/O performance of prototype


2014 ◽  
Vol 36 (5) ◽  
pp. 1047-1064 ◽  
Author(s):  
Bin LIAO ◽  
Jiong YU ◽  
Tao ZHANG ◽  
Xing-Yao YANG

2010 ◽  
Vol 33 (10) ◽  
pp. 1873-1880 ◽  
Author(s):  
Chun-Cong XU ◽  
Xiao-Meng HUANG ◽  
Nuo WU ◽  
Ning-Wei SUN ◽  
Guang-Wen YANG

Sign in / Sign up

Export Citation Format

Share Document