Metadata Indexing Sub-System for Distributed File System

The efficiency of metadata indexing is important to the performance of distributed file system. Time and space spending of current metadata management algorithms are unstable. In this paper, we use B-tree to index the metadata of distributed file system. Lustre is an open source distributed file system in which Hash function is used to manage the metadata. We implement the prototype of metadata indexing sub-system on Lustre and use Iozone to test the I/O performance of Lustre with and without the metadata indexing sub-system respectively. The simulation results show that Lustre with the metadata indexing sub-system has higher adaptability than Lustre with Hash-based metadata management algorithm.

Download Full-text

A Zones-Based Metadata Management Method for Distributed File System

Trustworthy Computing and Services - Communications in Computer and Information Science ◽

10.1007/978-3-662-43908-1_22 ◽

2014 ◽

pp. 169-175 ◽

Cited By ~ 1

Author(s):

Xiaowei Xie ◽

Yu Yang ◽

Yueming Lu

Keyword(s):

File System ◽

Distributed File System ◽

Metadata Management ◽

Management Method

Download Full-text

The Multi-Level Metadata Indexing in Mass Storage System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.818 ◽

2012 ◽

Vol 532-533 ◽

pp. 818-822

Author(s):

De Jiao Niu ◽

Yong Zhao Zhan ◽

Tao Cai

Keyword(s):

Hash Function ◽

Storage System ◽

Bloom Filter ◽

Metadata Management ◽

Mass Storage ◽

Mass Storage System ◽

Multi Level ◽

Temporal And Spatial ◽

Query Algorithm ◽

Management Algorithms

Metadata query plays an important role in mass storage system. Efficient indexing algorithm can reduce the time and space which greatly determine the efficiency of mass storage system. Typically, temporal and spatial consuming is immense and volatile in the existing metadata management algorithms. In this paper, a novel metadata indexing algorithm is presented. Metadata query algorithm is based on two-level indexing strategy. The metadata is classified into two categories, that are active metadata and non-active metadata. The Bloom Filter is used to generate binary string for active metadata, and the B-tree is used to establish index of each active partition. While, the suitable hash function is selected for each non-active metadata partition. The results show that the multi-level metadata indexing algorithm can reduce the temporal and spatial costs of metadata query.

Download Full-text

Research on Metadata Management Scheme of Distributed File System

2015 International Conference on Computer Science and Applications (CSA) ◽

10.1109/csa.2015.25 ◽

2015 ◽

Author(s):

Lin Huo ◽

Ran Yi

Keyword(s):

File System ◽

Distributed File System ◽

Metadata Management ◽

Management Scheme

Download Full-text

Metadata Caching Subsystem for Cloud Storage

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.214.584 ◽

2012 ◽

Vol 214 ◽

pp. 584-590 ◽

Cited By ~ 1

Author(s):

De Jiao Niu ◽

Tao Cai ◽

Yong Zhao Zhan ◽

Shi Guang Ju

Keyword(s):

Cloud Storage ◽

File System ◽

Experimental Results ◽

Distributed File System ◽

Metadata Management ◽

Metadata Cache

Cloud storage is a hot topic in current research. Different from previous work, we emphasize the importance of metadata cache in the study of cloud storage. Because the efficiency of distributed file system has much effect on cloud storage The metadata operation accounts for more than 50% of the total file operation. So the strategy of efficient metadata management is important. There are three parts in this paper. We start with a brief introduction of cloud storage. Then a metadata caching algorithm for cloud storage is proposed. An additional discussion of its performance is also provided. The prototype which incorporates the proposed metadata caching algorithm is realized on Luster to evaluate its performance. Comparing experimental results from this study conclude that the metadata caching subsystem can improve the performance of cloud storage.

Download Full-text

A Distribution of Nodes in Big Data using Hadoop Open Source System

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8459.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 106-110

Keyword(s):

Big Data ◽

Open Source ◽

Data Storage ◽

High Speed ◽

File System ◽

Fault Tolerant ◽

Heart Beat ◽

Distributed File System ◽

Process Data ◽

Hadoop Distributed File System

Apache Hadoop is an free open source Java framework under Apache Software Foundation. It provides storage of large amount of data efficiently with low costing. Hadoop has two main core components one is HDFS (Hadoop Distributed File System) and second Map Reduce. It is basically a file system and has capability of high fault-tolerant and while deploying supports less cost hardware. It. provides the high speed admittance to the relevance data. The Hadoop architecture is based on cluster, which consist of two nodes named as Data -Node and Name-Node which perform the internal activity known as heart beat to process data storage on distributed file system and Map reducing is performed internally to show the clustering of distributed data on localhost of ssh serverwebsite. Large quantity of data is needed to store in distributed file structure, for this Hadoop has played important role. Maintaining the large volume storage, making data duplicity for providing security and recovery of big data for its analysis and prediction.

Download Full-text

The Metadata Dynamic Load-Balancing Strategy of Distributed Filesystem Based on Hash Tags

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.556-562.4009 ◽

2014 ◽

Vol 556-562 ◽

pp. 4009-4013

Author(s):

Yi Jiang ◽

Qiang Xiao ◽

Rong Huang ◽

An Ping Xiong

Keyword(s):

Dynamic Load ◽

Load Balance ◽

Management Strategy ◽

File System ◽

Storage System ◽

Information Storage ◽

Distributed File System ◽

System Throughput ◽

Metadata Management ◽

Mass Storage

With the development of information technology, distributed file system is widely used in massive information storage. Usually, distributed file system uses metadata server to achieve quick access to files according to directory, thus the organization and management of metadata are the keys to the file system performance. In general, directory subtree partition method and hash algorithm are used by existing mass storage system to manage metadata. However, to solve the problems, like low access efficiency of metadata, ineffective balance of load and poor extensibility, in existing metadata management strategy of distributed file system, dynamic load balance strategy of metadata based on hash tags is put forward, in which the tags act as partition granularity and hot tags of metadata will be hashed again to achieve the goal of load balance. The experimental results in this paper turn out that modified metadata management strategy based on hash tags has greater system throughput and less average response time than the one based on tags.,

Download Full-text

Implementation and Analysis of the File System Based on Metadata Dynamic Hashing

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.490-495.1034 ◽

2012 ◽

Vol 490-495 ◽

pp. 1034-1038

Author(s):

Si Ma ◽

Tao Cai ◽

Yong Zhao Zhan

Keyword(s):

System Performance ◽

File System ◽

Storage System ◽

Metadata Management ◽

Mass Storage ◽

Management Algorithm ◽

Mass Storage System ◽

Hashing Function ◽

Hashing Algorithm ◽

Dynamic Hashing

Metadata management algorithm plays an important role in file system performance and file system is the most common way of data accessing in mass storage systems, so metadata management algorithm is very important for the performance of mass storage system. In this paper we analyze current metadata management algorithms and bring in the metadata dynamic hashing algorithm to solve the problem of poor adaptability of them. We realize the prototype on Lustre. After testing system performance with common tools by adjusting the several parameters’ value of metadata dynamic hashing algorithm, we find that performance of the prototype I/O is superior to Lustre. Furthermore, we analyze impacts of the parameters of dynamic hashing function on I/O performance of prototype

Download Full-text