Combination of data replication and scheduling algorithm for improving data availability in Data Grids

2013 ◽  
Vol 36 (2) ◽  
pp. 711-722 ◽  
Author(s):  
Najme Mansouri ◽  
Gholam Hosein Dastghaibyfard ◽  
Ehsan Mansouri
2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Priyanka Vashisht ◽  
Rajesh Kumar ◽  
Anju Sharma

In data grids scientific and business applications produce huge volume of data which needs to be transferred among the distributed and heterogeneous nodes of data grids. Data replication provides a solution for managing data files efficiently in large grids. The data replication helps in enhancing the data availability which reduces the overall access time of the file. In this paper an algorithm, namely, EDRA using agents for data grid, has been proposed and implemented. EDRA consists of dynamic replication of hierarchical structure taken into account for the selection of best replica. Decision for selecting the best replica is based on scheduling parameters. The scheduling parameters are bandwidth, load gauge, and computing capacity of the node. The scheduling in data grid helps in reducing the data access time. The distribution of the load on the nodes of data grid is done evenly by considering scheduling parameters. EDRA is implemented using data grid simulator, namely, OptorSim. European Data Grid CMS test bed topology is used in this experiment. The simulation results are obtained by comparing BHR, LRU, No Replication, and EDRA. The result shows the efficiency of EDRA algorithm in terms of mean job execution time, network usage, and storage usage of node.


Author(s):  
Mohammad Shorfuzzaman ◽  
Rasit Eskicioglu ◽  
Peter Graham

Data Grids provide services and infrastructure for distributed data-intensive applications that need to access, transfer and modify massive datasets stored at distributed locations around the world. For example, the next-generation of scientific applications such as many in high-energy physics, molecular modeling, and earth sciences will involve large collections of data created from simulations or experiments. The size of these data collections is expected to be of multi-terabyte or even petabyte scale in many applications. Ensuring efficient, reliable, secure and fast access to such large data is hindered by the high latencies of the Internet. The need to manage and access multiple petabytes of data in Grid environments, as well as to ensure data availability and access optimization are challenges that must be addressed. To improve data access efficiency, data can be replicated at multiple locations so that a user can access the data from a site near where it will be processed. In addition to the reduction of data access time, replication in Data Grids also uses network and storage resources more efficiently. In this chapter, the state of current research on data replication and arising challenges for the new generation of data-intensive grid environments are reviewed and open problems are identified. First, fundamental data replication strategies are reviewed which offer high data availability, low bandwidth consumption, increased fault tolerance, and improved scalability of the overall system. Then, specific algorithms for selecting appropriate replicas and maintaining replica consistency are discussed. The impact of data replication on job scheduling performance in Data Grids is also analyzed. A set of appropriate metrics including access latency, bandwidth savings, server load, and storage overhead for use in making critical comparisons of various data replication techniques is also discussed. Overall, this chapter provides a comprehensive study of replication techniques in Data Grids that not only serves as a tool to understanding this evolving research area but also provides a reference to which future e orts may be mapped.


2001 ◽  
Vol 02 (03) ◽  
pp. 317-329 ◽  
Author(s):  
MUSTAFA MAT DERIS ◽  
ALI MAMAT ◽  
PUA CHAI SENG ◽  
MOHD YAZID SAMAN

This article addresses the performance of data replication protocol in terms of data availability and communication costs. Specifically, we present a new protocol called Three Dimensional Grid Structure (TDGS) protocol, to manage data replication in distributed system. The protocol provides high availability for read and write operations with limited fault-tolerance at low communication cost. With TDGS protocol, a read operation is limited to two data copies, while a write operation is required with minimal number of copies. In comparison to other protocols. TDGS requires lower communication cost for an operation, while providing higher data availability.


2014 ◽  
Vol 14 (2) ◽  
pp. 24-30 ◽  
Author(s):  
Gundala Swathi ◽  
R. Saravanan

Abstract In recent years synchronization plays a major issue for secure transmission in mobile adhoc networks. When an attacker modifies the time synchronization algorithm, the nodes will have faulty estimates of other nodes location, leading to chaos. While transmitting under these adverse conditions, packets might be lost or might be sent to wrong locations. Data replication and data diffusion are two methods which are used to solve the problem of data availability. In this paper we propose an algorithm for secure multi hop transmission used for external attacks.


2012 ◽  
Vol 28 (7) ◽  
pp. 1045-1057 ◽  
Author(s):  
Ming-Chang Lee ◽  
Fang-Yie Leu ◽  
Ying-ping Chen

2018 ◽  
Vol 113 ◽  
pp. 115-126 ◽  
Author(s):  
Leila Azari ◽  
Amir Masoud Rahmani ◽  
Helder A. Daniel ◽  
Nooruldeen Nasih Qader
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document