Replica Placement Strategy for Data Grid Environment

2013 ◽  
Vol 5 (1) ◽  
pp. 70-81 ◽  
Author(s):  
Mohammed K. Madi ◽  
Yuhanis Yusof ◽  
Suhaidi Hassan

Data Grid is an infrastructure that manages huge amount of data files, and provides intensive computational resources across geographically distributed collaboration. To increase resource availability and to ease resource sharing in such environment, there is a need for replication services. Data replication is one of the methods used to improve the performance of data access in distributed systems by replicating multiple copies of data files in the distributed sites. Replica placement mechanism is the process of identifying where to place copies of replicated data files in a Grid system. Existing work identifies the suitable sites based on number of requests and read cost of the required file. Such approaches consume large bandwidth and increases the computational time. The authors propose a replica placement strategy (RPS) that finds the best locations to store replicas based on four criteria, namely, 1) Read Cost, 2) File Transfer Time, 3) Sites’ Workload, and 4) Replication Sites. OptorSim is used to evaluate the performance of this replica placement strategy. The simulation results show that RPS requires less execution time and consumes less network usage compared to existing approaches of Simple Optimizer and LFU (Least Frequently Used).

Author(s):  
Ghalem Belalem

Data grids have become an interesting and popular domain in grid community (Foster and Kesselmann, 2004). Generally, the grids are proposed as solutions for large scale systems, where data replication is a well-known technique used to reduce access latency and bandwidth, and increase availability. In splitting of the advantages of replication, there are many problems that should be solved such as, • The replica placement that determines the optimal locations of replicated data in order to reduce the storage cost and data access (Xu et al, 2002); • The problem of determining which replica will be accessed to in terms of consistency when we need to execute a read or write operation (Ranganathan and Foster, 2001); • The problem of degree of replication which consists in finding a minimal number of replicas without reducing the performance of user applications; • The problem of replica consistency that concerns the consistency of a set of replicated data. This consistency provides a completely coherent view of all the replicas for a user (Gray et al 1996). Our principal aim, in this article, is to integrate into consistency management service, an approach based on an economic model for resolving conflicts detected in the data grid.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Priyanka Vashisht ◽  
Rajesh Kumar ◽  
Anju Sharma

In data grids scientific and business applications produce huge volume of data which needs to be transferred among the distributed and heterogeneous nodes of data grids. Data replication provides a solution for managing data files efficiently in large grids. The data replication helps in enhancing the data availability which reduces the overall access time of the file. In this paper an algorithm, namely, EDRA using agents for data grid, has been proposed and implemented. EDRA consists of dynamic replication of hierarchical structure taken into account for the selection of best replica. Decision for selecting the best replica is based on scheduling parameters. The scheduling parameters are bandwidth, load gauge, and computing capacity of the node. The scheduling in data grid helps in reducing the data access time. The distribution of the load on the nodes of data grid is done evenly by considering scheduling parameters. EDRA is implemented using data grid simulator, namely, OptorSim. European Data Grid CMS test bed topology is used in this experiment. The simulation results are obtained by comparing BHR, LRU, No Replication, and EDRA. The result shows the efficiency of EDRA algorithm in terms of mean job execution time, network usage, and storage usage of node.


Author(s):  
H. O. Colijn

Many labs today wish to transfer data between their EDS systems and their existing PCs and minicomputers. Our lab has implemented SpectraPlot, a low- cost PC-based system to allow offline examination and plotting of spectra. We adopted this system in order to make more efficient use of our microscopes and EDS consoles, to provide hardcopy output for an older EDS system, and to allow students to access their data after leaving the university.As shown in Fig. 1, we have three EDS systems (one of which is located in another building) which can store data on 8 inch RT-11 floppy disks. We transfer data from these systems to a DEC MINC computer using “SneakerNet”, which consists of putting on a pair of sneakers and running down the hall. We then use the Hermit file transfer program to download the data files with error checking from the MINC to the PC.


2014 ◽  
Vol 42 (2/3) ◽  
pp. 120-124
Author(s):  
Lijun Zeng ◽  
Xiaoxia Yao ◽  
Juanjuan Liu ◽  
Qiang Zhu

Purpose – The purpose of this paper is to provide a detailed overview of the China Academic Library and Information system (CALIS) document supply service platform (CDSSP) – its historical development, network structure and future development plans – and discuss how its members make use of and benefit from its various components. Design/methodology/approach – The authors provide a first-person account based on their professional positions at the CALIS Administrative Center. Findings – CDSSP comprises five application systems including a unified authentication system, Saas-based interlibrary loan (ILL) and document delivery (DD) service system, ILL central scheduling and settlement system, File Transfer Protocol (FTP) service system and a service integration interface system. These systems work together to meet the needs of member libraries, other information service institutions, and their end users. CDSSP is widely used by more than 1,100 libraries based on a cloud service strategy. Each year more than 100,000 ILL and DD transactions are processed by this platform. Originality/value – The development of CDSSP makes it becomes true for CALIS to provide one stop information retrieval and supply service. At the same time, it promotes the resource sharing among member libraries to a great degree.


Author(s):  
Sandro Fiore ◽  
Alessandro Negro ◽  
Salvatore Vadacca ◽  
Massimo Cafaro ◽  
Giovanni Aloisio ◽  
...  

Grid computing is an emerging and enabling technology allowing organizations to easily share, integrate and manage resources in a distributed environment. Computational Grid allows running millions of jobs in parallel, but the huge amount of generated data has caused another interesting problem: the management (classification, storage, discovery etc.) of distributed data, i.e., a Data Grid specific issue. In the last decade, many efforts concerning the management of data (grid-storage services, metadata services, grid-database access and integration services etc.) identify data management as a real challenge for the next generation petascale grid environments. This work provides an architectural overview of the GRelC DAS, a grid database access service developed in the context of the GRelC Project and currently used for production/tutorial activities both in gLite and Globus based grid environments.


Sign in / Sign up

Export Citation Format

Share Document