Minimizing data access latency in data grids by neighborhood-based data replication and job scheduling

2010 ◽

pp. 1041-1056

Author(s):

Ghalem Belalem

Keyword(s):

Large Scale ◽

Data Access ◽

Data Grid ◽

Management Service ◽

Replica Placement ◽

Data Grids ◽

Large Scale Systems ◽

Consistency Management ◽

Access Latency ◽

Replicated Data

Data grids have become an interesting and popular domain in grid community (Foster and Kesselmann, 2004). Generally, the grids are proposed as solutions for large scale systems, where data replication is a well-known technique used to reduce access latency and bandwidth, and increase availability. In splitting of the advantages of replication, there are many problems that should be solved such as, • The replica placement that determines the optimal locations of replicated data in order to reduce the storage cost and data access (Xu et al, 2002); • The problem of determining which replica will be accessed to in terms of consistency when we need to execute a read or write operation (Ranganathan and Foster, 2001); • The problem of degree of replication which consists in finding a minimal number of replicas without reducing the performance of user applications; • The problem of replica consistency that concerns the consistency of a set of replicated data. This consistency provides a completely coherent view of all the replicas for a user (Gray et al 1996). Our principal aim, in this article, is to integrate into consistency management service, an approach based on an economic model for resolving conflicts detected in the data grid.

Download Full-text

Hybrid L2 NUCA Design and Management Considering Data Access Latency, Energy Efficiency, and Storage Lifetime

IEEE Transactions on Very Large Scale Integration (VLSI) Systems ◽

10.1109/tvlsi.2016.2540069 ◽

2016 ◽

Vol 24 (10) ◽

pp. 3118-3131

Author(s):

Seunghan Lee ◽

Kyungsu Kang ◽

Jongpil Jung ◽

Chong-Min Kyung

Keyword(s):

Energy Efficiency ◽

Data Access ◽

Access Latency ◽

And Storage ◽

Data Access Latency

Download Full-text

Job scheduling and data replication on data grids

Future Generation Computer Systems ◽

10.1016/j.future.2007.02.008 ◽

2007 ◽

Vol 23 (7) ◽

pp. 846-860 ◽

Cited By ~ 76

Author(s):

Ruay-Shiung Chang ◽

Jih-Sheng Chang ◽

Shin-Yi Lin

Keyword(s):

Job Scheduling ◽

Data Replication ◽

Data Grids

Download Full-text

A Control-Plane Perspective on Reducing Data Access Latency in LTE Networks

Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking - MobiCom '17 ◽

10.1145/3117811.3117838 ◽

2017 ◽

Cited By ~ 16

Author(s):

Yuanjie Li ◽

Zengwen Yuan ◽

Chunyi Peng

Keyword(s):

Data Access ◽

Control Plane ◽

Lte Networks ◽

Access Latency ◽

Data Access Latency

Download Full-text

Leveraging on Deep Memory Hierarchies to Minimize Energy Consumption and Data Access Latency on Single-Chip Cloud Computers

IEEE Transactions on Sustainable Computing ◽

10.1109/tsusc.2017.2706620 ◽

2017 ◽

Vol 2 (2) ◽

pp. 154-166 ◽

Cited By ~ 4

Author(s):

Tahir Maqsood ◽

Nikos Tziritas ◽

Thanasis Loukopoulos ◽

Sajjad A. Madani ◽

Samee U. Khan ◽

...

Keyword(s):

Energy Consumption ◽

Data Access ◽

Single Chip ◽

Memory Hierarchies ◽

Access Latency ◽

Data Access Latency

Download Full-text

Performance Analysis of Latency-Aware Data Management in Industrial IoT Networks

Sensors ◽

10.3390/s18082611 ◽

2018 ◽

Vol 18 (8) ◽

pp. 2611 ◽

Cited By ~ 12

Author(s):

Theofanis Raptis ◽

Andrea Passarella ◽

Marco Conti

Keyword(s):

Data Management ◽

Ieee 802.15.4 ◽

Data Access ◽

Distributed Data ◽

Access Latency ◽

Critical Data ◽

Industrial Iot ◽

Important Challenge ◽

The Given ◽

Data Access Latency

Maintaining critical data access latency requirements is an important challenge of Industry 4.0. The traditional, centralized industrial networks, which transfer the data to a central network controller prior to delivery, might be incapable of meeting such strict requirements. In this paper, we exploit distributed data management to overcome this issue. Given a set of data, the set of consumer nodes and the maximum access latency that consumers can tolerate, we consider a method for identifying and selecting a limited set of proxies in the network where data needed by the consumer nodes can be cached. The method targets at balancing two requirements; data access latency within the given constraints and low numbers of selected proxies. We implement the method and evaluate its performance using a network of WSN430 IEEE 802.15.4-enabled open nodes. Additionally, we validate a simulation model and use it for performance evaluation in larger scales and more general topologies. We demonstrate that the proposed method (i) guarantees average access latency below the given threshold and (ii) outperforms traditional centralized and even distributed approaches.

Download Full-text

Efficient Dynamic Replication Algorithm Using Agent for Data Grid

The Scientific World JOURNAL ◽

10.1155/2014/767016 ◽

2014 ◽

Vol 2014 ◽

pp. 1-10 ◽

Cited By ~ 7

Author(s):

Priyanka Vashisht ◽

Rajesh Kumar ◽

Anju Sharma

Keyword(s):

Data Replication ◽

Data Access ◽

Data Grid ◽

Data Availability ◽

Access Time ◽

Test Bed ◽

Data Grids ◽

Dynamic Replication ◽

Data Files ◽

Using Data

In data grids scientific and business applications produce huge volume of data which needs to be transferred among the distributed and heterogeneous nodes of data grids. Data replication provides a solution for managing data files efficiently in large grids. The data replication helps in enhancing the data availability which reduces the overall access time of the file. In this paper an algorithm, namely, EDRA using agents for data grid, has been proposed and implemented. EDRA consists of dynamic replication of hierarchical structure taken into account for the selection of best replica. Decision for selecting the best replica is based on scheduling parameters. The scheduling parameters are bandwidth, load gauge, and computing capacity of the node. The scheduling in data grid helps in reducing the data access time. The distribution of the load on the nodes of data grid is done evenly by considering scheduling parameters. EDRA is implemented using data grid simulator, namely, OptorSim. European Data Grid CMS test bed topology is used in this experiment. The simulation results are obtained by comparing BHR, LRU, No Replication, and EDRA. The result shows the efficiency of EDRA algorithm in terms of mean job execution time, network usage, and storage usage of node.

Download Full-text

Reducing data access latency in SDSM systems using runtime optimizations

10.1145/1923947.1923965 ◽

2010 ◽

Author(s):

Javier Bueno ◽

Xavier Martorell ◽

Juan José Costa ◽

Toni Cortés ◽

Eduard Ayguadé ◽

...

Keyword(s):

Data Access ◽

Access Latency ◽

Data Access Latency

Download Full-text

The State of the Art and Open Problems in Data Replication in Grid Environments

Handbook of Research on Scalable Computing Technologies ◽

10.4018/978-1-60566-661-7.ch022 ◽

2010 ◽

pp. 486-516 ◽

Cited By ~ 2

Author(s):

Mohammad Shorfuzzaman ◽

Rasit Eskicioglu ◽

Peter Graham

Keyword(s):

Data Replication ◽

Data Access ◽

The State ◽

Data Availability ◽

Data Grids ◽

Open Problems ◽

Data Intensive ◽

Grid Environments ◽

Bandwidth Savings ◽

And Storage

Data Grids provide services and infrastructure for distributed data-intensive applications that need to access, transfer and modify massive datasets stored at distributed locations around the world. For example, the next-generation of scientific applications such as many in high-energy physics, molecular modeling, and earth sciences will involve large collections of data created from simulations or experiments. The size of these data collections is expected to be of multi-terabyte or even petabyte scale in many applications. Ensuring efficient, reliable, secure and fast access to such large data is hindered by the high latencies of the Internet. The need to manage and access multiple petabytes of data in Grid environments, as well as to ensure data availability and access optimization are challenges that must be addressed. To improve data access efficiency, data can be replicated at multiple locations so that a user can access the data from a site near where it will be processed. In addition to the reduction of data access time, replication in Data Grids also uses network and storage resources more efficiently. In this chapter, the state of current research on data replication and arising challenges for the new generation of data-intensive grid environments are reviewed and open problems are identified. First, fundamental data replication strategies are reviewed which offer high data availability, low bandwidth consumption, increased fault tolerance, and improved scalability of the overall system. Then, specific algorithms for selecting appropriate replicas and maintaining replica consistency are discussed. The impact of data replication on job scheduling performance in Data Grids is also analyzed. A set of appropriate metrics including access latency, bandwidth savings, server load, and storage overhead for use in making critical comparisons of various data replication techniques is also discussed. Overall, this chapter provides a comprehensive study of replication techniques in Data Grids that not only serves as a tool to understanding this evolving research area but also provides a reference to which future e orts may be mapped.

Download Full-text