Predictive File Replication on the Data Grids

Author(s):  
ChenHan Liao ◽  
Na Helian ◽  
Sining Wu ◽  
Mamunur M. Rashid

Most replication methods either monitor the popularity of files or use complicated functions to calculate the overall cost of whether or not a replication decision or a deletion decision should be issued. However, once the replication decision is issued, the popularity of the files is changed and may have already impacted access latency and resource usage. This article proposes a decision-tree-based predictive file replication strategy that forecasts files’ future popularity based on their characteristics on the Grids. The proposed strategy has shown superb performance in terms of mean job time and effective network usage compared with the other two replication strategies, LRU and Economic under OptorSim simulation environment.

2010 ◽  
Vol 2 (1) ◽  
pp. 69-86 ◽  
Author(s):  
ChenHan Liao ◽  
Na Helian ◽  
Sining Wu ◽  
Mamunur M. Rashid

Most replication methods either monitor the popularity of files or use complicated functions to calculate the overall cost of whether or not a replication decision or a deletion decision should be issued. However, once the replication decision is issued, the popularity of the files is changed and may have already impacted access latency and resource usage. This article proposes a decision-tree-based predictive file replication strategy that forecasts files’ future popularity based on their characteristics on the Grids. The proposed strategy has shown superb performance in terms of mean job time and effective network usage compared with the other two replication strategies, LRU and Economic under OptorSim simulation environment.


Author(s):  
Nazanin Saadat ◽  
Amir Masoud Rahmani

One of the challenges of data grid is to access widely distributed data fast and efficiently and providing maximum data availability with minimum latency. Data replication is an efficient way used to address this challenge by replicating and storing replicas, making it possible to access similar data in different locations of the data grid and can shorten the time of getting the files. However, as the number and storage size of grid sites is limited and restricted, an optimized and effective replacement algorithm is needed to improve the efficiency of replication. In this paper, the authors propose a novel two-level replacement algorithm which uses Fuzzy Replica Preserving Value Evaluator System (FRPVES) for evaluating the value of each replica. The algorithm was tested using a grid simulator, OptorSim developed by European Data Grid projects. Results from simulation procedure show that the authors' proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, total number of replications and effective network usage.


Author(s):  
Supriya Raheja

Background: The extension of CPU schedulers with fuzzy has been ascertained better because of its unique capability of handling imprecise information. Though, other generalized forms of fuzzy can be used which can further extend the performance of the scheduler. Objectives: This paper introduces a novel approach to design an intuitionistic fuzzy inference system for CPU scheduler. Methods: The proposed inference system is implemented with a priority scheduler. The proposed scheduler has the ability to dynamically handle the impreciseness of both priority and estimated execution time. It also makes the system adaptive based on the continuous feedback. The proposed scheduler is also capable enough to schedule the tasks according to dynamically generated priority. To demonstrate the performance of proposed scheduler, a simulation environment has been implemented and the performance of proposed scheduler is compared with the other three baseline schedulers (conventional priority scheduler, fuzzy based priority scheduler and vague based priority scheduler). Results: Proposed scheduler is also compared with the shortest job first CPU scheduler as it is known to be an optimized solution for the schedulers. Conclusion: Simulation results prove the effectiveness and efficiency of intuitionistic fuzzy based priority scheduler. Moreover, it provides optimised results as its results are comparable to the results of shortest job first.


Author(s):  
Ghalem Belalem

Data grids have become an interesting and popular domain in grid community (Foster and Kesselmann, 2004). Generally, the grids are proposed as solutions for large scale systems, where data replication is a well-known technique used to reduce access latency and bandwidth, and increase availability. In splitting of the advantages of replication, there are many problems that should be solved such as, • The replica placement that determines the optimal locations of replicated data in order to reduce the storage cost and data access (Xu et al, 2002); • The problem of determining which replica will be accessed to in terms of consistency when we need to execute a read or write operation (Ranganathan and Foster, 2001); • The problem of degree of replication which consists in finding a minimal number of replicas without reducing the performance of user applications; • The problem of replica consistency that concerns the consistency of a set of replicated data. This consistency provides a completely coherent view of all the replicas for a user (Gray et al 1996). Our principal aim, in this article, is to integrate into consistency management service, an approach based on an economic model for resolving conflicts detected in the data grid.


Fuzzy Systems ◽  
2017 ◽  
pp. 516-539
Author(s):  
Nazanin Saadat ◽  
Amir Masoud Rahmani

One of the challenges of data grid is to access widely distributed data fast and efficiently and providing maximum data availability with minimum latency. Data replication is an efficient way used to address this challenge by replicating and storing replicas, making it possible to access similar data in different locations of the data grid and can shorten the time of getting the files. However, as the number and storage size of grid sites is limited and restricted, an optimized and effective replacement algorithm is needed to improve the efficiency of replication. In this paper, the authors propose a novel two-level replacement algorithm which uses Fuzzy Replica Preserving Value Evaluator System (FRPVES) for evaluating the value of each replica. The algorithm was tested using a grid simulator, OptorSim developed by European Data Grid projects. Results from simulation procedure show that the authors' proposed algorithm has better performance in comparison with other algorithms in terms of job execution time, total number of replications and effective network usage.


2020 ◽  
Vol 34 (31) ◽  
pp. 2050346
Author(s):  
Rajesh Kondabala ◽  
Vijay Kumar ◽  
Amjad Ali ◽  
Manjit Kaur

In this paper, a novel astrophysics-based prediction framework is developed for estimating the binding affinity of a glucose binder. The proposed framework utilizes the molecule properties for predicting the binding affinity. It also uses the astrophysics-learning strategy that incorporates the concepts of Kepler’s law during the prediction process. The proposed framework is compared with 10 regression algorithms over ZINC dataset. Experimental results reveal that the proposed framework provides 99.30% accuracy of predicting binding affinity. However, decision tree provides the prediction with 97.14% accuracy. Cross-validation results show that the proposed framework provides better accuracy than the other existing models. The developed framework enables researchers to screen glucose binder rapidly. It also reduces computational time for designing small glucose binding molecule.


2020 ◽  
Vol 16 (3) ◽  
pp. 291-307
Author(s):  
Ahmed Berkennou ◽  
Ghalem Belalem ◽  
Said Limam

Connecting objects have increasingly become popular in recent years, leading to the connection of more than 50 billion objects by the end of 2020. This large number of objects will generate a huge amount of data that is currently being processed and stored in the cloud. Fog Computing presents a promising solution to the problems of high latency and huge network traffic encountered in the cloud. As Fog’s infrastructures are dense, heterogeneous and geo-distributed, managing the data in order to satisfy users demand in such context is very complicated. In this work, we propose a data management strategy called ‘RMS-HaFC’ in which we consider the characteristics of Fog Computing environment. To do so, we proposed a hierarchical multi-layer model, on which we designed a migration and replication strategy based on data popularity. These strategies duplicate files dynamically and store them in different locations to improve the response time of users requests and minimize the system energy consumption without loading network usage. The strategy was evaluated using the iFogSim simulator and the experimental results obtained are very promising.


Sign in / Sign up

Export Citation Format

Share Document