A Case for Splitting a File for Data Placement in a Distributed Scientific Workflow

Author(s):  
Hindol Bhattacharya ◽  
Matangini Chattopadhyay ◽  
Samiran Chattopadhay
Author(s):  
Zhanghui Liu ◽  
Tao Xiang ◽  
Bing Lin ◽  
Xinshu Ye ◽  
Haijiang Wang ◽  
...  

2020 ◽  
Vol 13 (5) ◽  
pp. 871-883
Author(s):  
Avinash Kaur ◽  
Pooja Gupta ◽  
Parminder Singh ◽  
Manpreet Singh

Background: A large number of communities and enterprises deploy numerous scientific workflow applications on cloud service. Aims: The main aim of the cloud service provider is to execute the workflows with a minimal budget and makespan. Most of the existing techniques for budget and makespan are employed for the traditional platform of computing and are not applicable to cloud computing platforms with unique resource management methods and pricing strategies based on service. Methods: In this paper, we studied the joint optimization of cost and makespan of scheduling workflows in IaaS clouds, and proposed a novel workflow scheduling scheme. Also, data placement is included in the proposed algorithm. Results: In this scheme, DPO-HEFT (Data Placement Oriented HEFT) algorithm is developed which closely integrates the data placement mechanism with the list scheduling heuristic HEFT. Extensive experiments using the real-world and synthetic workflow demonstrate the efficacy of our scheme. Conclusion: Our scheme can achieve significantly better cost and makespan trade-off fronts with remarkably higher hypervolume and can run up to hundreds times faster than the state-of-the-art algorithms.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Zheyi Chen ◽  
Xu Zhao ◽  
Bing Lin

In hybrid cloud environments, reasonable data placement strategies are critical to the efficient execution of scientific workflows. Due to various loads, bandwidth fluctuations, and network congestions between different data centers as well as the dynamics of hybrid cloud environments, the data transmission time is uncertain. Thus, it poses huge challenges to the efficient data placement for scientific workflows. However, most of the traditional solutions for data placement focus on deterministic cloud environments, which lead to the excessive data transmission time of scientific workflows. To address this problem, we propose an adaptive discrete particle swarm optimization algorithm based on the fuzzy theory and genetic algorithm operators (DPSO-FGA) to minimize the fuzzy data transmission time of scientific workflows. The DPSO-FGA can rationally place the scientific workflow data while meeting the requirements of data privacy and the capacity limitations of data centers. Simulation results show that the DPSO-FGA can effectively reduce the fuzzy data transmission time of scientific workflows in hybrid cloud environments.


2019 ◽  
Vol 15 (7) ◽  
pp. 4254-4265 ◽  
Author(s):  
Bing Lin ◽  
Fangning Zhu ◽  
Jianshan Zhang ◽  
Jiaqing Chen ◽  
Xing Chen ◽  
...  

2014 ◽  
Vol 22 (3) ◽  
pp. 277
Author(s):  
Qiao Huijie ◽  
Lin Congtian ◽  
Wang Jiangning ◽  
Ji Liqiang

Sign in / Sign up

Export Citation Format

Share Document