scholarly journals Particle Swarm Approach to Scheduling Work-Flow Applications in Distributed Data-Intensive Computing Environments

Author(s):  
Hongbo Liu ◽  
Shichang Sun ◽  
Ajith Abraham
2019 ◽  
Vol 8 (3) ◽  
pp. 7440-7446

In the distributed data-intensive computing environment, relegating certain assignments to specific machines in a protected way is a major test for the employment planning issue. The unpredictability of this issue increments with the size of the activity and it is hard to understand viably. A few metaheuristic calculations including particle swarm optimization (PSO) strategy and variable neighborhood particle swarm optimization VNPSO) system are utilized to tackle the employment planning issue in distributed computing. While allocating assignments to the machines, to fulfill the security requirements and to limit the cost capacity, we proposed an altered PSO with a scout adjustment (MPSO-SA) calculation which utilized a cyclic term called change administrator to get the best cost capacity. The exhibition of the proposed MPSO-SA booking component is contrasted and the Genetic calculation (GA), PSO and VNPSO systems and the exploratory outcome demonstrate that the proposed technique diminishes the likelihood of hazard with security requirements and it has preferable intermingling property over the current conventions.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1471
Author(s):  
Jun-Yeong Lee ◽  
Moon-Hyun Kim ◽  
Syed Asif Raza Raza Shah ◽  
Sang-Un Ahn ◽  
Heejun Yoon ◽  
...  

Data are important and ever growing in data-intensive scientific environments. Such research data growth requires data storage systems that play pivotal roles in data management and analysis for scientific discoveries. Redundant Array of Independent Disks (RAID), a well-known storage technology combining multiple disks into a single large logical volume, has been widely used for the purpose of data redundancy and performance improvement. However, this requires RAID-capable hardware or software to build up a RAID-enabled disk array. In addition, it is difficult to scale up the RAID-based storage. In order to mitigate such a problem, many distributed file systems have been developed and are being actively used in various environments, especially in data-intensive computing facilities, where a tremendous amount of data have to be handled. In this study, we investigated and benchmarked various distributed file systems, such as Ceph, GlusterFS, Lustre and EOS for data-intensive environments. In our experiment, we configured the distributed file systems under a Reliable Array of Independent Nodes (RAIN) structure and a Filesystem in Userspace (FUSE) environment. Our results identify the characteristics of each file system that affect the read and write performance depending on the features of data, which have to be considered in data-intensive computing environments.


2013 ◽  
Vol 29 (3) ◽  
pp. 739-750 ◽  
Author(s):  
Lizhe Wang ◽  
Jie Tao ◽  
Rajiv Ranjan ◽  
Holger Marten ◽  
Achim Streit ◽  
...  

2013 ◽  
Vol 756-759 ◽  
pp. 3318-3323
Author(s):  
Qi Zhi Deng ◽  
Long Bo Zhang ◽  
Xin Qian ◽  
Ya Li Chen ◽  
Feng Ying Wang

In order to solve the problem of how to improve the scalability of data processing capabilities and the data availability which encountered by data mining techniques for Data-intensive computing, a new method of tree learning is presented in this paper. By introducing the MapReduce, the tree learning method based on SPRINT can obtain a well scalability when address large datasets. Moreover, we define the process of split point as a series of distributed computations, which is implemented with the MapReduce model respectively. And a new data structure called class distribution table is introduced to assist the calculation of histogram. Experiments and results analysis shows that the algorithm has strong processing capabilities of data mining for data-intensive computing environments.


Sign in / Sign up

Export Citation Format

Share Document