dynamic scheduler
Recently Published Documents


TOTAL DOCUMENTS

38
(FIVE YEARS 13)

H-INDEX

6
(FIVE YEARS 2)

Author(s):  
Luan Teylo ◽  
Alan L. Nunes ◽  
Alba C. M. A. Melo ◽  
Cristina Boeres ◽  
Lucia Maria de A. Drummond ◽  
...  
Keyword(s):  

2021 ◽  
Author(s):  
R. Amela ◽  
R. Badia ◽  
S. Böhm ◽  
R. Tosi ◽  
C. Soriano ◽  
...  

This deliverable focuses on the proling activities developed in the project with the partner's applications. To perform this proling activities, a couple of benchmarks were dened in collaboration with WP5. The rst benchmark is an embarrassingly parallel benchmark that performs a read and then multiple writes of the same object, with the objective of stressing the memory and storage systems and evaluate the overhead when these reads and writes are performed in parallel. A second benchmark is dened based on the Continuation Multi Level Monte Carlo (C-MLMC) algorithm. While this algorithm is normally executed using multiple levels, for the proling and performance analysis objectives, the execution of a single level was enough since the forthcoming levels have similar performance characteristics. Additionally, while the simulation tasks can be executed as parallel (multi-threaded tasks), in the benchmark, single threaded tasks were executed to increase the number of simulations to be scheduled and stress the scheduling engines. A set of experiments based on these two benchmarks have been executed in the MareNostrum 4 supercomputer and using PyCOMPSs as underlying programming model and dynamic scheduler of the tasks involved in the executions. While the rst benchmark was executed several times in a single iteration, the second benchmark was executed in an iterative manner, with cycles of 1) Execution and trace generation; 2) Performance analysis; 3) Improvements. This had enabled to perform several improvements in the benchmark and in the scheduler of PyCOMPSs. The initial iterations focused on the C-MLMC structure itself, performing re-factors of the code to remove ne grain and sequential tasks and merging them in larger granularity tasks. The next iterations focused on improving the PyCOMPSs scheduler, removing existent bottlenecks and increasing its performance by making the scheduler a multithreaded engine. While the results can still be improved, we are satised with the results since the granularity of the simulations run in this evaluation step are much ner than the one that will be used for the real scenarios. The deliverable nishes with some recommendations that should be followed along the project in order to obtain good performance in the execution of the project codes.


2020 ◽  
Vol 6 (2) ◽  
pp. 575-585
Author(s):  
James Hall ◽  
Klaus Moessner ◽  
Richard MacKenzie ◽  
Francois Carrez ◽  
Chuan Heng Foh

Most of the current day applications are data and compute intensive which led to invention of technologies like Hadoop. Hadoop uses Map Reduce framework for parallel processing of big data applications using the computing resources of multiple nodes. Hadoop is designed for cluster environments and has few limitations when executed in cloud environments. Hadoop on cloud has become a common choice due to its easy establishment of infrastructure and pay as you use model. Hadoop performance on cloud infrastructures is affected by the virtualization overhead of cloud environment. The execution times of Hadoop on cloud can be improved if the virtual resources are effectively used to schedule the tasks by studying the resource usage characteristics of the tasks and resource availability of the nodes. The proposed work is to build a dynamic scheduler for Hadoop framework which can make scheduling decision dynamically based on job resource usage and node load. The results of the proposed work indicate an improvement of up to 23% in execution time of the Hadoop Map Reduce applications.


2019 ◽  
Vol 8 (3) ◽  
pp. 3424-3428

Real-time traffic or flows also called as inelastic traffic is that which enforces timely delivery of flows within a specified period of time. There are several applications which generate these flows like multimedia, audio-video conferencing, webinar, Interactive gaming, webcam, Internet TV etc. An interactive application demands a speedy delivery of flows. Flows reaching the destination after this deadline are considered useless. Realtime traffic imposes rigid demands for the delivery. All the realtime flows should be timely delivered and accumulated at the destination. The proposed Dynamic Scheduler relies on Dynamic Packet Scheduling ratio which is dynamic in nature and the ratio changes dynamically with respect to the flows accumulated in both the queues. Flows are scheduled based on the maximum flows allowed on the path that is calculated by TP max. Packet scheduling is based on the available throughput of the network. This dynamic scheduling results in guaranteed fair treatment of both real-time traffic and non-real time traffic. In the this paper we propose a Dynamic Scheduler, [DPS] which dynamically works according to the available number of flows in both realtime and non-real-time queues.


2019 ◽  
Vol 5 ◽  
pp. e190 ◽  
Author(s):  
Bérenger Bramas

The task-based approach has emerged as a viable way to effectively use modern heterogeneous computing nodes. It allows the development of parallel applications with an abstraction of the hardware by delegating task distribution and load balancing to a dynamic scheduler. In this organization, the scheduler is the most critical component that solves the DAG scheduling problem in order to select the right processing unit for the computation of each task. In this work, we extend our Heteroprio scheduler that was originally created to execute the fast multipole method on multi-GPUs nodes. We improve Heteroprio by taking into account data locality during task distribution. The main principle is to use different task-lists for the different memory nodes and to investigate how locality affinity between the tasks and the different memory nodes can be evaluated without looking at the tasks’ dependencies. We evaluate the benefit of our method on two linear algebra applications and a stencil code. We show that simple heuristics can provide significant performance improvement and cut by more than half the total memory transfer of an execution.


2019 ◽  
Author(s):  
Bérenger Bramas

The task-based approach has gained much attention to use modern heterogeneous computing nodes. It allows parallelizing with an abstraction of the hardware by delegating task distribution and load balancing to a dynamic scheduler. In this organization, the scheduler is the most critical component that solves the DAG-scheduling problem in order to select the right processing unit for the computation of each task. In this work, we extend our Heteroprio scheduler that was originally created to execute the fast multipole method on multi-GPUs nodes. We improve Heteroprio by taking into account data locality during task assignation. The main principle is to use different task-lists for the different memory nodes and to investigate how locality affinity between the tasks and the different memory nodes can be evaluated without looking at the tasks' dependencies. The interest of the present method was evaluated on two linear algebra applications and a stencil code. It was deduced that simple heuristics can provide significant performance improvement and cut by more than half the total memory transfer of an execution.


Sign in / Sign up

Export Citation Format

Share Document