Work-in-Progress Abstract: The impact of the period variation on execution time distributions of programs

Author(s):  
Liliana Cucu-Grosjean ◽  
Avner Bar-Hen ◽  
Yves Sorel ◽  
Hadrien Clarke
Author(s):  
Vianney Kengne Tchendji ◽  
Jean Frederic Myoupo ◽  
Gilles Dequen

In this paper, the authors highlight the existence of close relations between the execution time, efficiency and number of communication rounds in a family of CGM-based parallel algorithms for the optimal binary search tree problem (OBST). In this case, these three parameters cannot be simultaneously improved. The family of CGM (Coarse Grained Multicomputer) algorithms they derive is based on Knuth's sequential solution running in time and space, where n is the size of the problem. These CGM algorithms use p processors, each with local memory. In general, the authors show that each algorithms runs in with communications rounds. is the granularity of their model, and is a parameter that depends on and . The special case of yields a load-balanced CGM-based parallel algorithm with communication rounds and execution steps. Alternately, if , they obtain another algorithm with better execution time, say , the absence of any load-balancing and communication rounds, i.e., not better than the first algorithm. The authors show that the granularity has a crucial role in the different techniques they use to partition the problem to solve and study the impact of each scheduling algorithm. To the best of their knowledge, this is the first unified method to derive a set of parameter-dependent CGM-based parallel algorithms for the OBST problem.


2020 ◽  
Author(s):  
Alan Cheville ◽  
Atsushi Akera ◽  
Donna Riley ◽  
Jennifer Karlin ◽  
Sarah Appelhans ◽  
...  

2021 ◽  
Author(s):  
Jeanne Alcantara

Apache Spark enables a big data application—one that takes massive data as input and may produce massive data along its execution—to run in parallel on multiple nodes. Hence, for a big data application, performance is a vital issue. This project analyzes a WordCount application using Apache Spark, where the impact on the execution time and average utilization is assessed. To facilitate this assessment, the number of executor cores and the size of executor memory are varied across different sizes of data that the application has to process, and the different number of nodes in the cluster that the application runs on. It is concluded that different pairs (data size, number of nodes in the cluster) require different number of executor cores and different size of executor memory to obtain optimum results for execution time and average node utilization.


2021 ◽  
Author(s):  
Jeanne Alcantara

Apache Spark enables a big data application—one that takes massive data as input and may produce massive data along its execution—to run in parallel on multiple nodes. Hence, for a big data application, performance is a vital issue. This project analyzes a WordCount application using Apache Spark, where the impact on the execution time and average utilization is assessed. To facilitate this assessment, the number of executor cores and the size of executor memory are varied across different sizes of data that the application has to process, and the different number of nodes in the cluster that the application runs on. It is concluded that different pairs (data size, number of nodes in the cluster) require different number of executor cores and different size of executor memory to obtain optimum results for execution time and average node utilization.


Sign in / Sign up

Export Citation Format

Share Document