Performance modeling of parallel applications on MPSoCs

2009 International Symposium on System-on-Chip ◽

10.1109/socc.2009.5335675 ◽

2009 ◽

Author(s):

Marco Lattuada ◽

Christian Pilato ◽

Antonino Tumeo ◽

Fabrizio Ferrandi

Keyword(s):

Performance Modeling ◽

Parallel Applications

Download Full-text

A mechanism for balancing accuracy and scope in cross-machine black-box GPU performance modeling

The International Journal of High Performance Computing Applications ◽

10.1177/1094342020921340 ◽

2020 ◽

Vol 34 (6) ◽

pp. 589-614

Author(s):

James D Stevens ◽

Andreas Klöckner

Keyword(s):

Performance Optimization ◽

Heterogeneous Computing ◽

Performance Modeling ◽

Matrix Multiplication ◽

Black Box ◽

Ease Of Use ◽

Performance Tuning ◽

Parallel Applications ◽

Accuracy Evaluation ◽

The ability to model, analyze, and predict execution time of computations is an important building block that supports numerous efforts, such as load balancing, benchmarking, job scheduling, developer-guided performance optimization, and the automation of performance tuning for high performance, parallel applications. In today’s increasingly heterogeneous computing environment, this task must be accomplished efficiently across multiple architectures, including massively parallel coprocessors like GPUs, which are increasingly prevalent in the world’s fastest supercomputers. To address this challenge, we present an approach for constructing customizable, cross-machine performance models for GPU kernels, including a mechanism to automatically and symbolically gather performance-relevant kernel operation counts, a tool for formulating mathematical models using these counts, and a customizable parameterized collection of benchmark kernels used to calibrate models to GPUs in a black-box fashion. With this approach, we empower the user to manage trade-offs between model accuracy, evaluation speed, and generalizability. A user can define their own model and customize the calibration process, making it as simple or complex as desired, and as application-targeted or general as desired. As application examples of our approach, we demonstrate both linear and nonlinear models; these examples are designed to predict execution times for multiple variants of a particular computation: two matrix-matrix multiplication variants, four discontinuous Galerkin differentiation operation variants, and two 2D five-point finite difference stencil variants. For each variant, we present accuracy results on GPUs from multiple vendors and hardware generations. We view this highly user-customizable approach as a response to a central question arising in GPU performance modeling: how can we model GPU performance in a cost-explanatory fashion while maintaining accuracy, evaluation speed, portability, and ease of use, an attribute we believe precludes approaches requiring manual collection of kernel or hardware statistics.

Download Full-text

Performance Modeling based on Multidimensional Surface Learning for Performance Predictions of Parallel Applications in Non-Dedicated Environments

2006 International Conference on Parallel Processing (ICPP'06) ◽

10.1109/icpp.2006.60 ◽

2006 ◽

Author(s):

J. Yagnik ◽

H.A. Sanjay ◽

S. Vadhiyar

Keyword(s):

Performance Modeling ◽

Parallel Applications ◽

Multidimensional Surface ◽

Performance Predictions ◽

Surface Learning

Download Full-text

Teuta: Tool Support for Performance Modeling of Distributed and Parallel Applications

Computational Science - ICCS 2004 - Lecture Notes in Computer Science ◽

10.1007/978-3-540-24688-6_60 ◽

2004 ◽

pp. 456-463 ◽

Author(s):

Thomas Fahringer ◽

Sabri Pllana ◽

Johannes Testori

Keyword(s):

Performance Modeling ◽

Parallel Applications ◽

Download Full-text

Performance modeling of parallel applications for grid scheduling

Journal of Parallel and Distributed Computing ◽

10.1016/j.jpdc.2008.02.006 ◽

2008 ◽

Vol 68 (8) ◽

pp. 1135-1145 ◽

Author(s):

H.A. Sanjay ◽

Sathish Vadhiyar

Keyword(s):

Performance Modeling ◽

Parallel Applications ◽

Grid Scheduling

Download Full-text

Methods of inference and learning for performance modeling of parallel applications

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '07 ◽

10.1145/1229428.1229479 ◽

2007 ◽

Author(s):

Benjamin C. Lee ◽

David M. Brooks ◽

Bronis R. de Supinski ◽

Martin Schulz ◽

Karan Singh ◽

...

Keyword(s):

Performance Modeling ◽

Parallel Applications

Download Full-text

Human performance modeling using the signal detection paradigm

PsycEXTRA Dataset ◽

10.1037/e574122012-015 ◽

1983 ◽

Author(s):

John F. McGrew

Keyword(s):

Signal Detection ◽

Human Performance ◽

Performance Modeling

Download Full-text

Human performance modeling for operational Command, Control and Communication

PsycEXTRA Dataset ◽

10.1037/e577362012-018 ◽

2005 ◽

Author(s):

Jeffrey T. Hansberger ◽

Diane Barnette

Keyword(s):

Human Performance ◽

Performance Modeling

Download Full-text

Impacting system design with human performance modeling and experiment: Another success story

PsycEXTRA Dataset ◽

10.1037/e577772012-003 ◽

2006 ◽

Author(s):

Diane K. Mitchel ◽

Jessie Y. C. Chen

Keyword(s):

System Design ◽

Human Performance ◽

Performance Modeling ◽

Download Full-text

Statistical Performance Modeling and Optimization

10.1561/9781601980571 ◽

2006 ◽

Author(s):

Xin Li ◽

Jiayong Le ◽

Lawrence T Pileggi

Keyword(s):

Performance Modeling ◽

Modeling And Optimization

Download Full-text

PERFORMANCE MODELING OF PARALLEL-CONNECTED RANQUE-HILSCH VORTEX TUBES USING A GENERALIZABLE AND ROBUST ANN

Heat Transfer Research ◽

10.1615/heattransres.2020035587 ◽

2020 ◽

Vol 51 (15) ◽

pp. 1399-1415

Author(s):

Hüseyin Kaya ◽

Volkan Kirmaci ◽

Hüseyin Avni Es

Keyword(s):

Performance Modeling ◽

Download Full-text