Productive Parallel Programming: The PCN Approach

We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.

Download Full-text

Cross-platform parallel debugging and performance analysis tools

Recent Advances in Parallel Virtual Machine and Message Passing Interface - Lecture Notes in Computer Science ◽

10.1007/bfb0056583 ◽

1998 ◽

pp. 257-264 ◽

Cited By ~ 2

Author(s):

Shirley Browne

Keyword(s):

Performance Analysis ◽

Analysis Tools ◽

Cross Platform ◽

And Performance ◽

Performance Analysis Tools ◽

Parallel Debugging

Download Full-text

Policy-based techniques for self-managing parallel applications

The Knowledge Engineering Review ◽

10.1017/s0269888906000890 ◽

2006 ◽

Vol 21 (3) ◽

pp. 205-219 ◽

Cited By ~ 1

Author(s):

RICHARD ANTHONY

Keyword(s):

Empirical Investigation ◽

Environmental Variability ◽

Adaptive Strategy ◽

Parallel Applications ◽

Self Management ◽

Parallel Application ◽

Loosely Coupled ◽

Heterogeneous Nature ◽

Management Techniques ◽

And Performance

This paper presents an empirical investigation of policy-based self-management techniques for parallel applications executing in loosely-coupled environments. The dynamic and heterogeneous nature of these environments is discussed and the special considerations for parallel applications are identified. An adaptive strategy for the run-time deployment of tasks of parallel applications is presented. The strategy is based on embedding numerous policies which are informed by contextual and environmental inputs. The policies govern various aspects of behaviour, enhancing flexibility so that the goals of efficiency and performance are achieved despite high levels of environmental variability. A prototype self-managing parallel application is used as a vehicle to explore the feasibility and benefits of the strategy. In particular, several aspects of stability are investigated. The implementation and behaviour of three policies are discussed and sample results examined.

Download Full-text

APPLICATION OF PERFORMANCE REDUCTION METHODS FOR MINIMIZATION OF ANALYZED NUMBER OF PARALLEL PROGRAM VARIANTS

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2019.09.pp.043-049 ◽

2019 ◽

pp. 43-49 ◽

Cited By ~ 1

Author(s):

A. I. Dordopulo

Keyword(s):

Computer Systems ◽

Automatic Parallelization ◽

Parallel Program ◽

Parallel Applications ◽

Performance Reduction ◽

Reduction Methods ◽

And Performance ◽

Hardware Costs ◽

Sequential Reduction ◽

Information Graph

In this paper, we review and compare the methods of parallel applications’ development based on the automatic program parallelizing for computer systems with shared and distributed memory and on the information graph’s hardware costs and performance reduction for reconfigurable computer systems. The increase in the number of computer system’s units or in the problem’s dimension leads to the significant growth of the automatic parallelization complexity for a procedural program. As a result, the obtainment of parallelizing results in acceptable time using state-of-the-art computer systems is very problematic. In reconfigurable computer systems, the reduction of absolutely parallel information graph of a problem is applied for the parallel program creation. The information graph illustrates the parallelizing and pipelining of computations. In addition to the traditionally practiced reduction of basic subgraphs’ number, the reductions of computational operations’ quantity and of data digit capacity can be utilized for the performance or hardware costs’ scaling. We have proved that the methods of information graph hardware costs and performance reduction provide a considerable decrease in the number of steps needed for adaptation of parallel application to reconfigurable computer systems’ architectures in comparison with automatic parallelizing. We have proved the theorem of coefficient value at sequential reduction, the theorem of increase in reduction coefficient at custom value and the theorem of commutativity of various reduction transformations. The proved theorems help to find a rational sequence of reduction transformations.

Download Full-text

PROGRAM DEVELOPMENT FOR COMPUTATIONAL GRIDS USING SKELETONS AND PERFORMANCE PREDICTION

Parallel Processing Letters ◽

10.1142/s0129626402000902 ◽

2002 ◽

Vol 12 (02) ◽

pp. 157-174 ◽

Cited By ~ 15

Author(s):

MARTIN ALT ◽

HOLGER BISCHOF ◽

SERGEI GORLATCH

Keyword(s):

Performance Prediction ◽

Computational Grid ◽

Computational Grids ◽

Programming System ◽

Challenging Problem ◽

Predicting Performance ◽

Grid Applications ◽

And Performance ◽

High Level ◽

Computational Resources

We address the challenging problem of algorithm and program design for the Computational Grid by providing the application user with a set of high-level, parameterised components called skeletons. We descrile a Java-based Grid programming system in which algorithmns are composed of skeletons and the computational resources for executing individual skeletons are chosen using performance prediction. The advantage of our approach is that skeletons are reusable for different applications and that skeletons' implementation can be tuned to particular machines. The focus of this paper is on predicting performance for Grid applications constructed using skeletons.

Download Full-text

Concluding remarks: progress toward the design of solid catalysts

Faraday Discussions ◽

10.1039/c6fd00134c ◽

2016 ◽

Vol 188 ◽

pp. 591-602 ◽

Cited By ~ 2

Author(s):

Bruce C. Gates

Keyword(s):

Heterogeneous Catalysts ◽

Performance Testing ◽

Catalyst Design ◽

Rapid Progress ◽

Catalyst Structure ◽

Solid Catalysts ◽

Catalytic Materials ◽

Fundamental Understanding ◽

Scientists And Engineers ◽

And Performance

The 2016 Faraday Discussion on the topic “Designing New Heterogeneous Catalysts” brought together a group of scientists and engineers to address forefront topics in catalysis and the challenge of catalyst design—which is daunting because of the intrinsic non-uniformity of the surfaces of catalytic materials. “Catalyst design” has taken on a pragmatic meaning which implies the discovery of new and better catalysts on the basis of fundamental understanding of the catalyst structure and performance. The presentations and discussion at the meeting illustrate the rapid progress in this understanding linked with improvements in spectroscopy, microscopy, theory, and catalyst performance testing. The following text includes a statement of recurrent themes in the discussion and examples of forefront science that evidences progress toward catalyst design.

Download Full-text

Empirical studies of employment and performance of scientists and engineers

Economics of Innovation and New Technology ◽

10.1080/10438590903432855 ◽

2010 ◽

Vol 19 (5) ◽

pp. 387-391 ◽

Cited By ~ 1

Author(s):

Andries de Grip ◽

Bronwyn H. Hall ◽

Wendy Smits

Keyword(s):

Empirical Studies ◽

Scientists And Engineers ◽

And Performance

Download Full-text

Analytic performance modeling and analysis of detailed neuron simulations

The International Journal of High Performance Computing Applications ◽

10.1177/1094342020912528 ◽

2020 ◽

Vol 34 (4) ◽

pp. 428-449 ◽

Cited By ~ 1

Author(s):

Francesco Cremonesi ◽

Georg Hager ◽

Gerhard Wellein ◽

Felix Schürmann

Keyword(s):

Biological Networks ◽

Memory Performance ◽

Simulation Software ◽

Performance Model ◽

Parallel Computer ◽

Neuron Models ◽

Simulation Code ◽

Modeling And Analysis ◽

Performance Engineering ◽

And Performance

Big science initiatives are trying to reconstruct and model the brain by attempting to simulate brain tissue at larger scales and with increasingly more biological detail than previously thought possible. The exponential growth of parallel computer performance has been supporting these developments, and at the same time maintainers of neuroscientific simulation code have strived to optimally and efficiently exploit new hardware features. Current state-of-the-art software for the simulation of biological networks has so far been developed using performance engineering practices, but a thorough analysis and modeling of the computational and performance characteristics, especially in the case of morphologically detailed neuron simulations, is lacking. Other computational sciences have successfully used analytic performance engineering, which is based on “white-box,” that is, first-principles performance models, to gain insight on the computational properties of simulation kernels, aid developers in performance optimizations and eventually drive codesign efforts, but to our knowledge a model-based performance analysis of neuron simulations has not yet been conducted. We present a detailed study of the shared-memory performance of morphologically detailed neuron simulations based on the Execution-Cache-Memory performance model. We demonstrate that this model can deliver accurate predictions of the runtime of almost all the kernels that constitute the neuron models under investigation. The gained insight is used to identify the main governing mechanisms underlying performance bottlenecks in the simulation. The implications of this analysis on the optimization of neural simulation software and eventually codesign of future hardware architectures are discussed. In this sense, our work represents a valuable conceptual and quantitative contribution to understanding the performance properties of biological networks simulations.

Download Full-text

OpenVX Integration Into the Visual Development Environment

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2018010102 ◽

2018 ◽

Vol 9 (1) ◽

pp. 20-49 ◽

Cited By ~ 1

Author(s):

Alexey Syschikov ◽

Boris Sedov ◽

Konstantin Nedovodeev ◽

Vera Ivanova

Keyword(s):

Computer Vision ◽

Performance Analysis ◽

Automatic Generation ◽

Visual Programming ◽

Visual Development ◽

Development Environment ◽

Performance Portability ◽

Programming Support ◽

And Performance ◽

Performance Analysis Tools

The OpenVX standard has appeared as an answer from the computer vision community to the challenge of accelerating vision applications on embedded heterogeneous platforms. It is designed to leverage the computer vision hardware potential with functional and performance portability. As long as VIPE has a powerful model of computation, it can incorporate various other models. This allows to extend facilities of a language or framework that is based on the model to be incorporated with visual programming support and provide access to the existing performance analysis and deployment tools. The authors present OpenVX integration into the VIPE IDE. VIPE addresses the need to design OpenVX graphs in a natural visual form with automatic generation of a full-fledged program, shielding a programmer from writing a bunch of boilerplate code. To the best of the authors' knowledge, this is the first use of a graphical notation for OpenVX programming. Using VIPE to develop OpenVX programs also enables the performance analysis tools.

Download Full-text