Automatic Library Generation and Performance Tuning for Modular Polynomial Multiplication

2021 ◽  
Author(s):  
Lingchuan Meng
2001 ◽  
pp. 475-487
Author(s):  
Dmitry Petrov ◽  
Serg Shestakov

Author(s):  
Masaki Iwasawa ◽  
Daisuke Namekata ◽  
Keigo Nitadori ◽  
Kentaro Nomura ◽  
Long Wang ◽  
...  

Abstract We describe algorithms implemented in FDPS (Framework for Developing Particle Simulators) to make efficient use of accelerator hardware such as GPGPUs (general-purpose computing on graphics processing units). We have developed FDPS to make it possible for researchers to develop their own high-performance parallel particle-based simulation programs without spending large amounts of time on parallelization and performance tuning. FDPS provides a high-performance implementation of parallel algorithms for particle-based simulations in a “generic” form, so that researchers can define their own particle data structure and interparticle interaction functions. FDPS compiled with user-supplied data types and interaction functions provides all the necessary functions for parallelization, and researchers can thus write their programs as though they are writing simple non-parallel code. It has previously been possible to use accelerators with FDPS by writing an interaction function that uses the accelerator. However, the efficiency was limited by the latency and bandwidth of communication between the CPU and the accelerator, and also by the mismatch between the available degree of parallelism of the interaction function and that of the hardware parallelism. We have modified the interface of the user-provided interaction functions so that accelerators are more efficiently used. We also implemented new techniques which reduce the amount of work on the CPU side and the amount of communication between CPU and accelerators. We have measured the performance of N-body simulations on a system with an NVIDIA Volta GPGPU using FDPS and the achieved performance is around 27% of the theoretical peak limit. We have constructed a detailed performance model, and found that the current implementation can achieve good performance on systems with much smaller memory and communication bandwidth. Thus, our implementation will be applicable to future generations of accelerator system.


1999 ◽  
Vol 122 (4) ◽  
pp. 803-812 ◽  
Author(s):  
Jonghoon Park ◽  
Wankyun Chung

Industrial manipulators are under various limitations against high quality motion control; for example, both frictional and dynamic disturbances should be dealt with a simple PID control structure. A robust linear PID motion controller, called the reference error feedback (REF), is proposed, which solves the nonlinear L2-gain attenuation control problem for robotic manipulators. The stability, robustness, and performance tuning of the proposed controller are analyzed. Making use of the fact that the single parameter of the induced L2-gain γ controls the performance with stability attained, we propose a simple and stable method of performance tuning called “the square law.” The analytical results are verified through experiments of a six-degrees-of-freedom industrial manipulator. [S0022-0434(00)00104-0]


2014 ◽  
Vol 22 (4) ◽  
pp. 259-260 ◽  
Author(s):  
Siegfried Benkner ◽  
Franz Franchetti ◽  
Hans Michael Gerndt ◽  
Jeffrey K. Hollingsworth

High Performance Computing architectures have become incredibly complex and exploiting their full potential is becoming more and more challenging. As a consequence, automatic performance tuning (autotuning) of HPC applications is of growing interest and many research groups around the world are currently involved. Autotuning is still a rapidly evolving research field with many different approaches being taken. This special issue features selected papers presented at the Dagstuhl seminar on “Automatic Application Tuning for HPC Architectures” in October 2013, which brought together researchers from the areas of autotuning and performance analysis in order to exchange ideas and steer future collaborations.


Sign in / Sign up

Export Citation Format

Share Document