Performance Reduction for Automatic Development of Parallel Applications for Reconfigurable Computer Systems

In this paper, we review and compare the methods of parallel applications’ development based on the automatic program parallelizing for computer systems with shared and distributed memory and on the information graph’s hardware costs and performance reduction for reconfigurable computer systems. The increase in the number of computer system’s units or in the problem’s dimension leads to the significant growth of the automatic parallelization complexity for a procedural program. As a result, the obtainment of parallelizing results in acceptable time using state-of-the-art computer systems is very problematic. In reconfigurable computer systems, the reduction of absolutely parallel information graph of a problem is applied for the parallel program creation. The information graph illustrates the parallelizing and pipelining of computations. In addition to the traditionally practiced reduction of basic subgraphs’ number, the reductions of computational operations’ quantity and of data digit capacity can be utilized for the performance or hardware costs’ scaling. We have proved that the methods of information graph hardware costs and performance reduction provide a considerable decrease in the number of steps needed for adaptation of parallel application to reconfigurable computer systems’ architectures in comparison with automatic parallelizing. We have proved the theorem of coefficient value at sequential reduction, the theorem of increase in reduction coefficient at custom value and the theorem of commutativity of various reduction transformations. The proved theorems help to find a rational sequence of reduction transformations.

Download Full-text

Poluição de Cache e Thrashing em Aplicações Paralelas de Alto Desempenho

10.5753/wscad.2019.8683 ◽

2019 ◽

Author(s):

Arthur Krause ◽

Francis Moreira ◽

Valéria Girelli ◽

Philippe Olivier Navaux

Keyword(s):

High Performance ◽

Computer Systems ◽

Memory Access ◽

Replacement Policy ◽

Parallel Applications ◽

Access Time ◽

L2 Cache ◽

Intelligent Management

Conforme os processadores evoluem, o desempenho dos sistemas computacionais se torna cada vez mais limitado pelo tempo de acesso à memória. Caches são empregadas a fim de contornar este problema, mas é necessária uma gerência inteligente dos dados que são armazenados nelas para impedir que problemas como poluição e thrashing degradem seu desempenho. Neste trabalho é apresentada uma análise da poluição de cache e thrashing em aplicações paralelas de alto desempenho. Os resultados mostram que caches com maior associatividade sofrem mais com estes problemas. Até 28% dos cache misses na L1 poderiam ser evitados com uma política de substituição de cache mais inteligente, chegando a até 62% na cache L2 e 98% na LLC. As processors evolve, the performance of computer systems becomes increasingly limited by the memory access time. Caches are employed in order to get around this problem, but an intelligent management of the data that is stored in them is necessary to prevent problems such as pollution and thrashing from degrading their performance. In this work, an analysis of cache and thrashing pollution in high performance parallel applications is presented. The results show that caches with greater associativity suffer more from these problems. Up to 28% of cache misses in the L1 cache could be avoided with a smarter replacement policy, up to 62% in the L2 cache and 98% in the LLC.

Download Full-text

On the problem of automatic development of parallel applications for reconfigurable computer systems

Вычислительные технологии ◽

10.25743/ict.2020.25.1.005 ◽

2020 ◽

Author(s):

И.И. Левин ◽

А.И. Дордопуло

Keyword(s):

Distinctive Feature ◽

Reconfigurable Computing ◽

Computer Systems ◽

Computing System ◽

Computing Resource ◽

Performance Reduction ◽

Reduction Coefficient ◽

Hardware Costs ◽

Computing Structure ◽

Information Graph

Рассмотрена оригинальная методика отображения информационного графа прикладной программы на архитектуру реконфигурируемой вычислительной системы с помощью методов редукции производительности, обеспечивающих решение задач, аппаратные затраты на реализацию которых превышают доступный вычислительный ресурс. Доказаны теоремы о свойствах последовательного применения редукций по числу базовых подграфов, по числу вычислительных устройств и разрядности. На основе доказанных теорем и следствий из них сформулирована методика редукционных преобразований информационного графа прикладной программы для автоматической адаптации к архитектуре реконфигурируемой вычислительной системы. Приведена оценка максимального числа преобразований согласно предложенной методике для сбалансированной редукции производительности и аппаратных затрат прикладных программ для реконфигурируемых вычислительных систем. To solve applied problems, the hardware costs of which exceed the available computing resource of FPGA-based computer systems, an original technique was developed for mapping the information graph of an application program to the architecture of a reconfigurable computing system. The proposed technique is based on the performance reduction methods that reduce the productivity of an applied task, which, along with the reducing productivity, does so for the hardware costs of its implementation and, thereby, solve the problem on the available computing resource. We demonstrate that the decrease in hardware costs for the computing structure realization occurs only during the reduction the basic subgraph number, the number of computing devices in a basic subgraph and the data width. The influence of sequential reduction transformations on the computing structure of a problem is examined. The proved theorems are concerned with the possibility of representing the reduction coefficient as a product of the coefficients of successive reductions, on the inability of additive increase in reduction coefficient during sequential reductions and on the superposition commutativity of different sequential reductions. The proved theorems and the corollaries presented in the article allow formulating the basic principles for the method of reduction transformations of the information graph of the problem for adaptation to the architecture of a hybrid reconfigurable computing system. A distinctive feature of the technique is a relatively small number of transformations for a balanced reduction of the information graph of the problem and the implementation of the task on a reconfigurable computer system.The comparatively small number of transformations required for the balanced reduction of the information graph of the problem and for the implementation of calculations on a reconfigurable computer system is the distinctive feature of the technique. For the developed technique, we estimated the maximal number of transformations and found out the decrease in the quantity of analyzed reduction variants from each class. The proposed technique permits the significant reduction of the time needed to create the computational structure of a parallel program adapted to the architecture and configuration of the reconfigurable computing system. Furthermore, the technique allows automatization of this process using the specialized software and providing at least 5075 efficiency in comparison with the solutions of the same problems by specialists.

Download Full-text

Porting of parallel applications to reconfigurable computer systems with various architectures and configurations

2016 5th International Conference on Informatics, Electronics and Vision (ICIEV) ◽

10.1109/iciev.2016.7760174 ◽

2016 ◽

Author(s):

Alexey Igorevich Dordopulo ◽

Vasiliy Borisovich Kovalenko ◽

Viacheslav Alexandrovich Gudkov ◽

Liubov Mikhailovna Slasten

Keyword(s):

Computer Systems ◽

Parallel Applications

Download Full-text

Future directions in HF and computer systems: A meeting report

PsycEXTRA Dataset ◽

10.1037/e574032012-022 ◽

1983 ◽

Author(s):

Paul Green

Keyword(s):

Computer Systems ◽

Meeting Report ◽

Future Directions

Download Full-text

Computer Systems and Water Resources

10.1016/s0167-5648(08)x7005-4 ◽

1974 ◽

Keyword(s):

Water Resources ◽

Computer Systems

Download Full-text

A Method of Risk Analysis for Computer Systems

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss1987.108.4_260 ◽

1988 ◽

Vol 108 (4) ◽

pp. 260-267

Author(s):

Kazuo Takaragi ◽

Ryoichi Sasaki ◽

Yasuhiko Nagai

Keyword(s):

Risk Analysis ◽

Computer Systems

Download Full-text

Use of parallel computer systems for high Reynolds flow simulation

Proceedings of the Sixth International Symposium On Turbulence, Heat and Mass Transfer ◽

10.1615/ichmt.2009.turbulheatmasstransf.900 ◽

2009 ◽

Author(s):

Boris N. Chetverushkin ◽

E. V. Shilnikov

Keyword(s):

Computer Systems ◽

Flow Simulation ◽

Parallel Computer ◽

High Reynolds

Download Full-text

Flexible buffer management with thresholds and blocking for congestion control in multi-server computer systems

Theoretical and Applied Informatics ◽

10.2478/v10179-010-0018-9 ◽

2010 ◽

Vol 22 (1) ◽

Cited By ~ 1

Author(s):

Walenty Oniszczuk

Keyword(s):

Congestion Control ◽

Computer Systems ◽

Buffer Management ◽

Multi Server ◽

Server Computer

Download Full-text

Matlab and Parallel Computing

Image Processing & Communications ◽

10.2478/v10248-012-0048-5 ◽

2012 ◽

Vol 17 (4) ◽

pp. 207-216 ◽

Cited By ~ 5

Author(s):

Magdalena Szymczyk ◽

Piotr Szymczyk

Keyword(s):

Image Processing ◽

Signal Processing ◽

Parallel Computing ◽

Distributed Computing ◽

Control Systems ◽

High Performance ◽

Parallel Applications ◽

Process Simulations ◽

Key Features ◽

Financial Process

Abstract The MATLAB is a technical computing language used in a variety of fields, such as control systems, image and signal processing, visualization, financial process simulations in an easy-to-use environment. MATLAB offers "toolboxes" which are specialized libraries for variety scientific domains, and a simplified interface to high-performance libraries (LAPACK, BLAS, FFTW too). Now MATLAB is enriched by the possibility of parallel computing with the Parallel Computing ToolboxTM and MATLAB Distributed Computing ServerTM. In this article we present some of the key features of MATLAB parallel applications focused on using GPU processors for image processing.

Download Full-text