APPLICATION OF PERFORMANCE REDUCTION METHODS FOR MINIMIZATION OF ANALYZED NUMBER OF PARALLEL PROGRAM VARIANTS

Author(s):  
A. I. Dordopulo

In this paper, we review and compare the methods of parallel applications’ development based on the automatic program parallelizing for computer systems with shared and distributed memory and on the information graph’s hardware costs and performance reduction for reconfigurable computer systems. The increase in the number of computer system’s units or in the problem’s dimension leads to the significant growth of the automatic parallelization complexity for a procedural program. As a result, the obtainment of parallelizing results in acceptable time using state-of-the-art computer systems is very problematic. In reconfigurable computer systems, the reduction of absolutely parallel information graph of a problem is applied for the parallel program creation. The information graph illustrates the parallelizing and pipelining of computations. In addition to the traditionally practiced reduction of basic subgraphs’ number, the reductions of computational operations’ quantity and of data digit capacity can be utilized for the performance or hardware costs’ scaling. We have proved that the methods of information graph hardware costs and performance reduction provide a considerable decrease in the number of steps needed for adaptation of parallel application to reconfigurable computer systems’ architectures in comparison with automatic parallelizing. We have proved the theorem of coefficient value at sequential reduction, the theorem of increase in reduction coefficient at custom value and the theorem of commutativity of various reduction transformations. The proved theorems help to find a rational sequence of reduction transformations.

Author(s):  
И.И. Левин ◽  
А.И. Дордопуло

Рассмотрена оригинальная методика отображения информационного графа прикладной программы на архитектуру реконфигурируемой вычислительной системы с помощью методов редукции производительности, обеспечивающих решение задач, аппаратные затраты на реализацию которых превышают доступный вычислительный ресурс. Доказаны теоремы о свойствах последовательного применения редукций по числу базовых подграфов, по числу вычислительных устройств и разрядности. На основе доказанных теорем и следствий из них сформулирована методика редукционных преобразований информационного графа прикладной программы для автоматической адаптации к архитектуре реконфигурируемой вычислительной системы. Приведена оценка максимального числа преобразований согласно предложенной методике для сбалансированной редукции производительности и аппаратных затрат прикладных программ для реконфигурируемых вычислительных систем. To solve applied problems, the hardware costs of which exceed the available computing resource of FPGA-based computer systems, an original technique was developed for mapping the information graph of an application program to the architecture of a reconfigurable computing system. The proposed technique is based on the performance reduction methods that reduce the productivity of an applied task, which, along with the reducing productivity, does so for the hardware costs of its implementation and, thereby, solve the problem on the available computing resource. We demonstrate that the decrease in hardware costs for the computing structure realization occurs only during the reduction the basic subgraph number, the number of computing devices in a basic subgraph and the data width. The influence of sequential reduction transformations on the computing structure of a problem is examined. The proved theorems are concerned with the possibility of representing the reduction coefficient as a product of the coefficients of successive reductions, on the inability of additive increase in reduction coefficient during sequential reductions and on the superposition commutativity of different sequential reductions. The proved theorems and the corollaries presented in the article allow formulating the basic principles for the method of reduction transformations of the information graph of the problem for adaptation to the architecture of a hybrid reconfigurable computing system. A distinctive feature of the technique is a relatively small number of transformations for a balanced reduction of the information graph of the problem and the implementation of the task on a reconfigurable computer system.The comparatively small number of transformations required for the balanced reduction of the information graph of the problem and for the implementation of calculations on a reconfigurable computer system is the distinctive feature of the technique. For the developed technique, we estimated the maximal number of transformations and found out the decrease in the quantity of analyzed reduction variants from each class. The proposed technique permits the significant reduction of the time needed to create the computational structure of a parallel program adapted to the architecture and configuration of the reconfigurable computing system. Furthermore, the technique allows automatization of this process using the specialized software and providing at least 5075 efficiency in comparison with the solutions of the same problems by specialists.


Performance ◽  
1959 ◽  
pp. 1-5:20
Author(s):  
Kenneth J. Lush ◽  
John K. Moakes

2006 ◽  
Vol 21 (3) ◽  
pp. 205-219 ◽  
Author(s):  
RICHARD ANTHONY

This paper presents an empirical investigation of policy-based self-management techniques for parallel applications executing in loosely-coupled environments. The dynamic and heterogeneous nature of these environments is discussed and the special considerations for parallel applications are identified. An adaptive strategy for the run-time deployment of tasks of parallel applications is presented. The strategy is based on embedding numerous policies which are informed by contextual and environmental inputs. The policies govern various aspects of behaviour, enhancing flexibility so that the goals of efficiency and performance are achieved despite high levels of environmental variability. A prototype self-managing parallel application is used as a vehicle to explore the feasibility and benefits of the strategy. In particular, several aspects of stability are investigated. The implementation and behaviour of three policies are discussed and sample results examined.


Author(s):  
Dominik Strzałka

<p>The problem of modeling different parts of computer systems requires accurate statistical tools. Cache memory systems is an inherent part of nowadays computer systems, where the memory hierarchical structure plays a key point role in behavior and performance of the whole system. In the case of Windows operating systems, cache memory is a place in memory subsystem where the I/O system puts recently used data from disk. In paper some preliminary results about statistical behavior of one selected system counter behavior are presented. Obtained results shown that the real phenomena, which have appeared during human-computer interaction, can be expressed in terms of non-extensive statistics that is related to Tsallis proposal of new entropy definition.</p>


Author(s):  
Zhengjing Shen ◽  
Wuli Chu

Sediment erosion is recognized as a serious engineering problem in slurry handling such as screw centrifugal pump, which has wide efficiency region and non-plugging performance. In the present study, the screw centrifugal pump was simulated based on the Euler-Lagrange method. The Mclaury model was adopted for the erosion prediction of flow passage components. By analyzing the correlation factor functions contained in the erosion model and performing some preliminary research with a simplified model, particle velocity, particle shape factor and particle concentration were selected as the influencing factors to analysis the quantitative relationship among particle parameters, erosion wear and performance of screw centrifugal pump. The results show that the erosion of volute casing is higher than impeller, and the erosion rate of suction side is higher than pressure side. The particles velocity is positively correlated with erosion wear and pump performance reduction rate. While the increase of particles shape factor shows the opposite trend. Erosion rate is found to be increases sharply and then slowly when particles concentration increases, because of the adhesion effect of sand particles in the volute casing inhibits the total erosion wear. The increase of erosion rate promoted the reduction rate of pump performance, and the pump efficiency decreased more significantly when the erosion rate increased to a certain extent. The results of this study are of great significance for further optimization of hydraulic design and structural design for screw centrifugal pump.


2013 ◽  
Vol 347-350 ◽  
pp. 2747-2751 ◽  
Author(s):  
Zhi Ming Feng ◽  
Yi Dan Su

tem-item collaborative filtering was widely used in item recommender system because of good recommend effects. However when facing a large amount of items, there would be performance reduction, because of building a very large item comparison dataset in order to find the similar item. K-means cluster had a very good effect in classification and a good performance even though the dataset being processed is very large. But the cold start was a problem to k-means and we must do some extra work to use it in item recommendation. By using the simulated annealing theory to combine the two methods to fixed the problems of the two methods mentioned above and take use of their advantages for better recommendation effect and performance. The experimental results show that, using simulated annealing to combine the clustering and collaborative filtering in item recommendation system can get more stable recommendation results of better quality.


1992 ◽  
Vol 1 (1) ◽  
pp. 51-66 ◽  
Author(s):  
Ian Foster ◽  
Robert Olson ◽  
Steven Tuecke

We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.


Sign in / Sign up

Export Citation Format

Share Document