A Comparison of Problem Partitioning Algorithms for the Intel Paragon

Author(s):  
B. Hendrickson ◽  
R. Leland
VLSI Design ◽  
1999 ◽  
Vol 9 (3) ◽  
pp. 253-270 ◽  
Author(s):  
Hong K. Kim ◽  
Jack Jean

A partitioning algorithm for parallel discrete event gate-level logic simulations is proposed in this paper. Unlike most other partitioning algorithms, the proposed algorithm preserves computation concurrency by assigning to processors circuit gates that can be evaluated at about the same time. As a result, the improved concurrency preserving partitioning (iCPP) algorithm can provide better load balancing throughout the period of a parallel simulation. This is especially important when the algorithm is used together with a Time Warp simulation where a high degree of concurrency can lead to fewer rollbacks and better performance. The algorithm consists of three phases and three conflicting goals can be separately considered so to reduce computational complexity.To evaluate the quality of partitioning algorithms in terms of preserving concurrency, a concurrency metric that requires neither sequential nor parallel simulation is proposed. A levelization technique is used in computing the metric so to determine gates which can be evaluated at about the same time. A parallel gate-level logic simulator is implemented on an INTEL Paragon and an IBM SP2 to evaluate the performance of the iCPP algorithm. The results are compared with several other partitioning algorithms to show that the iCPP algorithm does preserve concurrency pretty well and reasonable speedup may be achieved with the algorithm.


1997 ◽  
Vol 6 (1) ◽  
pp. 127-152
Author(s):  
Eric De Sturler ◽  
Volker Strumpen

Recently, the first commercial High Performance Fortran (HPF) subset compilers have appeared. This article reports on our experiences with the xHPF compiler of Applied Parallel Research, version 1.2, for the Intel Paragon. At this stage, we do not expect very High Performance from our HPF programs, even though performance will eventually be of paramount importance for the acceptance of HPF. Instead, our primary objective is to study how to convert large Fortran 77 (F77) programs to HPF such that the compiler generates reasonably efficient parallel code. We report on a case study that identifies several problems when parallelizing code with HPF; most of these problems affect current HPF compiler technology in general, although some are specific for the xHPF compiler. We discuss our solutions from the perspective of the scientific programmer, and presenttiming results on the Intel Paragon. The case study comprises three programs of different complexity with respect to parallelization. We use the dense matrix-matrix product to show that the distribution of arrays and the order of nested loops significantly influence the performance of the parallel program. We use Gaussian elimination with partial pivoting to study the parallelization strategy of the compiler. There are various ways to structure this algorithm for a particular data distribution. This example shows how much effort may be demanded from the programmer to support the compiler in generating an efficient parallel implementation. Finally, we use a small application to show that the more complicated structure of a larger program may introduce problems for the parallelization, even though all subroutines of the application are easy to parallelize by themselves. The application consists of a finite volume discretization on a structured grid and a nested iterative solver. Our case study shows that it is possible to obtain reasonably efficient parallel programs with xHPF, although the compiler needs substantial support from the programmer.


VLSI Design ◽  
2002 ◽  
Vol 15 (2) ◽  
pp. 485-489
Author(s):  
Youssef Saab

Partitioning is a fundamental problem in the design of VLSI circuits. In recent years, ratio-cut partitioning has received attention due to its tendency to partition circuits into their natural clusters. Node contraction has also been shown to enhance the performance of iterative partitioning algorithms. This paper describes a new simple ratio-cut partitioning algorithm using node contraction. This new algorithm combines iterative improvement with progressive cluster formation. Under suitably mild assumptions, the new algorithm runs in linear time. It is also shown that the new algorithm compares favorably with previous approaches.


2019 ◽  
Vol 8 (2) ◽  
pp. 5589-5593

A VLSI integrated circuit is the most significant part of electronic systems such as personal computer or workstation, digital camera, cell phone or a portable computing device, and automobile. So development within the field of electronic space depends on the design planning of VLSI integrated circuit. Circuit partitioning is most important step in VLSI physical design process. Many heuristic partitioning algorithms are proposed for this problem. The first heuristic algorithm for hypergraph partitioning in the domain of VLSI is FM algorithm. In this paper, I have proposed three variations of FM algorithm by utilizing pair insightful swapping strategies. I have played out a relative investigation of FM and my proposed algorithms utilizing two datasets for example ISPD98 and ISPD99. Test results demonstrate that my proposed calculations outflank the FM algorithm.


Sign in / Sign up

Export Citation Format

Share Document