Parallel Nonlinear Optimization on a Multiprocessor System with Distributed Memory

Author(s):  
Harald Boden ◽  
Regina Gehne ◽  
Manfred Grauer
1993 ◽  
Vol 04 (06) ◽  
pp. 1295-1306 ◽  
Author(s):  
ACHIM BASERMANN

For the solution of discretized ordinary or partial differential equations it is necessary to solve systems of equations with coefficient matrices of different sparsity pattern, depending on the discretization method; using the finite element method (FE) results in largely unstructured systems of equations. A frequently used iterative solver for systems of equations is the method of conjugate gradients (CG) with different preconditioners. On a multiprocessor system with distributed memory, in particular the data distribution and the communication scheme depending on the used data struture are of greatest importance for the efficient execution of this method. Here, a data distribution and a communication scheme are presented which are based on the analysis of the column indices of the non-zero matrix elements. The performance of the developed parallel CG-method was measured on the distributed-memory-system INTEL iPSC/860 of the Research Centre Jülich with systems of equations from FE-models. The parallel CG-algorithm has been shown to be well suited for both regular and irregular discretization meshes, i.e. for coefficient matrices of very different sparsity pattern.


2001 ◽  
Vol 11 (01) ◽  
pp. 169-184 ◽  
Author(s):  
PRASAD KAKULAVARAPU ◽  
OLIVIER C. MAQUELIN ◽  
JOSÉ NELSON AMARAL ◽  
GUANG R. GAO

Designing multi-processor systems that deliver a reasonable price-performance ratio using off-the-shelf processor and compiler technologies is a major challenge. For an important class of applications, it is critical to explore fine-grain parallelism to achieve reasonable performance. In such parallel systems it is essential to efficiently manage communication latencies, bandwidth, and synchronization overheads. In this paper we study load balancing strategies for the runtime system of a multi-threaded system. EARTH (Efficient Architecture for Running Threads) is a multi-threaded programming and execution model that supports fine-grain, non-preemptive, threads in a distributed memory environment. We describe the design and implementation of a set of dynamic load balancing algorithms, and study their performance in divide-and-conquer, regular, and irregular applications. Our experimental study on the distributed memory multi-processor IBP SP-2 indicate that a randomized load balancer perform as well as, and often better than, history based load balancers.


Optik ◽  
2010 ◽  
Vol 121 (20) ◽  
pp. 1845-1847 ◽  
Author(s):  
Zhihua Yu ◽  
Fengguang Luo ◽  
Bin LI ◽  
Weilin Zhou ◽  
Liangjia Zong ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document