Data-Parallel BLAS as a Basis for LAPACK on Massively Parallel Computers

Author(s):  
P. E. Bjørstad ◽  
T. Sørevik
1993 ◽  
Vol 04 (01) ◽  
pp. 85-96
Author(s):  
NICOLAS PARIS

POMPC is a parallel language dedicated to the programming of Massively Parallel Computers according to a synchronous Data Parallel model. It is an extension of the ANSI C language and follows its philosophy. Parallelism is explicitly handled by the definition of collections of parallel variables and the definition of communication primitives. A methodology is presented in order to easily port the language on different target architectures. Virtualization is introduced to handle simultaneously several collections of different sizes and shapes. Virtualization management is a key point of the compilation process. Programmer, architecture, compilation and system points of view lead to a special implementation of the virtualization mixing physical and virtual parallel objects. The implementation of the virtualization is adapted for the development of communication libraries and also adapted to enlarge the asynchronous sections of code for SPMD architecture. The portability of the POMPC language is validated by several implementations for mono/multi-process simulation on UNIX machines, for the Connection Machine CM-2, for the MasPar MP-1 and a compiler is in preparation for the iPSC-860.


1993 ◽  
Vol 2 (4) ◽  
pp. 193-202 ◽  
Author(s):  
Daniel J. Lickly ◽  
Philip J. Hatcher

Our goal is to apply the software engineering advantages of object-oriented programming to the raw power of massively parallel architectures. To do this we have constructed a hierarchy of C++ classes to support the data-parallel paradigm. Feasibility studies and initial coding can be supported by any serial machine that has a C++ compiler. Parallel execution requires an extended Cfront, which understands the data-parallel classes and generates C*code. (C*is a data-parallel superset of ANSI C developed by Thinking Machines Corporation). This approach provides potential portability across parallel architectures and leverages the existing compiler technology for translating data-parallel programs onto both SIMD and MIMD hardware.


1992 ◽  
Vol 278 ◽  
Author(s):  
Steven R. Lustig ◽  
J.J. Cristy ◽  
D.A. Pensak

AbstractThe fast multipole method (FMM) is implemented in canonical ensemble particle simulations to compute non-bonded interactions efficiently with explicit error control. Multipole and local expansions have been derived to implement the FMM efficiently in Cartesian coordinates for soft-sphere (inverse power law), Lennard- Jones, Morse and Yukawa potential functions. Significant reductions in execution times have been achieved with respect to the direct method. For a given number, N, of particles the execution times of the direct method scale asO(N2). The FMM execution times scale asO(N) on sequential workstations and vector processors and asymptotically0(logN) on massively parallel computers. Connection Machine CM-2 and WAVETRACER-DTC parallel FMM implementations execute faster than the Cray-YMP vectorized FMM for ensemble sizes larger than 28k and 35k, respectively. For 256k particle ensembles the CM-2 parallel FMM is 12 times faster than the Cray-YMP vectorized direct method and 2.2 times faster than the vectorized FMM. For 256k particle ensembles the WAVETRACER-DTC parallel FMM is 33 times faster than the Cray-YMP vectorized direct method.


ChemPhysChem ◽  
2005 ◽  
Vol 6 (9) ◽  
pp. 1788-1793 ◽  
Author(s):  
Jürg Hutter ◽  
Alessandro Curioni

2009 ◽  
Vol E92-D (5) ◽  
pp. 1062-1078 ◽  
Author(s):  
M.M. Hafizur RAHMAN ◽  
Yasushi INOGUCHI ◽  
Yukinori SATO ◽  
Susumu HORIGUCHI

Sign in / Sign up

Export Citation Format

Share Document