Efficient task spawning for shared memory and message passing in many-core architectures

2017 ◽  
Vol 77 ◽  
pp. 72-82 ◽  
Author(s):  
Aurang Zaib ◽  
Thomas Wild ◽  
Andreas Herkersdorf ◽  
Jan Heisswolf ◽  
Jürgen Becker ◽  
...  
2016 ◽  
Vol 26 (03) ◽  
pp. 1650014 ◽  
Author(s):  
Markus Flatz ◽  
Marián Vajteršic

The goal of Nonnegative Matrix Factorization (NMF) is to represent a large nonnegative matrix in an approximate way as a product of two significantly smaller nonnegative matrices. This paper shows in detail how an NMF algorithm based on Newton iteration can be derived using the general Karush-Kuhn-Tucker (KKT) conditions for first-order optimality. This algorithm is suited for parallel execution on systems with shared memory and also with message passing. Both versions were implemented and tested, delivering satisfactory speedup results.


1992 ◽  
Vol 6 (1) ◽  
pp. 98-111 ◽  
Author(s):  
S. K. Kim ◽  
A. T. Chrortopoulos

Main memory accesses for shared-memory systems or global communications (synchronizations) in message passing systems decrease the computation speed. In this paper, the standard Arnoldi algorithm for approximating a small number of eigenvalues, with largest (or smallest) real parts for nonsymmetric large sparse matrices, is restructured so that only one synchronization point is required; that is, one global communication in a message passing distributed-memory machine or one global memory sweep in a shared-memory machine per each iteration is required. We also introduce an s-step Arnoldi method for finding a few eigenvalues of nonsymmetric large sparse matrices. This method generates reduction matrices that are similar to those generated by the standard method. One iteration of the s-step Arnoldi algorithm corresponds to s iterations of the standard Arnoldi algorithm. The s-step method has improved data locality, minimized global communication, and superior parallel properties. These algorithms are implemented on a 64-node NCUBE/7 Hypercube and a CRAY-2, and performance results are presented.


2005 ◽  
Vol 18 (2) ◽  
pp. 219-224
Author(s):  
Emina Milovanovic ◽  
Natalija Stojanovic

Because many universities lack the funds to purchase expensive parallel computers, cost effective alternatives are needed to teach students about parallel processing. Free software is available to support the three major paradigms of parallel computing. Parallaxis is a sophisticated SIMD simulator which runs on a variety of platforms.jBACI shared memory simulator supports the MIMD model of computing with a common shared memory. PVM and MPI allow students to treat a network of workstations as a message passing MIMD multicomputer with distributed memory. Each of this software tools can be used in a variety of courses to give students experience with parallel algorithms.


2002 ◽  
Vol 28 (11) ◽  
pp. 1507-1530
Author(s):  
Ian N Dunn ◽  
Gerard G.L Meyer

Sign in / Sign up

Export Citation Format

Share Document