Integrating Fine-Grained Message Passing in Cache Coherent Shared Memory Multiprocessors

1996 ◽  
Vol 33 (2) ◽  
pp. 172-188 ◽  
Author(s):  
David K. Poulsen ◽  
Pen-Chung Yew
Author(s):  
Vladimir Vlassov ◽  
Oscar Sierra Merino ◽  
Csaba Andras Moritz ◽  
Konstantin Popov

2000 ◽  
Vol 10 (01) ◽  
pp. 111-132 ◽  
Author(s):  
VOON-YEE VEE ◽  
WEN-JING HSU

In the past decade, many synchronous algorithms have been proposed for parallel and discrete simulations. However, the actual performance of these algorithms have been far from ideal, especially when event granularity is small. Barring the case of low parallelism in the given simulation models, one of the main reasons of low speedups is in the uneven load distribution among processors. To amend for this, both static and dynamic load balancing approaches have been proposed. Nevertheless, static schemes based on partitioning of LPs are often subject to the dynamic behavior of the specific simulation models and are therefore application dependent; dynamic load balancing schemes, on the other hand, often suffer from loss of localities and hence cache misses, which could severely penalize on fine-grained event processing. In this paper, we present several new locality-preserving load balancing mechanisms for synchronous simulations on shared-memory multiprocessors. We focus on the type of synchronous simulations where the number of LPs to be processed within a cycle decreases monotonically. We show both theoretically and empirically that some of these mechanisms incur very low overhead. The mechanisms have been implemented by using MIT's Cilk and tested with a number of simulation applications. The results confirm that one of the new mechanisms is indeed more efficient and scalable than common existing approaches.


Sign in / Sign up

Export Citation Format

Share Document