Compilation for Distributed Memory Architectures

Author(s):  
Alok Choudhary ◽  
Mahmut Kandemir
1991 ◽  
Vol 2 (2) ◽  
pp. 45-49 ◽  
Author(s):  
Michele Di Santo ◽  
Giulio Iannello

Algorithms ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 342
Author(s):  
Alessandro Varsi ◽  
Simon Maskell ◽  
Paul G. Spirakis

Resampling is a well-known statistical algorithm that is commonly applied in the context of Particle Filters (PFs) in order to perform state estimation for non-linear non-Gaussian dynamic models. As the models become more complex and accurate, the run-time of PF applications becomes increasingly slow. Parallel computing can help to address this. However, resampling (and, hence, PFs as well) necessarily involves a bottleneck, the redistribution step, which is notoriously challenging to parallelize if using textbook parallel computing techniques. A state-of-the-art redistribution takes O((log2N)2) computations on Distributed Memory (DM) architectures, which most supercomputers adopt, whereas redistribution can be performed in O(log2N) on Shared Memory (SM) architectures, such as GPU or mainstream CPUs. In this paper, we propose a novel parallel redistribution for DM that achieves an O(log2N) time complexity. We also present empirical results that indicate that our novel approach outperforms the O((log2N)2) approach.


2013 ◽  
Vol 29 (2) ◽  
pp. 571-579 ◽  
Author(s):  
Didier Devaurs ◽  
Thierry Simeon ◽  
Juan Cortes

Sign in / Sign up

Export Citation Format

Share Document