scholarly journals Aristotle: A performance Impact Indicator for the OpenCL Kernels Using Local Memory

2014 ◽  
Vol 22 (3) ◽  
pp. 239-257 ◽  
Author(s):  
Jianbin Fang ◽  
Henk Sips ◽  
Ana Lucia Varbanescu

Due to the increasing complexity of multi/many-core architectures (with their mix of caches and scratch-pad memories) and applications (with different memory access patterns), the performance of many workloads becomes increasingly variable. In this work, we address one of the main causes for this performance variability: the efficiency of the memory system. Specifically, based on an empirical evaluation driven by memory access patterns, we qualify and partially quantify the performance impact of using local memory in multi/many-core processors. To do so, we systematically describe memory access patterns (MAPs) in an application-agnostic manner. Next, for each identified MAP, we use OpenCL (for portability reasons) to generate two microbenchmarks: a “naive” version (without local memory) and “an optimized” version (using local memory). We then evaluate both of them on typically used multi-core and many-core platforms, and we log their performance. What we eventually obtain is a local memory performance database, indexed by various MAPs and platforms. Further, we propose a set of composing rules for multiple MAPs. Thus, we can get an indicator of whether using local memory is beneficial in the presence of multiple memory access patterns. This indication can be used to either avoid the hassle of implementing optimizations with too little gain or, alternatively, give a rough prediction of the performance gain.

PLoS ONE ◽  
2021 ◽  
Vol 16 (4) ◽  
pp. e0250306
Author(s):  
Jonas Latt ◽  
Christophe Coreixas ◽  
Joël Beny

We present a novel, hardware-agnostic implementation strategy for lattice Boltzmann (LB) simulations, which yields massive performance on homogeneous and heterogeneous many-core platforms. Based solely on C++17 Parallel Algorithms, our approach does not rely on any language extensions, external libraries, vendor-specific code annotations, or pre-compilation steps. Thanks in particular to a recently proposed GPU back-end to C++17 Parallel Algorithms, it is shown that a single code can compile and reach state-of-the-art performance on both many-core CPU and GPU environments for the solution of a given non trivial fluid dynamics problem. The proposed strategy is tested with six different, commonly used implementation schemes to test the performance impact of memory access patterns on different platforms. Nine different LB collision models are included in the tests and exhibit good performance, demonstrating the versatility of our parallel approach. This work shows that it is less than ever necessary to draw a distinction between research and production software, as a concise and generic LB implementation yields performances comparable to those achievable in a hardware specific programming language. The results also highlight the gains of performance achieved by modern many-core CPUs and their apparent capability to narrow the gap with the traditionally massively faster GPU platforms. All code is made available to the community in form of the open-source project stlbm, which serves both as a stand-alone simulation software and as a collection of reusable patterns for the acceleration of pre-existing LB codes.


2018 ◽  
Vol 78 ◽  
pp. 1-14 ◽  
Author(s):  
Harald Servat ◽  
Jesús Labarta ◽  
Hans-Christian Hoppe ◽  
Judit Giménez ◽  
Antonio J. Peña

2020 ◽  
Vol 117 (39) ◽  
pp. 24590-24598
Author(s):  
Freek van Ede ◽  
Alexander G. Board ◽  
Anna C. Nobre

Adaptive behavior relies on the selection of relevant sensory information from both the external environment and internal memory representations. In understanding external selection, a classic distinction is made between voluntary (goal-directed) and involuntary (stimulus-driven) guidance of attention. We have developed a task—the anti-retrocue task—to separate and examine voluntary and involuntary guidance of attention to internal representations in visual working memory. We show that both voluntary and involuntary factors influence memory performance but do so in distinct ways. Moreover, by tracking gaze biases linked to attentional focusing in memory, we provide direct evidence for an involuntary “retro-capture” effect whereby external stimuli involuntarily trigger the selection of feature-matching internal representations. We show that stimulus-driven and goal-directed influences compete for selection in memory, and that the balance of this competition—as reflected in oculomotor signatures of internal attention—predicts the quality of ensuing memory-guided behavior. Thus, goal-directed and stimulus-driven factors together determine the fate not only of perception, but also of internal representations in working memory.


2019 ◽  
Vol 16 (3) ◽  
pp. 1-24
Author(s):  
Bingchao Li ◽  
Jizeng Wei ◽  
Jizhou Sun ◽  
Murali Annavaram ◽  
Nam Sung Kim

Sign in / Sign up

Export Citation Format

Share Document