scholarly journals An enhancement of futures runtime in presence of cache memory hierarchy

Author(s):  
Matko Botincan ◽  
Davor Runje
2009 ◽  
Vol 17 (1-2) ◽  
pp. 77-95 ◽  
Author(s):  
Pieter Bellens ◽  
Josep M. Perez ◽  
Felipe Cabarcas ◽  
Alex Ramirez ◽  
Rosa M. Badia ◽  
...  

Cell Superscalar's (CellSs) main goal is to provide a simple, flexible and easy programming approach for the Cell Broadband Engine (Cell/B.E.) that automatically exploits the inherent concurrency of the applications at a task level. The CellSs environment is based on a source-to-source compiler that translates annotated C or Fortran code and a runtime library tailored for the Cell/B.E. that takes care of the concurrent execution of the application. The first efforts for task scheduling in CellSs derived from very simple heuristics. This paper presents new scheduling techniques that have been developed for CellSs for the purpose of improving an application's performance. Additionally, the design of a new scheduling algorithm is detailed and the algorithm evaluated. The CellSs scheduler takes an extension of the memory hierarchy for Cell/B.E. into account, with a cache memory shared between the SPEs. All new scheduling practices have been evaluated showing better behavior of our system.


2013 ◽  
Vol 49 (7) ◽  
pp. 4456-4459 ◽  
Author(s):  
Shinobu Fujita ◽  
H. Noguchi ◽  
K. Nomura ◽  
K. Abe ◽  
E. Kitagawa ◽  
...  

10.14311/822 ◽  
2006 ◽  
Vol 46 (2) ◽  
Author(s):  
I. Šimeček

Every modern CPU uses a complex memory hierarchy, which consists of multiple cache memory levels. It is very difficult to predict the behavior of this hierarchy for a given program (for details see [1, 2]). The situation is even worse for systems with a shared memory. The most important example is the case of SMP (symmetric multiprocessing) systems [3]. The importance of these systems is growing due to the multi-core feature of the newest CPUs.The Cache Emulator (CE) can simulate the behavior of caches inside an SMP system and compute the number of cache misses during a computation. All measurements are done in the “off-line” mode on a single CPU. The CE uses its own emulated cache memory for an exact simulation. This means that no other CPU activity influences the behavior of the CE. This work extends the Cache Analyzer introduced in [4]. 


2020 ◽  
Vol 25 (6) ◽  
pp. 548-557
Author(s):  
A.V. Garashchenko ◽  
◽  
L.G. Gagarina ◽  
◽  

The verification of the cache memory hierarchy in modern SoC due to the large state space requires a huge number of complex tests. This becomes the main problem for functional verification. To cover the entire state space, a graph model of the cache memory hierarchy as well as the methods of generating the formation of the test sequences based on this model have been proposed. The graph model vertices are a set of states (tags, values, etc.) of each hierarchy level, and the edges are a set of transitions between states (instructions for reading, records). The graph model, describing all states of the cache-memory hierarchy states, has been developed. Each edge in the graph is a separate check sequence. In case of the non-deterministic situations, such as the choice of a channel (port) for multichannel cache memory, it will not be possible to resolve them at the level of the graph model, since the choice of the channel depends on many factors not considered within the model framework. It has been proposed to create a separate instance of a subgraph for each channel. The described approach has revealed, in verification of the multiport cache-memory hierarchy of the developed core with the new vector architecture VLIW DSP, a few architectural and functional errors. This approach can be used to test other processor cores and their blocks


2016 ◽  
Vol E99.C (8) ◽  
pp. 936-946
Author(s):  
Ryotaro KOBAYASHI ◽  
Ikumi KANEKO ◽  
Hajime SHIMADA

Sign in / Sign up

Export Citation Format

Share Document