On cache memory hierarchy for Chip-Multiprocessor

2003 ◽  
Vol 31 (1) ◽  
pp. 39-48 ◽  
Author(s):  
Mohamed M. Zahran
2009 ◽  
Vol 17 (1-2) ◽  
pp. 77-95 ◽  
Author(s):  
Pieter Bellens ◽  
Josep M. Perez ◽  
Felipe Cabarcas ◽  
Alex Ramirez ◽  
Rosa M. Badia ◽  
...  

Cell Superscalar's (CellSs) main goal is to provide a simple, flexible and easy programming approach for the Cell Broadband Engine (Cell/B.E.) that automatically exploits the inherent concurrency of the applications at a task level. The CellSs environment is based on a source-to-source compiler that translates annotated C or Fortran code and a runtime library tailored for the Cell/B.E. that takes care of the concurrent execution of the application. The first efforts for task scheduling in CellSs derived from very simple heuristics. This paper presents new scheduling techniques that have been developed for CellSs for the purpose of improving an application's performance. Additionally, the design of a new scheduling algorithm is detailed and the algorithm evaluated. The CellSs scheduler takes an extension of the memory hierarchy for Cell/B.E. into account, with a cache memory shared between the SPEs. All new scheduling practices have been evaluated showing better behavior of our system.


2011 ◽  
Vol 21 (01) ◽  
pp. 85-106 ◽  
Author(s):  
MARCO A. Z. ALVES ◽  
HENRIQUE C. FREITAS ◽  
PHILIPPE O. A. NAVAUX

Several studies point out the benefits of a shared L2 cache, but some other properties of shared caches must be considered to lead to a thorough understanding of all chip multiprocessor (CMP) bottlenecks. Our paper evaluates and explains shared cache bottlenecks, which are very important considering the rise of many-core processors. The results of our simulations with 32 cores show low performance when L2 cache memory is shared between 2 or 4 cores. In these two cases, the increase of L2 cache latency and contention are the main causes responsible for the increase of execution time.


Sign in / Sign up

Export Citation Format

Share Document