On cache memory hierarchy for Chip-Multiprocessor

Cell Superscalar's (CellSs) main goal is to provide a simple, flexible and easy programming approach for the Cell Broadband Engine (Cell/B.E.) that automatically exploits the inherent concurrency of the applications at a task level. The CellSs environment is based on a source-to-source compiler that translates annotated C or Fortran code and a runtime library tailored for the Cell/B.E. that takes care of the concurrent execution of the application. The first efforts for task scheduling in CellSs derived from very simple heuristics. This paper presents new scheduling techniques that have been developed for CellSs for the purpose of improving an application's performance. Additionally, the design of a new scheduling algorithm is detailed and the algorithm evaluated. The CellSs scheduler takes an extension of the memory hierarchy for Cell/B.E. into account, with a cache memory shared between the SPEs. All new scheduling practices have been evaluated showing better behavior of our system.

Download Full-text

Multi-objective optimization applied to unified second level cache memory hierarchy tuning aiming at energy and performance optimization

Applied Soft Computing ◽

10.1016/j.asoc.2016.09.006 ◽

2016 ◽

Vol 49 ◽

pp. 603-610 ◽

Cited By ~ 3

Author(s):

Filipe Rolim Cordeiro ◽

Abel Guilhermino da Silva-Filho

Keyword(s):

Performance Optimization ◽

Memory Hierarchy ◽

Cache Memory ◽

Multi Objective Optimization ◽

Multi Objective ◽

And Performance

Download Full-text

An enhancement of futures runtime in presence of cache memory hierarchy

ITI 2008 - 30th International Conference on Information Technology Interfaces ◽

10.1109/iti.2008.4588535 ◽

2008 ◽

Author(s):

Matko Botincan ◽

Davor Runje

Keyword(s):

Memory Hierarchy ◽

Cache Memory

Download Full-text

Coherence Maintenances to realize an efficient parallel processing for a Cache Memory with Synchronization on a Chip-Multiprocessor

8th International Symposium on Parallel Architectures,Algorithms and Networks (ISPAN'05) ◽

10.1109/ispan.2005.27 ◽

2006 ◽

Cited By ~ 2

Author(s):

A. Yamawaki ◽

M. Iwane

Keyword(s):

Parallel Processing ◽

Chip Multiprocessor ◽

Cache Memory

Download Full-text

Non-Overlayed Scratchpad Allocation Approaches for Main/Scratchpad + Cache Memory Hierarchy

Advanced Memory Optimization Techniques for Low-Power Embedded Processors ◽

10.1007/978-1-4020-5897-4_5 ◽

2007 ◽

pp. 49-82

Keyword(s):

Memory Hierarchy ◽

Cache Memory

Download Full-text

A novel Chip-Multiprocessor Architecture with optically interconnected shared L1 Optical Cache Memory

Optical Fiber Communication Conference ◽

10.1364/ofc.2014.th2a.9 ◽

2014 ◽

Cited By ~ 1

Author(s):

P. Maniotis ◽

S. Gitzenis ◽

L. Tassiulas ◽

N. Pleros

Keyword(s):

Chip Multiprocessor ◽

Cache Memory ◽

Multiprocessor Architecture

Download Full-text

An optically-enabled chip–multiprocessor architecture using a single-level shared optical cache memory

Optical Switching and Networking ◽

10.1016/j.osn.2016.05.001 ◽

2016 ◽

Vol 22 ◽

pp. 54-68 ◽

Cited By ~ 3

Author(s):

P. Maniotis ◽

S. Gitzenis ◽

L. Tassiulas ◽

N. Pleros

Keyword(s):

Chip Multiprocessor ◽

Cache Memory ◽

Single Level ◽

Multiprocessor Architecture

Download Full-text

HIGH LATENCY AND CONTENTION ON SHARED L2-CACHE FOR MANY-CORE ARCHITECTURES

Parallel Processing Letters ◽

10.1142/s0129626411000096 ◽

2011 ◽

Vol 21 (01) ◽

pp. 85-106 ◽

Cited By ~ 2

Author(s):

MARCO A. Z. ALVES ◽

HENRIQUE C. FREITAS ◽

PHILIPPE O. A. NAVAUX

Keyword(s):

Execution Time ◽

Chip Multiprocessor ◽

Cache Memory ◽

Shared Cache ◽

Shared Caches ◽

L2 Cache ◽

Low Performance ◽

Many Core

Several studies point out the benefits of a shared L2 cache, but some other properties of shared caches must be considered to lead to a thorough understanding of all chip multiprocessor (CMP) bottlenecks. Our paper evaluates and explains shared cache bottlenecks, which are very important considering the rise of many-core processors. The results of our simulations with 32 cores show low performance when L2 cache memory is shared between 2 or 4 cores. In these two cases, the increase of L2 cache latency and contention are the main causes responsible for the increase of execution time.

Download Full-text