Improvement of Load Balancing in Shared-Memory Multiprocessor Systems

Author(s):  
Hasan Deeb ◽  
Archana Sarangi ◽  
Shubhendu Kumar Sarangi
2009 ◽  
Vol 20 (01) ◽  
pp. 167-183 ◽  
Author(s):  
WOLFGANG BEIN ◽  
LAWRENCE L. LARMORE ◽  
RÜDIGER REISCHUK

Multiprocessor systems with a global shared memory provide logically uniform data access. To hide latencies when accessing global memory each processor makes use of a private cache. Several copies of a data item may exist concurrently in the system. To guarantee consistency when updating an item a processor must invalidate copies of the item in other private caches. To exclude the effect of classical paging faults, one assumes that each processor knows its own data access sequence, but does not know the sequence of future invalidations requested by other processors. Performance of a processor with this restriction can be measured against the optimal behavior of a theoretical omniscient processor, using competitive analysis. We present a [Formula: see text]-competitive randomized online algorithm for this problem for cache size of 2, and prove a matching lower bound on the competitiveness. The algorithm is derived with the help of a new concept we call knowledge states. Finally, we show a lower bound of [Formula: see text] on the competitiveness for larger cache sizes.


2003 ◽  
Vol 11 (2) ◽  
pp. 105-124 ◽  
Author(s):  
Vishal Aslot ◽  
Rudolf Eigenmann

The state of modern computer systems has evolved to allow easy access to multiprocessor systems by supporting multiple processors on a single physical package. As the multiprocessor hardware evolves, new ways of programming it are also developed. Some inventions may merely be adopting and standardizing the older paradigms. One such evolving standard for programming shared-memory parallel computers is the OpenMP API. The Standard Performance Evaluation Corporation (SPEC) has created a suite of parallel programs called SPEC OMP to compare and evaluate modern shared-memory multiprocessor systems using the OpenMP standard. We have studied these benchmarks in detail to understand their performance on a modern architecture. In this paper, we present detailed measurements of the benchmarks. We organize, summarize, and display our measurements using a Quantitative Model. We present a detailed discussion and derivation of the model. Also, we discuss the important loops in the SPEC OMPM2001 benchmarks and the reasons for less than ideal speedup on our platform.


1992 ◽  
Vol 02 (04) ◽  
pp. 391-398 ◽  
Author(s):  
THEODORE JOHNSON ◽  
TIMOTHY A. DAVIS

Shared memory multiprocessor systems need efficient dynamic storage allocators, both for system purposes and to support parallel programs. Memory managers are often based on the buddy system, which provides fast allocation and release. Previous parallel buddy memory managers made no attempt to coordinate the allocation, splitting and release of blocks, and as a result needlessly fragment memory. We present a fast and simple parallel buddy memory manager that is also as space efficient as a serial buddy memory manager. We test our algorithms using memory allocation/deallocation traces collected from a parallel sparse matrix algorithm.


Sign in / Sign up

Export Citation Format

Share Document