Compiler-directed power optimization of high-performance interconnection networks for load-balancing MPI applications

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text

Design and analysis of high performance multistage interconnection networks

IEEE Transactions on Computers ◽

10.1109/12.559810 ◽

1997 ◽

Vol 46 (1) ◽

pp. 110-117 ◽

Cited By ~ 3

Author(s):

S.K. Bhogavilli ◽

H. Abu-Amara

Keyword(s):

Interconnection Networks ◽

High Performance ◽

Multistage Interconnection Networks

Download Full-text

Trends in High‐Performance Interconnection Networks in the Exascale and Big‐Data Era (HiPINEB 2017)

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.5041 ◽

2018 ◽

Vol 31 (2) ◽

pp. e5041

Author(s):

Jesús Escudero Sahuquillo ◽

Pedro Javier Garcia

Keyword(s):

Big Data ◽

Interconnection Networks ◽

High Performance

Download Full-text

Hierarchical Load Balancing Model by Optimal Resource Utilization

Research Anthology on Architectures, Frameworks, and Integration Strategies for Distributed and Cloud Computing ◽

10.4018/978-1-7998-5339-8.ch007 ◽

2021 ◽

pp. 150-164

Author(s):

Jagdish Chandra Patni

Keyword(s):

Load Balancing ◽

Grid Computing ◽

Resource Utilization ◽

Resource Availability ◽

High Performance ◽

Low Cost ◽

Original Form ◽

Computing Paradigm ◽

Optimal Resource ◽

Performance Computing

Powerful computational capabilities and resource availability at a low cost is the utmost demand for high performance computing. The resources for computing can viewed as the edges of an interconnected grid. It can attain the capabilities of grid computing by balancing the load at various levels. Since the nature of resources are heterogeneous and distributed geographically, the grid computing paradigm in its original form cannot be used to meet the requirements, so it can use the capabilities of the cloud and other technologies to achieve the goal. Resource heterogeneity makes grid computing more dynamic and challenging. Therefore, in this article the problem of scalability, heterogeneity and adaptability of grid computing is discussed with a perspective of providing high computing, load balancing and availability of resources.

Download Full-text

High-Performance Implementation of Dynamically Configurable Load Balancing Engine on FPGA

IEEE Communications Magazine ◽

10.1109/mcom.001.1900525 ◽

2020 ◽

Vol 58 (1) ◽

pp. 62-67

Author(s):

Jun Zhao ◽

Zhichuan Guo ◽

Xuewen Zeng ◽

Mangu Song

Keyword(s):

Load Balancing ◽

High Performance ◽

Dynamically Configurable

Download Full-text

A new approach for load balancing in high performance decision support systems

High-Performance Computing and Networking - Lecture Notes in Computer Science ◽

10.1007/3-540-61142-8_598 ◽

1996 ◽

pp. 571-579 ◽

Cited By ~ 1

Author(s):

Björn Schiemann ◽

Lothar Borrmann

Keyword(s):

Decision Support ◽

Load Balancing ◽

Decision Support Systems ◽

High Performance ◽

Support Systems ◽

New Approach

Download Full-text

High performance pattern matching with dynamic load balancing on heterogeneous systems

14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP'06) ◽

10.1109/pdp.2006.41 ◽

2006 ◽

Cited By ~ 2

Author(s):

Jin Hwan Park ◽

B.A. Demirdag

Keyword(s):

Load Balancing ◽

Pattern Matching ◽

Dynamic Load ◽

High Performance ◽

Heterogeneous Systems ◽

Dynamic Load Balancing ◽

Performance Pattern

Download Full-text

Optical interconnection networks for high-performance systems

Optical Fiber Telecommunications VII ◽

10.1016/b978-0-12-816502-7.00020-8 ◽

2020 ◽

pp. 785-825

Author(s):

Qixiang Cheng ◽

Madeleine Glick ◽

Keren Bergman

Keyword(s):

Interconnection Networks ◽

High Performance ◽

Optical Interconnection ◽

High Performance Systems

Download Full-text

Communication Performance Evaluation of the Locally Twisted Cube

International Journal of Foundations of Computer Science ◽

10.1142/s0129054120500057 ◽

2020 ◽

Vol 31 (02) ◽

pp. 233-252

Author(s):

Yuejuan Han ◽

Lantao You ◽

Cheng-Kuan Lin ◽

Jianxi Fan

Keyword(s):

Performance Evaluation ◽

Interconnection Networks ◽

High Performance ◽

Communication Performance ◽

Wide Diameter ◽

Fault Diameter ◽

Twisted Cube ◽

Hypercube Network ◽

High Performance Computers ◽

Important Variant

The topology properties of multi-processors interconnection networks are important to the performance of high performance computers. The hypercube network [Formula: see text] has been proved to be one of the most popular interconnection networks. The [Formula: see text]-dimensional locally twisted cube [Formula: see text] is an important variant of [Formula: see text]. Fault diameter and wide diameter are two communication performance evaluation parameters of a network. Let [Formula: see text]), [Formula: see text] and [Formula: see text] denote the diameter, the [Formula: see text] fault diameter and the wide diameter of [Formula: see text], respectively. In this paper, we prove that [Formula: see text] if [Formula: see text] is an odd integer with [Formula: see text], [Formula: see text] if [Formula: see text] is an even integer with [Formula: see text].

Download Full-text