Communication algorithms in k-ary n-cube interconnection networks

In most distributed memory MIMD multiprocessors, processors are connected by a point-to-point interconnection network, usually modeled by a graph where processors are nodes and communication links are edges. Since interprocessor communication frequently constitutes serious bottlenecks, several architectures were proposed that enhance point-to-point topologies with the help of multiple bus systems so as to improve the communication efficiency. In this paper we study parallel architectures where the communication means are constituted solely by buses. These architectures can use the power of bus technologies, providing a way to interconnect much more processors in a simple and efficient manner. We present the hyperpath, hypergrid, hyperring, and hypertorus architectures, which are the bus-based versions of the well used point-to-point interconnection networks. Using (hyper) graph theoretic concepts to model inter-processor communication in such networks, we give optimal algorithms for broadcasting a message from one processor to all the others. For deriving high performance communication patterns we developed a new tool called simplification. The idea is to construct a graph, to be called representative graph, from the original hyper-topology, in such a way that it will become easy to describe and perform communication schemes to the former that will fit to the latter, because the simplification concept also allows us to partially use some already known communication algorithms for usual networks.

Download Full-text

A discrete event simulator of communication algorithms in interconnection networks

STACS 92 - Lecture Notes in Computer Science ◽

10.1007/3-540-55210-3_219 ◽

1992 ◽

pp. 609-610 ◽

Cited By ~ 1

Author(s):

Miltos D. Grammatikakis ◽

Jung-Sing Jwo

Keyword(s):

Interconnection Networks ◽

Discrete Event ◽

Communication Algorithms ◽

Discrete Event Simulator

Download Full-text

Efficient Communication Algorithms in Hexagonal Mesh Interconnection Networks

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2011.112 ◽

2012 ◽

Vol 23 (1) ◽

pp. 69-77 ◽

Cited By ~ 18

Author(s):

B. Albader ◽

B. Bose ◽

M. Flahive

Keyword(s):

Interconnection Networks ◽

Communication Algorithms ◽

Efficient Communication

Download Full-text

Optimal all-ports collective communication algorithms for the k-ary n-cube interconnection networks

Journal of Systems Architecture ◽

10.1016/j.sysarc.2003.09.003 ◽

2004 ◽

Vol 50 (4) ◽

pp. 221-231 ◽

Cited By ~ 7

Author(s):

Abderezak Touzene

Keyword(s):

Interconnection Networks ◽

Collective Communication ◽

Communication Algorithms

Download Full-text

On Directed Edge-Disjoint Spanning Trees in Product Networks, An Algorithmic Approach

The Journal of Engineering Research [TJER] ◽

10.24200/tjer.vol11iss2pp79-88 ◽

2014 ◽

Vol 11 (2) ◽

pp. 79

Author(s):

A.R. Touzene ◽

K. Day

Keyword(s):

Interconnection Networks ◽

Spanning Trees ◽

Construction Method ◽

Collective Communication ◽

Factor Graphs ◽

Directed Edge ◽

Algorithmic Approach ◽

Edge Disjoint ◽

Communication Algorithms ◽

Store And Forward

In (Ku et al. 2003), the authors have proposed a construction of edge-disjoint spanning trees EDSTs in undirected product networks. Their construction method focuses more on showing the existence of a maximum number (n1+n2-1) of EDSTs in product network of two graphs, where factor graphs have respectively n1 and n2 EDSTs. In this paper, we propose a new systematic and algorithmic approach to construct (n1+n2) directed routed EDST in the product networks. The direction of an edge is added to support bidirectional links in interconnection networks. Our EDSTs can be used straightforward to develop efficient collective communication algorithms for both models store-and-forward and wormhole.

Download Full-text

New protocol for multistage interconnection networks

IEE Proceedings E Computers and Digital Techniques ◽

10.1049/ip-e.1991.0035 ◽

1991 ◽

Vol 138 (4) ◽

pp. 269

Author(s):

N.M. Patel

Keyword(s):

Interconnection Networks ◽

Multistage Interconnection Networks

Download Full-text

Two-Level FIFO Buffer Design for Routers in On-Chip Interconnection Networks

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e94.a.2412 ◽

2011 ◽

Vol E94-A (11) ◽

pp. 2412-2424 ◽

Cited By ~ 1

Author(s):

Po-Tsang HUANG ◽

Wei HWANG

Keyword(s):

Interconnection Networks ◽

Buffer Design ◽

On Chip

Download Full-text

Efficient Instruction and Data Caching for High Performance Embedded Processors

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201201788 ◽

1970 ◽

pp. 9

Author(s):

A. Ferrerón Labari ◽

D. Suárez Gracia ◽

V. Viñals Yúfera

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Interconnection Networks ◽

High Performance ◽

Critical Issue ◽

Content Management ◽

Structure Design ◽

Portable Devices ◽

On Chip

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text