High-Performance Computers and Interconnection Networks

2020 ◽  
Vol 31 (02) ◽  
pp. 233-252
Author(s):  
Yuejuan Han ◽  
Lantao You ◽  
Cheng-Kuan Lin ◽  
Jianxi Fan

The topology properties of multi-processors interconnection networks are important to the performance of high performance computers. The hypercube network [Formula: see text] has been proved to be one of the most popular interconnection networks. The [Formula: see text]-dimensional locally twisted cube [Formula: see text] is an important variant of [Formula: see text]. Fault diameter and wide diameter are two communication performance evaluation parameters of a network. Let [Formula: see text]), [Formula: see text] and [Formula: see text] denote the diameter, the [Formula: see text] fault diameter and the wide diameter of [Formula: see text], respectively. In this paper, we prove that [Formula: see text] if [Formula: see text] is an odd integer with [Formula: see text], [Formula: see text] if [Formula: see text] is an even integer with [Formula: see text].


2014 ◽  
Vol 936 ◽  
pp. 2307-2312
Author(s):  
He Li

Due to integrated positive features of both hypercube and tori, optical multi-mesh hypercube (OMMH) networks in high-performance computers are regarded as a class of promising optical inter-connection networks. This paper firstly derive that the diagnosability of OMMH under the pessimistic strategy is (2n+6)/(2n+6), which shows that the OMMH possesses strong self-diagnosingability. With the improved cycle decomposition method by Yang in J. Parall. Distrib. Comput. [10], a fast diagnosis algorithm to identify all faulty nodes tailored for OMMH, which runs in O(Nlog2N) time is also proposed, where N is the number of the processors of an OMMH.


Author(s):  
A. Ferrerón Labari ◽  
D. Suárez Gracia ◽  
V. Viñals Yúfera

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.


Author(s):  
Jack Dongarra ◽  
Laura Grigori ◽  
Nicholas J. Higham

A number of features of today’s high-performance computers make it challenging to exploit these machines fully for computational science. These include increasing core counts but stagnant clock frequencies; the high cost of data movement; use of accelerators (GPUs, FPGAs, coprocessors), making architectures increasingly heterogeneous; and multi- ple precisions of floating-point arithmetic, including half-precision. Moreover, as well as maximizing speed and accuracy, minimizing energy consumption is an important criterion. New generations of algorithms are needed to tackle these challenges. We discuss some approaches that we can take to develop numerical algorithms for high-performance computational science, with a view to exploiting the next generation of supercomputers. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.


1992 ◽  
Vol 10 (6) ◽  
pp. 632-632
Author(s):  
Stuart M. Dambrot

PAMM ◽  
2015 ◽  
Vol 15 (1) ◽  
pp. 495-496 ◽  
Author(s):  
Lennart Schneiders ◽  
Jerry H. Grimmen ◽  
Matthias Meinke ◽  
Wolfgang Schröder

2012 ◽  
Vol 629 ◽  
pp. 704-710
Author(s):  
Xi Ying Liu ◽  
Tong Gui Bai ◽  
Tao Zhang

Analyzing problems represented by partial differential equations numerically with modern high performance computers has become an important approach in research of earth science. In the work, a Sea Ice numerical Model under JASMIN (J parallel Adaptive Structured Mesh applications INfrastructure) (SIMJ for brevity) including thermodynamic and dynamic processes is implemented and an numerical experiment of 20-year integration with SIMJ has been performed. It’s found that the model can reproduce seasonal variation of Arctic sea ice well and implementation of parallel computing is flexible and easy. The ratio of time consumption is 1:1.16:1.48:2.45 with 8, 4, 2, and 1 core(s) respectively for one year integration on mobile workstation (Thinkpad W510) with Red Hat Enterprise Linux 5.4 and Portland group’s pgf90 9.0-1.


Sign in / Sign up

Export Citation Format

Share Document