scholarly journals On Solving the Decycling Problem in a Torus Network

2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Antoine Bossard

Modern supercomputers are massively parallel systems: they embody thousands of computing nodes and sometimes several millions. The torus topology has proven very popular for the interconnect of these high-performance systems. Notably, this network topology is employed by the supercomputer ranked number one in the world as of November 2020, the supercomputer Fugaku. Given the high number of compute nodes in such systems, efficient parallel processing is critical to maximise the computing performance. It is well known that cycles harm the parallel processing capacity of systems: for instance, deadlocks and starvations are two notorious issues of parallel computing that are directly linked to the presence of cycles. Hence, network decycling is an important issue, and it has been extensively discussed in the literature. We describe in this paper a decycling algorithm for the 3-dimensional k -ary torus topology and compare it with established results, both theoretically and experimentally. (This paper is a revised version of Antoine Bossard (2020)).

2014 ◽  
Author(s):  
Mehdi Gilaki ◽  
Ilya Avdeev

In this study, we have investigated feasibility of using commercial explicit finite element code LS-DYNA on massively parallel super-computing cluster for accurate modeling of structural impact on battery cells. Physical and numerical lateral impact tests have been conducted on cylindrical cells using a flat rigid drop cart in a custom-built drop test apparatus. The main component of cylindrical cell, jellyroll, is a layered spiral structure which consists of thin layers of electrodes and separator. Two numerical approaches were considered: (1) homogenized model of the cell and (2) heterogeneous (full) 3-D cell model. In the first approach, the jellyroll was considered as a homogeneous material with an effective stress-strain curve obtained through experiments. In the second model, individual layers of anode, cathode and separator were accounted for in the model, leading to extremely complex and computationally expensive finite element model. To overcome limitations of desktop computers, high-performance computing (HPC) techniques on a HPC cluster were needed in order to get the results of transient simulations in a reasonable solution time. We have compared two HPC methods used for this model is shared memory parallel processing (SMP) and massively parallel processing (MPP). Both the homogeneous and the heterogeneous models were considered for parallel simulations utilizing different number of computational nodes and cores and the performance of these models was compared. This work brings us one step closer to accurate modeling of structural impact on the entire battery pack that consists of thousands of cells.


Author(s):  
Anil Yuksel ◽  
Vic Mahaney ◽  
Chris Marroquin ◽  
Shurong Tian ◽  
Mark Hoffmeyer ◽  
...  

Abstract High performance computing (HPC), artificial intelligence (AI) and cognitive systems have initiated a new era of computing. Efficient thermal management technologies of these systems have been vital due to the increasing power density in the electronic components. In 2018 IBM delivered the fastest supercomputer of the world through Summit with 200 petaflops computing performance with LINPACK benchmarks. The system is both air and water cooled, where water is employed to cool the high power dissipated electronic components which are the IBM POWER9 processors and NVIDIA GPUs. In this paper, we highlight the overview of the thermal and mechanical design strategies applied on these systems. In air cooled systems, we discuss the fan and heat sink designs, as well as the preheating effect on PCI section. Liquid cooled system has a unique coldplate design which cool the processors and the GPUs with water. We examine the water flow path design for the processor and the GPUs by providing the thermal performance of the coldplate. Also, an overview of the cooling assemblies such as TIMs and air baffles in the servers are discussed. Moreover, unit and rack manifolds are investigated; flow and pressure distribution at the node and rack level are provided.


1993 ◽  
Vol 04 (06) ◽  
pp. 1307-1314
Author(s):  
MICHAEL WEBER

The feasibility and constraints of workstation clusters for parallel processing are investigated. Measurements of latency and bandwidth are presented to fix the position of clusters in comparison to massively parallel systems. So it becomes possible to identify the kind of applications that seem to be suited for running on a cluster.


Author(s):  
Hugo Eduardo Camacho Cruz ◽  
Jesús Humberto Foullon Peña ◽  
Julio Cesar González Mariño ◽  
Ma. de Lourdes Cantú Gallegos

Sign in / Sign up

Export Citation Format

Share Document