High Performance Hierarchical Torus Network Under Adverse Traffic Patterns

2012 ◽  
Vol 7 (3) ◽  
Author(s):  
MM Hafizur Rahman ◽  
Yukinori Sato ◽  
Yasushi Inoguchi
2016 ◽  
Vol 2016 ◽  
pp. 1-13 ◽  
Author(s):  
Jeankyung Kim ◽  
Jinsoo Hwang ◽  
Kichang Kim

As internet traffic rapidly increases, fast and accurate network classification is becoming essential for high quality of service control and early detection of network traffic abnormalities. Machine learning techniques based on statistical features of packet flows have recently become popular for network classification partly because of the limitations of traditional port- and payload-based methods. In this paper, we propose a Markov model-based network classification with a Kullback-Leibler divergence criterion. Our study is mainly focused on hard-to-classify (or overlapping) traffic patterns of network applications, which current techniques have difficulty dealing with. The results of simulations conducted using our proposed method indicate that the overall accuracy reaches around 90% with a reasonable group size ofn=100.


2013 ◽  
Vol 39 (3) ◽  
pp. 973-983 ◽  
Author(s):  
M.M. Hafizur Rahman ◽  
Yukinori Sato ◽  
Yasushi Inoguchi

2018 ◽  
Vol 7 (2.7) ◽  
pp. 763
Author(s):  
Venkateswara Rao Musala ◽  
T V Rama Krishna

Route specific information with the SoC needs a great deal of wiring, which increases the Resistance & Capacitance (RC) component of the system. Network on Chip (NoC) is utilized as the interface to address the problems in SoC, On-chip interconnection network in NoC has gained more consideration over steadfast wiring and buses, like lower latency, scalability and high performance. Present routing algorithms in NoC is suffered from load balancing at incarnation networks under non-uniform traffic conditions, causes increase the NoC trade-offs (latency and throughput). Adaptive routing is a technique to progress the load balance, but previous adaptive routing techniques used uniform traffic patterns to form the routing decisions. This paper proposes a new approach at non- uniform traffic patterns in channel state and path specific, Path Aware Routing (PAR XY-X) uses a timeout piggybacking for acknowledgement and load shedding to avoid congestion which choose optimistic path calculation unit to connect the destination node without glue logic decisions in routing. PAR XY-X outperforms the Normal XY routing by 20% and 33% with respect to Avg.latency and throughput.


2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Antoine Bossard

Modern supercomputers are massively parallel systems: they embody thousands of computing nodes and sometimes several millions. The torus topology has proven very popular for the interconnect of these high-performance systems. Notably, this network topology is employed by the supercomputer ranked number one in the world as of November 2020, the supercomputer Fugaku. Given the high number of compute nodes in such systems, efficient parallel processing is critical to maximise the computing performance. It is well known that cycles harm the parallel processing capacity of systems: for instance, deadlocks and starvations are two notorious issues of parallel computing that are directly linked to the presence of cycles. Hence, network decycling is an important issue, and it has been extensively discussed in the literature. We describe in this paper a decycling algorithm for the 3-dimensional k -ary torus topology and compare it with established results, both theoretically and experimentally. (This paper is a revised version of Antoine Bossard (2020)).


2020 ◽  
Vol 20 (6) ◽  
pp. 94-104
Author(s):  
Ivan Lirkov

AbstractPractical realizations of 3D forward/inverse separable discrete transforms, such as Fourier transform, cosine/sine transform, etc. are frequently the principal limiters that prevent many practical applications from scaling to a large number of processors. Existing approaches, which are based primarily on 1D or 2D data decompositions, prevent the 3D transforms from effectively scaling to the maximum (possible/available) number of computer nodes. A highly scalable approach to realize forward/inverse 3D transforms has been proposed. It is based on a 3D decomposition of data and geared towards a torus network of computer nodes. The proposed algorithms requires compute-and-roll time-steps, where each step consists of an execution of multiple GEMM operations and concurrent movement of cubical data blocks between nearest neighbors. The aim of this paper is to present an experimental performance study of an implementation on high performance computer architecture.


Sign in / Sign up

Export Citation Format

Share Document