Energy characteristic of a processor allocator and a network-on-chip

Energy characteristic of a processor allocator and a network-on-chip Energy consumption in a Chip MultiProcessor (CMP) is one of the most important costs. It is related to design aspects such as thermal and power constrains. Besides efficient on-chip processing elements, a well-designed Processor Allocator (PA) and a Network-on-Chip (NoC) are also important factors in the energy budget of novel CMPs. In this paper, the authors propose an energy model for NoCs with 2D-mesh and 2D-torus topologies. All important NoC architectures are described and discussed. Energy estimation is presented for PAs. The estimation is based on synthesis results for PAs targeting FPGA. The PAs are driven by allocation algorithms that are studied as well. The proposed energy model is employed in a simulation environment, where exhaustive experiments are performed. Simulation results show that a PA with an IFF allocation algorithm for mesh systems and a torus-based NoC with express-virtual-channel flow control are very energy efficient. Combination of these two solutions is a clear choice for modern CMPs.

Download Full-text

A Transaction Level Modeling of Network-on-Chip Architecture for Energy Estimation

2007 IEEE International Conference on Research, Innovation and Vision for the Future ◽

10.1109/rivf.2007.369136 ◽

2007 ◽

Cited By ~ 1

Author(s):

Anh-Vu Dinh-Duc ◽

Pascal Vivet ◽

Alain Clouard

Keyword(s):

Network On Chip ◽

Energy Estimation ◽

Transaction Level Modeling ◽

Transaction Level ◽

On Chip

Download Full-text

An Asynchronous 2D-Torus Network-on-Chip Using Adaptive Routing Algorithm

Big Data Computing and Communications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-42553-5_29 ◽

2016 ◽

pp. 342-351

Author(s):

Zhenni Li ◽

Jingjiao Li ◽

Aiyun Yan ◽

Lan Yao

Keyword(s):

Routing Algorithm ◽

Adaptive Routing ◽

Network On Chip ◽

On Chip ◽

Torus Network ◽

2D Torus

Download Full-text

Reimagining the Role of Network-on-Chip Resources Toward Improving Chip Multiprocessor Performance

10.17918/00000254 ◽

2020 ◽

Author(s):

Karthik Sangaiah

Keyword(s):

Network On Chip ◽

Chip Multiprocessor ◽

On Chip

Download Full-text

Towards a Scalable Software Defined Network-on-Chip for Next Generation Cloud

Sensors ◽

10.3390/s18072330 ◽

2018 ◽

Vol 18 (7) ◽

pp. 2330 ◽

Cited By ~ 8

Author(s):

Alberto Scionti ◽

Somnath Mazumdar ◽

Antoni Portero

Keyword(s):

Modular Design ◽

Rapid Evolution ◽

Network On Chip ◽

Data Driven ◽

Software Defined Network ◽

Processing Elements ◽

Mesh Topology ◽

On Chip ◽

Execution Models ◽

Scalable Software

The rapid evolution of Cloud-based services and the growing interest in deep learning (DL)-based applications is putting increasing pressure on hyperscalers and general purpose hardware designers to provide more efficient and scalable systems. Cloud-based infrastructures must consist of more energy efficient components. The evolution must take place from the core of the infrastructure (i.e., data centers (DCs)) to the edges (Edge computing) to adequately support new/future applications. Adaptability/elasticity is one of the features required to increase the performance-to-power ratios. Hardware-based mechanisms have been proposed to support system reconfiguration mostly at the processing elements level, while fewer studies have been carried out regarding scalable, modular interconnected sub-systems. In this paper, we propose a scalable Software Defined Network-on-Chip (SDNoC)-based architecture. Our solution can easily be adapted to support devices ranging from low-power computing nodes placed at the edge of the Cloud to high-performance many-core processors in the Cloud DCs, by leveraging on a modular design approach. The proposed design merges the benefits of hierarchical network-on-chip (NoC) topologies (via fusing the ring and the 2D-mesh topology), with those brought by dynamic reconfiguration (i.e., adaptation). Our proposed interconnect allows for creating different types of virtualised topologies aiming at serving different communication requirements and thus providing better resource partitioning (virtual tiles) for concurrent tasks. To further allow the software layer controlling and monitoring of the NoC subsystem, a few customised instructions supporting a data-driven program execution model (PXM) are added to the processing element’s instruction set architecture (ISA). In general, the data-driven programming and execution models are suitable for supporting the DL applications. We also introduce a mechanism to map a high-level programming language embedding concurrent execution models into the basic functionalities offered by our SDNoC for easing the programming of the proposed system. In the reported experiments, we compared our lightweight reconfigurable architecture to a conventional flattened 2D-mesh interconnection subsystem. Results show that our design provides an increment of the data traffic throughput of 9.5% and a reduction of 2.2× of the average packet latency, compared to the flattened 2D-mesh topology connecting the same number of processing elements (PEs) (up to 1024 cores). Similarly, power and resource (on FPGA devices) consumption is also low, confirming good scalability of the proposed architecture.

Download Full-text

Migrating single FPGA chip multiprocessor with network on chip to 65nm and 45nm ASIC

ICM 2011 Proceeding ◽

10.1109/icm.2011.6177399 ◽

2011 ◽

Author(s):

O. Hammami ◽

Z. Wang ◽

Dominique Houzet

Keyword(s):

Network On Chip ◽

Chip Multiprocessor ◽

Fpga Chip ◽

On Chip

Download Full-text

Evaluation of Over Looped 2d Mesh Topology for Network on Chip

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7703.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 502-504

Keyword(s):

Network On Chip ◽

The Other ◽

Hop Count ◽

Mesh Topology ◽

Homogeneous Systems ◽

Outer Corner ◽

On Chip ◽

2D Torus ◽

Opposite Corner ◽

Structural Arrangements

The arrangements of nodes in the network identifies the complexity of the network. To reduce the complexity, a structural arrangements of nodes has to be taken care. The mesh topology yields attraction than the other traditional topologies. Making the opposite corner nodes to communicate with less hops and avoiding the centre of the networks traffic, Over-Looped 2D Mesh Topology is proposed. For a homogeneous systems the proposed work can be deployed without altering any of the switch component compositions. By making the flits, travel in the outer corner nodes with the help of looping nodes will make the journey from source to destination with less hops. For smaller network below 4x4 the looping is less responsive. For odd or even number of columns and rows the looping can be done. The number of columns and number of rows need not to be equal. The left over nodes will be looped accordingly. The hop count of the Over-Looped 2D Mesh Topology compared to 2D mesh decreases the journey by 25%. The wiring segmentation and the wiring length of the system more than 10 % from 2D mesh and less than 20% from 2D Torus

Download Full-text

PARALLEL GAUSS-SEIDEL ON A TORUS NETWORK-ON-CHIP ARCHITECTURE

Journal of Interconnection Networks ◽

10.1142/s0219265912500016 ◽

2012 ◽

Vol 13 (01n02) ◽

pp. 1250001 ◽

Cited By ~ 1

Author(s):

MOHAMMAD H. AL-TOWAIQ ◽

KHALED DAY

Keyword(s):

Linear Equations ◽

Network On Chip ◽

Multicore Architectures ◽

Processing Elements ◽

Recent Developments ◽

Systems Of Linear Equations ◽

Large Systems ◽

On Chip ◽

Large Linear Systems ◽

Torus Network

Network-on-chip multicore architectures with a large number of processing elements are becoming a reality with the recent developments in technology. In these modern systems the processing elements are interconnected with regular network-on-chip (NoC) topologies such as meshes and trees. In this paper we propose a parallel Gauss-Seidel (GS) iterative algorithm for solving large systems of linear equations on a torus NoC architecture. The proposed parallel algorithm is O (Nn2/k2) time complexity for solving a system with matrix of order n on a k × k torus NoC architecture with N iterations assuming n and N are large compared to k (i.e. for large linear systems that require a large number of iterations). We show that under these conditions the proposed parallel GS algorithm has near optimal speedup.

Download Full-text

Architecture of the On-Chip Processing Elements

Reconfigurable Computing Systems Engineering ◽

10.1201/9781315374697-3 ◽

2017 ◽

pp. 71-95

Author(s):

Lev Kirischian

Keyword(s):

Processing Elements ◽

Chip Processing ◽

On Chip

Download Full-text

EFFICIENT SCHEME FOR CONGESTION CONTROL IN NETWORK-ON-CHIP WITH QoS CONSIDERATION

Journal of Circuits System and Computers ◽

10.1142/s0218126614501205 ◽

2014 ◽

Vol 23 (09) ◽

pp. 1450120 ◽

Cited By ~ 1

Author(s):

ADEL SOUDANI ◽

AHMED ALDAMMAS ◽

ABDULLAH AL-DHELAAN

Keyword(s):

Network On Chip ◽

Chip Multiprocessor ◽

Multiprocessor Systems ◽

Signaling Mechanism ◽

Qos Management ◽

Efficient Scheme ◽

High Bandwidth ◽

On Chip ◽

Internal Architecture

Embedded distributed multimedia applications based on the use of on-chip networks for communication and messages exchange requires specific and enhanced quality of service (QoS) management. To reach the desired performances at the application level, the network-on-chip (NoC) router should implement per flit handling strategy with wide granularity. This purpose requires an enhanced internal architecture that ensures from one hand a specific management according to a service classification and from the other hand, it enhances the routing process. In this context, this paper proposes a new mechanism for QoS management in NoC. This mechanism is based on the use of central memory where flits are in-queued according to their class of service. This scheme enables an optimal flit scheduling phase and provides more capabilities to drop low important flits when the router shows congestion state symptoms. The paper presents, also, a protocol structure that fills with this architecture and introduces a signaling mechanism to make efficient the QoS management through the proposed architecture. The circuit performances and its adaptability to achieve QoS with low power processing and high bandwidth in on chip multiprocessor systems will be studied in this paper.

Download Full-text

Design and Simulation of Multicast Communication Model Based on 2D Torus Network on Chip

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.630.276 ◽

2012 ◽

Vol 630 ◽

pp. 276-282

Author(s):

Hao Wang ◽

Ling Wu

Keyword(s):

Model Simulation ◽

Network On Chip ◽

Transmission Delay ◽

Communication Model ◽

Multicast Communication ◽

Routing Strategy ◽

Lower Transmission ◽

Formalized Description ◽

On Chip ◽

2D Torus

In order to avoid the deadlock and high transmission delay of network on chip in multicast communication, this paper put forward a solution of multicast communication model. First, the author carried out a formalized description for the multicast communication model. Secondly, illustrate the deadlock caused by the loop circuit waiting. To solve this problem, the NOC multicast communication model was proposed based on the 2D Torus topology. In addition, this paper also presented an example to validate its correctness. Finally, simulate and apply this model simulation to the NOC of 2D Torus topology structure by the OPNET Modeler. The test results show that this multicast communication model has lower transmission delay and higher throughput volume compared with the unicast routing strategy using XY routing.

Download Full-text