Optimized Communication Architecture of MPSoCs with a Hardware Scheduler

Author(s):  
Diandian Zhang ◽  
Han Zhang ◽  
Jeronimo Castrillon ◽  
Torsten Kempf ◽  
Bart Vanthournout ◽  
...  

Efficient runtime resource management in multi-processor systems-on-chip (MPSoCs) for achieving high performance and low energy consumption is one of the key challenges for system designers. OSIP, an operating system application-specific instruction-set processor, together with its well-defined programming model, provides a promising solution. It delivers high computational performance to deal with dynamic task scheduling and mapping. Being programmable, it can easily be adapted to different systems. However, the distributed computation among the different processing elements introduces complexity to the communication architecture, which tends to become the bottleneck of such systems. In this work, the authors highlight the vital importance of the communication architecture for OSIP-based systems and optimize the communication architecture. Furthermore, the effects of OSIP and the communication architecture are investigated jointly from the system point of view, based on a broad case study for a real life application (H.264) and a synthetic benchmark application.

Author(s):  
Diandian Zhang ◽  
Han Zhang ◽  
Jeronimo Castrillon ◽  
Torsten Kempf ◽  
Bart Vanthournout ◽  
...  

Efficient runtime resource management in multi-processor systems-on-chip (MPSoCs) for achieving high performance and low energy consumption is one of the key challenges for system designers. OSIP, an operating system application-specific instruction-set processor, together with its well-defined programming model, provides a promising solution. It delivers high computational performance to deal with dynamic task scheduling and mapping. Being programmable, it can easily be adapted to different systems. However, the distributed computation among the different processing elements introduces complexity to the communication architecture, which tends to become the bottleneck of such systems. In this work, the authors highlight the vital importance of the communication architecture for OSIP-based systems and optimize the communication architecture. Furthermore, the effects of OSIP and the communication architecture are investigated jointly from the system point of view, based on a broad case study for a real life application (H.264) and a synthetic benchmark application.


2015 ◽  
Vol 2015 ◽  
pp. 1-12
Author(s):  
Mahendra Vucha ◽  
Arvind Rajawat

Modern embedded systems are being modeled as Reconfigurable High Speed Computing System (RHSCS) where Reconfigurable Hardware, that is, Field Programmable Gate Array (FPGA), and softcore processors configured on FPGA act as computing elements. As system complexity increases, efficient task distribution methodologies are essential to obtain high performance. A dynamic task distribution methodology based on Minimum Laxity First (MLF) policy (DTD-MLF) distributes the tasks of an application dynamically onto RHSCS and utilizes available RHSCS resources effectively. The DTD-MLF methodology takes the advantage of runtime design parameters of an application represented as DAG and considers the attributes of tasks in DAG and computing resources to distribute the tasks of an application onto RHSCS. In this paper, we have described the DTD-MLF model and verified its effectiveness by distributing some of real life benchmark applications onto RHSCS configured on Virtex-5 FPGA device. Some benchmark applications are represented as DAG and are distributed to the resources of RHSCS based on DTD-MLF model. The performance of the MLF based dynamic task distribution methodology is compared with static task distribution methodology. The comparison shows that the dynamic task distribution model with MLF criteria outperforms the static task distribution techniques in terms of schedule length and effective utilization of available RHSCS resources.


2020 ◽  
Vol 96 (3s) ◽  
pp. 585-588
Author(s):  
С.Е. Фролова ◽  
Е.С. Янакова

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.


2020 ◽  
Vol 10 (4) ◽  
pp. 37
Author(s):  
Habiba Lahdhiri ◽  
Jordane Lorandel ◽  
Salvatore Monteleone ◽  
Emmanuelle Bourdel ◽  
Maurizio Palesi

The Network-on-chip (NoC) paradigm has been proposed as a promising solution to enable the handling of a high degree of integration in multi-/many-core architectures. Despite their advantages, wired NoC infrastructures are facing several performance issues regarding multi-hop long-distance communications. RF-NoC is an attractive solution offering high performance and multicast/broadcast capabilities. However, managing RF links is a critical aspect that relies on both application-dependent and architectural parameters. This paper proposes a design space exploration framework for OFDMA-based RF-NoC architecture, which takes advantage of both real application benchmarks simulated using Sniper and RF-NoC architecture modeled using Noxim. We adopted the proposed framework to finely configure a routing algorithm, working with real traffic, achieving up to 45% of delay reduction, compared to a wired NoC setup in similar conditions.


2008 ◽  
Vol 3 (1) ◽  
pp. 23-31
Author(s):  
Everton Carara ◽  
Ney Calazans ◽  
Fernando Moraes

For almost a decade now, Network on Chip (NoC) concepts have evolved to provide an interesting alternative to more traditional intrachip communication architectures (e.g. shared busses) for the design of complex Systems on Chip (SoCs). A considerable number of NoC proposals are available, focusing on different sets of optimization aspects, related to specific classes of applications. Each such application employs a NoC as part of its underlying implementation infrastructure. Many of the mentioned optimization aspects target results such as Quality of Service (QoS) achievement and/or power consumption reduction. On the other hand, the use of NoCs brings about the solution of new design problems, such to the choice of synchronization method to employ between NoC routers and application modules mapping. Although the availability of NoC structures is already rather ample, some design choices are at base of many, if not most, NoC proposals. These include the use of wormhole packet switching and virtual channels. This work pledges against this practice. It discusses trade-offs of using circuit or packet switching, arguing in favor the use of the former with fixed size packets (cells). Quantitative data supports the argumentation. Also, the work proposes and justifies replacing the use of virtual channels by replicated channels, based on the abundance of wires in current and expected deep sub-micron technologies. Finally, the work proposes a transmission method coupling the use of session layer structures to circuit switching to better support application implementation. The main reported result is the availability of a router with reduced latency and area, a communication architecture adapted for high-performance applications.


Author(s):  
Diandian Zhang ◽  
Jeronimo Castrillon ◽  
Stefan Schürmans ◽  
Gerd Ascheid ◽  
Rainer Leupers ◽  
...  

Efficient runtime resource management in heterogeneous Multi-Processor Systems-on-Chip (MPSoCs) for achieving high performance and energy efficiency is one key challenge for system designers. In the past years, several IP blocks have been proposed that implement system-wide runtime task and resource management. As the processor count continues to increase, it is important to analyze the scalability of runtime managers at the system-level for different communication architectures. In this chapter, the authors analyze the scalability of an Application-Specific Instruction-Set Processor (ASIP) for runtime management called OSIP on two platform paradigms: shared and distributed memory. For the former, a generic bus is used as interconnect. For distributed memory, a Network-on-Chip (NoC) is used. The effects of OSIP and the communication architecture are jointly investigated from the system point of view, based on a broad case study with real applications (an H.264 video decoder and a digital receiver for wireless communications) and a synthetic benchmark application.


Author(s):  
K. Suresh Kumar ◽  
S. Anitha ◽  
M. Gayathri

In this model a runtime cache data mapping is discussed for 3-D stacked L2 caches to minimize the overall energy of 3-D chip multiprocessors (CMPs). The suggested method considers both temperature distribution and memory traffic of 3-D CMPs. Experimental result shows energy reduction achieving up to 22.88% compared to an existing solution which considers only the temperature distribution.  New tendencies envisage 3D Multi-Processor System-On-Chip (MPSoC) design as a promising solution to keep increasing the performance of the next-generation high performance computing (HPC) systems. However, as the power density of HPC systems increases with the arrival of 3D MPSoCs with energy reduction achieving up to 19.55% by supplying electrical power to the computing equipment and constantly removing the generated heat is rapidly becoming the dominant cost in any HPC facility.


Sign in / Sign up

Export Citation Format

Share Document