PERFORMANCE ANALYSIS OF A JPEG ENCODER MAPPED ONTO A VIRTUAL MPSoC-NoC ARCHITECTURE USING TLM 2.0.1

Networks on Chip (NoCs) are commonly used to integrate complex embedded systems and multiprocessor platforms due to their scalability and versatility. Modeling tools used at the functional level use SystemC to perform hardware–software co-design and error correction concurrently, thus, reducing time to market. This work analyzes a JPEG encoding algorithm mapped onto a configurable M × N, mesh/torus, NoC platform described in SystemC with the transaction level modeling (TLM) standard; timing constraints for both, the router and network interface controller, are assigned according to a hardware description language (HDL) model written for this purpose. Processing nodes are also described as SystemC threads and their computation delays are assigned depending on the amount and cost of the operations they perform. The programming model employed is message passing. We start by describing and profiling the JPEG algorithm as a task graph; then, four partitioning proposals are mapped onto three NoCs of different size. Our analysis comprises changes in topology, virtual channel depth, routing algorithms, network speed and task-node assignments. Through several high-level simulations we evaluate the impact of each parameter and we show that, for the proposed model, most improvements come from the algorithm partitioning.

Download Full-text

Communication-centric high level synthesis metrics for low vertical channel density 3-dimensional Networks-on-Chip

7th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC) ◽

10.1109/recosoc.2012.6322897 ◽

2012 ◽

Cited By ~ 2

Author(s):

Haoyuan Ying ◽

Thomas Hollstein ◽

Klaus Hofmann

Keyword(s):

Vertical Channel ◽

High Level Synthesis ◽

Networks On Chip ◽

3 Dimensional ◽

Channel Density ◽

On Chip ◽

High Level

Download Full-text

A Cost-Driven, High-Level Optimization of OSV Operations in the Flemish Pass

Volume 1: Offshore Technology ◽

10.1115/omae2018-77640 ◽

2018 ◽

Author(s):

Philippe Gauthier ◽

David Molyneux

Keyword(s):

Programming Model ◽

Oil Extraction ◽

Non Linear Programming ◽

Non Linear ◽

High Level ◽

Non Linear System ◽

Offshore Oil ◽

The Impact ◽

Pareto Frontiers ◽

Made In

This paper presents a cost driven, high-level optimization of Offshore Supply Vessel (OSV) operations in the Flemish Pass sector. This is an area located in the offshore waters of Newfoundland where significant oil discoveries were made in recent years, but where oil extraction will pose logistical challenges due to the increased distance from shore bases. In the first part of this paper, a simple non-linear programming model is used to minimize the monthly costs to supply a hypothetical offshore oil installation located in the Flemish Pass and to assess whether hypothetical fast supply vessels make economic sense. The second part of this paper explores the application of Pareto frontiers to the non-linear system, to evaluate the impact of schedule slack on costs, but also to look at winter operations in the Flemish Pass area.

Download Full-text

HLS Based Approach to Develop an Implementable HDR Algorithm

Electronics ◽

10.3390/electronics7110332 ◽

2018 ◽

Vol 7 (11) ◽

pp. 332 ◽

Cited By ~ 1

Author(s):

Rappy Saha ◽

Partha Banik ◽

Ki-Doo Kim

Keyword(s):

Dynamic Range ◽

Signal To Noise Ratio ◽

Simple Algorithm ◽

Structural Similarity ◽

High Dynamic Range ◽

Field Programmable ◽

Hardware Description ◽

On Chip ◽

High Level ◽

Removal Technique

Hardware suitability of an algorithm can only be verified when the algorithm is actually implemented in the hardware. By hardware, we indicate system on chip (SoC) where both processor and field-programmable gate array (FPGA) are available. Our goal is to develop a simple algorithm that can be implemented on hardware where high-level synthesis (HLS) will reduce the tiresome work of manual hardware description language (HDL) optimization. We propose an algorithm to achieve high dynamic range (HDR) image from a single low dynamic range (LDR) image. We use highlight removal technique for this purpose. Our target is to develop parameter free simple algorithm that can be easily implemented on hardware. For this purpose, we use statistical information of the image. While software development is verified with state of the art, the HLS approach confirms that the proposed algorithm is implementable to hardware. The performance of the algorithm is measured using four no-reference metrics. According to the measurement of the structural similarity (SSIM) index metric and peak signal-to-noise ratio (PSNR), hardware simulated output is at least 98.87 percent and 39.90 dB similar to the software simulated output. Our approach is novel and effective in the development of hardware implementable HDR algorithm from a single LDR image using the HLS tool.

Download Full-text

High-Level Programming of Dynamically Reconfigurable NoC-Based Heterogeneous Multicore SoCs

Dynamic Reconfigurable Network-on-Chip Design ◽

10.4018/978-1-61520-807-4.ch008 ◽

2010 ◽

pp. 186-219

Author(s):

Wim Vanderbauwhede

Keyword(s):

Cmos Technology ◽

Design Reuse ◽

Full Potential ◽

Networks On Chip ◽

Heterogeneous Multicore ◽

Dynamically Reconfigurable ◽

Ip Cores ◽

And Performance ◽

On Chip ◽

High Level

With the increase in System-on-Chip (SoC) complexity and CMOS technology capabilities, the SoC design community has recently observed a convergence of a number of critical trends, all of them aimed at addressing the design gap: the advent of heterogeneous multicore SoCs and Networks-on-Chip and the recognition of the need for design reuse through Intellectual Property (IP) cores, for dynamic reconfigurability and for high abstraction-level design. In this chapter, we present a solution for High-level Programming of Dynamically Reconfigurable NoC-based Heterogeneous Multicore SoCs. Our solution, the Gannet framework, allows IP core-based Heterogeneous Multicore SoCs to be programmed using a high-level language whilst preserving the full potential for parallelism and dynamic reconfigurability inherent in such a system. The required hardware infrastructure is small and low-latency, thus adding full dynamic reconfiguration capabilities with a small overhead both in area and performance.

Download Full-text

A Design Space Exploration Framework for ANN-Based Fault Detection in Hardware Systems

Journal of Electrical and Computer Engineering ◽

10.1155/2017/9361493 ◽

2017 ◽

Vol 2017 ◽

pp. 1-12

Author(s):

Andreas G. Savva ◽

Theocharis Theocharides ◽

Chrysostomos Nicopoulos

Keyword(s):

Fault Detection ◽

Design Space Exploration ◽

Design Space ◽

Generalization Capability ◽

Networks On Chip ◽

Traffic Models ◽

Artificial Neural Network Ann ◽

On Chip ◽

High Level

This work presents a design exploration framework for developing a high level Artificial Neural Network (ANN) for fault detection in hardware systems. ANNs can be used for fault detection purposes since they have excellent characteristics such as generalization capability, robustness, and fault tolerance. Designing an ANN in order to be used for fault detection purposes includes different parameters. Through this work, those parameters are presented and analyzed based on simulations. Moreover, after the development of the ANN, in order to evaluate it, a case study scenario based on Networks on Chip is used for detection of interrouter link faults. Simulation results with various synthetic traffic models show that the proposed work can detect up to 96–99% of interrouter link faults with a delay less than 60 cycles. Added to this, the size of the ANN is kept relatively small and they can be implemented in hardware easily. Synthesis results indicate an estimated amount of 0.0523 mW power consumption per neuron for the implemented ANN when computing a complete cycle.

Download Full-text

Evaluating the Impact of Data Encoding Techniques on the Power Consumption in Networks-on-Chip

IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures (ISVLSI'06) ◽

10.1109/isvlsi.2006.42 ◽

2006 ◽

Cited By ~ 1

Author(s):

J.C.S. Palma ◽

R.A.L. Reis ◽

L. Soares Indrusiak ◽

A. Garcia Ortiz ◽

M. Glesner ◽

...

Keyword(s):

Power Consumption ◽

Networks On Chip ◽

Data Encoding ◽

On Chip ◽

The Impact

Download Full-text

Effective On-Chip Communication for Message Passing Programs on Multi-Core Processors

Electronics ◽

10.3390/electronics10212681 ◽

2021 ◽

Vol 10 (21) ◽

pp. 2681

Author(s):

Joonmoo Huh ◽

Deokwoo Lee

Keyword(s):

Parallel Programming ◽

Shared Memory ◽

Message Passing ◽

Programming Model ◽

Multicore Architectures ◽

Worst Case ◽

High Performing ◽

Parallel Programming Model ◽

On Chip ◽

Sharing Patterns

Shared memory is the most popular parallel programming model for multi-core processors, while message passing is generally used for large distributed machines. However, as the number of cores on a chip increases, the relative merits of shared memory versus message passing change, and we argue that message passing becomes a viable, high performing, and parallel programming model. To demonstrate this hypothesis, we compare a shared memory architecture with a new message passing architecture on a suite of applications tuned for each system independently. Perhaps surprisingly, the fundamental behaviors of the applications studied in this work, when optimized for both models, are very similar to each other, and both could execute efficiently on multicore architectures despite many implementations being different from each other. Furthermore, if hardware is tuned to support message passing by supporting bulk message transfer and the elimination of unnecessary coherence overheads, and if effective support is available for global operations, then some applications would perform much better on a message passing architecture. Leveraging our insights, we design a message passing architecture that supports both memory-to-memory and cache-to-cache messaging in hardware. With the new architecture, message passing is able to outperform its shared memory counterparts on many of the applications due to the unique advantages of the message passing hardware as compared to cache coherence. In the best case, message passing achieves up to a 34% increase in speed over its shared memory counterpart, and it achieves an average 10% increase in speed. In the worst case, message passing is slowed down in two applications—CG (conjugate gradient) and FT (Fourier transform)—because it could not perform well on the unique data sharing patterns as its counterpart of shared memory. Overall, our analysis demonstrates the importance of considering message passing as a high performing and hardware-supported programming model on future multicore architectures.

Download Full-text

SystemC Language Usage as the Alternative to the HDL and High-level Modeling for NoC Simulation

International Journal of Embedded and Real-Time Communication Systems ◽

10.4018/ijertcs.2018070102 ◽

2018 ◽

Vol 9 (2) ◽

pp. 18-31 ◽

Cited By ~ 2

Author(s):

Aleksandr Romanov ◽

Alexander Ivannikov

Keyword(s):

Programming Language ◽

Hot Spots ◽

Geometric Shape ◽

Language Usage ◽

Networks On Chip ◽

Speed Increase ◽

Low Level ◽

High Level Modeling ◽

On Chip ◽

High Level

This article describes how actual trends of networks-on-chip research and known approaches to their modeling are considered. The characteristics of analytic and high- / low- level simulation are given. The programming language SystemC as an alternative solution to create models of networks-on-chip is proposed, and SystemC models speed increase methodic is observed. The methods of improving SystemC models are formulated. There has been shown how SystemC language can reduce the disadvantages and maximize the advantages of high-level and low-level approaches. To achieve this, the comparison of results for high-level, low-level and SystemC NoC simulation is given on the example of “hot spots” and the geometric shape of regular NoC topologies effect on their productivity.

Download Full-text

On the Impact of Traffic Statistics on Quality of Service for Networks on Chip

2005 IEEE International Symposium on Circuits and Systems ◽

10.1109/iscas.2005.1465096 ◽

2005 ◽

Cited By ~ 3

Author(s):

S. Santi ◽

B. Lin ◽

L. Kocarev ◽

G.M. Maggio ◽

R. Rovatti ◽

...

Keyword(s):

Quality Of Service ◽

Networks On Chip ◽

On Chip ◽

The Impact

Download Full-text

Low-Cost Allocator Implementations for Networks-on-Chip Routers

VLSI Design ◽

10.1155/2009/415646 ◽

2009 ◽

Vol 2009 ◽

pp. 1-10

Author(s):

Min Zhang ◽

Chiu-Sing Choy

Keyword(s):

Low Cost ◽

Virtual Channel ◽

Careful Study ◽

Network Simulator ◽

Networks On Chip ◽

Look Ahead ◽

Generic Architecture ◽

On Chip ◽

The Look ◽

The Impact

Cost-effective Networks-on-Chip (NoCs) routers are important for future SoCs and embedded devices. Implementation results show that the generic virtual channel allocator (VA) and the generic switch allocator (SA) of a router consume large amount of area and power. In this paper, after a careful study of the working principle of a VA and the utilization statistics of its arbiters, opportunities to simplify the generic VA are identified. Then, the deadlock problem for a combined switch and virtual channel allocator (SVA) is studied. Next, the impact of the VA simplification on the router critical paths is analyzed. Finally, the generic architecture and two low-cost architectures proposed (the look-ahead, and the SVA) are evaluated with a cycle-accurate network simulator and detailed VLSI implementations. Results show that both the look-ahead and the SVA significantly reduce area and power compared to the generic architecture. Furthermore, cost savings are achieved without performance penalty.

Download Full-text