Sub-nanosecond synchronization implementation in pure Xilinx Kintex-7 FPGA

Network-On-Chip (NoC) has surpassed the traditional bus based on-chip communication in offering better performance for data transfers among many processing, peripheral and other cores of high performance embedded systems. Adaptive routing provides an effective way of efficient on-chip communication among NoC cores. The message routing efficiency can further improve the performance of NoC based embedded systems on a chip. Congestion awareness has been applied to adaptive routing for achieving better data throughput and latency. This thesis presents a novel approach of analyzing congestion to improve NoC throughput by improving packet allocation in NoC routers. The routers would have the knowledge of the traffic conditions around themselves by utilizing the congestion information. We employ header flits to store the congestion information that does not require any additional communication links between the routers. By prioritizing data packets that are likely to suffer the worst congestion would improve overall NoC data transfer latency.

Download Full-text

Implementation of High Accuracy DAC Controller Based on HRPWM in TMS320F2806x

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.722.345 ◽

2013 ◽

Vol 722 ◽

pp. 345-351

Author(s):

Ke Fan ◽

Hui Deng ◽

Feng Wang

Keyword(s):

Experimental Data ◽

High Resolution ◽

Basic Principle ◽

Software Design ◽

Hardware Design ◽

High Accuracy ◽

Digital To Analog Converter ◽

Analog Converter ◽

On Chip ◽

Design Ideas

This paper proposes method for utilizing the on-chip high resolution pluse width modulation (HRPWM) as a digital-to-analog converter (DAC). The thesis detailedly discusses the basic principle of the design and analyzes the courses of affecting DAC output accuracy.It described the hardware design and the software design ideas. The experimental data shows the effectiveness of the proposed method.

Download Full-text

An Open-Source Platform for High-Performance Non-Coherent On-Chip Communication

IEEE Transactions on Computers ◽

10.1109/tc.2021.3107726 ◽

2021 ◽

pp. 1-1

Author(s):

Andreas Kurth ◽

Wolfgang Ronninger ◽

Thomas Benz ◽

Matheus Cavalcante ◽

Fabian Schuiki ◽

...

Keyword(s):

Open Source ◽

High Performance ◽

On Chip

Download Full-text

IOb-Cache: A High-Performance Configurable Open-Source Cache

Algorithms ◽

10.3390/a14080218 ◽

2021 ◽

Vol 14 (8) ◽

pp. 218

Author(s):

João V. Roque ◽

João D. Lopes ◽

Mário P. Véstias ◽

José T. de Sousa

Keyword(s):

Open Source ◽

High Performance ◽

System On Chip ◽

Processing Unit ◽

Central Processing ◽

Front End ◽

Front End Module ◽

Memory Accesses ◽

On Chip ◽

Access Policies

Open-source processors are increasingly being adopted by the industry, which requires all sorts of open-source implementations of peripherals and other system-on-chip modules. Despite the recent advent of open-source hardware, the available open-source caches have low configurability, limited lack of support for single-cycle pipelined memory accesses, and use non-standard hardware interfaces. In this paper, the IObundle cache (IOb-Cache), a high-performance configurable open-source cache is proposed, developed and deployed. The cache has front-end and back-end modules for fast integration with processors and memory controllers. The front-end module supports the native interface, and the back-end module supports the native interface and the standard Advanced eXtensible Interface (AXI). The cache is highly configurable in structure and access policies. The back-end can be configured to read bursts of multiple words per transfer to take advantage of the available memory bandwidth. To the best of our knowledge, IOb-Cache is currently the only configurable cache that supports pipelined Central Processing Unit (CPU) interfaces and AXI memory bus interface. Additionally, it has a write-through buffer and an independent controller for fast, most of the time 1-cycle writing together with 1-cycle reading, while previous works only support 1-cycle reading. This allows the best clocks-per-Instruction (CPI) to be close to one (1.055). IOb-Cache is integrated into IOb System-on-Chip (IOb-SoC) Github repository, which has 29 stars and is already being used in 50 projects (forks).

Download Full-text

Congestion aware adaptive routing for network-on-chip communication

10.32920/ryerson.14645025 ◽

2021 ◽

Author(s):

Stephen Chui

Keyword(s):

Embedded Systems ◽

High Performance ◽

Data Transfer ◽

Adaptive Routing ◽

Network On Chip ◽

Message Routing ◽

Data Packets ◽

Novel Approach ◽

Communication Links ◽

On Chip

Network-On-Chip (NoC) has surpassed the traditional bus based on-chip communication in offering better performance for data transfers among many processing, peripheral and other cores of high performance embedded systems. Adaptive routing provides an effective way of efficient on-chip communication among NoC cores. The message routing efficiency can further improve the performance of NoC based embedded systems on a chip. Congestion awareness has been applied to adaptive routing for achieving better data throughput and latency. This thesis presents a novel approach of analyzing congestion to improve NoC throughput by improving packet allocation in NoC routers. The routers would have the knowledge of the traffic conditions around themselves by utilizing the congestion information. We employ header flits to store the congestion information that does not require any additional communication links between the routers. By prioritizing data packets that are likely to suffer the worst congestion would improve overall NoC data transfer latency.

Download Full-text

A Digitally Programmable Analog Quadrature Sine Oscillator for on-chip Lock-In Measurement Systems

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201701623 ◽

1970 ◽

Vol 4 ◽

pp. 5-6

Author(s):

Alejandro Márquez ◽

Nicolás Medrano ◽

Belén Calvo ◽

Pedro A. Martínez

Keyword(s):

High Performance ◽

Low Cost ◽

High Accuracy ◽

Measurement Systems ◽

Programmable Analog ◽

Actuation System ◽

Digitally Programmable ◽

On Chip ◽

Lock In

This paper presents a CMOS 1.8V-180nm analog quadrature sine oscillator. Thanks to a custom 12-bit bidirectional DAC-based architecture, the frequency can be digitally programmed over two decades with high accuracy, making it suitable as the actuation system in low-cost high-performance embedded lock-in measurement systems.

Download Full-text

A NURBS Interpolator Using Multiprocessor on Chip for High Performance Motion Control

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.130-134.1929 ◽

2011 ◽

Vol 130-134 ◽

pp. 1929-1932

Author(s):

Wei Tang ◽

Xiao Dong Zhang ◽

Yong Ding ◽

Jing Jing

Keyword(s):

Motion Control ◽

High Speed ◽

High Performance ◽

Experimental Tests ◽

High Accuracy ◽

Motion Controller ◽

Real Time Control ◽

Time Control ◽

Nurbs Interpolation ◽

On Chip

Modern CNC system adopts the NURBS interpolation for the purpose of achieving high-speed and high accuracy performance. However, in conventional control architectures, the computation of the basis functions of a NURBS curve is very time consuming due to serial computing constraints. In this paper, a novel multiprocessor-based motion controller on chip utilizing its high-speed parallel computing power is proposed to realize the NURBS interpolation. The motion control algorithm and I/O control are also embedded in the chip to implement real-time control and NURBS interpolation simultaneously. The experimental tests using an X-Y table verify the outstanding computation performance of the multiprocessor-based motion controller on chip. The result indicates that shorter sampling time (0.1 ms) can be achieved for NURBS interpolation and high-accuracy motion control.

Download Full-text

Efficient Instruction and Data Caching for High Performance Embedded Processors

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201201788 ◽

1970 ◽

pp. 9

Author(s):

A. Ferrerón Labari ◽

D. Suárez Gracia ◽

V. Viñals Yúfera

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Interconnection Networks ◽

High Performance ◽

Critical Issue ◽

Content Management ◽

Structure Design ◽

Portable Devices ◽

On Chip

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text

Design and simulation of high performance router on chip based on random routing

JOURNAL OF ELECTRONIC MEASUREMENT AND INSTRUMENT ◽

10.3724/sp.j.1187.2013.00669 ◽

2014 ◽

Vol 27 (7) ◽

pp. 669-675 ◽

Cited By ~ 1

Author(s):

Feng Yue ◽

Runfeng Li ◽

Tian Chen ◽

Jun Liu ◽

Peng Chen ◽

...

Keyword(s):

High Performance ◽

On Chip

Download Full-text

МЕТОДЫ ДОСТИЖЕНИЯ МАКСИМАЛЬНОЙ ЭФФЕКТИВНОСТИ ПЛАТФОРМЫ ПРОТОТИПИРОВАНИЯ ВЫСОКОПРОИЗВОДИТЕЛЬНЫХ СИСТЕМ НА КРИСТАЛЛЕ НА ЗАДАЧАХ ИСКУССТВЕННОГО ИНТЕЛЛЕКТА

Nanoindustry Russia ◽

10.22184/1993-8578.2020.13.3s.585.588 ◽

2020 ◽

Vol 96 (3s) ◽

pp. 585-588

Author(s):

С.Е. Фролова ◽

Е.С. Янакова

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Computer Vision ◽

High Performance ◽

Systems On Chip ◽

High Performance Systems ◽

On Chip ◽

Network Technologies ◽

Neural Network Technologies

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.

Download Full-text