Optimal Design of a VLSI Processor with Spatially and Temporally Parallel Structure

1996 ◽  
Vol 8 (6) ◽  
pp. 516-523
Author(s):  
Michitaka Kameyama ◽  
◽  
Masayuki Sasaki

In intelligent integrated systems such as robotics for autonomous work, it is essential to respond to the change of the environment very quickly. Therefore, the development of special-purpose VLSI processors with minimum delay time becomes a very important subject. A suitable combination of spatially parallel and temporally parallel processing is very important to realize the minimum delay time. In this article, we present a scheduling algorithm for high-level synthesis, where the input to the scheduler is a behavioral description viewed as a data flow graph. The scheduler minimizes the delay time under the constraint of a silicon area and I/O pins.

1996 ◽  
Vol 8 (6) ◽  
pp. 496-499
Author(s):  
Michitaka Kameyama ◽  
◽  
Yoshichika Fujioka ◽  

As one of the next-generation information systems, it is important to construct intelligent integrated systems that have quick response for dynamically changing environment. Therefore, it becomes essential to develop the special purpose VLSI processors which are based on the philosophy ""great reduction of the delay time."" Particularly, we call it robot electronics to develop the special purpose VLSI processors for intelligent robot control. In this article, we will review the fundamental technologies such as pipeline architecture, spacial parallel processing, reconfigurable parallel architecture and high level synthesis of the parallel processor with minimum delay time.


2021 ◽  
Vol 14 (4) ◽  
pp. 1-15
Author(s):  
Zhenghua Gu ◽  
Wenqing Wan ◽  
Jundong Xie ◽  
Chang Wu

Performance optimization is an important goal for High-level Synthesis (HLS). Existing HLS scheduling algorithms are all based on Control and Data Flow Graph (CDFG) and will schedule basic blocks in sequential order. Our study shows that the sequential scheduling order of basic blocks is a big limiting factor for achievable circuit performance. In this article, we propose a Dependency Graph (DG) with two important properties for scheduling. First, DG is a directed acyclic graph. Thus, no loop breaking heuristic is needed for scheduling. Second, DG can be used to identify the exact instruction parallelism. Our experiment shows that DG can lead to 76% instruction parallelism increase over CDFG. Based on DG, we propose a bottom-up scheduling algorithm to achieve much higher instruction parallelism than existing algorithms. Hierarchical state transition graph with guard conditions is proposed for efficient implementation of such high parallelism scheduling. Our experimental results show that our DG-based HLS algorithm can outperform the CDFG-based LegUp and the state-of-the-art industrial tool Vivado HLS by 2.88× and 1.29× on circuit latency, respectively.


Author(s):  
Imed Saad Ben Dhaou ◽  
Hannu Tenhunen

This article presents a word serial retimed architecture for the SHA-256/224 algorithm. The architecture is compliant with the dedicated-short range communication for safety message authentications. We elaborate three-operand adder architectures suitable for field programmable gate array implementation. Several transformation techniques at the data-flow-graph level have been used to derive the architecture. Synthesis results show that the architecture has high throughput/ slice value compared with state-of-the-art SHA-256 implementations. The article also promulgates a comparison between high-level synthesis and RTL design.


2005 ◽  
Vol 14 (04) ◽  
pp. 735-755
Author(s):  
ASHOK KUMAR ◽  
MAGDY BAYOUMI

In this paper, a fast low power scheduling algorithm is presented for high-level synthesis with multiple voltages. The resources are assumed to operate at different voltages and their power consumption and delay for each voltage level is known in advance. The proposed methodology achieves maximal power reduction of functional units by identifying the maximal available parallelism of power hungry operators in an initial schedule. The proposed methodology is developed in the framework of a modified stochastic evolution mechanism in order to tame the computational complexity. The proposed scheduling technique is extremely fast and it runs in quadratic complexity in the number of the nodes in the data flow graph of the design. This is the fastest reported time of scheduling algorithms for resource-and-latency constrained scheduling with resources operating at multiple voltages. The algorithm produces results within accuracy of 3%–5% of the linear programming method.


Sign in / Sign up

Export Citation Format

Share Document