Power Efficient High-Level Synthesis by Centralized and Fine-Grained Clock Gating

With the diffusion of cyber-physical systems and internet of things, adaptivity and low power consumption became of primary importance in digital systems design. Reconfigurable heterogeneous platforms seem to be one of the most suitable choices to cope with such challenging context. However, their development and power optimization are not trivial, especially considering hardware acceleration components. On the one hand high level synthesis could simplify the design of such kind of systems, but on the other hand it can limit the positive effects of the adopted power saving techniques. In this work, the mutual impact of different high level synthesis tools and the application of the well known clock gating strategy in the development of reconfigurable accelerators is studied. The aim is to optimize a clock gating application according to the chosen high level synthesis engine and target technology (Application Specific Integrated Circuit (ASIC) or Field Programmable Gate Array (FPGA)). Different levels of application of clock gating are evaluated, including a novel multi level solution. Besides assessing the benefits and drawbacks of the clock gating application at different levels, hints for future design automation of low power reconfigurable accelerators through high level synthesis are also derived.

Download Full-text

Bus Optimization for Low Power in High-Level Synthesis

Journal of Circuits System and Computers ◽

10.1142/s0218126603000829 ◽

2003 ◽

Vol 12 (01) ◽

pp. 1-17

Author(s):

Sungpack Hong ◽

Taewhan Kim

Keyword(s):

Optimal Solution ◽

Minimum Cost ◽

Maximum Flow ◽

High Level Synthesis ◽

Benchmark Problems ◽

Timing Constraints ◽

Power Efficient ◽

High Level ◽

The Impact ◽

Operation Scheduling

Sub-micron feature sizes have resulted in a considerable portion of power to be dissipated on the buses, causing an increased attention on savings for power at the behavioral level and the RT level of design. This paper addresses the problem of minimizing power dissipated in the switching of the buses in the high-level synthesis of data-dominated behavioral descriptions. Unlike the previous approaches in which the minimization of the power consumed in buses has not been considered until operation scheduling is completed, our approach integrates the bus binding problem into scheduling to exploit the impact of scheduling on the reduction of power dissipated on the buses more fully and effectively. We accomplish this by formulating the problem into a flow problem in a network, and devising an efficient algorithm which iteratively finds the maximum flow of minimum cost solutions in the network. Experimental results on a number of benchmark problems show that given resource and global timing constraints our designs are 19.8% power-efficient over the designs produced by a random-move based solution, and 15.5% power-efficient over the designs by a clock-step based optimal solution.

Download Full-text

Power Efficient Rapid Design Space Exploration of Integrated Scheduling and Module Selection in High Level Synthesis

10.32920/ryerson.14644968 ◽

2021 ◽

Author(s):

Pallabi Sarkar

Keyword(s):

Power Consumption ◽

Design Space Exploration ◽

Design Space ◽

Space Exploration ◽

High Level Synthesis ◽

Optimal Solutions ◽

Power Efficient ◽

Pi Method ◽

High Level

High level Synthesis (HLS) or Electronic System Level (ESL) synthesis requires scheduling algorithms that have strong capability to reach optimal/near-optimal solutions with significant rapidity and greater accuracy. A novel power efficient scheduling approach using ‘PI’ method has been presented in this thesis that reduces the final power consumption of the solution at the expenditure of minimal latency clock cycles. The proposed scheduling approach is based on ‘Priority indicator (PI)’ metric and ‘Intersect Matrix’ topology methods that have a tendency to escape local optimal solutions and thereby reach global solutions. Application of the proposed approach results in even distribution of allocated hardware functional units thereby yielding power efficient scheduling solutions. The two main novel and significant aspects of the thesis are: a) Introduction of ‘Intersect Matrix’ topology with its associated algorithm which is used to check for precedence violation during scheduling b) Introduction of PI method using Priority indicator metric that assists in choosing the highest priority node during each iteration of the scheduling optimization process. Comparative analysis of the proposed approach has been done with an existing design space exploration method for qualitative assessment using proposed ‘Quality Cost Factor (Q- metric)’. This Q-metric is a combination of latency and power consumption values for the solution found, which dictates the quality of the final solutions found in terms of cost for both the proposed and existing approaches. An average improvement of approximately 12 % in quality of final solution and average reduction of 59 % in runtime has been achieved by the proposed approach compared to a current scheduling approach for the DSP benchmarks.

Download Full-text

An Integrated Approach for Fine-Grained Power and Peak Temperature Management During High-Level Synthesis

Journal of Low Power Electronics ◽

10.1166/jolpe.2013.1262 ◽

2013 ◽

Vol 9 (3) ◽

pp. 350-362

Author(s):

Rajdeep Mukherjee ◽

Priyankar Ghosh ◽

Pallab Dasgupta ◽

Ajit Pal

Keyword(s):

Integrated Approach ◽

Peak Temperature ◽

High Level Synthesis ◽

Temperature Management ◽

Fine Grained ◽

High Level

Download Full-text

Power Efficient Rapid Design Space Exploration of Integrated Scheduling and Module Selection in High Level Synthesis

10.32920/ryerson.14644968.v1 ◽

2021 ◽

Author(s):

Pallabi Sarkar

Keyword(s):

Power Consumption ◽

Design Space Exploration ◽

Design Space ◽

Space Exploration ◽

High Level Synthesis ◽

Optimal Solutions ◽

Power Efficient ◽

Pi Method ◽

High Level

High level Synthesis (HLS) or Electronic System Level (ESL) synthesis requires scheduling algorithms that have strong capability to reach optimal/near-optimal solutions with significant rapidity and greater accuracy. A novel power efficient scheduling approach using ‘PI’ method has been presented in this thesis that reduces the final power consumption of the solution at the expenditure of minimal latency clock cycles. The proposed scheduling approach is based on ‘Priority indicator (PI)’ metric and ‘Intersect Matrix’ topology methods that have a tendency to escape local optimal solutions and thereby reach global solutions. Application of the proposed approach results in even distribution of allocated hardware functional units thereby yielding power efficient scheduling solutions. The two main novel and significant aspects of the thesis are: a) Introduction of ‘Intersect Matrix’ topology with its associated algorithm which is used to check for precedence violation during scheduling b) Introduction of PI method using Priority indicator metric that assists in choosing the highest priority node during each iteration of the scheduling optimization process. Comparative analysis of the proposed approach has been done with an existing design space exploration method for qualitative assessment using proposed ‘Quality Cost Factor (Q- metric)’. This Q-metric is a combination of latency and power consumption values for the solution found, which dictates the quality of the final solutions found in terms of cost for both the proposed and existing approaches. An average improvement of approximately 12 % in quality of final solution and average reduction of 59 % in runtime has been achieved by the proposed approach compared to a current scheduling approach for the DSP benchmarks.

Download Full-text

Priority function based power efficient rapid Design Space Exploration of scheduling and module selection in high level synthesis

2011 24th Canadian Conference on Electrical and Computer Engineering(CCECE) ◽

10.1109/ccece.2011.6030509 ◽

2011 ◽

Author(s):

Anirban Sengupta ◽

Reza Sedaghat ◽

Pallabi Sarkar ◽

Summit Sehgal

Keyword(s):

Design Space Exploration ◽

Design Space ◽

Space Exploration ◽

High Level Synthesis ◽

Power Efficient ◽

Rapid Design ◽

Module Selection ◽

Priority Function ◽

High Level

Download Full-text

Network simplex method based Multiple Voltage Scheduling in Power-efficient High-level synthesis

2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC) ◽

10.1109/aspdac.2013.6509602 ◽

2013 ◽

Cited By ~ 1

Author(s):

Cong Hao ◽

Song Chen ◽

T. Yoshimura

Keyword(s):

Simplex Method ◽

High Level Synthesis ◽

Power Efficient ◽

High Level ◽

Network Simplex

Download Full-text

Energy-efficient High-level Synthesis for HDR Architectures with Clock Gating Based on Concurrency-oriented Scheduling

IPSJ Transactions on System LSI Design Methodology ◽

10.2197/ipsjtsldm.6.101 ◽

2013 ◽

Vol 6 (0) ◽

pp. 101-111 ◽

Cited By ~ 2

Author(s):

Hiroyuki Akasaka ◽

Shin-ya Abe ◽

Masao Yanagisawa ◽

Nozomu Togawa

Keyword(s):

Energy Efficient ◽

High Level Synthesis ◽

Clock Gating ◽

High Level

Download Full-text

Energy-efficient High-level Synthesis for HDR Architecture with Multi-stage Clock Gating

IPSJ Transactions on System LSI Design Methodology ◽

10.2197/ipsjtsldm.7.74 ◽

2014 ◽

Vol 7 (0) ◽

pp. 74-80

Author(s):

Hiroyuki Akasaka ◽

Shin-ya Abe ◽

Masao Yanagisawa ◽

Nozomu Togawa

Keyword(s):

Energy Efficient ◽

High Level Synthesis ◽

Clock Gating ◽

Multi Stage ◽

High Level

Download Full-text

Optimized Memory Allocation and Power Minimization for FPGA-Based Image Processing

Journal of Imaging ◽

10.3390/jimaging5010007 ◽

2019 ◽

Vol 5 (1) ◽

pp. 7 ◽

Cited By ~ 6

Author(s):

Paulo Garcia ◽

Deepayan Bhowmik ◽

Robert Stewart ◽

Greg Michaelson ◽

Andrew Wallace

Keyword(s):

Image Processing ◽

Power Consumption ◽

Utilization Efficiency ◽

High Level Synthesis ◽

Total Power ◽

Limiting Factor ◽

Power Efficient ◽

Partitioning Algorithms ◽

On Chip ◽

High Level

Memory is the biggest limiting factor to the widespread use of FPGAs for high-level image processing, which require complete frame(s) to be stored in situ. Since FPGAs have limited on-chip memory capabilities, efficient use of such resources is essential to meet performance, size and power constraints. In this paper, we investigate allocation of on-chip memory resources in order to minimize resource usage and power consumption, contributing to the realization of power-efficient high-level image processing fully contained on FPGAs. We propose methods for generating memory architectures, from both Hardware Description Languages and High Level Synthesis designs, which minimize memory usage and power consumption. Based on a formalization of on-chip memory configuration options and a power model, we demonstrate how our partitioning algorithms can outperform traditional strategies. Compared to commercial FPGA synthesis and High Level Synthesis tools, our results show that the proposed algorithms can result in up to 60% higher utilization efficiency, increasing the sizes and/or number of frames that can be accommodated, and reduce frame buffers’ dynamic power consumption by up to approximately 70%. In our experiments using Optical Flow and MeanShift Tracking, representative high-level algorithms, data show that partitioning algorithms can reduce total power by up to 25% and 30%, respectively, without impacting performance.

Download Full-text