A Framework to Compare Estimated and Measured Power Consumption on FPGAs

In this paper, an extensive review of the available publications about comparing estimations versus measurements of power consumption in FPGA technology is carried out. This study reveals that the variety of experimental setups makes it difficult to elaborate solid studies departing from the results of different researchers using meta-analysis techniques. To mitigate this problem, we propose a procedure to standardize the setup of FPGA power estimation experiments. The goal is to make as close as possible power estimations and their corresponding actual on-chip measurements. The main idea is to use a fixed arrangement composed by a parameterized pattern generator block at the input, together with a set of interchangeable IP cores utilized as reference circuits. All the blocks are mapped together inside the FPGA sample, being the clock and reset lines the sole input signals. Thus, both power estimation and actual measurements are performed to the whole system in identical conditions. In order to illustrate the method, the paper includes some examples of the proposed methodology for different cores. A set of 25 circuits have been tested in two FPGA families, obtaining relative errors in power estimation between –61.5% and 9.2%.

Download Full-text

Power analysis approach and its application to IP-based SoC design

COMPEL The International Journal for Computation and Mathematics in Electrical and Electronic Engineering ◽

10.1108/compel-08-2015-0283 ◽

2016 ◽

Vol 35 (3) ◽

Author(s):

Yaseer Arafat Durrani ◽

Teresa Riesgo ◽

Muhammad Imran Khan ◽

Tariq Mahmood

Keyword(s):

Power Consumption ◽

Design Methodology ◽

Average Power ◽

Power Estimation ◽

Statistical Characteristics ◽

Estimation Accuracy ◽

Average Error ◽

Average Power Consumption ◽

On Chip ◽

Practical Implications

Purpose Low-power consumption has become an important issue that cannot be ignored in System-on-Chip (SoC) design. The key challenge encountered by system design is how to maintain balance between the estimation accuracy and speed. This paper aims at demonstrating an accurate and fast power estimation technique. Design/methodology/approach The methodology adopted in the paper is to use input patterns with the predefined statistical characteristics which helps to analyze the average power consumption of the different intellectual-property (IP) cores and the interconnects/buses in SoC design. Similarly the paper has implemented Genetic algorithm (GA) to generate sequences of input signals during the power estimation procedure. Findings The GA concurrently optimizes the input signal characteristics that influence the final solution of the pattern. In addition to that, a Monte-Carlo zero-delay simulation is also performed for individual IP core and bus at high-level. By the simple addition of these cores/buses, power is predicted by a novel macro-model function. In experiments, the average error is estimated at 13.84%. Research limitations/implications To present the research findings with clarity and to avoid complexities, the paper does not consider delay factors like glitches, jitter etc. in the power model. Practical implications The proposed methodology allowed accurate power/energy analysis of practical applications mapped onto Network-on-Chip (NoC) based Multiprocessors SoC platform. It enables the performance analysis of different design alternatives under the load imposed by complex applications. Originality/value This paper is an original contribution and the results demonstrate that our novel technique could be implemented to achieve fast and accurate power estimation in the early stage of any SoC design.

Download Full-text

Improved Simulated Annealing Genetic Algorithm based Low power mapping for 3D NoC

MATEC Web of Conferences ◽

10.1051/matecconf/201823202022 ◽

2018 ◽

Vol 232 ◽

pp. 02022 ◽

Cited By ~ 1

Author(s):

Hanna He ◽

Fang Fang ◽

Wei Wang

Keyword(s):

Genetic Algorithm ◽

Simulated Annealing ◽

Power Consumption ◽

Average Power ◽

Optimal Solution ◽

Initial Population ◽

Good Convergence ◽

Average Power Consumption ◽

Ip Cores ◽

On Chip

Mapping of IP(Intellectual Property) cores onto NoC(Network-on-Chip) architectures is a key step in NoCbased designs. Energy is the key parameter to measure the designs. Therefore, we propose an Improved Simulated Annealing Genetic Alogrithm, abbreviated as ISAGA. The algorithm combines the parallelism of Genetic Algorithm(GA) and the local search ability of Simulated Annealing(SA). We improve the initial population selection of GA to get the lower power consumption mapping scheme. The experimental results show that compared with the GA, ISAGA has good convergence and can search the optimal solution quickly, which can effectively reduce the power consumption of the system. In the case of 124 IP cores, the average power consumption of the ISAGA is reduced by 32.0% compared with the GA.

Download Full-text

A New Simulator Based on Multi Core Processor with Improved Sense Amplifier

Journal of Circuits System and Computers ◽

10.1142/s0218126615501418 ◽

2015 ◽

Vol 24 (09) ◽

pp. 1550141 ◽

Cited By ~ 2

Author(s):

Erulappan Sakthivel ◽

Veluchamy Malathi ◽

Muruganantham Arunraja

Keyword(s):

Power Consumption ◽

Low Power ◽

High Performance ◽

Routing Algorithm ◽

Main Idea ◽

Sense Amplifier ◽

Area Reduction ◽

Sense Amplifiers ◽

On Chip ◽

Multi Core Processor

In recent days, network-on-chip (NoC) researchers focus mainly on the area reduction and low power consumption both in architectural and algorithmic approach. To achieve low power and high performance in NoC architecture, sense amplifiers (SAs) introduced which can consume less power under various traffic conditions. In order to analyze the performance of architectural NoC design before fabrication level, the new simulator is developed based on multi core processor with improved sense amplifier (MCPSA) in this work. The MCPSA simulator provides user, the flexibility of incorporating various traffic configurations and routing algorithm with user reconfigurable option. In addition, the different SA model can be put into the simulation in plug and play manner for evaluation. The NoC case studies are presented to demonstrate the NoC architecture with double tail sense amplifier (DTSA) and modified-DTSA (M-DTSA) design. The performance metric such as delay, data rate and power consumption is evaluated. The main idea of this new simulator is to interface multisim environment (MSE) into a NoC environment for validating any DTSA.

Download Full-text

Area-efficient programmable arbiter for inter-layer communications in 3-D network-on-chip

Open Computer Science ◽

10.2478/s13537-012-0006-8 ◽

2012 ◽

Vol 2 (1) ◽

Cited By ~ 1

Author(s):

Mohammad Khan ◽

Abdul Ansari

Keyword(s):

Power Consumption ◽

Clock Cycle ◽

Network On Chip ◽

Fixed Number ◽

Clock Frequency ◽

Mesh Topology ◽

Communication Technique ◽

Ip Cores ◽

On Chip ◽

Area Efficient

AbstractThe Network-on-Chip (NoC) is an emerging communication technique for System-on-Chip (SoC) communications. The NoC uses multiple processors, usually targeted for embedded applications and other applications [3, 13]. Performance of the bus is degraded by the increasing number of processing elements and transaction oriented model [13]. This has attracted much attention for applying wireless network protocols as CDMA, TDMA, and dTDMA in SoC. The TDMA systems use a fixed number of timeslots. This protocol wastes bandwidth when some timeslots are allocated but not used. The dynamic TDMA (dTDMA) bus arbiter dynamically grows and shrinks the number of timeslots to match the number of active transmitters [14]. In this paper, we present a design of area-efficient switch for inter-layer communications in 3-D NoC. The arbitration logic in the switch is based on a programmable priority encoder. A 640-bit message with uniform random destination data pattern was injected per IP per machine clock cycle. We have obtained the maximum clock frequency of 2.09 GHz for 96(4 × 8 × 3) IP cores connected in a mesh topology. The presented architecture demonstrates their superior functionality in terms of speed, latency, area, and power consumption as compared with the existing implementation [14]. The maximum power consumption of the proposed area-efficient programmable arbiter is 0.625 mW. The design is synthesized using 180nm TSMC Technology.

Download Full-text

Efficient Instruction and Data Caching for High Performance Embedded Processors

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201201788 ◽

1970 ◽

pp. 9

Author(s):

A. Ferrerón Labari ◽

D. Suárez Gracia ◽

V. Viñals Yúfera

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Interconnection Networks ◽

High Performance ◽

Critical Issue ◽

Content Management ◽

Structure Design ◽

Portable Devices ◽

On Chip

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text

Ultracompact and low-power-consumption silicon thermo-optic switch for high-speed data

Nanophotonics ◽

10.1515/nanoph-2020-0496 ◽

2020 ◽

Vol 10 (2) ◽

pp. 937-945

Author(s):

Ruihuan Zhang ◽

Yu He ◽

Yong Zhang ◽

Shaohua An ◽

Qingming Zhu ◽

...

Keyword(s):

Power Consumption ◽

Low Power ◽

High Speed ◽

High Performance ◽

Pulse Amplitude ◽

Telecommunication Networks ◽

Low Power Consumption ◽

Power Efficient ◽

High Speed Data ◽

On Chip

AbstractUltracompact and low-power-consumption optical switches are desired for high-performance telecommunication networks and data centers. Here, we demonstrate an on-chip power-efficient 2 × 2 thermo-optic switch unit by using a suspended photonic crystal nanobeam structure. A submilliwatt switching power of 0.15 mW is obtained with a tuning efficiency of 7.71 nm/mW in a compact footprint of 60 μm × 16 μm. The bandwidth of the switch is properly designed for a four-level pulse amplitude modulation signal with a 124 Gb/s raw data rate. To the best of our knowledge, the proposed switch is the most power-efficient resonator-based thermo-optic switch unit with the highest tuning efficiency and data ever reported.

Download Full-text

Exploring a New Adaptive Routing Based on the Dijkstra Algorithm in Optical Networks-on-Chip

Micromachines ◽

10.3390/mi12010054 ◽

2021 ◽

Vol 12 (1) ◽

pp. 54

Author(s):

Yan-Li Zheng ◽

Ting-Ting Song ◽

Jun-Xiong Chai ◽

Xiao-Ping Yang ◽

Meng-Meng Yu ◽

...

Keyword(s):

Power Consumption ◽

Power Control ◽

Optical Networks ◽

Output Power ◽

Network Performance ◽

Transmission Loss ◽

Adaptive Routing ◽

Dijkstra Algorithm ◽

Networks On Chip ◽

On Chip

The photoelectric hybrid network has been proposed to achieve the ultrahigh bandwidth, lower delay, and less power consumption for chip multiprocessor (CMP) systems. However, a large number of optical elements used in optical networks-on-chip (ONoCs) generate high transmission loss which will influence network performance severely and increase power consumption. In this paper, the Dijkstra algorithm is adopted to realize adaptive routing with minimum transmission loss of link and reduce the output power of the link transmitter in mesh-based ONoCs. The numerical simulation results demonstrate that the transmission loss of a link in optimized power control based on the Dijkstra algorithm could be maximally reduced compared with traditional power control based on the dimensional routing algorithm. Additionally, it has a greater advantage in saving the average output power of optical transmitter compared to the adaptive power control in previous studies, while the network size expands. With the aid of simulation software OPNET, the network performance simulations in an optimized network revealed that the end-to-end (ETE) latency and throughput are not vastly reduced in regard to a traditional network. Hence, the optimized power control proposed in this paper can greatly reduce the power consumption of s network without having a big impact on network performance.

Download Full-text

1.0 V-0.18 µm CMOS Tunable Low Pass Filters with 73 dB DR for On-Chip Sensing Acquisition Systems

Electronics ◽

10.3390/electronics10050563 ◽

2021 ◽

Vol 10 (5) ◽

pp. 563

Author(s):

Jorge Pérez-Bailón ◽

Belén Calvo ◽

Nicolás Medrano

Keyword(s):

Power Consumption ◽

Dynamic Range ◽

Low Voltage ◽

Cutoff Frequency ◽

Cmos Technology ◽

Active Area ◽

Current Steering ◽

Low Pass ◽

On Chip ◽

Low Pass Filters

This paper presents a new approach based on the use of a Current Steering (CS) technique for the design of fully integrated Gm–C Low Pass Filters (LPF) with sub-Hz to kHz tunable cut-off frequencies and an enhanced power-area-dynamic range trade-off. The proposed approach has been experimentally validated by two different first-order single-ended LPFs designed in a 0.18 µm CMOS technology powered by a 1.0 V single supply: a folded-OTA based LPF and a mirrored-OTA based LPF. The first one exhibits a constant power consumption of 180 nW at 100 nA bias current with an active area of 0.00135 mm2 and a tunable cutoff frequency that spans over 4 orders of magnitude (~100 mHz–152 Hz @ CL = 50 pF) preserving dynamic figures greater than 78 dB. The second one exhibits a power consumption of 1.75 µW at 500 nA with an active area of 0.0137 mm2 and a tunable cutoff frequency that spans over 5 orders of magnitude (~80 mHz–~1.2 kHz @ CL = 50 pF) preserving a dynamic range greater than 73 dB. Compared with previously reported filters, this proposal is a competitive solution while satisfying the low-voltage low-power on-chip constraints, becoming a preferable choice for general-purpose reconfigurable front-end sensor interfaces.

Download Full-text

Fuzzy-Based Thermal Management Scheme for 3D Chip Multicores with Stacked Caches

Electronics ◽

10.3390/electronics9020346 ◽

2020 ◽

Vol 9 (2) ◽

pp. 346 ◽

Cited By ~ 1

Author(s):

Lili Shen ◽

Ning Wu ◽

Gaizhen Yan

Keyword(s):

Power Consumption ◽

Thermal Management ◽

System Performance ◽

Control Policy ◽

Three Dimension ◽

Processor Core ◽

Management Scheme ◽

And Performance ◽

On Chip ◽

Silicon Vias

By using through-silicon-vias (TSV), three dimension integration technology can stack large memory on the top of cores as a last-level on-chip cache (LLC) to reduce off-chip memory access and enhance system performance. However, the integration of more on-chip caches increases chip power density, which might lead to temperature-related issues in power consumption, reliability, cooling cost, and performance. An effective thermal management scheme is required to ensure the performance and reliability of the system. In this study, a fuzzy-based thermal management scheme (FBTM) is proposed that simultaneously considers cores and stacked caches. The proposed method combines a dynamic cache reconfiguration scheme with a fuzzy-based control policy in a temperature-aware manner. The dynamic cache reconfiguration scheme determines the size of the cache for the processor core according to the application that reaches a substantial amount of power consumption savings. The fuzzy-based control policy is used to change the frequency level of the processor core based on dynamic cache reconfiguration, a process which can further improve the system performance. Experiments show that, compared with other thermal management schemes, the proposed FBTM can achieve, on average, 3 degrees of reduction in temperature and a 41% reduction of leakage energy.

Download Full-text