Tradeoff exploration between reliability, power consumption, and execution time for embedded systems

Ismail Assayad; Alain Girault; Hamoudi Kalla

doi:10.1007/s10009-012-0263-9

THE METHOD OF CONSTRUCTION OF THE MEANS FOR EXECUTION TIME TESTING THE EMBEDDED SYSTEMS THAT ARE DEVELOPED IN KEIL UVISION

ELECTRICAL AND COMPUTER SYSTEMS ◽

10.15276/eltecs.27.103.2018.24 ◽

2018 ◽

Vol 27 (103) ◽

pp. 213-219

Author(s):

R. S. Chopey, ◽

◽

D. V. Fedasyuk

Keyword(s):

Embedded Systems ◽

Execution Time

Download Full-text

Efficient Instruction and Data Caching for High Performance Embedded Processors

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201201788 ◽

1970 ◽

pp. 9

Author(s):

A. Ferrerón Labari ◽

D. Suárez Gracia ◽

V. Viñals Yúfera

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Interconnection Networks ◽

High Performance ◽

Critical Issue ◽

Content Management ◽

Structure Design ◽

Portable Devices ◽

On Chip

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text

A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3431731 ◽

2021 ◽

Vol 18 (1) ◽

pp. 1-25

Author(s):

Lorenz Braun ◽

Sotirios Nikas ◽

Chen Song ◽

Vincent Heuveline ◽

Holger Fröning

Keyword(s):

Simple Model ◽

Power Consumption ◽

Execution Time ◽

Fast Prediction

Download Full-text

Microprocessors KOMDIV for High Performance Embedded Systems

INFORMATION TECHNOLOGY IN INDUSTRY ◽

10.17762/itii.v7i3.71 ◽

2021 ◽

Vol 7 (3) ◽

Author(s):

S.G. Bobkov

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

High Performance ◽

Clock Cycle ◽

Embedded Computing ◽

Computing Systems ◽

Processor Performance

The problems of creating of high-performance embedded computing systems based on microprocessors KOMDIV is considered. Processor performance is dependent upon three characteristics: clock cycle, clock cycles per instruction, and instruction count. These characteristics for microprocessors KOMDIV are optimized using parameter performance/power consumption and requirements of embedded systems.

Download Full-text

Performance and power consumption evaluation of concurrent queue implementations in embedded systems

Computer Science - Research and Development ◽

10.1007/s00450-014-0261-0 ◽

2014 ◽

Vol 30 (2) ◽

pp. 165-175 ◽

Cited By ~ 2

Author(s):

Lazaros Papadopoulos ◽

Ivan Walulya ◽

Paul Renaud-Goud ◽

Philippas Tsigas ◽

Dimitrios Soudris ◽

...

Keyword(s):

Embedded Systems ◽

Power Consumption

Download Full-text

Tradeoff Exploration between Reliability, Power Consumption, and Execution Time

Lecture Notes in Computer Science - Computer Safety, Reliability, and Security ◽

10.1007/978-3-642-24270-0_32 ◽

2011 ◽

pp. 437-451 ◽

Cited By ~ 16

Author(s):

Ismail Assayad ◽

Alain Girault ◽

Hamoudi Kalla

Keyword(s):

Power Consumption ◽

Execution Time

Download Full-text

Modeling and Simulation of Software Execution Time in Embedded Systems

2020 10th Annual Computing and Communication Workshop and Conference (CCWC) ◽

10.1109/ccwc47524.2020.9031143 ◽

2020 ◽

Author(s):

Stefan Resmerita ◽

Anton Poelzleitner ◽

Stefan Lukesch

Keyword(s):

Embedded Systems ◽

Modeling And Simulation ◽

Execution Time ◽

Software Execution

Download Full-text

Hardware-Enhanced Protection for the Runtime Data Security in Embedded Systems

Electronics ◽

10.3390/electronics8010052 ◽

2019 ◽

Vol 8 (1) ◽

pp. 52 ◽

Cited By ~ 2

Author(s):

Weike Wang ◽

Xiaobing Zhang ◽

Qiang Hao ◽

Zhun Zhang ◽

Bin Xu ◽

...

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Data Security ◽

Memory Access ◽

Low Power Consumption ◽

Replay Attack ◽

Protection Method ◽

Xor Operation ◽

Hardware Overhead ◽

Integrity Protection

At present, the embedded systems are facing various kinds of attacks, especially for the data stored in the external memories. This paper presents a hardware-enhanced protection method to protect the data integrity and confidentiality at runtime, preventing the data from spoofing attack, splicing attack, replay attack, and some malicious analysis. For the integrity protection, the signature is calculated by the hardware implemented Lhash engine before the data sending off the chip, and the signature of the data block is recalculated and compared with the decrypted one at the load time. For the confidentiality protection, an AES encryption engine is used to generate the key stream, the plain data and the cipher data can translate through a simple XOR operation. The hardware cryptographic engines are optimized to work simultaneously with the memory access operation, which reduces the hardware overhead and the performance overhead. We implement the proposed architecture within OR1200 processor on Xilinx Virtex 5 FPGA platform. The experiment results show that the proposed hardware-enhanced protection method can preserve the integrity and confidentiality of the runtime data in the embedded systems with low power consumption and a marginal area footprint. The performance overhead is less than 2.27% according to the selected benchmarks.

Download Full-text

ViPar: High-Level Design Space Exploration for Parallel Video Processing Architectures

International Journal of Reconfigurable Computing ◽

10.1155/2019/4298013 ◽

2019 ◽

Vol 2019 ◽

pp. 1-19

Author(s):

Karim M. A. Ali ◽

Rabie Ben Atitallah ◽

Abdessamad Ait El Cadi ◽

Nizar Fakhfakh ◽

Jean-Luc Dekeyser

Keyword(s):

Power Consumption ◽

Video Processing ◽

Execution Time ◽

Design Space Exploration ◽

Hardware Implementation ◽

Space Exploration ◽

Absolute Difference ◽

Architectural Model ◽

Processing Architectures ◽

High Level

Embedded video applications are now involved in sophisticated transportation systems like autonomous vehicles and driver assistance systems. As silicon capacity increases, the design productivity gap grows up for the current available design tools. Hence, high-level synthesis (HLS) tools emerged in order to reduce that gap by shifting the design efforts to higher abstraction levels. In this paper, we present ViPar as a tool for exploring different video processing architectures at higher design level. First, we proposed a parametrizable parallel architectural model dedicated for video applications. Second, targeting this architectural model, we developed ViPar tool with two main features: (1) An empirical model was introduced to estimate the power consumption based on hardware utilization and operating frequency. In addition to that, we derived the equations for estimating the hardware utilization and execution time for each design point during the space exploration process. (2) By defining the main characteristics of the parallel video architecture like parallelism level, the number of input/output ports, the pixel distribution pattern, and so on, ViPar tool can automatically generate the dedicated architecture for hardware implementation. In the experimental validation, we used ViPar tool to generate automatically an efficient hardware implementation for a Multiwindow Sum of Absolute Difference stereo matching algorithm on Xilinx Zynq ZC706 board. We succeeded to increase the design productivity by converging rapidly to the appropriate designs that fit with our system constraints in terms of power consumption, hardware utilization, and frame execution time.

Download Full-text

Software-Controlled Instruction Prefetch Buffering for Low-End Processors

Journal of Circuits System and Computers ◽

10.1142/s0218126615501613 ◽

2015 ◽

Vol 24 (10) ◽

pp. 1550161 ◽

Cited By ~ 1

Author(s):

Muhammad Yasir Qadri ◽

Nadia N. Qadri ◽

Martin Fleury ◽

Klaus D. McDonald-Maier

Keyword(s):

Power Consumption ◽

Execution Time ◽

Embedded Processors ◽

Maximum Reduction ◽

Embedded Applications

This paper proposes a method of buffering instructions by software-based prefetching. The method allows low-end processors to improve their instruction throughput with a minimum of additional logic and power consumption. Low-end embedded processors do not employ caches for mainly two reasons. The first reason is that the overhead of cache implementation in terms of energy and area is considerable. The second reason is that, because a cache's performance primarily depends on the number of hits, an increasing number of misses could cause a processor to remain in stall mode for a longer duration. As a result, a cache may become more of a liability than an advantage. In contrast, the benchmarked results for the proposed software-based prefetch buffering without a cache show a 5–10% improvement in execution time. They also show a 4% or more reduction in the energy-delay-square-product (ED2P) with a maximum reduction of 40%. The results additionally demonstrate that the performance and efficiency of the proposed architecture scales with the number of multicycle instructions. The benchmarked routines tested to arrive at these results are widely deployed components of embedded applications.

Download Full-text