Performance and energy optimization of heterogeneous CPU-GPU systems for embedded applications

10.32920/ryerson.14661414.v1 ◽

2021 ◽

Author(s):

Abdullah Siddiqui

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Optimization Algorithm ◽

Heterogeneous Computing ◽

Energy Optimization ◽

Systems Design ◽

Application Partitioning ◽

Computing Platforms ◽

Embedded Applications ◽

Software Partitioning

One of the most critical steps of embedded systems design is Hardware-Software partitioning. It is characterized by distributing the components of an application between hardware and software such that the user defined system constraints are satisfied. Heterogeneous computing platforms consisting of CPUs and GPUs have tremendous potential for enhancing the performance of embedded applications. The challenge of application partitioning for CPU-GPU mapping is much greater on such platforms due to their unique and diverse characteristics. In this thesis, an optimization algorithm is devised and presented for partitioning and mapping computational tasks on CPU-GPU platforms while keeping a check on the power consumption. Our methodology also uses parallelism in applications and their tasks by utilizing the architectural capabilities of the GPU. The optimization algorithm was tested with a MJPEG decoder, several benchmarks and synthetic graphs.

Download Full-text

Heterogeneous Computing: An Emerging Paradigm of Embedded Systems Design

Computational Frameworks ◽

10.1016/b978-1-78548-256-4.50003-x ◽

2017 ◽

pp. 61-93

Author(s):

Abderazak Ben Abdallah

Keyword(s):

Embedded Systems ◽

Heterogeneous Computing ◽

Systems Design

Download Full-text

Energy and Performance Trade-Off Optimization in Heterogeneous Computing via Reinforcement Learning

Electronics ◽

10.3390/electronics9111812 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1812 ◽

Cited By ~ 1

Author(s):

Zheqi Yu ◽

Pedro Machado ◽

Adnan Zahid ◽

Amir M. Abdulghani ◽

Kia Dashtipour ◽

...

Keyword(s):

Reinforcement Learning ◽

Power Consumption ◽

Heterogeneous Computing ◽

Operation Mode ◽

Substantial Reduction ◽

Power Measurement ◽

Resource Utilisation ◽

Improve Energy Efficiency ◽

Computing Platforms ◽

And Performance

This paper suggests an optimisation approach in heterogeneous computing systems to balance energy power consumption and efficiency. The work proposes a power measurement utility for a reinforcement learning (PMU-RL) algorithm to dynamically adjust the resource utilisation of heterogeneous platforms in order to minimise power consumption. A reinforcement learning (RL) technique is applied to analyse and optimise the resource utilisation of field programmable gate array (FPGA) control state capabilities, which is built for a simulation environment with a Xilinx ZYNQ multi-processor systems-on-chip (MPSoC) board. In this study, the balance operation mode for improving power consumption and performance is established to dynamically change the programmable logic (PL) end work state. It is based on an RL algorithm that can quickly discover the optimization effect of PL on different workloads to improve energy efficiency. The results demonstrate a substantial reduction of 18% in energy consumption without affecting the application’s performance. Thus, the proposed PMU-RL technique has the potential to be considered for other heterogeneous computing platforms.

Download Full-text

Efficient Instruction and Data Caching for High Performance Embedded Processors

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201201788 ◽

1970 ◽

pp. 9

Author(s):

A. Ferrerón Labari ◽

D. Suárez Gracia ◽

V. Viñals Yúfera

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Interconnection Networks ◽

High Performance ◽

Critical Issue ◽

Content Management ◽

Structure Design ◽

Portable Devices ◽

On Chip

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text

Massively Parallel Systems Design for Real-Time Embedded Applications

10.21236/ada277256 ◽

1994 ◽

Author(s):

Thomas C. Choinski ◽

Chin-Hwa Lee

Keyword(s):

Real Time ◽

Systems Design ◽

Parallel Systems ◽

Massively Parallel ◽

Massively Parallel Systems ◽

Embedded Applications

Download Full-text

Performance Evaluation of Fiber Wireless (FiWi) Access Network using Position Optimization of ONUs

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327910666200304131411 ◽

2020 ◽

Vol 10 ◽

Author(s):

Nitin Chouhan ◽

Uma Rathore Bhatt ◽

Raksha Upadhyay

Keyword(s):

Power Consumption ◽

Optimization Algorithm ◽

Optical Network ◽

Access Network ◽

Whale Optimization Algorithm ◽

Wireless Access ◽

Delivery Ratio ◽

Packet Delivery ◽

Whale Optimization ◽

Wireless Access Network

: Fiber Wireless Access Network is the blend of passive optical network and wireless access network. This network provides higher capacity, better flexibility, more stability and improved reliability to the users at lower cost. Network component (such as Optical Network Unit (ONU)) placement is one of the major research issues which affects the network design, performance and cost. Considering all these concerns, we implement customized Whale Optimization Algorithm (WOA) for ONU placement. Initially whale optimization algorithm is applied to get optimized position of ONUs, which is followed by reduction of number of ONUs in the network. Reduction of ONUs is done such that with fewer number of ONUs all routers present in the network can communicate. In order to ensure the performance of the network we compute the network parameters such as Packet Delivery Ratio (PDR), Total Time for Delivering the Packets in the Network (TTDPN) and percentage reduction in power consumption for the proposed algorithm. The performance of the proposed work is compared with existing algorithms (deterministic and centrally placed ONUs with predefined hops) and has been analyzed through extensive simulation. The result shows that the proposed algorithm is superior to the other algorithms in terms of minimum required ONUs and reduced power consumption in the network with almost same packet delivery ratio and total time for delivering the packets in the network. Therefore, present work is suitable for developing cost-effective FiWi network with maintained network performance.

Download Full-text