Creating Customized CGRAs for Scientific Applications

George Charitopoulos; Ioannis Papaefstathiou; Dionisios N. Pnevmatikatos

doi:10.3390/electronics10040445

Creating Customized CGRAs for Scientific Applications

Electronics ◽

10.3390/electronics10040445 ◽

2021 ◽

Vol 10 (4) ◽

pp. 445

Author(s):

George Charitopoulos ◽

Ioannis Papaefstathiou ◽

Dionisios N. Pnevmatikatos

Keyword(s):

Energy Consumption ◽

Execution Time ◽

Coarse Grain ◽

Scientific Applications ◽

Cell Array ◽

Analysis Methods ◽

Reconfigurable Arrays ◽

Hardware Implementations ◽

Application Analysis ◽

Software Implementations

Executing complex scientific applications on Coarse Grain Reconfigurable Arrays (CGRAs) offers improvements in the execution time and/or energy consumption when compared to optimized software implementations or even fully customized hardware solutions. In this work, we explore the potential of application analysis methods in such customized hardware solutions. We offer analysis metrics from various scientific applications and tailor the results that are to be used by MC-Def, a novel Mixed-CGRA Definition Framework targeting a Mixed-CGRA architecture that leverages the advantages of CGRAs and those of FPGAs by utilizing a customized cell-array along, with a separate LUT array being used for adaptability. Additionally, we present the implementation results regarding the VHDL-created hardware implementations of our CGRA cell concerning various scientific applications.

Download Full-text

MC-DeF

ACM Transactions on Architecture and Code Optimization ◽

10.1145/3447970 ◽

2021 ◽

Vol 18 (3) ◽

pp. 1-25

Author(s):

George Charitopoulos ◽

Dionisios N. Pnevmatikatos ◽

Georgi Gaydadjiev

Keyword(s):

Energy Consumption ◽

Design Space Exploration ◽

Cell Structure ◽

Cell Array ◽

Domain Specific ◽

Fine Grain ◽

Reconfigurable Arrays ◽

Field Programmable ◽

Final Design ◽

Novel Algorithms

Executing complex scientific applications on Coarse-Grain Reconfigurable Arrays ( CGRAs ) promises improvements in execution time and/or energy consumption compared to optimized software implementations or even fully customized hardware solutions. Typical CGRA architectures contain of multiple instances of the same compute module that consist of simple and general hardware units such as ALUs, simple processors. However, generality in the cell contents, while convenient for serving a wide variety of applications, penalizes performance and energy efficiency. To that end, a few proposed CGRAs use custom logic tailored to a particular application’s specific characteristics in the compute module. This approach, while much more efficient, restricts the versatility of the array. To date, versatility at hardware speeds is only supported with Field programmable gate arrays (FPGAs), that are reconfigurable at a very fine grain. This work proposes MC-DeF, a novel Mixed-CGRA Definition Framework targeting a Mixed-CGRA architecture that leverages the advantages of CGRAs by utilizing a customized cell array, and those of FPGAs by incorporating a separate LUT array used for adaptability. The framework presented aims to develop a complete CGRA architecture. First, a cell structure and functionality definition phase creates highly customized application/domain specific CGRA cells. Then, mapping and routing phases define the CGRA connectivity and cell-LUT array transactions. Finally, an energy and area estimation phase presents the user with area occupancy and energy consumption estimations of the final design. MC-DeF uses novel algorithms and cost functions driven by user defined metrics, threshold values, and area/energy restrictions. The benefits of our framework, besides creating fast and efficient CGRA designs, include design space exploration capabilities offered to the user. The validity of the presented framework is demonstrated by evaluating and creating CGRA designs of nine applications. Additionally, we provide comparisons of MC-DeF with state-of-the-art related works, and show that MC-DeF offers competitive performance (in terms of internal bandwidth and processing throughput) even compared against much larger designs, and requires fewer physical resources to achieve this level of performance. Finally, MC-DeF is able to better utilize the underlying FPGA fabric and achieves the best efficiency (measured in LUT/GOPs).

Download Full-text

Joint Optimization Offloading Strategy of Execution Time and Energy Consumption of Mobile Edge Computing

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/5/114 ◽

2021 ◽

Vol 18 (5) ◽

Author(s):

Qingzhu Wang ◽

Xiaoyun Cui

Keyword(s):

Energy Consumption ◽

Mobile Devices ◽

Execution Time ◽

Optimal Solution ◽

Computation Offloading ◽

Joint Optimization ◽

Random Strategy ◽

Fitness Value ◽

Cloud Server ◽

Time And Energy

As mobile devices become more and more powerful, applications generate a large number of computing tasks, and mobile devices themselves cannot meet the needs of users. This article proposes a computation offloading model in which execution units including mobile devices, edge server, and cloud server. Previous studies on joint optimization only considered tasks execution time and the energy consumption of mobile devices, and ignored the energy consumption of edge and cloud server. However, edge server and cloud server energy consumption have a significant impact on the final offloading decision. This paper comprehensively considers execution time and energy consumption of three execution units, and formulates task offloading decision as a single-objective optimization problem. Genetic algorithm with elitism preservation and random strategy is adopted to obtain optimal solution of the problem. At last, simulation experiments show that the proposed computation offloading model has lower fitness value compared with other computation offloading models.

Download Full-text

Experimental Evaluation of Probabilistic Execution-Time Modeling and Analysis Methods for SDF Applications on MPSoCs

Lecture Notes in Computer Science - Embedded Computer Systems: Architectures, Modeling, and Simulation ◽

10.1007/978-3-030-27562-4_17 ◽

2019 ◽

pp. 241-254

Author(s):

Ralf Stemmer ◽

Hai-Dang Vu ◽

Kim Grüttner ◽

Sebastien Le Nours ◽

Wolfgang Nebel ◽

...

Keyword(s):

Execution Time ◽

Experimental Evaluation ◽

Modeling And Analysis ◽

Analysis Methods ◽

Time Modeling

Download Full-text

ANALOG HARDWARE IMPLEMENTATIONS OF ARTIFICIAL NEURAL NETWORKS

Journal of Circuits System and Computers ◽

10.1142/s0218126611007347 ◽

2011 ◽

Vol 20 (03) ◽

pp. 349-373 ◽

Cited By ~ 6

Author(s):

NADIA NEDJAH ◽

RODRIGO MARTINS DA SILVA ◽

LUIZA DE MACEDO MOURELLE

Keyword(s):

Computing System ◽

Activation Function ◽

Lookup Table ◽

Training Process ◽

Systems Software ◽

Hardware Implementations ◽

Intrinsic Parallelism ◽

Software Implementations ◽

Analog Implementation ◽

Artificial Neural

There are several possible implementations of artificial neural network that are based either on software or hardware systems. Software implementations are rather inefficient due to the fact that the intrinsic parallelism of the underlying computation is usually not taken advantage of in a mono-processor kind of computing system. Existing hardware implementations of ANNs are efficient as the dedicated datapath used is optimized and the hardware is usually parallel. Hardware implementations of ANNs may be either digital, analog, or even hybrid. Digital implementations of ANNs tend to be of high complexity, thus of high cost, and somehow imprecise due to the use of lookup table for the activation function. On the other hand, analog implementation of ANNs are generally very simple and much more precise. In this paper, we focus on possible analog implementations of ANNs. The neuron is based on a simple operational amplifier. The reviewed implementations allow for the use of both negative and positive synaptic weights. An alternative implementation permits the realization of the training process.

Download Full-text

Research on Network-on-Chip Dynamic and Adaptive Algorithm and Choice Strategy

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.539.296 ◽

2014 ◽

Vol 539 ◽

pp. 296-302

Author(s):

Dong Li

Keyword(s):

Energy Consumption ◽

Execution Time ◽

Adaptive Algorithm ◽

Network On Chip ◽

Time Dimension ◽

Choice Strategy ◽

Bus Structure ◽

On Chip ◽

Mapping Scheme ◽

Single Objective

With further increase of the number of on-chip device, the bus structure has not met the requirements. In order to make better communication between each part, the chip designers need to explore a new structure to solve the interconnection of on-chip device. The paper proposes a network-on-chip dynamic and adaptive algorithm which selects NoC platform with 2-dimension mesh as the carrier, incorporates communication energy consumption and delay into unified cost function and uses ant colony optimization to realize NOC map facing energy consumption and delay. The experiment indicates that compared with random map, single objective optimization can separately saves (30%~47 %) and ( 20%~39%) in communication energy consumption and execution time compared with random map, and joint objective optimization can further excavate the potential of time dimension in mapping scheme dominated by the energy.

Download Full-text

Joint Optimization Offloading Strategy of Execution Time and Energy Consumption of Mobile Edge Computing

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/5/11 ◽

2021 ◽

Vol 18 (5) ◽

Author(s):

Qingzhu Wang ◽

Xiaoyun Cui

Keyword(s):

Energy Consumption ◽

Execution Time ◽

Edge Computing ◽

Joint Optimization ◽

Mobile Edge Computing ◽

Time And Energy

Download Full-text

Structural Coverage Analysis Methods

Advances in Computer and Electrical Engineering - Code Generation, Analysis Tools, and Testing for Quality ◽

10.4018/978-1-5225-7455-2.ch002 ◽

2019 ◽

pp. 36-63

Author(s):

Parnasi Retasbhai Patel ◽

Chintan M. Bhatt

Keyword(s):

Execution Time ◽

Source Code ◽

Second Phase ◽

Coverage Criteria ◽

Analysis Methods ◽

Coverage Analysis ◽

Structural Coverage ◽

Test Suit

Structural coverage analysis for any code is a very common approach to measure the quality of any test suit. Structural coverage determines which structure of the software or which portion is not exercised. This chapter describes two different phases to achieve structural coverage analysis using DO-178B/C standards. Statement coverage is the very basic coverage criteria which involves execution of all the executable statements in the source code at least once. Analysis of structural coverage can be done by capturing the amount of code that is covered by the airborne software. The first phase contains the instrumentation procedure which instruments the source code at execution time, and the second phase is generating a report that specifies which portion of source code is executed and which one is not in the form of a percentage.

Download Full-text

Hardware Implementations of Image/Video Watermarking Algorithms

Advanced Techniques in Multimedia Watermarking - Advances in Multimedia and Interactive Technologies ◽

10.4018/978-1-61520-903-3.ch017 ◽

2010 ◽

pp. 425-454

Author(s):

Fayez M. Idris

Keyword(s):

Digital Media ◽

Digital Watermarking ◽

Video Watermarking ◽

Patient Records ◽

Electronic Patient Records ◽

Hardware Implementations ◽

Consumer Electronic ◽

Software Implementations ◽

Enhance Efficiency

Digital watermarking is a process in which a secondary pattern or signature, called a watermark, is hidden into a digital media (e.g., image and video) such that it can be detected or extracted later for different intentions. Digital watermarking has many applications including copyright protection, authentication, tamper detection, and embedding of electronic patient records in medical images. Various software implementations of digital watermarking algorithms can be built. While software implementations can address digital watermarking in off-line applications, they cannot meet the requirements of many applications. For example, in consumer electronic devices, a software solution would be very expensive. This has motivated the development of hardware implementations of digital watermarking. In this chapter, the authors present a detailed survey of existing hardware implementations of image and video watermarking algorithms. Fundamental design issues are discussed and special techniques exploited to enhance efficiency are identified. Future outlooks are also presented to address the challenges of hardware architecture design for image and video watermarking.

Download Full-text

A methodology correlating code optimizations with data memory accesses, execution time and energy consumption

The Journal of Supercomputing ◽

10.1007/s11227-019-02880-z ◽

2019 ◽

Vol 75 (10) ◽

pp. 6710-6745 ◽

Cited By ~ 1

Author(s):

Vasilios Kelefouras ◽

Karim Djemame

Keyword(s):

Energy Consumption ◽

Execution Time ◽

Data Memory ◽

Memory Accesses ◽

Time And Energy ◽

Code Optimizations

Download Full-text

On the Energy Consumption of Quantum-resistant Cryptographic Software Implementations Suitable for Wireless Sensor Networks

Proceedings of the 16th International Joint Conference on e-Business and Telecommunications ◽

10.5220/0007835600720083 ◽

2019 ◽

Author(s):

Michael Heigl ◽

Laurin Doerr ◽

Martin Schramm ◽

Dalibor Fiala

Keyword(s):

Wireless Sensor Networks ◽

Energy Consumption ◽

Sensor Networks ◽

Wireless Sensor ◽

Software Implementations

Download Full-text