International Journal of Reconfigurable Computing
Latest Publications


TOTAL DOCUMENTS

265
(FIVE YEARS 18)

H-INDEX

16
(FIVE YEARS 2)

Published By Hindawi Limited

1687-7209, 1687-7195

2021 ◽  
Vol 2021 ◽  
pp. 1-29
Author(s):  
Dimple Sharma ◽  
Lev Kirischian

One of the growing areas of application of embedded systems in robotics, aerospace, military, etc. is autonomous mobile systems. Usually, such embedded systems have multitask multimodal workloads. These systems must sustain the required performance of their dynamic workloads in presence of varying power budget due to rechargeable power sources, varying die temperature due to varying workloads and/or external temperature, and varying hardware resources due to occurrence of hardware faults. This paper proposes a run-time decision-making method, called Decision Space Explorer, for FPGA-based Systems-on-Chip (SoCs) to support changing workload requirements while simultaneously mitigating unpredictable variations in power budget, die temperature, and hardware resource constraints. It is based on the concept of Run-Time Structural Adaptation (RTSA); whenever there is a change in a system’s set of constraints, Explorer selects a suitable hardware processing circuit for each active task at an appropriate operating frequency such that all the constraints are satisfied. Explorer has been experimentally deployed on the ARM Cortex-A9 core of Xilinx Zynq XC7Z020 SoC. Its worst-case decision-making time for different scenarios ranges from tens to hundreds of microseconds. Explorer is thus suitable for enabling RTSA in systems where specifications of multiple objectives must be maintained simultaneously, making them self-sustainable.


2021 ◽  
Vol 2021 ◽  
pp. 1-31
Author(s):  
Khaled Allem ◽  
El-Bay Bourennane ◽  
Youcef Khelfaoui

To deal with the complex design issues of Dynamically Reconfigurable Systems-on-Chip (DRSoCs), it is extremely relevant to raise the abstraction level in which models are expressed. A high abstraction level allows great flexibility and reusability while bypassing low-level implementation details. In this context, model-driven engineering (MDE) provides support to build and transform precise and structured models for a particular purpose at different levels of abstraction. Indeed, high-level models are successively refined to low-level models until reaching the executable ones. Thus, this paper presents an MDE-based framework for DRSoCs design enabling the transformation of UML/MARTE specifications to SystemC/TLM implementation. To achieve a high degree of expressiveness for modeling dynamic reconfiguration, we use a suitable software engineering approach based on service-oriented component architecture. Since MARTE does not cover the common features of dynamic reconfiguration domain and service orientation concepts, new stereotypes are created by refinement to add missing capabilities to the profile. Likewise, SystemC does not provide native support for dynamic reconfiguration, thus leading us to adopt a design pattern based solution for DRSoCs implementation in compliance with standards. The proposed framework is validated through a reconfigurable active 3-way crossover case study in which we demonstrate the practicability of the approach by gradual model transformations with reduced implementation effort and significant design productivity gain.


2021 ◽  
Vol 2021 ◽  
pp. 1-20
Author(s):  
Dimple Sharma ◽  
Lev Kirischian

Autonomous mobile systems nowadays deploy FPGA-based System on Programmable Chips (SoPCs) for supporting their dynamic multitask multimodal workloads. For such field-deployed systems, activation times, execution periods of tasks, and variations in environmental conditions are usually difficult to predict. These dynamic variations result in a new challenge of dynamic thermal cycling stress on the SoPC die, which can result in transient and even permanent hardware faults in the computing system. This paper proposes the approach of run-time structural adaptation (RTSA) to mitigate dynamic thermal cycling stress on the SoPC dies. RTSA assumes the tasks to have multiple implementation variants, called Application Specific Processing (ASP) circuit variants, which vary in hardware resources, operating frequency, and power consumption. Dynamically reconfiguring appropriate ASP circuit variants of tasks allow systems to maintain their die temperature in the desired range while taking into account variations in power budget and modes of operation. This means the essence of RTSA is a decision-making mechanism which can select at run-time, a suitable system configuration (set of ASP circuit variants of active tasks), whenever needed, to meet the die temperature constraints. To do so, run-time die temperature prediction for potential system configurations using an estimation model is required. This paper presents a generic method to derive an analytical model for any SoPC that can estimate the die temperature in real time and thus support the decision-making mechanism. To develop this method, the thermal behavior of SoPC die under different task scenarios is studied and relation of die temperature to frequency, resource utilization, and power consumption is analyzed. An RTSA-enabled experimental platform is set up on Xilinx Zynq XC7Z020 SoPC for this purpose. Experimental results also demonstrate that the proposed method can be used to derive a model in run-time, thus enabling systems to self-derive and dynamically update the model in run-time.


2020 ◽  
Vol 2020 ◽  
pp. 1-2
Author(s):  
Wim Vanderbauwhede ◽  
Sven-Bodo Scholz ◽  
Martin Margala
Keyword(s):  


2020 ◽  
Vol 2020 ◽  
pp. 1-11 ◽  
Author(s):  
Yuzhi Zhou ◽  
Xi Jin ◽  
Tianqi Wang

The traditional A∗ algorithm is time-consuming due to a large number of iteration operations to calculate the evaluation function and sort the OPEN list. To achieve real-time path-planning performance, a hardware accelerator’s architecture called A∗ accelerator has been designed and implemented in field programmable gate array (FPGA). The specially designed 8-port cache and OPEN list array are introduced to tackle the calculation bottleneck. The system-on-a-chip (SOC) design is implemented in Xilinx Kintex-7 FPGA to evaluate A∗ accelerator. Experiments show that the hardware accelerator achieves 37–75 times performance enhancement relative to software implementation. It is suitable for real-time path-planning applications.


2020 ◽  
Vol 2020 ◽  
pp. 1-19
Author(s):  
Jahanzeb Anwer ◽  
Sebastian Meisner ◽  
Marco Platzner

Radiation tolerance in FPGAs is an important field of research particularly for reliable computation in electronics used in aerospace and satellite missions. The motivation behind this research is the degradation of reliability in FPGA hardware due to single-event effects caused by radiation particles. Redundancy is a commonly used technique to enhance the fault-tolerance capability of radiation-sensitive applications. However, redundancy comes with an overhead in terms of excessive area consumption, latency, and power dissipation. Moreover, the redundant circuit implementations vary in structure and resource usage with the redundancy insertion algorithms as well as number of used redundant stages. The radiation environment varies during the operation time span of the mission depending on the orbit and space weather conditions. Therefore, the overheads due to redundancy should also be optimized at run-time with respect to the current radiation level. In this paper, we propose a technique called Dynamic Reliability Management (DRM) that utilizes the radiation data, interprets it, selects a suitable redundancy level, and performs the run-time reconfiguration, thus varying the reliability levels of the target computation modules. DRM is composed of two parts. The design-time tool flow of DRM generates a library of various redundant implementations of the circuit with different magnitudes of performance factors. The run-time tool flow, while utilizing the radiation/error-rate data, selects a required redundancy level and reconfigures the computation module with the corresponding redundant implementation. Both parts of DRM have been verified by experimentation on various benchmarks. The most significant finding we have from this experimentation is that the performance can be scaled multiple times by using partial reconfiguration feature of DRM, e.g., 7.7 and 3.7 times better performance results obtained for our data sorter and matrix multiplier case studies compared with static reliability management techniques. Therefore, DRM allows for maintaining a suitable trade-off between computation reliability and performance overhead during run-time of an application.


2019 ◽  
Vol 2019 ◽  
pp. 1-18 ◽  
Author(s):  
Xin Fang ◽  
Stratis Ioannidis ◽  
Miriam Leeser

Secure Function Evaluation (SFE) has received recent attention due to the massive collection and mining of personal data, but remains impractical due to its large computational cost. Garbled Circuits (GC) is a protocol for implementing SFE which can evaluate any function that can be expressed as a Boolean circuit and obtain the result while keeping each party’s input private. Recent advances have led to a surge of garbled circuit implementations in software for a variety of different tasks. However, these implementations are inefficient, and therefore GC is not widely used, especially for large problems. This research investigates, implements, and evaluates secure computation generation using a heterogeneous computing platform featuring FPGAs. We have designed and implemented SIFO: secure computational infrastructure using FPGA overlays. Unlike traditional FPGA design, a coarse-grained overlay architecture is adopted which supports mapping SFE problems that are too large to map to a single FPGA. Host tools provided include SFE problem generator, parser, and automatic host code generation. Our design allows repurposing an FPGA to evaluate different SFE tasks without the need for reprogramming and fully explores the parallelism for any GC problem. Our system demonstrates an order of magnitude speedup compared with an existing software platform.


2019 ◽  
Vol 2019 ◽  
pp. 1-17 ◽  
Author(s):  
Rym Skhiri ◽  
Virginie Fresse ◽  
Jean Paul Jamont ◽  
Benoit Suffran ◽  
Jihene Malek

Field Programmable Gate Array (FPGA) draws a significant attention from both industry and academia by accelerating computationally expensive applications and achieving low power consumption. FPGAs are interesting due to the flexibility and reconfigurabiltiy of their device. Cloud computing becomes a major trend towards infrastructure and computing resources dematerialization. It provides “unlimited” storage capacities and a large number of data and applications that make collaboration easier between multiple (not domain specific) designers. Many papers in the literature have surveyed Cloud and FPGA separately and, more precisely, their services and challenges. The acceleration of applications by FPGA and the unlimited capacities of the cloud are expected to be more and more pervasive. As more and more FPGA are being deployed in traditional cloud, it is appropriate to clarify what is the cloud FPGA and which drawbacks of using FPGA in local are resolved. We present a survey of the cloud FPGA works that have been proposed to exploit the advantages of using FPGA in the cloud. We classify these studies in three services to highlight their benefits and limitations. This survey aims at motivating further researches in cloud FPGA.


2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Syed Waqar Nabi ◽  
Wim Vanderbauwhede

There is a large body of legacy scientific code in use today that could benefit from execution on accelerator devices like GPUs and FPGAs. Manual translation of such legacy code into device-specific parallel code requires significant manual effort and is a major obstacle to wider FPGA adoption. We are developing an automated optimizing compiler TyTra to overcome this obstacle. The TyTra flow aims to compile legacy Fortran code automatically for FPGA-based acceleration, while applying suitable optimizations. We present the flow with a focus on two key optimizations, automatic pipelining and vectorization. Our compiler frontend extracts patterns from legacy Fortran code that can be pipelined and vectorized. The backend first creates fine and coarse-grained pipelines and then automatically vectorizes both the memory access and the datapath based on a cost model, generating an OpenCL-HDL hybrid working solution for FPGA targets on the Amazon cloud. Our results show up to 4.2× performance improvement over baseline OpenCL code.


2019 ◽  
Vol 2019 ◽  
pp. 1-19
Author(s):  
Karim M. A. Ali ◽  
Rabie Ben Atitallah ◽  
Abdessamad Ait El Cadi ◽  
Nizar Fakhfakh ◽  
Jean-Luc Dekeyser

Embedded video applications are now involved in sophisticated transportation systems like autonomous vehicles and driver assistance systems. As silicon capacity increases, the design productivity gap grows up for the current available design tools. Hence, high-level synthesis (HLS) tools emerged in order to reduce that gap by shifting the design efforts to higher abstraction levels. In this paper, we present ViPar as a tool for exploring different video processing architectures at higher design level. First, we proposed a parametrizable parallel architectural model dedicated for video applications. Second, targeting this architectural model, we developed ViPar tool with two main features: (1) An empirical model was introduced to estimate the power consumption based on hardware utilization and operating frequency. In addition to that, we derived the equations for estimating the hardware utilization and execution time for each design point during the space exploration process. (2) By defining the main characteristics of the parallel video architecture like parallelism level, the number of input/output ports, the pixel distribution pattern, and so on, ViPar tool can automatically generate the dedicated architecture for hardware implementation. In the experimental validation, we used ViPar tool to generate automatically an efficient hardware implementation for a Multiwindow Sum of Absolute Difference stereo matching algorithm on Xilinx Zynq ZC706 board. We succeeded to increase the design productivity by converging rapidly to the appropriate designs that fit with our system constraints in terms of power consumption, hardware utilization, and frame execution time.


Sign in / Sign up

Export Citation Format

Share Document