Improving Performance-Power-Programmability in Space Avionics with Edge Devices: VBN on Myriad2 SoC

2021 ◽  
Vol 20 (3) ◽  
pp. 1-23
Author(s):  
Vasileios Leon ◽  
George Lentaris ◽  
Evangelos Petrongonas ◽  
Dimitrios Soudris ◽  
Gianluca Furano ◽  
...  

The advent of powerful edge devices and AI algorithms has already revolutionized many terrestrial applications; however, for both technical and historical reasons, the space industry is still striving to adopt these key enabling technologies in new mission concepts. In this context, the current work evaluates an heterogeneous multi-core system-on-chip processor for use on-board future spacecraft to support novel, computationally demanding digital signal processors and AI functionalities. Given the importance of low power consumption in satellites, we consider the Intel Movidius Myriad2 system-on-chip and focus on SW development and performance aspects. We design a methodology and framework to accommodate efficient partitioning, mapping, parallelization, code optimization, and tuning of complex algorithms. Furthermore, we propose an avionics architecture combining this commercial off-the-shelf chip with a field programmable gate array device to facilitate, among others, interfacing with traditional space instruments via SpaceWire transcoding. We prototype our architecture in the lab targeting vision-based navigation tasks. We implement a representative computer vision pipeline to track the 6D pose of ENVISAT using megapixel images during hypothetical spacecraft proximity operations. Overall, we achieve 2.6 to 4.9 FPS with only 0.8 to 1.1 W on Myriad2 , i.e., 10-fold acceleration versus modern rad-hard processors. Based on the results, we assess various benefits of utilizing Myriad2 instead of conventional field programmable gate arrays and CPUs.

2018 ◽  
Vol 7 (2.16) ◽  
pp. 57
Author(s):  
G Prasad Acharya ◽  
M Asha Rani

The increased demand for processor-level parallelism has many-folded the challenges for SoC designers to design, simulate and verify/validate today’s Multi-core System-On-Chip (SoC) due to the increased system complexity. There is also a need to reduce the design cycle time to produce a complex multi-core SOC system thereby the product can be brought into the market within an affordable time. The Computer-Aided Design (CAD) tools and Field Programmable Gate Arrays (FPGAs) provide a solution for rapidly prototyping and validating the system. This paper presents an implementation of multi-core SoC consisting of 6 Xilinx Micro-Blaze soft-core processors integrated to the Zynq Processing System (PS) using IP Integrator and these cores will be communicated through AXI bus. The functionality of the system is verified using Micro-Blaze system debugger. The hardware framework for the implemented system is implemented and verified on FPGA.  


Electronics ◽  
2019 ◽  
Vol 8 (5) ◽  
pp. 528 ◽  
Author(s):  
Julian Viejo ◽  
Jorge Juan-Chico ◽  
Manuel J. Bellido ◽  
Paulino Ruiz-de-Clavijo ◽  
David Guerrero ◽  
...  

This paper presents the complete design and implementation of a low-cost, low-footprint, network time protocol server core for field programmable gate arrays. The core uses a carefully designed modular architecture, which is fully implemented in hardware using digital circuits and systems. Most remarkable novelties introduced are a hardware-optimized timekeeping algorithm implementation, and a full-hardware protocol stack and automatic network configuration. As a result, the core is able to achieve similar accuracy and performance to typical high-performance network time protocol server equipment. The core uses a standard global positioning system receiver as time reference, has a small footprint and can easily fit in a low-range field-programmable chip, greatly scaling down from previous system-on-chip time synchronization systems. Accuracy and performance results show that the core can serve hundreds of thousands of network time clients with negligible accuracy degradation, in contrast to state-of-the-art high-performance time server equipment. Therefore, this core provides a valuable time server solution for a wide range of emerging embedded and distributed network applications such as the Internet of Things and the smart grid, at a fraction of the cost and footprint of current discrete and embedded solutions.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1587
Author(s):  
Duo Sheng ◽  
Hsueh-Ru Lin ◽  
Li Tai

High performance and complex system-on-chip (SoC) design require a throughput and stable timing monitor to reduce the impacts of uncertain timing and implement the dynamic voltage and frequency scaling (DVFS) scheme for overall power reduction. This paper presents a multi-stage timing monitor, combining three timing-monitoring stages to achieve a high timing-monitoring resolution and a wide timing-monitoring range simultaneously. Additionally, because the proposed timing monitor has high immunity to the process–voltage–temperature (PVT) variation, it provides a more stable time-monitoring results. The time-monitoring resolution and range of the proposed timing monitor are 47 ps and 2.2 µs, respectively, and the maximum measurement error is 0.06%. Therefore, the proposed multi-stage timing monitor provides not only the timing information of the specified signals to maintain the functionality and performance of the SoC, but also makes the operation of the DVFS scheme more efficient and accurate in SoC design.


2021 ◽  
Vol 27 (3) ◽  
pp. 57-70
Author(s):  
Damjan M. Rakanovic ◽  
Vuk Vranjkovic ◽  
Rastislav J. R. Struharik

Paper proposes a two-step Convolutional Neural Network (CNN) pruning algorithm and resource-efficient Field-programmable gate array (FPGA) CNN accelerator named “Argus”. The proposed CNN pruning algorithm first combines similar kernels into clusters, which are then pruned using the same regular pruning pattern. The pruning algorithm is carefully tailored for FPGAs, considering their resource characteristics. Regular sparsity results in high Multiply-accumulate (MAC) efficiency, reducing the amount of logic required to balance workloads among different MAC units. As a result, the Argus accelerator requires about 170 Look-up tables (LUTs) per Digital Signal Processor (DSP) block. This number is close to the average LUT/DPS ratio for various FPGA families, enabling balanced resource utilization when implementing Argus. Benchmarks conducted using Xilinx Zynq Ultrascale + Multi-Processor System-on-Chip (MPSoC) indicate that Argus is achieving up to 25 times higher frames per second than NullHop, 2 and 2.5 times higher than NEURAghe and Snowflake, respectively, and 2 times higher than NVDLA. Argus shows comparable performance to MIT’s Eyeriss v2 and Caffeine, requiring up to 3 times less memory bandwidth and utilizing 4 times fewer DSP blocks, respectively. Besides the absolute performance, Argus has at least 1.3 and 2 times better GOP/s/DSP and GOP/s/Block-RAM (BRAM) ratios, while being competitive in terms of GOP/s/LUT, compared to some of the state-of-the-art solutions.


Cryptography ◽  
2020 ◽  
Vol 4 (1) ◽  
pp. 6 ◽  
Author(s):  
Saleh Mulhem ◽  
Ayoub Mars ◽  
Wael Adi

New large classes of permutations over ℤ 2 n based on T-Functions as Self-Inverting Permutation Functions (SIPFs) are presented. The presented classes exhibit negligible or low complexity when implemented in emerging FPGA technologies. The target use of such functions is in creating the so called Secret Unknown Ciphers (SUC) to serve as resilient Clone-Resistant structures in smart non-volatile Field Programmable Gate Arrays (FPGA) devices. SUCs concepts were proposed a decade ago as digital consistent alternatives to the conventional analog inconsistent Physical Unclonable Functions PUFs. The proposed permutation classes are designed and optimized particularly to use non-consumed Mathblock cores in programmable System-on-Chip (SoC) FPGA devices. Hardware and software complexities for realizing such structures are optimized and evaluated for a sample expected target FPGA technology. The attained security levels of the resulting SUCs are evaluated and shown to be scalable and usable even for post-quantum crypto systems.


2010 ◽  
Vol 34 (2-4) ◽  
pp. 102-116 ◽  
Author(s):  
Hyun-min Kyung ◽  
Gi-ho Park ◽  
Jong Wook Kwak ◽  
Tae-jin Kim ◽  
Sung-Bae Park

Sign in / Sign up

Export Citation Format

Share Document