Fast and Accurate Code Placement of Embedded Software for Hybrid On-Chip Memory Architecture

Author(s):  
Zimeng Zhou ◽  
Lei Ju ◽  
Zhiping Jia ◽  
Xin Li
Author(s):  
Katalin Popovici ◽  
Frédéric Rousseau ◽  
Ahmed A. Jerraya ◽  
Marilyn Wolf

2020 ◽  
Vol 12 (2) ◽  
pp. 116-121
Author(s):  
Rastislav Struharik ◽  
Vuk Vranjković

Data movement between the Convolutional Neural Network (CNN) accelerators and off-chip memory is critical concerning the overall power consumption. Minimizing power consumption is particularly important for low power embedded applications. Specific CNN computes patterns offer a possibility of significant data reuse, leading to the idea of using specialized on-chip cache memories which enable a significant improvement in power consumption. However, due to the unique caching pattern present within CNNs, standard cache memories would not be efficient. In this paper, a novel on-chip cache memory architecture, based on the idea of input feature map striping, is proposed, which requires significantly less on-chip memory resources compared to previously proposed solutions. Experiment results show that the proposed cache architecture can reduce on-chip memory size by a factor of 16 or more, while increasing power consumption no more than 15%, compared to some of the previously proposed solutions.


Author(s):  
J. Santhi ◽  
L. Srinivas

Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are being developed for line-speed packet processing. While several pattern matching algorithms have already been developed for such applications, we find that most of them suffer from scalability issues. We present a hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. We modify the classic Aho-Corasick algorithm to consider multiple characters at a time for higher throughput. Furthermore, we suppress a large fraction of memory accesses by using Bloom filters implemented with a small amount of on-chip memory. The resulting algorithm can support matching of several thousands of patterns at more than 10 Gbps with the help of a less than 50 KBytes of embedded memory and a few megabytes of external SRAM.


2020 ◽  
Vol 25 (3) ◽  
pp. 46 ◽  
Author(s):  
Mario Kovač ◽  
Philippe Notton ◽  
Daniel Hofman ◽  
Josip Knezović

In this paper, we present an overview of the European Processor Initiative (EPI), one of the cornerstones of the EuroHPC Joint Undertaking, a new European Union strategic entity focused on pooling the Union’s and national resources on HPC to acquire, build and deploy the most powerful supercomputers in the world within Europe. EPI started its activities in December 2018. The first three years drew processor and platform designers, embedded software, middleware, applications and usage experts from 10 EU countries together to co-design Europe’s first HPC Systems on Chip and accelerators with its unique Common Platform (CP) technology. One of EPI’s core activities also takes place in the automotive sector, providing architectural solutions for a novel embedded high-performance computing (eHPC) platform and ensuring the overall economic viability of the initiative.


Sign in / Sign up

Export Citation Format

Share Document