scholarly journals Accelerating Deep Learning Inference with Cross-Layer Data Reuse on GPUs

Author(s):  
Xueying Wang ◽  
Guangli Li ◽  
Xiao Dong ◽  
Jiansong Li ◽  
Lei Liu ◽  
...  
Keyword(s):  
IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 157730-157740
Author(s):  
Shu-Ming Tseng ◽  
Yung-Fang Chen ◽  
Cheng-Shun Tsai ◽  
Wen-Da Tsai

Computing ◽  
2021 ◽  
Author(s):  
Feng Wu ◽  
Hongwei Lv ◽  
Tongrang Fan ◽  
Wenbin Zhao ◽  
Jiaqi Wang

Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 1025
Author(s):  
Ran Wu ◽  
Xinmin Guo ◽  
Jian Du ◽  
Junbao Li

The breakthrough of deep learning has started a technological revolution in various areas such as object identification, image/video recognition and semantic segmentation. Neural network, which is one of representative applications of deep learning, has been widely used and developed many efficient models. However, the edge implementation of neural network inference is restricted because of conflicts between the high computation and storage complexity and resource-limited hardware platforms in applications scenarios. In this paper, we research neural networks which are involved in the acceleration on FPGA-based platforms. The architecture of networks and characteristics of FPGA are analyzed, compared and summarized, as well as their influence on acceleration tasks. Based on the analysis, we generalize the acceleration strategies into five aspects—computing complexity, computing parallelism, data reuse, pruning and quantization. Then previous works on neural network acceleration are introduced following these topics. We summarize how to design a technical route for practical applications based on these strategies. Challenges in the path are discussed to provide guidance for future work.


Author(s):  
Paulo Alexandre Regis ◽  
Suman Bhunia ◽  
Amar Nath Patra ◽  
Shamik Sengupta

Sign in / Sign up

Export Citation Format

Share Document