RiSA: A Reinforced Systolic Array for Depthwise Convolutions and Embedded Tensor Reshaping

Depthwise convolutions are widely used in convolutional neural networks (CNNs) targeting mobile and embedded systems. Depthwise convolution layers reduce the computation loads and the number of parameters compared to the conventional convolution layers. Many deep neural network (DNN) accelerators adopt an architecture that exploits the high data-reuse factor of DNN computations, such as a systolic array. However, depthwise convolutions have low data-reuse factor and under-utilize the processing elements (PEs) in systolic arrays. In this paper, we present a DNN accelerator design called RiSA, which provides a novel mechanism that boosts the PE utilization for depthwise convolutions on a systolic array with minimal overheads. In addition, the PEs in systolic arrays can be efficiently used only if the data items ( tensors ) are arranged in the desired layout. Typical DNN accelerators provide various types of PE interconnects or additional modules to flexibly rearrange the data items and manage data movements during DNN computations. RiSA provides a lightweight set of tensor management tasks within the PE array itself that eliminates the need for an additional module for tensor reshaping tasks. Using this embedded tensor reshaping, RiSA supports various DNN models, including convolutional neural networks and natural language processing models while maintaining a high area efficiency. Compared to Eyeriss v2, RiSA improves the area and energy efficiency for MobileNet-V1 inference by 1.91× and 1.31×, respectively.

Download Full-text

COSY: An Energy-Efficient Hardware Architecture for Deep Convolutional Neural Networks Based on Systolic Array

2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS) ◽

10.1109/icpads.2017.00034 ◽

2017 ◽

Cited By ~ 2

Author(s):

Chen Xin ◽

Qiang Chen ◽

Miren Tian ◽

Mohan Ji ◽

Chenglong Zou ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Systolic Array ◽

Energy Efficient ◽

Hardware Architecture ◽

Deep Convolutional Neural Networks

Download Full-text

Adaptive Tiling: Applying Fixed-size Systolic Arrays To Sparse Convolutional Neural Networks

2018 24th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr.2018.8545462 ◽

2018 ◽

Cited By ~ 4

Author(s):

H. T. Kung ◽

Bradley McDanel ◽

Sai Qian Zhang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Systolic Arrays ◽

Fixed Size

Download Full-text

Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations

Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '19 ◽

10.1145/3297858.3304028 ◽

2019 ◽

Cited By ~ 23

Author(s):

H.T. Kung ◽

Bradley McDanel ◽

Sai Qian Zhang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Systolic Array

Download Full-text

Flexible Multi-Precision Accelerator Design for Deep Convolutional Neural Networks Considering Both Data Computation and Communication

2020 International Symposium on VLSI Design, Automation and Test (VLSI-DAT) ◽

10.1109/vlsi-dat49148.2020.9196465 ◽

2020 ◽

Author(s):

Shen-Fu Hsiao ◽

Yu-Hong Chen

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks ◽

Accelerator Design

Download Full-text

Energy-Efficient Convolutional Neural Networks via Recurrent Data Reuse

2019 Design, Automation & Test in Europe Conference & Exhibition (DATE) ◽

10.23919/date.2019.8714880 ◽

2019 ◽

Cited By ~ 1

Author(s):

Luca Mocerino ◽

Valerio Tenace ◽

Andrea Calimera

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Energy Efficient ◽

Data Reuse

Download Full-text

Exploring optimized accelerator design for binarized convolutional neural networks

2017 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2017.7966161 ◽

2017 ◽

Cited By ~ 2

Author(s):

Kodai Ueyoshi ◽

Kota Ando ◽

Kentaro Orimo ◽

Masayuki Ikebe ◽

Tetsuya Asai ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Accelerator Design

Download Full-text

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks

Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '15 ◽

10.1145/2684746.2689060 ◽

2015 ◽

Cited By ~ 576

Author(s):

Chen Zhang ◽

Peng Li ◽

Guangyu Sun ◽

Yijin Guan ◽

Bingjun Xiao ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks ◽

Accelerator Design

Download Full-text

Image Classification: A Survey

Journal of Informatics Electrical and Electronics Engineering (JIEEE) ◽

10.54060/jieee/001.02.002 ◽

2020 ◽

Vol 1 (2) ◽

pp. 1-9

Author(s):

Ankita Singh ◽

◽

Pawan Singh

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Image Classification ◽

Convolutional Neural Networks ◽

Language Processing ◽

Classification Accuracy ◽

Deep Neural Network ◽

Learning Ability ◽

Final Decision

The Classification of images is a paramount topic in artificial vision systems which have drawn a notable amount of interest over the past years. This field aims to classify an image, which is an input, based on its visual content. Currently, most people relied on hand-crafted features to describe an image in a particular way. Then, using classifiers that are learnable, such as random forest, and decision tree was applied to the extract features to come to a final decision. The problem arises when large numbers of photos are concerned. It becomes a too difficult problem to find features from them. This is one of the reasons that the deep neural network model has been introduced. Owing to the existence of Deep learning, it can become feasible to represent the hierarchical nature of features using a various number of layers and corresponding weight with them. The existing image classification methods have been gradually applied in real-world problems, but then there are various problems in its application processes, such as unsatisfactory effect and extremely low classification accuracy or then and weak adaptive ability. Models using deep learning concepts have robust learning ability, which combines the feature extraction and the process of classification into a whole which then completes an image classification task, which can improve the image classification accuracy effectively. Convolutional Neural Networks are a powerful deep neural network technique. These networks preserve the spatial structure of a problem and were built for object recognition tasks such as classifying an image into respective classes. Neural networks are much known because people are getting a state-of-the-art outcome on complex computer vision and natural language processing tasks. Convolutional neural networks have been extensively used.

Download Full-text

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based InputTiling

10.21203/rs.3.rs-743636/v1 ◽

2021 ◽

Author(s):

Weihao Zhuang ◽

Tristan Hascoet ◽

Xunquan Chen ◽

Ryoichi Takashima ◽

Tetsuya Takiguchi ◽

...

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Convolutional Neural Networks ◽

Language Processing ◽

State Of The Art ◽

Input Image ◽

Memory Consumption ◽

Excellent Performance ◽

Conceptual Approach ◽

Recent Developments

Abstract Currently, deep learning plays an indispensable role in many fields, including computer vision, natural language processing, and speech recognition. Convolutional Neural Networks (CNNs) have demonstrated excellent performance in computer vision tasks thanks to their powerful feature extraction capability. However, as the larger models have shown higher accuracy, recent developments have led to state-of-the-art CNN models with increasing resource consumption. This paper investigates a conceptual approach to reduce the memory consumption of CNN inference. Our method consists of processing the input image in a sequence of carefully designed tiles within the lower subnetwork of the CNN, so as to minimize its peak memory consumption, while keeping the end-to-end computation unchanged. This method introduces a trade-off between memory consumption and computations, which is particularly suitable for high-resolution inputs. Our experimental results show that MobileNetV2 memory consumption can be reduced by up to 5.3 times with our proposed method. For ResNet50, one of the most commonly used CNN models in computer vision tasks, memory can be optimized by up to 2.3 times.

Download Full-text

A Study of The Convolutional Neural Networks Applications

UKH Journal of Science and Engineering ◽

10.25079/ukhjse.v3n2y2019.pp31-40 ◽

2019 ◽

Vol 3 (2) ◽

pp. 31-40 ◽

Cited By ~ 2

Author(s):

Ahmed Shamsaldin ◽

Polla Fattah ◽

Tarik Rashid ◽

Nawzad Al-Salihi

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Deep Learning ◽

Natural Language Processing ◽

Face Recognition ◽

Convolutional Neural Networks ◽

Language Processing ◽

Text Classification ◽

Scene Labeling ◽

Real World Problems

At present, deep learning is widely used in a broad range of arenas. A convolutional neural networks (CNN) is becoming the star of deep learning as it gives the best and most precise results when cracking real-world problems. In this work, a brief description of the applications of CNNs in two areas will be presented: First, in computer vision, generally, that is, scene labeling, face recognition, action recognition, and image classification; Second, in natural language processing, that is, the fields of speech recognition and text classification.

Download Full-text