Modular Neural Networks for Low-Power Image Classification on Embedded Devices

Abhinav Goel; Sara Aghajanzadeh; Caleb Tung; Shuo-Han Chen; George K. Thiruvathukal; Yung-Hsiang Lu

doi:10.1145/3408062

Quantization and Deployment of Deep Neural Networks on Microcontrollers

Sensors ◽

10.3390/s21092984 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2984

Author(s):

Pierre-Emmanuel Novac ◽

Ghouthi Boukli Hacene ◽

Alain Pegatoquet ◽

Benoît Miramond ◽

Vincent Gripon

Keyword(s):

Neural Networks ◽

Low Power ◽

Power Efficiency ◽

Deep Neural Networks ◽

Use Cases ◽

Comparison Study ◽

Power Devices ◽

Embedded Devices ◽

Inference Engines ◽

New Framework

Embedding Artificial Intelligence onto low-power devices is a challenging task that has been partly overcome with recent advances in machine learning and hardware design. Presently, deep neural networks can be deployed on embedded targets to perform different tasks such as speech recognition, object detection or Human Activity Recognition. However, there is still room for optimization of deep neural networks onto embedded devices. These optimizations mainly address power consumption, memory and real-time constraints, but also an easier deployment at the edge. Moreover, there is still a need for a better understanding of what can be achieved for different use cases. This work focuses on quantization and deployment of deep neural networks onto low-power 32-bit microcontrollers. The quantization methods, relevant in the context of an embedded execution onto a microcontroller, are first outlined. Then, a new framework for end-to-end deep neural networks training, quantization and deployment is presented. This framework, called MicroAI, is designed as an alternative to existing inference engines (TensorFlow Lite for Microcontrollers and STM32Cube.AI). Our framework can indeed be easily adjusted and/or extended for specific use cases. Execution using single precision 32-bit floating-point as well as fixed-point on 8- and 16 bits integers are supported. The proposed quantization method is evaluated with three different datasets (UCI-HAR, Spoken MNIST and GTSRB). Finally, a comparison study between MicroAI and both existing embedded inference engines is provided in terms of memory and power efficiency. On-device evaluation is done using ARM Cortex-M4F-based microcontrollers (Ambiq Apollo3 and STM32L452RE).

CNN-based Classification of Degraded Images

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.10.ipas-028 ◽

2020 ◽

Vol 2020 (10) ◽

pp. 28-1-28-7 ◽

Cited By ~ 1

Author(s):

Kazuki Endo ◽

Masayuki Tanaka ◽

Masatoshi Okutomi

Keyword(s):

Neural Networks ◽

Image Restoration ◽

Image Classification ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks ◽

Alternative Approach ◽

Degraded Image ◽

Degraded Images ◽

Straightforward Approach

Classification of degraded images is very important in practice because images are usually degraded by compression, noise, blurring, etc. Nevertheless, most of the research in image classification only focuses on clean images without any degradation. Some papers have already proposed deep convolutional neural networks composed of an image restoration network and a classification network to classify degraded images. This paper proposes an alternative approach in which we use a degraded image and an additional degradation parameter for classification. The proposed classification network has two inputs which are the degraded image and the degradation parameter. The estimation network of degradation parameters is also incorporated if degradation parameters of degraded images are unknown. The experimental results showed that the proposed method outperforms a straightforward approach where the classification network is trained with degraded images only.

HEp-2 Cell Image Classification with Convolutional Neural Networks

2014 1st Workshop on Pattern Recognition Techniques for Indirect Immunofluorescence Images ◽

10.1109/i3a.2014.15 ◽

2014 ◽

Cited By ~ 18

Author(s):

Zhimin Gao ◽

Jianjia Zhang ◽

Luping Zhou ◽

Lei Wang

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Cell Image

Graph Neural Networks for the Cross-Domain Histopathological Image Classification

2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI) ◽

10.1109/isbi48211.2021.9434037 ◽

2021 ◽

Author(s):

Chang Cai ◽

Dou Xu ◽

Chaowei Fang ◽

Meng Yang ◽

Zhongyu Li

Keyword(s):

Neural Networks ◽

Image Classification ◽

Cross Domain ◽

Histopathological Image ◽

The Cross ◽

Histopathological Image Classification ◽

Graph Neural Networks

Neural Networks: Low‐Power Self‐Rectifying Memristive Artificial Neural Network for Near Internet‐of‐Things Sensor Computing (Adv. Electron. Mater. 6/2021)

Advanced Electronic Materials ◽

10.1002/aelm.202170017 ◽

2021 ◽

Vol 7 (6) ◽

pp. 2170017

Author(s):

Seok Choi ◽

Yong Kim ◽

Tien Van Nguyen ◽

Won Hee Jeong ◽

Kyeong‐Sik Min ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Network ◽

Internet Of Things ◽

Low Power ◽

Artificial Neural

Binary Precision Neural Network Manycore Accelerator

ACM Journal on Emerging Technologies in Computing Systems ◽

10.1145/3423136 ◽

2021 ◽

Vol 17 (2) ◽

pp. 1-27

Author(s):

Morteza Hosseini ◽

Tinoosh Mohsenin

Keyword(s):

Neural Network ◽

Low Power ◽

Image Classification ◽

Case Studies ◽

Average Power ◽

Total Power ◽

Fabrication Technology ◽

Population Count ◽

Cluster Architecture ◽

Domain Specific

This article presents a low-power, programmable, domain-specific manycore accelerator, Binarized neural Network Manycore Accelerator (BiNMAC), which adopts and efficiently executes binary precision weight/activation neural network models. Such networks have compact models in which weights are constrained to only 1 bit and can be packed several in one memory entry that minimizes memory footprint to its finest. Packing weights also facilitates executing single instruction, multiple data with simple circuitry that allows maximizing performance and efficiency. The proposed BiNMAC has light-weight cores that support domain-specific instructions, and a router-based memory access architecture that helps with efficient implementation of layers in binary precision weight/activation neural networks of proper size. With only 3.73% and 1.98% area and average power overhead, respectively, novel instructions such as Combined Population-Count-XNOR , Patch-Select , and Bit-based Accumulation are added to the instruction set architecture of the BiNMAC, each of which replaces execution cycles of frequently used functions with 1 clock cycle that otherwise would have taken 54, 4, and 3 clock cycles, respectively. Additionally, customized logic is added to every core to transpose 16×16-bit blocks of memory on a bit-level basis, that expedites reshaping intermediate data to be well-aligned for bitwise operations. A 64-cluster architecture of the BiNMAC is fully placed and routed in 65-nm TSMC CMOS technology, where a single cluster occupies an area of 0.53 mm 2 with an average power of 232 mW at 1-GHz clock frequency and 1.1 V. The 64-cluster architecture takes 36.5 mm 2 area and, if fully exploited, consumes a total power of 16.4 W and can perform 1,360 Giga Operations Per Second (GOPS) while providing full programmability. To demonstrate its scalability, four binarized case studies including ResNet-20 and LeNet-5 for high-performance image classification, as well as a ConvNet and a multilayer perceptron for low-power physiological applications were implemented on BiNMAC. The implementation results indicate that the population-count instruction alone can expedite the performance by approximately 5×. When other new instructions are added to a RISC machine with existing population-count instruction, the performance is increased by 58% on average. To compare the performance of the BiNMAC with other commercial-off-the-shelf platforms, the case studies with their double-precision floating-point models are also implemented on the NVIDIA Jetson TX2 SoC (CPU+GPU). The results indicate that, within a margin of ∼2.1%--9.5% accuracy loss, BiNMAC on average outperforms the TX2 GPU by approximately 1.9× (or 7.5× with fabrication technology scaled) in energy consumption for image classification applications. On low power settings and within a margin of ∼3.7%--5.5% accuracy loss compared to ARM Cortex-A57 CPU implementation, BiNMAC is roughly ∼9.7×--17.2× (or 38.8×--68.8× with fabrication technology scaled) more energy efficient for physiological applications while meeting the application deadline.

Data Augmentation Methods Applying Grayscale Images for Convolutional Neural Networks in Machine Vision

Applied Sciences ◽

10.3390/app11156721 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6721

Author(s):

Jinyeong Wang ◽

Sanghwan Lee

Keyword(s):

Neural Networks ◽

Machine Vision ◽

Object Detection ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

Image Data ◽

Manufacturing Productivity ◽

Smart Factories ◽

Grayscale Images

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.

Semi-supervised training of deep convolutional neural networks with heterogeneous data and few local annotations: an experiment on prostate histopathology image classification

Medical Image Analysis ◽

10.1016/j.media.2021.102165 ◽

2021 ◽

pp. 102165

Author(s):

Niccolò Marini ◽

Sebastian Otálora ◽

Henning Müller ◽

Manfredo Atzori

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks ◽

Heterogeneous Data ◽

Deep Convolutional Neural Networks ◽

Supervised Training

Image Classification for the Automatic Feature Extraction in Human Worn Fashion Data

Mathematics ◽

10.3390/math9060624 ◽

2021 ◽

Vol 9 (6) ◽

pp. 624

Author(s):

Stefan Rohrmanstorfer ◽

Mikhail Komarov ◽

Felix Mödritscher

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Image Classification ◽

Convolutional Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Image Data ◽

Classification Model ◽

Upper Body ◽

Automatic Feature Extraction

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.

Classifiers Comparison for Convolutional Neural Networks (CNNs) in Image Classification

2019 IEEE/ACM 23rd International Symposium on Distributed Simulation and Real Time Applications (DS-RT) ◽

10.1109/ds-rt47707.2019.8958662 ◽

2019 ◽

Author(s):

Mauro Tropea ◽

Giuseppe Fedele

Keyword(s):

Neural Networks ◽

Image Classification ◽

Convolutional Neural Networks