WINNER: a high speed high energy efficient Neural Network implementation for image classification

The estimation of human hand pose has become the basis for many vital applications where the user depends mainly on the hand pose as a system input. Virtual reality (VR) headset, shadow dexterous hand and in-air signature verification are a few examples of applications that require to track the hand movements in real-time. The state-of-the-art 3D hand pose estimation methods are based on the Convolutional Neural Network (CNN). These methods are implemented on Graphics Processing Units (GPUs) mainly due to their extensive computational requirements. However, GPUs are not suitable for the practical application scenarios, where the low power consumption is crucial. Furthermore, the difficulty of embedding a bulky GPU into a small device prevents the portability of such applications on mobile devices. The goal of this work is to provide an energy efficient solution for an existing depth camera based hand pose estimation algorithm. First, we compress the deep neural network model by applying the dynamic quantization techniques on different layers to achieve maximum compression without compromising accuracy. Afterwards, we design a custom hardware architecture. For our device we selected the FPGA as a target platform because FPGAs provide high energy efficiency and can be integrated in portable devices. Our solution implemented on Xilinx UltraScale+ MPSoC FPGA is 4.2× faster and 577.3× more energy efficient than the original implementation of the hand pose estimation algorithm on NVIDIA GeForce GTX 1070.

Download Full-text

A Review of Switched Inertance Hydraulic Converter Technology1

Journal of Dynamic Systems Measurement and Control ◽

10.1115/1.4046103 ◽

2020 ◽

Vol 142 (5) ◽

Cited By ~ 2

Author(s):

Chenggang Yuan ◽

Min Pan ◽

Andrew Plummer

Keyword(s):

Energy Efficient ◽

High Speed ◽

Control Strategies ◽

New Technology ◽

High Energy ◽

Hydraulic Power ◽

Controlled Systems ◽

Hydraulic Machines ◽

And Control ◽

High Speed Switching

Abstract Digital hydraulics is a new technology providing an alternative to conventional proportional or servovalve-controlled systems in the area of fluid power. Digital hydraulic applications, such as digital pumps, digital valves and actuators, switched inertance hydraulic converters (SIHCs), and digital hydraulic power management systems, promise high-energy efficiency and less contamination sensitivity. Research on digital hydraulics is driven by the need for highly energy efficient hydraulic machines but is relatively immature compared to other energy-saving technologies. This review introduces the development of SIHCs particularly focusing on the work being undertaken in the last 15 years and evaluates the device configurations, performance, and control strategies that are found in the current SIHC research. Various designs for high-speed switching valves are presented, and their advantages and limitations are compared and discussed. The current limitations of SIHCs are discussed and suggestions for the future development of SIHCs are made.

Download Full-text

Implementation of Pruned Backpropagation Neural Network Based on Photonic Integrated Circuits

Photonics ◽

10.3390/photonics8090363 ◽

2021 ◽

Vol 8 (9) ◽

pp. 363

Author(s):

Qi Zhang ◽

Zhuangzhuang Xing ◽

Duan Huang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Integrated Circuits ◽

Energy Efficient ◽

High Speed ◽

Large Scale ◽

Matrix Operation ◽

Optical Neural Networks ◽

Optical Neural Network ◽

Random Initialization

We demonstrate a pruned high-speed and energy-efficient optical backpropagation (BP) neural network. The micro-ring resonator (MRR) banks, as the core of the weight matrix operation, are used for large-scale weighted summation. We find that tuning a pruned MRR weight banks model gives an equivalent performance in training with the model of random initialization. Results show that the overall accuracy of the optical neural network on the MNIST dataset is 93.49% after pruning six-layer MRR weight banks on the condition of low insertion loss. This work is scalable to much more complex networks, such as convolutional neural networks and recurrent neural networks, and provides a potential guide for truly large-scale optical neural networks.

Download Full-text

High speed VLSI neural network for high-energy physics

Proceedings of the Fourth International Conference on Microelectronics for Neural Networks and Fuzzy Systems ◽

10.1109/icmnn.1994.593738 ◽

2002 ◽

Cited By ~ 3

Author(s):

P. Masa ◽

K. Hoen ◽

H. Wallinga

Keyword(s):

Neural Network ◽

High Speed ◽

High Energy Physics ◽

High Energy ◽

Energy Physics

Download Full-text

The hierarchical high-speed neural network image classification algorithm for video surveillance systems

2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) ◽

10.1109/eiconrus.2018.8317465 ◽

2018 ◽

Author(s):

Andrey A. Belyaev ◽

Vera V. Kuzmina ◽

Andrey A. Bychkov ◽

Elena S. Yanakova ◽

Anatoly V. Khamukhin

Keyword(s):

Neural Network ◽

Image Classification ◽

Video Surveillance ◽

High Speed ◽

Classification Algorithm ◽

Surveillance Systems

Download Full-text

A Biological Retina Inspired Tone Mapping Processor for High-Speed and Energy-Efficient Image Enhancement

Sensors ◽

10.3390/s20195600 ◽

2020 ◽

Vol 20 (19) ◽

pp. 5600

Author(s):

Xiaoqiang Xiang ◽

Lili Liu ◽

Luying Que ◽

Conghan Jia ◽

Bo Yan ◽

...

Keyword(s):

Energy Efficiency ◽

Image Enhancement ◽

High Throughput ◽

Energy Efficient ◽

High Speed ◽

High Energy ◽

Tone Mapping ◽

Data Partition ◽

Convolution Filter ◽

Feature Sharing

In this work, a biological retina inspired tone mapping processor for high-speed and energy-efficient image enhancement has been proposed. To achieve high throughput and high energy efficiency, several hardware design techniques have been proposed, including data partition based parallel processing with S-shape sliding, adjacent frame feature sharing, multi-layer convolution pipelining, and convolution filter compression with zero skipping convolution. Implemented on a Xilinx’s Virtex7 FPGA, the proposed design achieves a high throughput of 189 frames per second for 1024 × 768 RGB images while consuming 819 mW. Compared with several state-of-the-art tone mapping processors, the proposed design shows higher throughput and energy efficiency. It is suitable for high-speed and energy-constrained image enhancement applications.

Download Full-text

High speed neural network chip for trigger purposes in high energy physics

Proceedings Design, Automation and Test in Europe ◽

10.1109/date.1998.655844 ◽

2002 ◽

Author(s):

W. Eppler ◽

T. Fischer ◽

H. Gemmeke ◽

A. Menchikov

Keyword(s):

Neural Network ◽

High Speed ◽

High Energy Physics ◽

High Energy ◽

Energy Physics

Download Full-text

An Energy-Efficient Embedded Deep Neural Network Processor for High Speed Visual Attention in Mobile Vision Recognition SoC

IEEE Journal of Solid-State Circuits ◽

10.1109/jssc.2016.2582864 ◽

2016 ◽

pp. 1-9 ◽

Cited By ~ 5

Author(s):

Seongwook Park ◽

Injoon Hong ◽

Junyoung Park ◽

Hoi-Jun Yoo

Keyword(s):

Neural Network ◽

Visual Attention ◽

Energy Efficient ◽

High Speed ◽

Deep Neural Network ◽

Network Processor ◽

Mobile Vision

Download Full-text

High speed and energy efficient deep neural network for edge computing

Proceedings of the 4th ACM/IEEE Symposium on Edge Computing - SEC '19 ◽

10.1145/3318216.3363453 ◽

2019 ◽

Author(s):

Kangjun Bai ◽

Shiya Liu ◽

Yang Yi

Keyword(s):

Neural Network ◽

Energy Efficient ◽

High Speed ◽

Deep Neural Network ◽

Edge Computing

Download Full-text

A scalable FPGA based accelerator for Tiny-YOLO-v2 using OpenCL

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v8.i3.pp206-214 ◽

2019 ◽

Vol 8 (3) ◽

pp. 206

Author(s):

Yap June Wai ◽

Zulkanain Mohd Yussof ◽

Sani Irwan Md Salim

Keyword(s):

Neural Network ◽

Object Recognition ◽

Object Detection ◽

Image Classification ◽

High Power ◽

Energy Efficient ◽

Detection Algorithm ◽

Small Scale ◽

Design Development ◽

Deep Convolution Neural Network

Deep Convolution Neural Network (CNN) algorithm have recently gained popularity in many applications such as image classification, video analytic, object recognition and segmentation. Being compute-intensive and memory expensive, CNN computations are common accelerated by GPUs with high power dissipations. Recent studies show implementation of CNN on FPGA and it gain higher advantage in term of energy-efficient and flexibility over Software-configurable-GPUs. The proposed framework is verified by implement Tiny-YOLO-v2 on De1SoC. The design development in this project is HLS approach to ease effort from writing complex RTL codes and provide fast verification through emulation and profiling tools provided in the OpenCL SDK. To best of our knowledge, this is the first implementation of Tiny-YOLO-v2 CNN based object detection algorithm on a small scale De1SoC board using Intel FPGA SDK for OpenCL approach.

Download Full-text