HEIF: Highly Efficient Stochastic Computing-Based Inference Framework for Deep Neural Networks

Stochastic computing (SC) reduces the complexity of computation by representing numbers with long streams of independent bits. However, increasing performance in SC comes with either an increase in area or a loss in accuracy. Processing in memory (PIM) computes data in-place while having high memory density and supporting bit-parallel operations with low energy consumption. In this article, we propose COSMO, an architecture for co mputing with s tochastic numbers in me mo ry, which enables SC in memory. The proposed architecture is general and can be used for a wide range of applications. It is a highly dense and parallel architecture that supports most SC encodings and operations in memory. It maximizes the performance and energy efficiency of SC by introducing several innovations: (i) in-memory parallel stochastic number generation, (ii) efficient implication-based logic in memory, (iii) novel memory bit line segmenting, (iv) a new memory-compatible SC addition operation, and (v) enabling flexible block allocation. To show the generality and efficiency of our stochastic architecture, we implement image processing, deep neural networks (DNNs), and hyperdimensional (HD) computing on the proposed hardware. Our evaluations show that running DNN inference on COSMO is 141× faster and 80× more energy efficient as compared to GPU.

Download Full-text

On the inference speed and video-compression robustness of DeepLabCut

10.1101/457242 ◽

2018 ◽

Cited By ~ 9

Author(s):

Alexander Mathis ◽

Richard Warren

Keyword(s):

Neural Networks ◽

Transfer Learning ◽

Video Compression ◽

Pose Estimation ◽

Efficient Method ◽

Deep Neural Networks ◽

Video Data ◽

Frame Size ◽

Highly Efficient

Pose estimation is crucial for many applications in neuroscience, biomechanics, genetics and beyond. We recently presented a highly efficient method for markerless pose estimation based on transfer learning with deep neural networks called DeepLabCut. Current experiments produce vast amounts of video data, which pose challenges for both storage and analysis. Here we improve the inference speed of DeepLabCut by up to tenfold and benchmark these updates on various CPUs and GPUs. In particular, depending on the frame size, poses can be inferred offline at up to 1200 frames per second (FPS). For instance, 278 × 278 images can be processed at 225 FPS on a GTX 1080 Ti graphics card. Furthermore, we show that DeepLabCut is highly robust to standard video compression (ffmpeg). Compression rates of greater than 1,000 only decrease accuracy by about half a pixel (for 640 × 480 frame size). DeepLabCut’s speed and robustness to compression can save both time and hardware expenses.

Download Full-text

An Efficient Hardware Implementation of Activation Functions Using Stochastic Computing for Deep Neural Networks

2018 IEEE 12th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) ◽

10.1109/mcsoc2018.2018.00045 ◽

2018 ◽

Cited By ~ 5

Author(s):

Van-Tinh Nguyen ◽

Tieu-Khanh Luong ◽

Han Le Duc ◽

Van-Phuc Hoang

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Hardware Implementation ◽

Activation Functions ◽

Stochastic Computing

Download Full-text

Runtime Long-Term Reliability Management Using Stochastic Computing in Deep Neural Networks

2021 22nd International Symposium on Quality Electronic Design (ISQED) ◽

10.1109/isqed51717.2021.9424285 ◽

2021 ◽

Author(s):

Yibo Liu ◽

Shuyuan Yu ◽

Shaoyi Peng ◽

Sheldon X.-D. Tan

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Reliability Management ◽

Stochastic Computing

Download Full-text

DPS: Dynamic Precision Scaling for Stochastic Computing-based Deep Neural Networks*

2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC) ◽

10.1109/dac.2018.8465700 ◽

2018 ◽

Cited By ~ 5

Author(s):

Hyeonuk Sim ◽

Saken Kenzhegulov ◽

Jongeun Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Stochastic Computing ◽

Dynamic Precision

Download Full-text

Universal Approximation Property and Equivalence of Stochastic Computing-Based Neural Networks and Binary Neural Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015369 ◽

2019 ◽

Vol 33 ◽

pp. 5369-5376 ◽

Cited By ~ 1

Author(s):

Yanzhi Wang ◽

Zheng Zhan ◽

Liang Zhao ◽

Jian Tang ◽

Siyue Wang ◽

...

Keyword(s):

Neural Networks ◽

Large Scale ◽

Deep Neural Networks ◽

Approximation Property ◽

Network Size ◽

Universal Approximation ◽

Approximate Computing ◽

Stochastic Computing ◽

Hardware Implementations ◽

Pros And Cons

Large-scale deep neural networks are both memory and computation-intensive, thereby posing stringent requirements on the computing platforms. Hardware accelerations of deep neural networks have been extensively investigated. Specific forms of binary neural networks (BNNs) and stochastic computing-based neural networks (SCNNs) are particularly appealing to hardware implementations since they can be implemented almost entirely with binary operations. Despite the obvious advantages in hardware implementation, these approximate computing techniques are questioned by researchers in terms of accuracy and universal applicability. Also it is important to understand the relative pros and cons of SCNNs and BNNs in theory and in actual hardware implementations. In order to address these concerns, in this paper we prove that the “ideal” SCNNs and BNNs satisfy the universal approximation property with probability 1 (due to the stochastic behavior), which is a new angle from the original approximation property. The proof is conducted by first proving the property for SCNNs from the strong law of large numbers, and then using SCNNs as a “bridge” to prove for BNNs. Besides the universal approximation property, we also derive an appropriate bound for bit length M in order to provide insights for the actual neural network implementations. Based on the universal approximation property, we further prove that SCNNs and BNNs exhibit the same energy complexity. In other words, they have the same asymptotic energy consumption with the growth of network size. We also provide a detailed analysis of the pros and cons of SCNNs and BNNs for hardware implementations and conclude that SCNNs are more suitable.

Download Full-text