A Scalable Cluster-based Hierarchical Hardware Accelerator for a Cortically Inspired Algorithm

2021 ◽  
Vol 17 (4) ◽  
pp. 1-29
Author(s):  
Sumon Dey ◽  
Lee Baker ◽  
Joshua Schabel ◽  
Weifu Li ◽  
Paul D. Franzon

This article describes a scalable, configurable and cluster-based hierarchical hardware accelerator through custom hardware architecture for Sparsey, a cortical learning algorithm. Sparsey is inspired by the operation of the human cortex and uses a Sparse Distributed Representation to enable unsupervised learning and inference in the same algorithm. A distributed on-chip memory organization is designed and implemented in custom hardware to improve memory bandwidth and accelerate the memory read/write operations for synaptic weight matrices. Bit-level data are processed from distributed on-chip memory and custom multiply-accumulate hardware is implemented for binary and fixed-point multiply-accumulation operations. The fixed-point arithmetic and fixed-point storage are also adapted in this implementation. At 16 nm, the custom hardware of Sparsey achieved an overall 24.39× speedup, 353.12× energy efficiency per frame, and 1.43× reduction in silicon area against a state-of-the-art GPU.

1990 ◽  
Vol 2 (3) ◽  
pp. 363-373 ◽  
Author(s):  
Paul W. Hollis ◽  
John S. Harper ◽  
John J. Paulos

This paper presents a study of precision constraints imposed by a hybrid chip architecture with analog neurons and digital backpropagation calculations. Conversions between the analog and digital domains and weight storage restrictions impose precision limits on both analog and digital calculations. It is shown through simulations that a learning system of this nature can be implemented in spite of limited resolution in the analog circuits and using fixed point arithmetic to implement the backpropagation algorithm.


Sign in / Sign up

Export Citation Format

Share Document