A programmable fuzzifier integrated circuit — synthesis, design, and fabrication

1989 ◽  
Vol 20 (5) ◽  
pp. 49
Integration ◽  
2021 ◽  
Vol 81 ◽  
pp. 322-330
Author(s):  
Gamze İslamoğlu ◽  
Tuğberk Oğulcan Çakıcı ◽  
Şeyda Nur Güzelhan ◽  
Engin Afacan ◽  
Günhan Dündar

2021 ◽  
Author(s):  
James Garland ◽  
David Gregg

Abstract Low-precision floating-point (FP) can be highly effective for convolutional neural network (CNN) inference. Custom low-precision FP can be implemented in field programmable gate array (FPGA) and application-specific integrated circuit (ASIC) accelerators, but existing microprocessors do not generally support fast, custom precision FP. We propose hardware optimized bitslice-parallel floating-point operators (HOBFLOPS), a generator of efficient custom precision emulated bitslice-parallel software(C/C++) FP arithmetic. We generate custom-precision FP routines, optimized using a hardware synthesis design flow, to create circuits. We provide standard cell libraries matching the bitwise operations on the target microprocessor architecture and a code generator to translate the hardware circuits to bitslice software equivalents. We exploit bitslice parallelism to create a novel, very wide (32—512 element) vectorized CNN convolution for inference. On Arm and Intel processors, the multiply-accumulate (MAC) performance in CNN convolution of HOBFLOPS, Flexfloat, and Berkeley’s SoftFP are compared. HOBFLOPS outperforms Flexfloat by up to 10× on Intel AVX512. HOBFLOPS offers arbitrary-precision FP with custom range and precision, e . g ., HOBFLOPS9, which outperforms Flexfloat 9-bit on Arm Neon by 7×. HOBFLOPS allows researchers to prototype different levels of custom FP precision in the arithmetic of software CNN ac celerators. Furthermore, HOBFLOPS fast custom-precision FP CNNs may be valuable in cases where memory bandwidth is limited.


Sign in / Sign up

Export Citation Format

Share Document