A Novel Ultra-Low Power 8T SRAM-Based Compute-in-Memory Design for Binary Neural Networks

Youngbae Kim; Shuai Li; Nandakishor Yadav; Kyuwon Ken Choi

doi:10.3390/electronics10172181

A Novel Ultra-Low Power 8T SRAM-Based Compute-in-Memory Design for Binary Neural Networks

Electronics ◽

10.3390/electronics10172181 ◽

2021 ◽

Vol 10 (17) ◽

pp. 2181

Author(s):

Youngbae Kim ◽

Shuai Li ◽

Nandakishor Yadav ◽

Kyuwon Ken Choi

Keyword(s):

Neural Networks ◽

Power Consumption ◽

Low Power ◽

State Of The Art ◽

Cell Structure ◽

Matrix Multiplication ◽

Minimum Size ◽

Cell Array ◽

Ultra Low Power ◽

Vector Matrix

We propose a novel ultra-low-power, voltage-based compute-in-memory (CIM) design with a new single-ended 8T SRAM bit cell structure. Since the proposed SRAM bit cell uses a single bitline for CIM calculation with decoupled read and write operations, it supports a much higher energy efficiency. In addition, to separate read and write operations, the stack structure of the read unit minimizes leakage power consumption. Moreover, the proposed bit cell structure provides better read and write stability due to the isolated read path, write path and greater pull-up ratio. Compared to the state-of-the-art SRAM-CIM, our proposed SRAM-CIM does not require extra transistors for CIM vector-matrix multiplication. We implemented a 16 k (128 × 128) bit cell array for the computation of 128× neurons, and used 64× binary inputs (0 or 1) and 64 × 128 binary weights (−1 or +1) values for the binary neural networks (BNNs). Each row of the bit cell array corresponding to a single neuron consists of a total of 128 cells, 64× cells for dot-product and 64× replicas cells for ADC reference. Additionally, 64× replica cells consist of 32× cells for ADC reference and 32× cells for offset calibration. We used a row-by-row ADC for the quantized outputs of each neuron, which supports 1–7 bits of output for each neuron. The ADC uses the sweeping method using 32× duplicate bit cells, and the sweep cycle is set to 2N−1+1, where N is the number of output bits. The simulation is performed at room temperature (27 °C) using 45 nm technology via Synopsys Hspice, and all transistors in bitcells use the minimum size considering the area, power, and speed. The proposed SRAM-CIM has reduced power consumption for vector-matrix multiplication by 99.96% compared to the existing state-of-the-art SRAM-CIM. Furthermore, because of the decoupled reading unit from an internal node of latch, there is no feedback from the reading unit, with read static noise, and margin-free results.

Download Full-text

Ultra-Low Power and High-Throughput SRAM Design to Enhance AI Computing Ability in Autonomous Vehicles

Electronics ◽

10.3390/electronics10030256 ◽

2021 ◽

Vol 10 (3) ◽

pp. 256

Author(s):

Youngbae Kim ◽

Shreyash Patel ◽

Heekyung Kim ◽

Nandakishor Yadav ◽

Kyuwon Ken Choi

Keyword(s):

Integrated Circuits ◽

Power Consumption ◽

Low Power ◽

High Throughput ◽

Autonomous Vehicles ◽

Field Effect Transistors ◽

State Of The Art ◽

Electrical Performance ◽

Large Area ◽

Ultra Low Power

Power consumption and data processing speed of integrated circuits (ICs) is an increasing concern in many emerging Artificial Intelligence (AI) applications, such as autonomous vehicles and Internet of Things (IoT). Existing state-of-the-art SRAM architectures for AI computing are highly accurate and can provide high throughput. However, these SRAMs have problems that they consume high power and occupy a large area to accommodate complex AI models. A carbon nanotube field-effect transistors (CNFET) device has been reported as a potential candidates for AI devices requiring ultra-low power and high-throughput due to their satisfactory carrier mobility and symmetrical, good subthreshold electrical performance. Based on the CNFET and FinFET device’s electrical performance, we propose novel ultra-low power and high-throughput 8T SRAMs to circumvent the power and the throughput issues in Artificial Intelligent (AI) computation for autonomous vehicles. We propose two types of novel 8T SRAMs, P-Latch N-Access (PLNA) 8T SRAM structure and single-ended (SE) 8T SRAM structure, and compare the performance with existing state-of-the-art 8T SRAM architectures in terms of power consumption and speed. In the SRAM circuits of the FinFET and CNFET, higher tube and fin numbers lead to higher operating speed. However, the large number of tubes and fins can lead to larger area and more power consumption. Therefore, we optimize the area by reducing the number of tubes and fins without compromising the memory circuit speed and power. Most importantly, the decoupled reading and writing of our new SRAMs cell offers better low-power operation due to the stacking of device in the reading part, as well as achieving better readability and writability, while offering read Static Noise Margin (SNM) free because of isolated reading path, writing path, and greater pull up ratio. In addition, the proposed 8T SRAMs show even better performance in delay and power when we combine them with the collaborated voltage sense amplifier and independent read component. The proposed PLNA 8T SRAM can save 96%, while the proposed SE 8T SRAM saves around 99% in writing power consumption compared with the existing state-of-the-art 8T SRAM in FinFET model, as well as 99% for writing operation in CNFET model.

Download Full-text

Always-On Sub-Microwatt Spiking Neural Network Based on Spike-Driven Clock- and Power-Gating for an Ultra-Low-Power Intelligent Device

Frontiers in Neuroscience ◽

10.3389/fnins.2021.684113 ◽

2021 ◽

Vol 15 ◽

Author(s):

Pavan Kumar Chundi ◽

Dewei Wang ◽

Sung Justin Kim ◽

Minhao Yang ◽

Joao Pedro Cerqueira ◽

...

Keyword(s):

Neural Network ◽

Power Consumption ◽

Low Power ◽

Power Dissipation ◽

State Of The Art ◽

Spiking Neural Network ◽

Power Gating ◽

Ultra Low Power ◽

Fine Grained ◽

Input Activity

This paper presents a novel spiking neural network (SNN) classifier architecture for enabling always-on artificial intelligent (AI) functions, such as keyword spotting (KWS) and visual wake-up, in ultra-low-power internet-of-things (IoT) devices. Such always-on hardware tends to dominate the power efficiency of an IoT device and therefore it is paramount to minimize its power dissipation. A key observation is that the input signal to always-on hardware is typically sparse in time. This is a great opportunity that a SNN classifier can leverage because the switching activity and the power consumption of SNN hardware can scale with spike rate. To leverage this scalability, the proposed SNN classifier architecture employs event-driven architecture, especially fine-grained clock generation and gating and fine-grained power gating, to obtain very low static power dissipation. The prototype is fabricated in 65 nm CMOS and occupies an area of 1.99 mm2. At 0.52 V supply voltage, it consumes 75 nW at no input activity and less than 300 nW at 100% input activity. It still maintains competitive inference accuracy for KWS and other always-on classification workloads. The prototype achieved a power consumption reduction of over three orders of magnitude compared to the state-of-the-art for SNN hardware and of about 2.3X compared to the state-of-the-art KWS hardware.

Download Full-text

CMOS based Ultra-low Power High-Precision Analog Vector Matrix Multiplication Circuit with ±0.1% Error for Vision Application

2019 IEEE 62nd International Midwest Symposium on Circuits and Systems (MWSCAS) ◽

10.1109/mwscas.2019.8884997 ◽

2019 ◽

Author(s):

Nikita Mirchandani ◽

Aatmesh Shrivastava

Keyword(s):

Low Power ◽

High Precision ◽

Matrix Multiplication ◽

Ultra Low Power ◽

Vector Matrix ◽

Vision Application

Download Full-text

An Ultra-low Power Consumption MAC Protocol Complied with IEEE 802.15.4/4e for Wireless Smart Utility Networks

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.136.1555 ◽

2016 ◽

Vol 136 (11) ◽

pp. 1555-1566 ◽

Cited By ~ 6

Author(s):

Jun Fujiwara ◽

Hiroshi Harada ◽

Takuya Kawata ◽

Kentaro Sakamoto ◽

Sota Tsuchiya ◽

...

Keyword(s):

Power Consumption ◽

Low Power ◽

Ieee 802.15.4 ◽

Mac Protocol ◽

Low Power Consumption ◽

Ultra Low Power

Download Full-text

Ultra low power consumption design of electronic tag

JOURNAL OF ELECTRONIC MEASUREMENT AND INSTRUMENT ◽

10.3724/sp.j.1187.2010.00979 ◽

2010 ◽

Vol 24 (10) ◽

pp. 979-984

Author(s):

Gang Huang ◽

Xiaoyan Lei

Keyword(s):

Power Consumption ◽

Low Power ◽

Low Power Consumption ◽

Ultra Low Power ◽

Electronic Tag

Download Full-text

Subnanosecond Ultra-Low Power Consumption 3-Dimensional NB RAM. Phase 1.

10.21236/ada342291 ◽

1997 ◽

Author(s):

Alex Kirichenko

Keyword(s):

Power Consumption ◽

Low Power ◽

Phase 1 ◽

Low Power Consumption ◽

Ultra Low Power ◽

3 Dimensional

Download Full-text

Ultra Low Power Consumption for Self-Oscillating Nanoelectromechanical Systems Constructed by Contacting Two Nanowires

Nano Letters ◽

10.1021/nl304352w ◽

2013 ◽

Vol 13 (4) ◽

pp. 1451-1456 ◽

Cited By ~ 13

Author(s):

T. Barois ◽

A. Ayari ◽

P. Vincent ◽

S. Perisanu ◽

P. Poncharal ◽

...

Keyword(s):

Power Consumption ◽

Low Power ◽

Nanoelectromechanical Systems ◽

Low Power Consumption ◽

Ultra Low Power

Download Full-text

An Ultra-Low-Power K-Band 22.2 GHz-to-26.9 GHz Current-Reuse VCO Using Dynamic Back-Gate-Biasing Technique

Electronics ◽

10.3390/electronics10080889 ◽

2021 ◽

Vol 10 (8) ◽

pp. 889

Author(s):

Xiaoying Deng ◽

Peiqi Tan

Keyword(s):

Power Consumption ◽

Low Power ◽

Tuning Range ◽

Frequency Tuning ◽

Cmos Process ◽

Voltage Controlled Oscillator ◽

Back Gate ◽

Ultra Low Power ◽

Current Reuse ◽

K Band

An ultra-low-power K-band LC-VCO (voltage-controlled oscillator) with a wide tuning range is proposed in this paper. Based on the current-reuse topology, a dynamic back-gate-biasing technique is utilized to reduce power consumption and increase tuning range. With this technique, small dimension cross-coupled pairs are allowed, reducing parasitic capacitors and power consumption. Implemented in SMIC 55 nm 1P7M CMOS process, the proposed VCO achieves a frequency tuning range of 19.1% from 22.2 GHz to 26.9 GHz, consuming only 1.9 mW–2.1 mW from 1.2 V supply and occupying a core area of 0.043 mm2. The phase noise ranges from −107.1 dBC/HZ to −101.9 dBc/Hz at 1 MHz offset over the whole tuning range, while the total harmonic distortion (THD) and output power achieve −40.6 dB and −2.9 dBm, respectively.

Download Full-text

Effect of voltage divider layer on self-current compliance resistive switching in Ta/TaOx/ITO structure with an ultra-low power consumption

Applied Physics Letters ◽

10.1063/5.0036730 ◽

2021 ◽

Vol 118 (4) ◽

pp. 042103

Author(s):

Jinshi Zhao ◽

Shuqin Guo ◽

Jiacheng Li ◽

Yingchen Li ◽

Liwei Zhou

Keyword(s):

Power Consumption ◽

Low Power ◽

Resistive Switching ◽

Voltage Divider ◽

Low Power Consumption ◽

Ultra Low Power ◽

Current Compliance

Download Full-text

Design of Ultra-low Power Consumption Low-dropout Regulator

Journal of Physics Conference Series ◽

10.1088/1742-6596/1617/1/012028 ◽

2020 ◽

Vol 1617 ◽

pp. 012028

Author(s):

Bai Xiao-juan ◽

Shao Wen-Zhuo ◽

Han Zhi-Gang

Keyword(s):

Power Consumption ◽

Low Power ◽

Low Power Consumption ◽

Ultra Low Power ◽

Low Dropout Regulator

Download Full-text