scholarly journals A Novel Ultra-Low Power 8T SRAM-Based Compute-in-Memory Design for Binary Neural Networks

Electronics ◽  
2021 ◽  
Vol 10 (17) ◽  
pp. 2181
Author(s):  
Youngbae Kim ◽  
Shuai Li ◽  
Nandakishor Yadav ◽  
Kyuwon Ken Choi

We propose a novel ultra-low-power, voltage-based compute-in-memory (CIM) design with a new single-ended 8T SRAM bit cell structure. Since the proposed SRAM bit cell uses a single bitline for CIM calculation with decoupled read and write operations, it supports a much higher energy efficiency. In addition, to separate read and write operations, the stack structure of the read unit minimizes leakage power consumption. Moreover, the proposed bit cell structure provides better read and write stability due to the isolated read path, write path and greater pull-up ratio. Compared to the state-of-the-art SRAM-CIM, our proposed SRAM-CIM does not require extra transistors for CIM vector-matrix multiplication. We implemented a 16 k (128 × 128) bit cell array for the computation of 128× neurons, and used 64× binary inputs (0 or 1) and 64 × 128 binary weights (−1 or +1) values for the binary neural networks (BNNs). Each row of the bit cell array corresponding to a single neuron consists of a total of 128 cells, 64× cells for dot-product and 64× replicas cells for ADC reference. Additionally, 64× replica cells consist of 32× cells for ADC reference and 32× cells for offset calibration. We used a row-by-row ADC for the quantized outputs of each neuron, which supports 1–7 bits of output for each neuron. The ADC uses the sweeping method using 32× duplicate bit cells, and the sweep cycle is set to 2N−1+1, where N is the number of output bits. The simulation is performed at room temperature (27 °C) using 45 nm technology via Synopsys Hspice, and all transistors in bitcells use the minimum size considering the area, power, and speed. The proposed SRAM-CIM has reduced power consumption for vector-matrix multiplication by 99.96% compared to the existing state-of-the-art SRAM-CIM. Furthermore, because of the decoupled reading unit from an internal node of latch, there is no feedback from the reading unit, with read static noise, and margin-free results.

Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 256
Author(s):  
Youngbae Kim ◽  
Shreyash Patel ◽  
Heekyung Kim ◽  
Nandakishor Yadav ◽  
Kyuwon Ken Choi

Power consumption and data processing speed of integrated circuits (ICs) is an increasing concern in many emerging Artificial Intelligence (AI) applications, such as autonomous vehicles and Internet of Things (IoT). Existing state-of-the-art SRAM architectures for AI computing are highly accurate and can provide high throughput. However, these SRAMs have problems that they consume high power and occupy a large area to accommodate complex AI models. A carbon nanotube field-effect transistors (CNFET) device has been reported as a potential candidates for AI devices requiring ultra-low power and high-throughput due to their satisfactory carrier mobility and symmetrical, good subthreshold electrical performance. Based on the CNFET and FinFET device’s electrical performance, we propose novel ultra-low power and high-throughput 8T SRAMs to circumvent the power and the throughput issues in Artificial Intelligent (AI) computation for autonomous vehicles. We propose two types of novel 8T SRAMs, P-Latch N-Access (PLNA) 8T SRAM structure and single-ended (SE) 8T SRAM structure, and compare the performance with existing state-of-the-art 8T SRAM architectures in terms of power consumption and speed. In the SRAM circuits of the FinFET and CNFET, higher tube and fin numbers lead to higher operating speed. However, the large number of tubes and fins can lead to larger area and more power consumption. Therefore, we optimize the area by reducing the number of tubes and fins without compromising the memory circuit speed and power. Most importantly, the decoupled reading and writing of our new SRAMs cell offers better low-power operation due to the stacking of device in the reading part, as well as achieving better readability and writability, while offering read Static Noise Margin (SNM) free because of isolated reading path, writing path, and greater pull up ratio. In addition, the proposed 8T SRAMs show even better performance in delay and power when we combine them with the collaborated voltage sense amplifier and independent read component. The proposed PLNA 8T SRAM can save 96%, while the proposed SE 8T SRAM saves around 99% in writing power consumption compared with the existing state-of-the-art 8T SRAM in FinFET model, as well as 99% for writing operation in CNFET model.


2021 ◽  
Vol 15 ◽  
Author(s):  
Pavan Kumar Chundi ◽  
Dewei Wang ◽  
Sung Justin Kim ◽  
Minhao Yang ◽  
Joao Pedro Cerqueira ◽  
...  

This paper presents a novel spiking neural network (SNN) classifier architecture for enabling always-on artificial intelligent (AI) functions, such as keyword spotting (KWS) and visual wake-up, in ultra-low-power internet-of-things (IoT) devices. Such always-on hardware tends to dominate the power efficiency of an IoT device and therefore it is paramount to minimize its power dissipation. A key observation is that the input signal to always-on hardware is typically sparse in time. This is a great opportunity that a SNN classifier can leverage because the switching activity and the power consumption of SNN hardware can scale with spike rate. To leverage this scalability, the proposed SNN classifier architecture employs event-driven architecture, especially fine-grained clock generation and gating and fine-grained power gating, to obtain very low static power dissipation. The prototype is fabricated in 65 nm CMOS and occupies an area of 1.99 mm2. At 0.52 V supply voltage, it consumes 75 nW at no input activity and less than 300 nW at 100% input activity. It still maintains competitive inference accuracy for KWS and other always-on classification workloads. The prototype achieved a power consumption reduction of over three orders of magnitude compared to the state-of-the-art for SNN hardware and of about 2.3X compared to the state-of-the-art KWS hardware.


2016 ◽  
Vol 136 (11) ◽  
pp. 1555-1566 ◽  
Author(s):  
Jun Fujiwara ◽  
Hiroshi Harada ◽  
Takuya Kawata ◽  
Kentaro Sakamoto ◽  
Sota Tsuchiya ◽  
...  

Nano Letters ◽  
2013 ◽  
Vol 13 (4) ◽  
pp. 1451-1456 ◽  
Author(s):  
T. Barois ◽  
A. Ayari ◽  
P. Vincent ◽  
S. Perisanu ◽  
P. Poncharal ◽  
...  

Electronics ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 889
Author(s):  
Xiaoying Deng ◽  
Peiqi Tan

An ultra-low-power K-band LC-VCO (voltage-controlled oscillator) with a wide tuning range is proposed in this paper. Based on the current-reuse topology, a dynamic back-gate-biasing technique is utilized to reduce power consumption and increase tuning range. With this technique, small dimension cross-coupled pairs are allowed, reducing parasitic capacitors and power consumption. Implemented in SMIC 55 nm 1P7M CMOS process, the proposed VCO achieves a frequency tuning range of 19.1% from 22.2 GHz to 26.9 GHz, consuming only 1.9 mW–2.1 mW from 1.2 V supply and occupying a core area of 0.043 mm2. The phase noise ranges from −107.1 dBC/HZ to −101.9 dBc/Hz at 1 MHz offset over the whole tuning range, while the total harmonic distortion (THD) and output power achieve −40.6 dB and −2.9 dBm, respectively.


Sign in / Sign up

Export Citation Format

Share Document