scholarly journals An Adaptive STDP Learning Rule for Neuromorphic Systems

2021 ◽  
Vol 15 ◽  
Author(s):  
Ashish Gautam ◽  
Takashi Kohno

The promise of neuromorphic computing to develop ultra-low-power intelligent devices lies in its ability to localize information processing and memory storage in synaptic circuits much like the synapses in the brain. Spiking neural networks modeled using high-resolution synapses and armed with local unsupervised learning rules like spike time-dependent plasticity (STDP) have shown promising results in tasks such as pattern detection and image classification. However, designing and implementing a conventional, multibit STDP circuit becomes complex both in terms of the circuitry and the required silicon area. In this work, we introduce a modified and hardware-friendly STDP learning (named adaptive STDP) implemented using just 4-bit synapses. We demonstrate the capability of this learning rule in a pattern recognition task, in which a neuron learns to recognize a specific spike pattern embedded within noisy inhomogeneous Poisson spikes. Our results demonstrate that the performance of the proposed learning rule (94% using just 4-bit synapses) is similar to the conventional STDP learning (96% using 64-bit floating-point precision). The models used in this study are ideal ones for a CMOS neuromorphic circuit with analog soma and synapse circuits and mixed-signal learning circuits. The learning circuit stores the synaptic weight in a 4-bit digital memory that is updated asynchronously. In circuit simulation with Taiwan Semiconductor Manufacturing Company (TSMC) 250 nm CMOS process design kit (PDK), the static power consumption of a single synapse and the energy per spike (to generate a synaptic current of amplitude 15 pA and time constant 3 ms) are less than 2 pW and 200 fJ, respectively. The static power consumption of the learning circuit is less than 135 pW, and the energy to process a pair of pre- and postsynaptic spikes corresponding to a single learning step is less than 235 pJ. A single 4-bit synapse (capable of being configured as excitatory, inhibitory, or shunting inhibitory) along with its learning circuitry and digital memory occupies around 17,250 μm2 of silicon area.

Electronics ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 889
Author(s):  
Xiaoying Deng ◽  
Peiqi Tan

An ultra-low-power K-band LC-VCO (voltage-controlled oscillator) with a wide tuning range is proposed in this paper. Based on the current-reuse topology, a dynamic back-gate-biasing technique is utilized to reduce power consumption and increase tuning range. With this technique, small dimension cross-coupled pairs are allowed, reducing parasitic capacitors and power consumption. Implemented in SMIC 55 nm 1P7M CMOS process, the proposed VCO achieves a frequency tuning range of 19.1% from 22.2 GHz to 26.9 GHz, consuming only 1.9 mW–2.1 mW from 1.2 V supply and occupying a core area of 0.043 mm2. The phase noise ranges from −107.1 dBC/HZ to −101.9 dBc/Hz at 1 MHz offset over the whole tuning range, while the total harmonic distortion (THD) and output power achieve −40.6 dB and −2.9 dBm, respectively.


VLSI technology become one of the most significant and demandable because of the characteristics like device portability, device size, large amount of features, expenditure, consistency, rapidity and many others. Multipliers and Adders place an important role in various digital systems such as computers, process controllers and signal processors in order to achieve high speed and low power. Two input XOR/XNOR gate and 2:1 multiplexer modules are used to design the Hybrid Full adders. The XOR/XNOR gate is the key punter of power included in the Full adder cell. However this circuit increases the delay, area and critical path delay. Hence, the optimum design of the XOR/XNOR is required to reduce the power consumption of the Full adder Cell. So a 6 New Hybrid Full adder circuits are proposed based on the Novel Full-Swing XOR/XNOR gates and a New Gate Diffusion Input (GDI) design of Full adder with high-swing outputs. The speed, power consumption, power delay product and driving capability are the merits of the each proposed circuits. This circuit simulation was carried used cadence virtuoso EDA tool. The simulation results based on the 90nm CMOS process technology model.


2016 ◽  
Vol 26 (02) ◽  
pp. 1750027 ◽  
Author(s):  
Chia-Hung Chang ◽  
Cihun-Siyong Alex Gong ◽  
Jian-Chiun Liou ◽  
Yu-Lin Tsou ◽  
Feng-Lin Shiu ◽  
...  

This paper showcases a low-power demodulator for medical implant communication services (MICS) applications. Complementary shunt resistive feedback, current reuse configuration, and sub-threshold LO driving techniques are proposed to achieve ultra-low power consumption. The chip has been implemented in standard CMOS process and consumes only 260-[Formula: see text]W.


2018 ◽  
Vol 27 (13) ◽  
pp. 1850206 ◽  
Author(s):  
Qingshan Yang ◽  
Peiqing Han ◽  
Niansong Mei ◽  
Zhaofeng Zhang

A 16.4[Formula: see text]nW, sub-1[Formula: see text]V voltage reference for ultra-low power low voltage applications is proposed. This design reduces the operating voltage to 0.8[Formula: see text]V by a BJT voltage divider and decreases the silicon area considerably by eliminating resistors. The PTAT and CTAT are based on SCM structures and a scaled-down [Formula: see text], respectively, to improve the process insensitivity. This work is fabricated in 0.18[Formula: see text][Formula: see text]m CMOS process with a total area of 0.0033[Formula: see text]mm2. Measured results show that it works properly for supply voltage from 0.8[Formula: see text]V to 2[Formula: see text]V. The reference voltage is 467.2[Formula: see text]mV with standard deviation ([Formula: see text]) being 12.2 mV and measured TC at best is 38.7[Formula: see text]ppm/[Formula: see text]C ranging from [Formula: see text]C to 60[Formula: see text]C. The total power consumption is 16.4[Formula: see text]nW under the minimum supply voltage at 27[Formula: see text]C.


2021 ◽  
Author(s):  
Lorenzo De Marinis ◽  
Alessandro Catania ◽  
Piero Castoldi ◽  
Giampiero Contestabile ◽  
Paolo Bruschi ◽  
...  

In the modern era of artificial intelligence, increasingly sophisticated artificial neural networks (ANNs) are implemented, which pose challenges in terms of execution speed and power consumption. To tackle this problem, recent research on reduced-precision ANNs opened the possibility to exploit analog hardware for neuromorphic acceleration. In this scenario, photonic-electronic engines are emerging as a short-medium term solution to exploit the high speed and inherent parallelism of optics for linear computations needed in ANN, while resorting to electronic circuitry for signal conditioning and memory storage. In this paper we introduce a precision-scalable integrated photonic-electronic multiply-accumulate neuron, namely PEMAN. The proposed device relies on (i) an analog photonic engine to perform reduced-precision multiplications at high speed and low power, and (ii) an electronic front-end for accumulation and application of the nonlinear activation function by means of a nonlinear encoding in the analog-to-digital converter (ADC). The device, based on the iSiPP50G SOI process for the photonic engine and a commercial 28 nm CMOS process for the electronic front end, has been numerically validated through cosimulations to perform multiply-accumulate operations (MAC). PEMAN exhibits a multiplication accuracy of 6.1 ENOB up to 10 GMAC/s, while it can perform computations up to 56 GMAC/s with a reduced accuracy down to 2.1 ENOB. The device can trade off speed with resolution and power consumption, it outperforms its analog electronics counterparts both in terms of speed and power consumption, and brings substantial improvements also compared to a leading GPU.


Sensors ◽  
2020 ◽  
Vol 20 (19) ◽  
pp. 5706
Author(s):  
Sanghoon Lee ◽  
Dongkyu Lee ◽  
Pyung Choi ◽  
Daejin Park

Light detection and ranging (LiDAR) sensors help autonomous vehicles detect the surrounding environment and the exact distance to an object’s position. Conventional LiDAR sensors require a certain amount of power consumption because they detect objects by transmitting lasers at a regular interval according to a horizontal angular resolution (HAR). However, because the LiDAR sensors, which continuously consume power inefficiently, have a fatal effect on autonomous and electric vehicles using battery power, power consumption efficiency needs to be improved. In this paper, we propose algorithms to improve the inefficient power consumption of conventional LiDAR sensors, and efficiently reduce power consumption in two ways: (a) controlling the HAR to vary the laser transmission period (TP) of a laser diode (LD) depending on the vehicle’s speed and (b) reducing the static power consumption using a sleep mode, depending on the surrounding environment. The proposed LiDAR sensor with the HAR control algorithm reduces the power consumption of the LD by 6.92% to 32.43% depending on the vehicle’s speed, compared to the maximum number of laser transmissions (Nx.max). The sleep mode with a surrounding environment-sensing algorithm reduces the power consumption by 61.09%. The algorithm of the proposed LiDAR sensor was tested on a commercial processor chip, and the integrated processor was designed as an IC using the Global Foundries 55 nm CMOS process.


2013 ◽  
Vol 22 (04) ◽  
pp. 1350026 ◽  
Author(s):  
ZHANGMING ZHU ◽  
YU XIAO ◽  
LIANG LIANG ◽  
LIANXI LIU ◽  
YINTANG YANG

Based on TSMC 0.18 μm 1.8 V CMOS process, a low power 10-bit 200 KS/s successive approximation register (SAR) analog-to-digital (ADC) is realized. This paper mainly considers the improvement of linearity and the optimization of power consumption. And a novel switching sequence is proposed which allows both to achieve a better compromise. Moreover, the fully dynamic comparator, which consumes no static power, and the optimization of SAR control logic, further reduce power consumption. The simulation results show that at 1.0 V supply and 200 KS/s, the ADC achieves an signal-to-noise and distortion-ration (SNDR) of 59.78 dB and consumes 3.03 μW, resulting in a figure-of-merit (FOM) of 19.0 fJ/conversion-step. The ADC core occupies an active area of only 260 × 220 μm2.


2019 ◽  
Vol 3 (3) ◽  
pp. 19-27
Author(s):  
Mohsen Sadeghi ◽  
Mahya Zahedi ◽  
Maaruf Ali

This article presents a low power consumption, high speed multiplier, based on a lowest transistor count novel structure when compared with other traditional multipliers. The proposed structure utilizes 4×4-bit adder units, since it is the base structure of digital multipliers. The main merits of this multiplier design are that: it has the least adder unit count; ultra-low power consumption and the fastest propagation delay in comparison with other gate implementations. The figures demonstrate that the proposed structure consumes 32% less power than using the bypassing Ripple Carry Array (RCA) implementation. Moreover, its propagation delay and adder units count are respectively about 31% and 8.5% lower than the implementation using the bypassing RCA multiplier. All of these simulations were carried out using the HSPICE circuit simulation software in 0.18 μm technology at 1.8 V supply voltage. The proposed design is thus highly suitable in low power drain and high-speed arithmetic electronic circuit applications.


Author(s):  
Yue Lu ◽  
Tom Kazmierski

In this paper, a new approach is proposed for designing ultra-low-power FFT (Fast Fourier Transform) system suitable for use in energy harvesting powered sensors. Bit-serial architecture is adopted to reduce the power consumption of butterfly operation. Simulation results show that, compared with state-of-the-art bit-serial and conventional parallel processors, the proposed technique is superior in terms of silicon area, power consumption, dynamic energy use due to variable precision arithmetic. A sample design of a 64-point FFT shows that the implementation can save about 40% area and 36% leakage power compared with a conventional parallel counterpart, accordingly achieving significant power benefits at a low sample rate and low voltage domain. The dynamic variation of the arithmetic precision can be achieved through a simple modification of the controller with hardware area overhead of 10% gate count.


2012 ◽  
Vol 10 ◽  
pp. 207-213 ◽  
Author(s):  
U. Vishnoi ◽  
T. G. Noll

Abstract. The COordinate Rotate DIgital Computer (CORDIC) algorithm is a well known versatile approach and is widely applied in today's SoCs for especially but not restricted to digital communications. Dedicated CORDIC blocks can be implemented in deep sub-micron CMOS technologies at very low area and energy costs and are attractive to be used as hardware accelerators for Application Specific Instruction Processors (ASIPs). Thereby, overcoming the well known energy vs. flexibility conflict. Optimizing Global Navigation Satellite System (GNSS) receivers to reduce the hardware complexity is an important research topic at present. In such receivers CORDIC accelerators can be used for digital baseband processing (fixed-point) and in Position-Velocity-Time estimation (floating-point). A micro architecture well suited to such applications is presented. This architecture is parameterized according to the wordlengths as well as the number of iterations and can be easily extended for floating point data format. Moreover, area can be traded for throughput by partially or even fully unrolling the iterations, whereby the degree of pipelining is organized with one CORDIC iteration per cycle. From the architectural description, the macro layout can be generated fully automatically using an in-house datapath generator tool. Since the adders and shifters play an important role in optimizing the CORDIC block, they must be carefully optimized for high area and energy efficiency in the underlying technology. So, for this purpose carry-select adders and logarithmic shifters have been chosen. Device dimensioning was automatically optimized with respect to dynamic and static power, area and performance using the in-house tool. The fully sequential CORDIC block for fixed-point digital baseband processing features a wordlength of 16 bits, requires 5232 transistors, which is implemented in a 40-nm CMOS technology and occupies a silicon area of 1560 μm2 only. Maximum clock frequency from circuit simulation of extracted netlist is 768 MHz under typical, and 463 MHz under worst case technology and application corner conditions, respectively. Simulated dynamic power dissipation is 0.24 uW MHz−1 at 0.9 V; static power is 38 uW in slow corner, 65 uW in typical corner and 518 uW in fast corner, respectively. The latter can be reduced by 43% in a 40-nm CMOS technology using 0.5 V reverse-backbias. These features are compared with the results from different design styles as well as with an implementation in 28-nm CMOS technology. It is interesting that in the latter case area scales as expected, but worst case performance and energy do not scale well anymore.


Sign in / Sign up

Export Citation Format

Share Document