scholarly journals Analog memristive synapse based on topotactic phase transition for high-performance neuromorphic computing and neural network pruning

2021 ◽  
Vol 7 (29) ◽  
pp. eabh0648
Author(s):  
Xing Mou ◽  
Jianshi Tang ◽  
Yingjie Lyu ◽  
Qingtian Zhang ◽  
Siyao Yang ◽  
...  

Inspired by the human brain, nonvolatile memories (NVMs)–based neuromorphic computing emerges as a promising paradigm to build power-efficient computing hardware for artificial intelligence. However, existing NVMs still suffer from physically imperfect device characteristics. In this work, a topotactic phase transition random-access memory (TPT-RAM) with a unique diffusive nonvolatile dual mode based on SrCoOx is demonstrated. The reversible phase transition of SrCoOx is well controlled by oxygen ion migrations along the highly ordered oxygen vacancy channels, enabling reproducible analog switching characteristics with reduced variability. Combining density functional theory and kinetic Monte Carlo simulations, the orientation-dependent switching mechanism of TPT-RAM is investigated synergistically. Furthermore, the dual-mode TPT-RAM is used to mimic the selective stabilization of developing synapses and implement neural network pruning, reducing ~84.2% of redundant synapses while improving the image classification accuracy to 99%. Our work points out a new direction to design bioplausible memristive synapses for neuromorphic computing.

2019 ◽  
Author(s):  
Rulin Shao ◽  
Hongyu He ◽  
Hui Liu ◽  
Dianbo Liu

BACKGROUND Artificial neural network has achieved unprecedented success in a wide variety of domains such as classifying, predicting and recognizing objects. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns and people want to take control over their sensitive information during both training and using processes. OBJECTIVE To address this problem, we propose a privacy-preserving method for the distributed system. The proposed method, Stochastic Channel-Based Federated Learning (SCBF), enables the participants to train a high-performance model cooperatively without sharing their inputs. METHODS Specifically, we design, implement and evaluate a channel-based update algorithm for the central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop and upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, is applied to the algorithm based on the validation set. RESULTS We construct a distributed system consisting of 5 clients and 1 server. Our trials show that the Stochastic Channel-Based Federated Learning method can achieve an AUCROC of 0.9776 and an AUCPR of 0.9695 with 10% channels shared with the server. Compared with Federated Averaging algorithm, the proposed method achieves 0.05388 higher in AUCROC and 0.09695 higher in AUCPR. In addition, our experiment shows that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUCROC performance and a reduction of 0.0068 in AUCPR. CONCLUSIONS In the experiment, our model presents better performances and higher saturating speed than the Federated Averaging method, which reveals all the parameters of local models to the server. We also demonstrate that the saturating rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate.


2019 ◽  
Vol 6 (4) ◽  
pp. 041407 ◽  
Author(s):  
Hao Luo ◽  
Bolun Wang ◽  
Enze Wang ◽  
Xuewen Wang ◽  
Yufei Sun ◽  
...  

Author(s):  
Wanting Zhao ◽  
Hong Qi ◽  
Yu Jiang ◽  
Chong Wang ◽  
Fenglin Wei

In the field of underwater image recognition, a chip with smaller footprint and lower energy consumption is required to be implanted into autonomous intelligent underwater vehicle to make real-time response to the surrounding objects. Therefore, a promising accelerator with high performance and low energy consumption is designed, which optimizes the features possessed by convolutional neural network. The sharing of weights between neurons reduces the memory requirement. With all convolutional neural network data stored within on-chip static random-access memory, the need for memory access is drastically decreased. Besides, several small processing elements are used to form neural functional unit, which considerably reduces the bandwidth requirement through inter-processing element data transmission. By sending control signals to autonomous underwater vehicle, this accelerator enables it to avoid dangerous areas such as rocks and algae in time. The result suggests the proposed accelerator successfully achieves a higher processing speed than that of CPU and GPU with a footprint of 6.09 mm2 only and the energy consumption of 327.3 mW at 1 GHz.


2020 ◽  
Vol 6 (27) ◽  
pp. eabb2958 ◽  
Author(s):  
A. Melianas ◽  
T. J. Quill ◽  
G. LeCroy ◽  
Y. Tuchman ◽  
H. v. Loo ◽  
...  

Devices with tunable resistance are highly sought after for neuromorphic computing. Conventional resistive memories, however, suffer from nonlinear and asymmetric resistance tuning and excessive write noise, degrading artificial neural network (ANN) accelerator performance. Emerging electrochemical random-access memories (ECRAMs) display write linearity, which enables substantially faster ANN training by array programing in parallel. However, state-of-the-art ECRAMs have not yet demonstrated stable and efficient operation at temperatures required for packaged electronic devices (~90°C). Here, we show that (semi)conducting polymers combined with ion gel electrolyte films enable solid-state ECRAMs with stable and nearly temperature-independent operation up to 90°C. These ECRAMs show linear resistance tuning over a >2× dynamic range, 20-nanosecond switching, submicrosecond write-read cycling, low noise, and low-voltage (±1 volt) and low-energy (~80 femtojoules per write) operation combined with excellent endurance (>109 write-read operations at 90°C). Demonstration of these high-performance ECRAMs is a fundamental step toward their implementation in hardware ANNs.


2020 ◽  
Vol 96 (3s) ◽  
pp. 585-588
Author(s):  
С.Е. Фролова ◽  
Е.С. Янакова

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.


Nanomaterials ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1401
Author(s):  
Te Jui Yen ◽  
Albert Chin ◽  
Vladimir Gritsenko

Large device variation is a fundamental challenge for resistive random access memory (RRAM) array circuit. Improved device-to-device distributions of set and reset voltages in a SiNx RRAM device is realized via arsenic ion (As+) implantation. Besides, the As+-implanted SiNx RRAM device exhibits much tighter cycle-to-cycle distribution than the nonimplanted device. The As+-implanted SiNx device further exhibits excellent performance, which shows high stability and a large 1.73 × 103 resistance window at 85 °C retention for 104 s, and a large 103 resistance window after 105 cycles of the pulsed endurance test. The current–voltage characteristics of high- and low-resistance states were both analyzed as space-charge-limited conduction mechanism. From the simulated defect distribution in the SiNx layer, a microscopic model was established, and the formation and rupture of defect-conductive paths were proposed for the resistance switching behavior. Therefore, the reason for such high device performance can be attributed to the sufficient defects created by As+ implantation that leads to low forming and operation power.


Nanophotonics ◽  
2020 ◽  
Vol 10 (2) ◽  
pp. 937-945
Author(s):  
Ruihuan Zhang ◽  
Yu He ◽  
Yong Zhang ◽  
Shaohua An ◽  
Qingming Zhu ◽  
...  

AbstractUltracompact and low-power-consumption optical switches are desired for high-performance telecommunication networks and data centers. Here, we demonstrate an on-chip power-efficient 2 × 2 thermo-optic switch unit by using a suspended photonic crystal nanobeam structure. A submilliwatt switching power of 0.15 mW is obtained with a tuning efficiency of 7.71 nm/mW in a compact footprint of 60 μm × 16 μm. The bandwidth of the switch is properly designed for a four-level pulse amplitude modulation signal with a 124 Gb/s raw data rate. To the best of our knowledge, the proposed switch is the most power-efficient resonator-based thermo-optic switch unit with the highest tuning efficiency and data ever reported.


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Batyrbek Alimkhanuly ◽  
Joon Sohn ◽  
Ik-Joon Chang ◽  
Seunghyun Lee

AbstractRecent studies on neural network quantization have demonstrated a beneficial compromise between accuracy, computation rate, and architecture size. Implementing a 3D Vertical RRAM (VRRAM) array accompanied by device scaling may further improve such networks’ density and energy consumption. Individual device design, optimized interconnects, and careful material selection are key factors determining the overall computation performance. In this work, the impact of replacing conventional devices with microfabricated, graphene-based VRRAM is investigated for circuit and algorithmic levels. By exploiting a sub-nm thin 2D material, the VRRAM array demonstrates an improved read/write margins and read inaccuracy level for the weighted-sum procedure. Moreover, energy consumption is significantly reduced in array programming operations. Finally, an XNOR logic-inspired architecture designed to integrate 1-bit ternary precision synaptic weights into graphene-based VRRAM is introduced. Simulations on VRRAM with metal and graphene word-planes demonstrate 83.5 and 94.1% recognition accuracy, respectively, denoting the importance of material innovation in neuromorphic computing.


Sign in / Sign up

Export Citation Format

Share Document