Analog memristive synapse based on topotactic phase transition for high-performance neuromorphic computing and neural network pruning

Xing Mou; Jianshi Tang; Yingjie Lyu; Qingtian Zhang; Siyao Yang; Feng Xu; Wei Liu; Minghong Xu; Yu Zhou; Wen Sun; Yanan Zhong; Bin Gao; Pu Yu; He Qian; Huaqiang Wu

doi:10.1126/sciadv.abh0648

Analog memristive synapse based on topotactic phase transition for high-performance neuromorphic computing and neural network pruning

Science Advances ◽

10.1126/sciadv.abh0648 ◽

2021 ◽

Vol 7 (29) ◽

pp. eabh0648

Author(s):

Xing Mou ◽

Jianshi Tang ◽

Yingjie Lyu ◽

Qingtian Zhang ◽

Siyao Yang ◽

...

Keyword(s):

Neural Network ◽

Phase Transition ◽

High Performance ◽

Density Functional ◽

Kinetic Monte Carlo ◽

Random Access ◽

Dual Mode ◽

Neuromorphic Computing ◽

Power Efficient ◽

Network Pruning

Inspired by the human brain, nonvolatile memories (NVMs)–based neuromorphic computing emerges as a promising paradigm to build power-efficient computing hardware for artificial intelligence. However, existing NVMs still suffer from physically imperfect device characteristics. In this work, a topotactic phase transition random-access memory (TPT-RAM) with a unique diffusive nonvolatile dual mode based on SrCoOx is demonstrated. The reversible phase transition of SrCoOx is well controlled by oxygen ion migrations along the highly ordered oxygen vacancy channels, enabling reproducible analog switching characteristics with reduced variability. Combining density functional theory and kinetic Monte Carlo simulations, the orientation-dependent switching mechanism of TPT-RAM is investigated synergistically. Furthermore, the dual-mode TPT-RAM is used to mimic the selective stabilization of developing synapses and implement neural network pruning, reducing ~84.2% of redundant synapses while improving the image classification accuracy to 99%. Our work points out a new direction to design bioplausible memristive synapses for neuromorphic computing.

Download Full-text

One Transistor One Electrolyte‐Gated Transistor Based Spiking Neural Network for Power‐Efficient Neuromorphic Computing System

Advanced Functional Materials ◽

10.1002/adfm.202100042 ◽

2021 ◽

pp. 2100042

Author(s):

Yue Li ◽

Zihao Xuan ◽

Jikai Lu ◽

Zhongrui Wang ◽

Xumeng Zhang ◽

...

Keyword(s):

Neural Network ◽

Computing System ◽

Spiking Neural Network ◽

Neuromorphic Computing ◽

Power Efficient

Download Full-text

Design and Verification of Dual Mode Logic (DML) for Power Efficient and High Performance

International Journal of Advance Engineering and Research Development ◽

10.21090/ijaerd.01121 ◽

2014 ◽

Vol 1 (12) ◽

Keyword(s):

High Performance ◽

Dual Mode ◽

Power Efficient

Download Full-text

Privacy Preserving: Stochastic Channel-Based Federated Learning with Neural Network Pruning (Preprint)

10.2196/preprints.17111 ◽

2019 ◽

Author(s):

Rulin Shao ◽

Hongyu He ◽

Hui Liu ◽

Dianbo Liu

Keyword(s):

Neural Network ◽

Distributed System ◽

High Performance ◽

Averaging Method ◽

Privacy Preserving ◽

Performance Model ◽

Sensitive Information ◽

Privacy Concerns ◽

Network Pruning ◽

Validation Set

BACKGROUND Artificial neural network has achieved unprecedented success in a wide variety of domains such as classifying, predicting and recognizing objects. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns and people want to take control over their sensitive information during both training and using processes. OBJECTIVE To address this problem, we propose a privacy-preserving method for the distributed system. The proposed method, Stochastic Channel-Based Federated Learning (SCBF), enables the participants to train a high-performance model cooperatively without sharing their inputs. METHODS Specifically, we design, implement and evaluate a channel-based update algorithm for the central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop and upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, is applied to the algorithm based on the validation set. RESULTS We construct a distributed system consisting of 5 clients and 1 server. Our trials show that the Stochastic Channel-Based Federated Learning method can achieve an AUCROC of 0.9776 and an AUCPR of 0.9695 with 10% channels shared with the server. Compared with Federated Averaging algorithm, the proposed method achieves 0.05388 higher in AUCROC and 0.09695 higher in AUCPR. In addition, our experiment shows that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUCROC performance and a reduction of 0.0068 in AUCPR. CONCLUSIONS In the experiment, our model presents better performances and higher saturating speed than the Federated Averaging method, which reveals all the parameters of local models to the server. We also demonstrate that the saturating rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate.

Download Full-text

Phase-transition modulated, high-performance dual-mode photodetectors based on WSe2/VO2 heterojunctions

Applied Physics Reviews ◽

10.1063/1.5124672 ◽

2019 ◽

Vol 6 (4) ◽

pp. 041407 ◽

Cited By ~ 7

Author(s):

Hao Luo ◽

Bolun Wang ◽

Enze Wang ◽

Xuewen Wang ◽

Yufei Sun ◽

...

Keyword(s):

Phase Transition ◽

High Performance ◽

Dual Mode

Download Full-text

A convolutional neural network accelerator for real-time underwater image recognition of autonomous underwater vehicle

Proceedings of the Institution of Mechanical Engineers Part I Journal of Systems and Control Engineering ◽

10.1177/0959651820958208 ◽

2020 ◽

pp. 095965182095820

Author(s):

Wanting Zhao ◽

Hong Qi ◽

Yu Jiang ◽

Chong Wang ◽

Fenglin Wei

Keyword(s):

Neural Network ◽

Energy Consumption ◽

Convolutional Neural Network ◽

Real Time ◽

Image Recognition ◽

High Performance ◽

Autonomous Underwater Vehicle ◽

Random Access ◽

Underwater Vehicle ◽

Underwater Image

In the field of underwater image recognition, a chip with smaller footprint and lower energy consumption is required to be implanted into autonomous intelligent underwater vehicle to make real-time response to the surrounding objects. Therefore, a promising accelerator with high performance and low energy consumption is designed, which optimizes the features possessed by convolutional neural network. The sharing of weights between neurons reduces the memory requirement. With all convolutional neural network data stored within on-chip static random-access memory, the need for memory access is drastically decreased. Besides, several small processing elements are used to form neural functional unit, which considerably reduces the bandwidth requirement through inter-processing element data transmission. By sending control signals to autonomous underwater vehicle, this accelerator enables it to avoid dangerous areas such as rocks and algae in time. The result suggests the proposed accelerator successfully achieves a higher processing speed than that of CPU and GPU with a footprint of 6.09 mm2 only and the energy consumption of 327.3 mW at 1 GHz.

Download Full-text

Temperature-resilient solid-state organic artificial synapses for neuromorphic computing

Science Advances ◽

10.1126/sciadv.abb2958 ◽

2020 ◽

Vol 6 (27) ◽

pp. eabb2958 ◽

Cited By ~ 4

Author(s):

A. Melianas ◽

T. J. Quill ◽

G. LeCroy ◽

Y. Tuchman ◽

H. v. Loo ◽

...

Keyword(s):

Solid State ◽

High Performance ◽

Dynamic Range ◽

Low Voltage ◽

Random Access ◽

Low Noise ◽

Neuromorphic Computing ◽

Efficient Operation ◽

Linear Resistance ◽

Artificial Neural Network Ann

Devices with tunable resistance are highly sought after for neuromorphic computing. Conventional resistive memories, however, suffer from nonlinear and asymmetric resistance tuning and excessive write noise, degrading artificial neural network (ANN) accelerator performance. Emerging electrochemical random-access memories (ECRAMs) display write linearity, which enables substantially faster ANN training by array programing in parallel. However, state-of-the-art ECRAMs have not yet demonstrated stable and efficient operation at temperatures required for packaged electronic devices (~90°C). Here, we show that (semi)conducting polymers combined with ion gel electrolyte films enable solid-state ECRAMs with stable and nearly temperature-independent operation up to 90°C. These ECRAMs show linear resistance tuning over a >2× dynamic range, 20-nanosecond switching, submicrosecond write-read cycling, low noise, and low-voltage (±1 volt) and low-energy (~80 femtojoules per write) operation combined with excellent endurance (>109 write-read operations at 90°C). Demonstration of these high-performance ECRAMs is a fundamental step toward their implementation in hardware ANNs.

Download Full-text

МЕТОДЫ ДОСТИЖЕНИЯ МАКСИМАЛЬНОЙ ЭФФЕКТИВНОСТИ ПЛАТФОРМЫ ПРОТОТИПИРОВАНИЯ ВЫСОКОПРОИЗВОДИТЕЛЬНЫХ СИСТЕМ НА КРИСТАЛЛЕ НА ЗАДАЧАХ ИСКУССТВЕННОГО ИНТЕЛЛЕКТА

Nanoindustry Russia ◽

10.22184/1993-8578.2020.13.3s.585.588 ◽

2020 ◽

Vol 96 (3s) ◽

pp. 585-588

Author(s):

С.Е. Фролова ◽

Е.С. Янакова

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Computer Vision ◽

High Performance ◽

Systems On Chip ◽

High Performance Systems ◽

On Chip ◽

Network Technologies ◽

Neural Network Technologies

Предлагаются методы построения платформ прототипирования высокопроизводительных систем на кристалле для задач искусственного интеллекта. Изложены требования к платформам подобного класса и принципы изменения проекта СнК для имплементации в прототип. Рассматриваются методы отладки проектов на платформе прототипирования. Приведены результаты работ алгоритмов компьютерного зрения с использованием нейросетевых технологий на FPGA-прототипе семантических ядер ELcore. Methods have been proposed for building prototyping platforms for high-performance systems-on-chip for artificial intelligence tasks. The requirements for platforms of this class and the principles for changing the design of the SoC for implementation in the prototype have been described as well as methods of debugging projects on the prototyping platform. The results of the work of computer vision algorithms using neural network technologies on the FPGA prototype of the ELcore semantic cores have been presented.

Download Full-text

Improved Device Distribution in High-Performance SiNx Resistive Random Access Memory via Arsenic Ion Implantation

Nanomaterials ◽

10.3390/nano11061401 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1401

Author(s):

Te Jui Yen ◽

Albert Chin ◽

Vladimir Gritsenko

Keyword(s):

High Performance ◽

Conduction Mechanism ◽

Random Access ◽

Random Access Memory ◽

Resistive Random Access Memory ◽

Switching Behavior ◽

Access Memory ◽

Current Voltage ◽

Current Voltage Characteristics ◽

Limited Conduction

Large device variation is a fundamental challenge for resistive random access memory (RRAM) array circuit. Improved device-to-device distributions of set and reset voltages in a SiNx RRAM device is realized via arsenic ion (As+) implantation. Besides, the As+-implanted SiNx RRAM device exhibits much tighter cycle-to-cycle distribution than the nonimplanted device. The As+-implanted SiNx device further exhibits excellent performance, which shows high stability and a large 1.73 × 103 resistance window at 85 °C retention for 104 s, and a large 103 resistance window after 105 cycles of the pulsed endurance test. The current–voltage characteristics of high- and low-resistance states were both analyzed as space-charge-limited conduction mechanism. From the simulated defect distribution in the SiNx layer, a microscopic model was established, and the formation and rupture of defect-conductive paths were proposed for the resistance switching behavior. Therefore, the reason for such high device performance can be attributed to the sufficient defects created by As+ implantation that leads to low forming and operation power.

Download Full-text

Ultracompact and low-power-consumption silicon thermo-optic switch for high-speed data

Nanophotonics ◽

10.1515/nanoph-2020-0496 ◽

2020 ◽

Vol 10 (2) ◽

pp. 937-945

Author(s):

Ruihuan Zhang ◽

Yu He ◽

Yong Zhang ◽

Shaohua An ◽

Qingming Zhu ◽

...

Keyword(s):

Power Consumption ◽

Low Power ◽

High Speed ◽

High Performance ◽

Pulse Amplitude ◽

Telecommunication Networks ◽

Low Power Consumption ◽

Power Efficient ◽

High Speed Data ◽

On Chip

AbstractUltracompact and low-power-consumption optical switches are desired for high-performance telecommunication networks and data centers. Here, we demonstrate an on-chip power-efficient 2 × 2 thermo-optic switch unit by using a suspended photonic crystal nanobeam structure. A submilliwatt switching power of 0.15 mW is obtained with a tuning efficiency of 7.71 nm/mW in a compact footprint of 60 μm × 16 μm. The bandwidth of the switch is properly designed for a four-level pulse amplitude modulation signal with a 124 Gb/s raw data rate. To the best of our knowledge, the proposed switch is the most power-efficient resonator-based thermo-optic switch unit with the highest tuning efficiency and data ever reported.

Download Full-text

Graphene-based 3D XNOR-VRRAM with ternary precision for neuromorphic computing

npj 2D Materials and Applications ◽

10.1038/s41699-021-00236-x ◽

2021 ◽

Vol 5 (1) ◽

Author(s):

Batyrbek Alimkhanuly ◽

Joon Sohn ◽

Ik-Joon Chang ◽

Seunghyun Lee

Keyword(s):

Neural Network ◽

Energy Consumption ◽

Recognition Accuracy ◽

Material Selection ◽

Weighted Sum ◽

Device Design ◽

Key Factors ◽

Neuromorphic Computing ◽

Device Scaling ◽

The Impact

AbstractRecent studies on neural network quantization have demonstrated a beneficial compromise between accuracy, computation rate, and architecture size. Implementing a 3D Vertical RRAM (VRRAM) array accompanied by device scaling may further improve such networks’ density and energy consumption. Individual device design, optimized interconnects, and careful material selection are key factors determining the overall computation performance. In this work, the impact of replacing conventional devices with microfabricated, graphene-based VRRAM is investigated for circuit and algorithmic levels. By exploiting a sub-nm thin 2D material, the VRRAM array demonstrates an improved read/write margins and read inaccuracy level for the weighted-sum procedure. Moreover, energy consumption is significantly reduced in array programming operations. Finally, an XNOR logic-inspired architecture designed to integrate 1-bit ternary precision synaptic weights into graphene-based VRRAM is introduced. Simulations on VRRAM with metal and graphene word-planes demonstrate 83.5 and 94.1% recognition accuracy, respectively, denoting the importance of material innovation in neuromorphic computing.

Download Full-text