Mitigating State-Drift in Memristor Crossbar Arrays for Vector Matrix Multiplication

In this Chapter, we review the recent progress on resistance drift mitigation techniques for resistive switching memory devices (specifically memristors) and its impact on the accuracy in deep neural network applications. In the first section of the chapter, we investigate the importance of soft errors and their detrimental impact on memristor-based vector–matrix multiplication (VMM) platforms performance specially the memristance state-drift induced by long-term recurring inference operations with sub-threshold stress voltage. Also, we briefly review some currently developed state-drift mitigation methods. In the next section of the chapter, we will discuss an adaptive inference technique with low hardware overhead to mitigate the memristance drift in memristive VMM platform by using optimization techniques to adjust the inference voltage characteristic associated with different network layers. Also, we present simulation results and performance improvements achieved by applying the proposed inference technique by considering non-idealities for various deep network applications on memristor crossbar arrays. This chapter suggests that a simple low overhead inference technique can revive the functionality, enhance the performance of memristor-based VMM arrays and significantly increases their lifetime which can be a very important factor toward making this technology as a main stream player in future in-memory computing platforms.

Download Full-text

SYSTEMC IMPLEMENTATION AND PERFORMANCE EVALUATION OF A DECOUPLED GENERAL-PURPOSE MATRIX PROCESSOR

Parallel Processing Letters ◽

10.1142/s0129626410000090 ◽

2010 ◽

Vol 20 (02) ◽

pp. 103-121 ◽

Cited By ~ 1

Author(s):

MOSTAFA I. SOLIMAN ◽

ABDULMAJID F. Al-JUNAID

Keyword(s):

Performance Evaluation ◽

Matrix Multiplication ◽

General Purpose ◽

System Level ◽

Memory Latency ◽

Single Chip ◽

Wide Range ◽

Matrix Unit ◽

And Performance ◽

Vector Matrix

Technological advances in IC manufacturing provide us with the capability to integrate more and more functionality into a single chip. Today's modern processors have nearly one billion transistors on a single chip. With the increasing complexity of today's system, the designs have to be modeled at a high-level of abstraction before partitioning into hardware and software components for final implementation. This paper explains in detail the implementation and performance evaluation of a matrix processor called Mat-Core with SystemC (system level modeling language). Mat-Core is a research processor aiming at exploiting the increasingly number of transistors per IC to improve the performance of a wide range of applications. It extends a general-purpose scalar processor with a matrix unit. To hide memory latency, the extended matrix unit is decoupled into two components: address generation and data computation, which communicate through data queues. Like vector architectures, the data computation unit is organized in parallel lanes. However, on parallel lanes, Mat-Core can execute matrix-scalar, matrix-vector, and matrix-matrix instructions in addition to vector-scalar and vector-vector instructions. For controlling the execution of vector/matrix instructions on the matrix core, this paper extends the well known scoreboard technique. Furthermore, the performance of Mat-Core is evaluated on vector and matrix kernels. Our results show that the performance of four lanes Mat-Core with matrix registers of size 4 × 4 or 16 elements each, queues size of 10, start up time of 6 clock cycles, and memory latency of 10 clock cycles is about 0.94, 1.3, 2.3, 1.6, 2.3, and 5.5 FLOPs per clock cycle; achieved on scalar-vector multiplication, SAXPY, Givens, rank-1 update, vector-matrix multiplication, and matrix-matrix multiplication, respectively.

Download Full-text

Uncertainty Quantification of Memristor Crossbar Array for Vector Matrix Multiplication

2021 IEEE 25th Workshop on Signal and Power Integrity (SPI) ◽

10.1109/spi52361.2021.9505193 ◽

2021 ◽

Author(s):

Rohan Kumar ◽

Aksh Chordia ◽

AR Aswani ◽

Alex James ◽

Jai Narayan Tripathi

Keyword(s):

Uncertainty Quantification ◽

Matrix Multiplication ◽

Crossbar Array ◽

Memristor Crossbar ◽

Vector Matrix

Download Full-text

A Novel Voltage-Accumulation Vector-Matrix Multiplication Architecture Using Resistor-shunted Floating Gate Flash Memory Device for Low-power and High-density Neural Network Applications

2018 IEEE International Electron Devices Meeting (IEDM) ◽

10.1109/iedm.2018.8614688 ◽

2018 ◽

Cited By ~ 4

Author(s):

Yu-Yu Lin ◽

Feng-Min Lee ◽

Ming-Hsiu Lee ◽

Wei-Chen Chen ◽

Hsiang-Lan Lung ◽

...

Keyword(s):

Neural Network ◽

Low Power ◽

Flash Memory ◽

Matrix Multiplication ◽

Memory Device ◽

High Density ◽

Floating Gate ◽

Network Applications ◽

Vector Matrix ◽

Neural Network Applications

Download Full-text

Materials Issues and Characterization of Low-k Dielectric Materials

MRS Bulletin ◽

10.1557/s0883769400034205 ◽

1997 ◽

Vol 22 (10) ◽

pp. 49-54 ◽

Cited By ~ 15

Author(s):

E. Todd Ryan ◽

Andrew J. McKerrow ◽

Jihperng Leu ◽

Paul S. Ho

Keyword(s):

Process Integration ◽

Dielectric Materials ◽

Propagation Delay ◽

Thermomechanical Properties ◽

General Criterion ◽

Crosstalk Noise ◽

Total Delay ◽

Interlayer Dielectrics ◽

Performance Improvements ◽

And Performance

Continuing improvement in device density and performance has significantly affected the dimensions and complexity of the wiring structure for on-chip interconnects. These enhancements have led to a reduction in the wiring pitch and an increase in the number of wiring levels to fulfill demands for density and performance improvements. As device dimensions shrink to less than 0.25 μm, the propagation delay, crosstalk noise, and power dissipation due to resistance-capacitance (RC) coupling become significant. Accordingly the interconnect delay now constitutes a major fraction of the total delay limiting the overall chip performance. Equally important is the processing complexity due to an increase in the number of wiring levels. This inevitably drives cost up by lowering the manufacturing yield due to an increase in defects and processing complexity.To address these problems, new materials for use as metal lines and interlayer dielectrics (ILDs) and alternative architectures have surfaced to replace the current Al(Cu)/SiO2 interconnect technology. These alternative architectures will require the introduction of low-dielectric-constant k materials as the interlayer dielectrics and/or low-resistivity conductors such as copper. The electrical and thermomechanical properties of SiO2 are ideal for ILD applications, and a change to material with different properties has important process-integration implications. To facilitate the choice of an alternative ILD, it is necessary to establish general criterion for evaluating thin-film properties of candidate low-k materials, which can be later correlated with process-integration problems.

Download Full-text

Remote Thermal Performance Monitoring and Diagnostics: Turning Data Into Knowledge

Volume 2: Reliability, Availability and Maintainability (RAM); Plant Systems, Structures, Components and Materials Issues; Simple and Combined Cycles; Advanced Energy Systems and Renewables (Wind, Solar and Geothermal); Energy Water Nexus; Thermal Hydraulics and CFD; Nuclear Plant Design, Licensing and Construction; Performance Testing and Performance Test Codes ◽

10.1115/power2013-98246 ◽

2013 ◽

Cited By ~ 4

Author(s):

Xiaomo Jiang ◽

Craig Foster

Keyword(s):

Anomaly Detection ◽

Gas Turbines ◽

Performance Monitoring ◽

Combined Cycle ◽

Integrated System ◽

Site Specific ◽

Performance Improvements ◽

Maintenance Schedule ◽

Effectiveness Monitoring ◽

And Performance

Gas turbine simple or combined cycle plants are built and operated with higher availability, reliability, and performance in order to provide the customer with sufficient operating revenues and reduced fuel costs meanwhile enhancing customer dispatch competitiveness. A tremendous amount of operational data is usually collected from the everyday operation of a power plant. It has become an increasingly important but challenging issue about how to turn this data into knowledge and further solutions via developing advanced state-of-the-art analytics. This paper presents an integrated system and methodology to pursue this purpose by automating multi-level, multi-paradigm, multi-facet performance monitoring and anomaly detection for heavy duty gas turbines. The system provides an intelligent platform to drive site-specific performance improvements, mitigate outage risk, rationalize operational pattern, and enhance maintenance schedule and service offerings via taking appropriate proactive actions. In addition, the paper also presents the components in the system, including data sensing, hardware, and operational anomaly detection, expertise proactive act of company, site specific degradation assessment, and water wash effectiveness monitoring and analytics. As demonstrated in two examples, this remote performance monitoring aims to improve equipment efficiency by converting data into knowledge and solutions in order to drive value for customers including lowering operating fuel cost and increasing customer power sales and life cycle value.

Download Full-text

Enabling Large-Scale Simulations of Quantum Transport with Manycore Computing

Electronics ◽

10.3390/electronics10030253 ◽

2021 ◽

Vol 10 (3) ◽

pp. 253

Author(s):

Yosang Jeong ◽

Hoon Ryu

Keyword(s):

Quantum Transport ◽

Large Scale ◽

Performance Enhancement ◽

Silicon Nanowire ◽

Matrix Multiplication ◽

Tight Binding ◽

Optimization Techniques ◽

Wide Energy Range ◽

Processing Unit ◽

Binding Model

The non-equilibrium Green’s function (NEGF) is being utilized in the field of nanoscience to predict transport behaviors of electronic devices. This work explores how much performance improvement can be driven for quantum transport simulations with the aid of manycore computing, where the core numerical operation involves a recursive process of matrix multiplication. Major techniques adopted for performance enhancement are data restructuring, matrix tiling, thread scheduling, and offload computing, and we present technical details on how they are applied to optimize the performance of simulations in computing hardware, including Intel Xeon Phi Knights Landing (KNL) systems and NVIDIA general purpose graphic processing unit (GPU) devices. With a target structure of a silicon nanowire that consists of 100,000 atoms and is described with an atomistic tight-binding model, the effects of optimization techniques on the performance of simulations are rigorously tested in a KNL node equipped with two Quadro GV100 GPU devices, and we observe that computation is accelerated by a factor of up to ∼20 against the unoptimized case. The feasibility of handling large-scale workloads in a huge computing environment is also examined with nanowire simulations in a wide energy range, where good scalability is procured up to 2048 KNL nodes.

Download Full-text

Achieving Health, Safety, and Performance Improvements Through Enhanced Cost Visibility and Workplace Partnerships

AIHA Journal ◽

10.1080/15428110308984860 ◽

2003 ◽

Vol 64 (5) ◽

pp. 660-667 ◽

Cited By ~ 3

Author(s):

Katharyn A. Grant ◽

John G. Garland ◽

Todd C. Joachim ◽

Andrew Wallen ◽

Twyla Vital

Keyword(s):

Health Safety ◽

Performance Improvements ◽

And Performance

Download Full-text

Aerodynamic effects and performance improvements of running in drafting formations

Journal of Biomechanics ◽

10.1016/j.jbiomech.2021.110457 ◽

2021 ◽

Vol 122 ◽

pp. 110457

Author(s):

Lukas Schickhofer ◽

Henry Hanson

Keyword(s):

Performance Improvements ◽

And Performance

Download Full-text

A New Aluminium Silicon-Boron Nitride Abradable for Compressor Components

Thermal Spray 1998: Proceedings from the International Thermal Spray Conference ◽

10.31399/asm.cp.itsc1998p1049 ◽

1998 ◽

Author(s):

K. Boddenberg ◽

B. Kock ◽

M. Dorfman ◽

L. Russo ◽

M. Nestler

Keyword(s):

Boron Nitride ◽

Raw Materials ◽

Electrical Energy ◽

Air Separation ◽

Energy Costs ◽

Performance Improvements ◽

And Performance ◽

Materials Used ◽

Abradable Coatings ◽

Optimization And Performance

Abstract Air separation plants use centrifugal compressors where air and electrical energy are the only raw materials used in the production process. So energy costs play a crucial role and the compressors are heavily penalized when guaranteed performance levels are not achieved. In order to better generate performance, abradable coatings, previously used in the gas turbine industry, have been designed into turbocompressors. This paper will show the optimization and performance improvements of a new aluminium silicon-boron nitride material.

Download Full-text

IEEE/ISO/IEC International Standard-Telecommunications and exchange between information technology systems -- Requirements for local and metropolitan area networks -- Part 1Q: Bridges and bridged networks AMENDMENT 31: Stream Reservation Protocol (SRP) enhancements and performance improvements

10.1109/ieeestd.2021.9599625 ◽

2021 ◽

Keyword(s):

Information Technology ◽

Metropolitan Area ◽

Metropolitan Area Networks ◽

International Standard ◽

Performance Improvements ◽

Reservation Protocol ◽

Systems Requirements ◽

And Performance ◽

Amendment 31 ◽

Technology Systems

Download Full-text