Accelerating On-Chip Training with Ferroelectric-Based Hybrid Precision Synapse

2022 ◽  
Vol 18 (2) ◽  
pp. 1-20
Author(s):  
Yandong Luo ◽  
Panni Wang ◽  
Shimeng Yu

In this article, we propose a hardware accelerator design using ferroelectric transistor (FeFET)-based hybrid precision synapse (HPS) for deep neural network (DNN) on-chip training. The drain erase scheme for FeFET programming is incorporated for both FeFET HPS design and FeFET buffer design. By using drain erase, high-density FeFET buffers can be integrated onchip to store the intermediate input-output activations and gradients, which reduces the energy consuming off-chip DRAM access. Architectural evaluation results show that the energy efficiency could be improved by 1.2× ∼ 2.1×, 3.9× ∼ 6.0× compared to the other HPS-based designs and emerging non-volatile memory baselines, respectively. The chip area is reduced by 19% ∼ 36% compared with designs using SRAM on-chip buffer even though the capacity of FeFET buffer is increased. Besides, by utilizing drain erase scheme for FeFET programming, the chip area is reduced by 11% ∼ 28.5% compared with the designs using body erase scheme.

2007 ◽  
Vol 49 (3) ◽  
Author(s):  
Christopher Claus ◽  
Walter Stechele ◽  
Andreas Herkersdorf

In this article the Autovision architecture is presented, a new Multi Processor System-on-Chip (MPSoC) architecture for future video-based driver assistance systems, using run-time reconfigurable hardware accelerator engines for video processing. According to various driving conditions (highway, city, sunlight, rain, tunnel entrance) different algorithms have to be used for video processing. These different algorithms require different hardware accelerator engines, which are loaded into the Autovision chip at run-time of the system, triggered by changing driving conditions. It was investigated how to use dynamic partial reconfiguration to load and operate the correct hardware accelerator engines in time, while removing unused engines in order to save precious chip area.


2015 ◽  
Vol 24 (06) ◽  
pp. 1550079 ◽  
Author(s):  
Tiefei Zhang ◽  
Jixiang Zhu ◽  
Jun Fu ◽  
Tianzhou Chen

Due to its large leakage power and low density, the conventional SARM becomes less appealing to implement the large on-chip cache due to energy issue. Emerging non-volatile memory technologies, such as phase change memory (PCM) and spin-transfer torque RAM (STT-RAM), have advantages of low leakage power and high density, which makes them good candidates for on-chip cache. In particular, STT-RAM has longer endurance and shorter access latency over PCM. There are two kinds of STT-RAM so far: single-level cell (SLC) STT-RAM and multi-level cell (MLC) STT-RAM. Compared to the SLC STT-RAM, the MLC STT-RAM has higher density and lower leakage power, which makes it a even more promising candidate for future on-chip cache. However, MLC STT-RAM improves density at the cost of almost doubled write latency and energy compared to the SLC STT-RAM. These drawbacks degrade the system performance and diminish the energy benefits. To alleviate these problems, we propose a novel cache organization, companion write cache (CWC), which is a small fully associative SRAM cache, working with the main MLC STT-RAM cache in a master-and-servant way. The key function of CWC is to absorb the energy-consuming write updates from the MLC STT-RAM cache. The experimental results are promising that CWC can greatly reduce the write energy and dynamic energy, improve the performance and endurance of MLC STT-RAM cache compared to a baseline.


Author(s):  
Yunfei Fu ◽  
Hongchuan Yu ◽  
Chih-Kuo Yeh ◽  
Tong-Yee Lee ◽  
Jian J. Zhang

Brushstrokes are viewed as the artist’s “handwriting” in a painting. In many applications such as style learning and transfer, mimicking painting, and painting authentication, it is highly desired to quantitatively and accurately identify brushstroke characteristics from old masters’ pieces using computer programs. However, due to the nature of hundreds or thousands of intermingling brushstrokes in the painting, it still remains challenging. This article proposes an efficient algorithm for brush Stroke extraction based on a Deep neural network, i.e., DStroke. Compared to the state-of-the-art research, the main merit of the proposed DStroke is to automatically and rapidly extract brushstrokes from a painting without manual annotation, while accurately approximating the real brushstrokes with high reliability. Herein, recovering the faithful soft transitions between brushstrokes is often ignored by the other methods. In fact, the details of brushstrokes in a master piece of painting (e.g., shapes, colors, texture, overlaps) are highly desired by artists since they hold promise to enhance and extend the artists’ powers, just like microscopes extend biologists’ powers. To demonstrate the high efficiency of the proposed DStroke, we perform it on a set of real scans of paintings and a set of synthetic paintings, respectively. Experiments show that the proposed DStroke is noticeably faster and more accurate at identifying and extracting brushstrokes, outperforming the other methods.


2020 ◽  
Author(s):  
Chiou-Jye Huang ◽  
Yamin Shen ◽  
Ping-Huan Kuo ◽  
Yung-Hsiang Chen

AbstractThe coronavirus disease 2019 pandemic continues as of March 26 and spread to Europe on approximately February 24. A report from April 29 revealed 1.26 million confirmed cases and 125 928 deaths in Europe. This study proposed a novel deep neural network framework, COVID-19Net, which parallelly combines a convolutional neural network (CNN) and bidirectional gated recurrent units (GRUs). Three European countries with severe outbreaks were studied—Germany, Italy, and Spain—to extract spatiotemporal feature and predict the number of confirmed cases. The prediction results acquired from COVID-19Net were compared to those obtained using a CNN, GRU, and CNN-GRU. The mean absolute error, mean absolute percentage error, and root mean square error, which are commonly used model assessment indices, were used to compare the accuracy of the models. The results verified that COVID-19Net was notably more accurate than the other models. The mean absolute percentage error generated by COVID-19Net was 1.447 for Germany, 1.801 for Italy, and 2.828 for Spain, which were considerably lower than those of the other models. This indicated that the proposed framework can accurately predict the accumulated number of confirmed cases in the three countries and serve as a crucial reference for devising public health strategies.


2008 ◽  
Vol 17 (06) ◽  
pp. 1161-1172 ◽  
Author(s):  
HUA-PIN CHEN ◽  
KUO-HSIUNG WU

A new voltage-mode biquad with four inputs and four outputs using only two differential difference current conveyors (DDCCs), two grounded capacitors, and two resistors is proposed. The proposed circuit can act as a multifunction voltage-mode filter with one or three inputs and four outputs and can perform simultaneous realization of voltage-mode notch, highpass, bandpass, and lowpass filter signals from the four output terminals, respectively, without any component choice conditions. On the other hand, it also can act as a universal voltage-mode filter with four inputs and a single output and can realize five generic voltage-mode filter signals from the same configuration without any component-matching conditions. Finally, to verify our architecture, we have designed this analog filter chip with TSMC 0.35 μm 2P4M CMOS technology. This chip operates to 1.125 MHz and consumes 30.95 mW. The chip area of the analog filter is about 0.822 mm2.


2010 ◽  
Vol 7 (1) ◽  
pp. 35-43 ◽  
Author(s):  
John H. Lau

Moore's law has been the most powerful driver for the development of the microelectronic industry. This law is grounded in lithography scaling and integration (in 2D) of all functions on a single chip, perhaps through system-on-chip (SoC). On the other hand, the integration of all these functions can be achieved through system-in-package (SiP) or, ultimately, 3D IC integration. However, there are many critical issues for 3D IC integration. In this study, some of the critical issues will be discussed and some potential solutions or research problems will be proposed.


Sign in / Sign up

Export Citation Format

Share Document