In-Memory-Computing Realization with a Photodiode/Memristor Based Vision Sensor

State-of-the-art IoT technologies request novel design solutions in edge computing, resulting in even more portable and energy-efficient hardware for in-the-field processing tasks. Vision sensors, processors, and hardware accelerators are among the most demanding IoT applications. Resistance switching (RS) two-terminal devices are suitable for resistive RAMs (RRAM), a promising technology to realize storage class memories. Furthermore, due to their memristive nature, RRAMs are appropriate candidates for in-memory computing architectures. Recently, we demonstrated a CMOS compatible silicon nitride (SiNx) MIS RS device with memristive properties. In this paper, a report on a new photodiode-based vision sensor architecture with in-memory computing capability, relying on memristive device, is disclosed. In this context, the resistance switching dynamics of our memristive device were measured and a data-fitted behavioral model was extracted. SPICE simulations were made highlighting the in-memory computing capabilities of the proposed photodiode-one memristor pixel vision sensor. Finally, an integration and manufacturing perspective was discussed.

Download Full-text

Event Camera Simulator Improvements via Characterized Parameters

Frontiers in Neuroscience ◽

10.3389/fnins.2021.702765 ◽

2021 ◽

Vol 15 ◽

Author(s):

Damien Joubert ◽

Alexandre Marcireau ◽

Nic Ralph ◽

Andrew Jolley ◽

André van Schaik ◽

...

Keyword(s):

Processing Speed ◽

Energy Efficient ◽

State Of The Art ◽

Wide Spectrum ◽

Vision Sensor ◽

Dynamic Vision ◽

Complex Scenes ◽

Efficient Data ◽

Event Camera ◽

Noise Models

It has been more than two decades since the first neuromorphic Dynamic Vision Sensor (DVS) sensor was invented, and many subsequent prototypes have been built with a wide spectrum of applications in mind. Competing against state-of-the-art neural networks in terms of accuracy is difficult, although there are clear opportunities to outperform conventional approaches in terms of power consumption and processing speed. As neuromorphic sensors generate sparse data at the focal plane itself, they are inherently energy-efficient, data-driven, and fast. In this work, we present an extended DVS pixel simulator for neuromorphic benchmarks which simplifies the latency and the noise models. In addition, to more closely model the behaviour of a real pixel, the readout circuitry is modelled, as this can strongly affect the time precision of events in complex scenes. Using a dynamic variant of the MNIST dataset as a benchmarking task, we use this simulator to explore how the latency of the sensor allows it to outperform conventional sensors in terms of sensing speed.

Download Full-text

Translucent Answer Predictions in Multi-Hop Reading Comprehension

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6272 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7700-7707

Author(s):

G P Shrivatsa Bhargav ◽

Michael Glass ◽

Dinesh Garg ◽

Shirish Shevade ◽

Saswati Dana ◽

...

Keyword(s):

Reading Comprehension ◽

Question Answering ◽

State Of The Art ◽

Local Context ◽

The Novel ◽

Loose Coupling ◽

Loosely Coupled ◽

Neural Architecture ◽

Coupled Networks ◽

Novel Design

Research on the task of Reading Comprehension style Question Answering (RCQA) has gained momentum in recent years due to the emergence of human annotated datasets and associated leaderboards, for example CoQA, HotpotQA, SQuAD, TriviaQA, etc. While state-of-the-art has advanced considerably, there is still ample opportunity to advance it further on some important variants of the RCQA task. In this paper, we propose a novel deep neural architecture, called TAP (Translucent Answer Prediction), to identify answers and evidence (in the form of supporting facts) in an RCQA task requiring multi-hop reasoning. TAP comprises two loosely coupled networks – Local and Global Interaction eXtractor (LoGIX) and Answer Predictor (AP). LoGIX predicts supporting facts, whereas AP consumes these predicted supporting facts to predict the answer span. The novel design of LoGIX is inspired by two key design desiderata – local context and global interaction– that we identified by analyzing examples of multi-hop RCQA task. The loose coupling between LoGIX and the AP reveals the set of sentences used by the AP in predicting an answer. Therefore, answer predictions of TAP can be interpreted in a translucent manner. TAP offers state-of-the-art performance on the HotpotQA (Yang et al. 2018) dataset – an apt dataset for multi-hop RCQA task – as it occupies Rank-1 on its leaderboard (https://hotpotqa.github.io/) at the time of submission.

Download Full-text

Spin Wave Based Approximate 4:2 Compressor

10.36227/techrxiv.16645417 ◽

2021 ◽

Author(s):

Abdulqader Mahmoud ◽

Frederic Vanderveken ◽

Florin Ciubotaru ◽

Christoph Adelmann ◽

Said Hamdioui ◽

...

Keyword(s):

Energy Consumption ◽

Error Rate ◽

Energy Efficient ◽

Directional Coupler ◽

State Of The Art ◽

The Other ◽

Average Error ◽

Majority Gate ◽

Micromagnetic Simulations ◽

Average Error Rate

In this paper, we propose an energy efficient SW based approximate 4:2 compressor comprising a 3-input and a 5-input Majority gate. We validate our proposal by means of micromagnetic simulations, and assess and compare its performance with one of the state-of-the-art SW, 45nm CMOS, and Spin-CMOS counterparts. The evaluation results indicate that the proposed compressor consumes 31.5\% less energy in comparison with its accurate SW design version. Furthermore, it has the same energy consumption and error rate as the approximate compressor with Directional Coupler (DC), but it exhibits 3x lower delay. In addition, it consumes 14% less energy, while having 17% lower average error rate than the approximate 45nm CMOS counterpart. When compared with the other emerging technologies, the proposed compressor outperforms approximate Spin-CMOS based compressor by 3 orders of magnitude in term of energy consumption while providing the same error rate. Finally, the proposed compressor requires the smallest chip real-estate measured in terms of devices.

Download Full-text

Spin Wave Based Full Adder

10.36227/techrxiv.16616134 ◽

2021 ◽

Author(s):

Abdulqader Mahmoud ◽

Frederic Vanderveken ◽

Florin Ciubotaru ◽

Christoph Adelmann ◽

Sorin Cotofana ◽

...

Keyword(s):

Energy Efficient ◽

Wall Motion ◽

State Of The Art ◽

Magnetic Tunnel Junction ◽

Domain Wall Motion ◽

Full Adder ◽

Low Energy ◽

Phase Detection ◽

Micromagnetic Simulations ◽

The Road

Spin Waves (SWs) propagate through magnetic waveguides and interfere with each other without consuming noticeable energy, which opens the road to new ultra-low energy circuit designs. In this paper we build upon SW features and propose a novel energy efficient Full Adder (FA) design consisting of The FA 1 Majority and 2 XOR gates, which outputs Sum and Carry-out are generated by means of threshold and phase detection, respectively. We validate our proposal by means of MuMax3 micromagnetic simulations and we evaluate and compare its performance with state-of-the-art SW, 22nm CMOS, Magnetic Tunnel Junction (MTJ), Spin Hall Effect (SHE), Domain Wall Motion (DWM), and Spin-CMOS implementations. Our evaluation indicates that the proposed SW FA consumes 22.5% and 43% less energy than the direct SW gate based and 22nm CMOS counterparts, respectively. Moreover it exhibits a more than 3 orders of magnitude smaller energy consumption when compared with state-of-the-art MTJ, SHE, DWM, and Spin-CMOS based FAs, and outperforms its contenders in terms of area by requiring at least 22% less chip real-estate.

Download Full-text

Spin Wave Based Approximate Computing

10.36227/techrxiv.14273261 ◽

2021 ◽

Author(s):

Abdulqader Mahmoud ◽

Frederic Vanderveken ◽

Florin Ciubotaru ◽

Christoph Adelmann ◽

Said Hamdioui ◽

...

Keyword(s):

Real Estate ◽

Error Rate ◽

Energy Efficient ◽

State Of The Art ◽

Magnetic Tunnel Junction ◽

Domain Wall Motion ◽

Full Adder ◽

Average Error ◽

Approximate Computing ◽

Average Error Rate

By their very nature Spin Waves (SWs) enable the realization of energy efficient circuits as they propagate and interfere within waveguides without consuming noticeable energy. However, SW computing can be even more energy efficient by taking advantage of the approximate computing paradigm as many applications are error-tolerant like multimedia and social media. In this paper we propose an ultra-low energy novel Approximate Full Adder (AFA) and a 2-bit inputs Multiplier (AMUL). The approximate FA consists of one Majority gate while the approximate MUL is built by means of 3 AND gates. We validate the correct functionality of our proposal by means of micromagnetic simulations and evaluate the approximate FA figure of merit against state-of-the-art accurate SW, 7nm CMOS, Spin Hall Effect (SHE), Domain Wall Motion (DWM), accurate and approximate 45nm CMOS, Magnetic Tunnel Junction (MTJ), and Spin-CMOS FA implementations. Our results indicate that AFA consumes 43% and 33% less energy than state-of-the-art accurate SW and 7nm CMOS FA, respectively, and saves 69% and 44% when compared with accurate and approximate 45nm CMOS, respectively, and provides a 2 orders of magnitude energy reduction when compared with accurate SHE, accurate and approximate DWM, MTJ, and Spin-CMOS, counterparts. In addition, it achieves the same error rate as approximate 45nm CMOS and Spin-CMOS FA whereas it exhibits 50% less error rate than the approximate DWM FA. Furthermore, it outperforms its contenders in terms of area by saving at least 29% chip real-estate. AMUL is evaluated and compared with state-of-the-art accurate SW and 16nm CMOS accurate and approximate state-of-the-art designs. The evaluation results indicate that it saves at least 2x and 5x energy in comparison with the state-of-the-art SW designs and 16nm CMOS accurate and approximate designs, respectively, and has an average error rate of 10%, while the approximate CMOS MUL has an average error rate of 12.5%, and requires at least 64% less chip real-estate.

Download Full-text

Robust and energy-efficient carbon nanotube FET-based MVL gates: A novel design approach

Microelectronics Journal ◽

10.1016/j.mejo.2015.09.018 ◽

2015 ◽

Vol 46 (12) ◽

pp. 1333-1342 ◽

Cited By ~ 18

Author(s):

Fazel Sharifi ◽

Mohammad Hossein Moaiyeri ◽

Keivan Navi ◽

Nader Bagherzadeh

Keyword(s):

Carbon Nanotube ◽

Energy Efficient ◽

Design Approach ◽

Novel Design ◽

Carbon Nanotube Fet

Download Full-text

Approximate Full Adders for Energy Efficient Image Processing Applications

Journal of Circuits System and Computers ◽

10.1142/s0218126621502352 ◽

2021 ◽

pp. 2150235

Author(s):

M. C. Parameshwara

Keyword(s):

Energy Efficient ◽

State Of The Art ◽

Signal To Noise Ratio ◽

Full Adder ◽

Signal To Noise ◽

Design Metrics ◽

Standard Cell Library ◽

Cell Library ◽

Fair Comparison ◽

Power Delay Product

This paper proposes six novel approximate 1-bit full adders (AFAs) for inexact computing. The six novel AFAs namely AFA1, AFA2, AFA3, AFA4, AFA5, and AFA6 are derived from state-of-the-art exact 1-bit full adder (EFA) architectures. The performance of these AFAs is compared with reported AFAs (RAAs) in terms of design metrics (DMs) and peak-signal-to-noise-ratio (PSNR). The DMs under consideration are power, delay, power-delay-product (PDP), energy-delay-product (EDP), and area. For a fair comparison, the EFAs and proposed AFAs along with RAAs are described in Verilog, simulated, and synthesized using Cadences’ RC tool, using generic 180 nm standard cell library. The unconstrained synthesis results show that: among all the proposed AFAs, the AFA1 and AFA2 are found to be energy-efficient adders with high PSNR. The AFA1 has a total [Formula: see text][Formula: see text][Formula: see text]W, [Formula: see text][Formula: see text]ps, [Formula: see text][Formula: see text]fJ, [Formula: see text][Formula: see text]Js, [Formula: see text][Formula: see text][Formula: see text]m2, and [Formula: see text][Formula: see text]dB. And the AFA2 has the total [Formula: see text][Formula: see text][Formula: see text]W, [Formula: see text][Formula: see text]ps, [Formula: see text][Formula: see text]fJ, [Formula: see text][Formula: see text]Js, [Formula: see text][Formula: see text][Formula: see text]m2, and [Formula: see text][Formula: see text]dB.

Download Full-text

Exoskeletons

Human Performance Optimization ◽

10.1093/oso/9780190455132.003.0011 ◽

2019 ◽

pp. 234-259

Author(s):

Priyanshu Agarwal ◽

Ashish D. Deshpande

Keyword(s):

Case Studies ◽

Energy Efficient ◽

Laboratory Testing ◽

Human Performance ◽

State Of The Art ◽

Upper Body ◽

The Past ◽

The Future ◽

Human Use ◽

Exoskeleton Design

The past few decades have witnessed a rapid explosion in research surrounding robotic exoskeletons due to their promising applications in medicine and human performance augmentation. Several advances in technology have led to the development of more energy efficient and viable prototypes of these devices. However, despite this rapid advancement in exoskeleton technology, most of the developed devices are limited to laboratory testing and a very few of them are commercially available for human use. This chapter discusses the advances in various constituting technologies including actuation, sensing, materials, and controls that made exoskeleton research feasible. Also presented are case studies on two state-of-the-art robotic exoskeletons, Harmony and Maestro, developed for rehabilitation of the upper body. The chapter concludes with a discussion on the ongoing challenges in exoskeleton design and ethical, social, and legal considerations related to the use of these devices and the future of exoskeletons.

Download Full-text

Automatic Tool for Fast Generation of Custom Convolutional Neural Networks Accelerators for FPGA

Electronics ◽

10.3390/electronics8060641 ◽

2019 ◽

Vol 8 (6) ◽

pp. 641 ◽

Cited By ~ 7

Author(s):

Miguel Rivera-Acosta ◽

Susana Ortega-Cisneros ◽

Jorge Rivera

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Third Party ◽

Hardware Accelerators ◽

Field Programmable ◽

Custom Hardware ◽

Architecture Description ◽

Automatic Tool

This paper presents a platform that automatically generates custom hardware accelerators for convolutional neural networks (CNNs) implemented in field-programmable gate array (FPGA) devices. It includes a user interface for configuring and managing these accelerators. The herein-presented platform can perform all the processes necessary to design and test CNN accelerators from the CNN architecture description at both layer and internal parameter levels, training the desired architecture with any dataset and generating the configuration files required by the platform. With these files, it can synthesize the register-transfer level (RTL) and program the customized CNN accelerator into the FPGA device for testing, making it possible to generate custom CNN accelerators quickly and easily. All processes save the CNN architecture description are fully automatized and carried out by the platform, which manages third-party software to train the CNN and synthesize and program the generated RTL. The platform has been tested with the implementation of some of the CNN architectures found in the state-of-the-art for freely available datasets such as MNIST, CIFAR-10, and STL-10.

Download Full-text

Research on Lane a Compensation Method Based on Multi-Sensor Fusion

Sensors ◽

10.3390/s19071584 ◽

2019 ◽

Vol 19 (7) ◽

pp. 1584 ◽

Cited By ~ 3

Author(s):

Yushan Li ◽

Wenbo Zhang ◽

Xuewu Ji ◽

Chuanxiang Ren ◽

Jian Wu

Keyword(s):

Real Time ◽

Sensor Fusion ◽

Compensation Method ◽

Measurement Unit ◽

Vision Sensor ◽

Vision Sensors ◽

Cubic Polynomial ◽

Vehicle Test ◽

Yaw Angle ◽

Geometric Relationship

The curvature of the lane output by the vision sensor caused by shadows, changes in lighting and line breaking jumps over in a period of time, which leads to serious problems for unmanned driving control. It is particularly important to predict or compensate the real lane in real-time during sensor jumps. This paper presents a lane compensation method based on multi-sensor fusion of global positioning system (GPS), inertial measurement unit (IMU) and vision sensors. In order to compensate the lane, the cubic polynomial function of the longitudinal distance is selected as the lane model. In this method, a Kalman filter is used to estimate vehicle velocity and yaw angle by GPS and IMU measurements, and a vehicle kinematics model is established to describe vehicle motion. It uses the geometric relationship between vehicle and relative lane motion at the current moment to solve the coefficient of the lane polynomial at the next moment. The simulation and vehicle test results show that the prediction information can compensate for the failure of the vision sensor, and has good real-time, robustness and accuracy.

Download Full-text