EE-TCAM: An Energy-Efficient SRAM-Based TCAM on FPGA

Inayat Ullah; Zahid Ullah; Jeong-A Lee

doi:10.3390/electronics7090186

EE-TCAM: An Energy-Efficient SRAM-Based TCAM on FPGA

Electronics ◽

10.3390/electronics7090186 ◽

2018 ◽

Vol 7 (9) ◽

pp. 186 ◽

Cited By ~ 4

Author(s):

Inayat Ullah ◽

Zahid Ullah ◽

Jeong-A Lee

Keyword(s):

Power Consumption ◽

Integrated Circuit ◽

Energy Efficient ◽

High Speed ◽

Random Access ◽

Content Addressable Memories ◽

Field Programmable ◽

Application Specific Integrated Circuit ◽

High Power Consumption ◽

Sram Memory

Ternary content-addressable memories (TCAMs) are used to design high-speed search engines. TCAM is implemented on application-specific integrated circuit (native TCAMs) and field-programmable gate array (FPGA) (static random-access memory (SRAM)-based TCAMs) platforms but both have the drawback of high power consumption. This paper presents a pre-classifier-based architecture for an energy-efficient SRAM-based TCAM. The first classification stage divides the TCAM table into several sub-tables of balanced size. The second SRAM-based implementation stage maps each of the resultant TCAM sub-tables to a separate row of configured SRAM blocks in the architecture. The proposed architecture selectively activates at most one row of SRAM blocks for each incoming TCAM word. Compared with the existing SRAM-based TCAM designs on FPGAs, the proposed design consumes significantly reduced energy as it activates a part of SRAM memory used for lookup rather than the entire SRAM memory as in the previous schemes. We implemented the proposed approach sample designs of size 512 × 36 on Xilinx Virtex-6 FPGA. The experimental results showed that the proposed design achieved at least three times lower power consumption per performance than other SRAM-based TCAM architectures.

Download Full-text

Performance analysis of number theoretic transform-based convolution using field programmable gate array

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.8.10409 ◽

2018 ◽

Vol 7 (2.8) ◽

pp. 204

Author(s):

G A.E Satish Kumar

Keyword(s):

Performance Analysis ◽

Integrated Circuit ◽

Field Programmable Gate Array ◽

High Speed ◽

Convolution Operation ◽

The Matrix ◽

Field Programmable ◽

High Power Consumption ◽

Hardware Description ◽

Very High

This paper presents the convolution operation based on the Number Theoretic Transfom for two n=8 input sequences. The convolution of two n-point sequences using Fast Fourier Transform exhibits design complexity leading to high power consumption. The Number Theoretic Transform utilizes the matrix of modulus values to evaluate the convolution. The Number Theoretic Transform is as an integer transform which makes the design comparatively simple. The convolution based Number Theoretic Transform is developed using the Very High Speed Integrated Circuit Hardware Description language.Also the real time implementation of the proposed method is validated by the Xilinx Spartan FPGA family devices. The performance analysis of power, speed and area are evaluated and compared with 3A DSP FPGA and Virtex 6 FPGA devices.

Download Full-text

High speed multi-channel data acquisition technique for efficient hardware utilization using quad data rate approach

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.16061 ◽

2018 ◽

Vol 7 (4) ◽

pp. 2569

Author(s):

Priyanka Chauhan ◽

Dippal Israni ◽

Karan Jasani ◽

Ashwin Makwana

Keyword(s):

Power Consumption ◽

Data Acquisition ◽

Resource Utilization ◽

Field Programmable Gate Array ◽

High Speed ◽

State Of The Art ◽

Data Rate ◽

Field Programmable ◽

Sensor Signals ◽

Acquisition Technique

Data acquisition is the most demanding application for the acquisition and monitoring of various sensor signals. The data received are processed in real-time environment. This paper proposes a novel Data Acquisition (DAQ) technique for better resource utilization with less power consumption. Present work has designed and compared advanced Quad Data Rate (QDR) technique with traditional Dual Data Rate (DDR) technique in terms of resource utilization and power consumption of Field Programmable Gate Array (FPGA) hardware. Xilinx ISE is used to verify results of FPGA resource utilization by QDR with state of the art DDR approach. The paper ratiocinates that QDR technique outperforms traditional DDR technique in terms of FPGA resource utilization.

Download Full-text

On the Use of Magnetic RAMs in Field-Programmable Gate Arrays

International Journal of Reconfigurable Computing ◽

10.1155/2008/723950 ◽

2008 ◽

Vol 2008 ◽

pp. 1-9 ◽

Cited By ~ 16

Author(s):

Y. Guillemenet ◽

L. Torres ◽

G. Sassatelli ◽

N. Bruchon

Keyword(s):

Power Consumption ◽

Random Access ◽

Switching Scheme ◽

Fpga Design ◽

Gate Arrays ◽

Magnetic Switching ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Time Required ◽

Magnetic Tunneling

This paper describes the integration of field-induced magnetic switching (FIMS) and thermally assisted switching (TAS) magnetic random access memories in FPGA design. The nonvolatility of the latter is achieved through the use of magnetic tunneling junctions (MTJs) in the MRAM cell. A thermally assisted switching scheme helps to reduce power consumption during write operation in comparison to the writing scheme in the FIMS-MTJ device. Moreover, the nonvolatility of such a design based on either an FIMS or a TAS writing scheme should reduce both power consumption and configuration time required at each power up of the circuit in comparison to classical SRAM-based FPGAs. A real-time reconfigurable (RTR) micro-FPGA using FIMS-MRAM or TAS-MRAM allows dynamic reconfiguration mechanisms, while featuring simple design architecture.

Download Full-text

SYNTHESIS AND FPGA–IMPLEMENTATION BASED NEURAL TECHNIQUE OF A NONLINEAR ADC MODEL

International Journal of Computing ◽

10.47839/ijc.4.1.321 ◽

2014 ◽

pp. 27-33

Author(s):

Mounir Bouhedda ◽

Mokhtar Attari

Keyword(s):

Integrated Circuit ◽

Field Programmable Gate Array ◽

High Speed ◽

Hardware Description Language ◽

Description Language ◽

Analog To Digital ◽

Field Programmable ◽

Hardware Description ◽

Nonlinear Analog ◽

Very High

The aim of this paper is to introduce a new architecture using Artificial Neural Networks (ANN) in designing a 6-bit nonlinear Analog to Digital Converter (ADC). A study was conducted to synthesise an optimal ANN in view to FPGA (Field Programmable Gate Array) implementation using Very High-speed Integrated Circuit Hardware Description Language (VHDL). Simulation and tests results are carried out to show the efficiency of the designed ANN.

Download Full-text

High-speed energy-efficient InP photonic integrated circuit transceivers

Optical Interconnects XIX ◽

10.1117/12.2515218 ◽

2019 ◽

Author(s):

Kevin A. Williams ◽

Marija Trajkovic ◽

Valeria Rustichelli ◽

Florian Lemaître ◽

Huub P. Ambrosius ◽

...

Keyword(s):

Integrated Circuit ◽

Energy Efficient ◽

High Speed ◽

Photonic Integrated Circuit

Download Full-text

A P4-Enabled RINA Interior Router for Software-Defined Data Centers

Computers ◽

10.3390/computers9030070 ◽

2020 ◽

Vol 9 (3) ◽

pp. 70

Author(s):

Carolina Fernández ◽

Sergio Giménez ◽

Eduard Grasa ◽

Steve Bunch

Keyword(s):

Integrated Circuit ◽

High Performance ◽

Data Transfer ◽

Great Promise ◽

Gate Arrays ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Application Specific Integrated Circuit ◽

Networking Technologies ◽

Application Specific

The lack of high-performance RINA (Recursive InterNetwork Architecture) implementations to date makes it hard to experiment with RINA as an underlay networking fabric solution for different types of networks, and to assess RINA’s benefits in practice on scenarios with high traffic loads. High-performance router implementations typically require dedicated hardware support, such as FPGAs (Field Programmable Gate Arrays) or specialized ASICs (Application Specific Integrated Circuit). With the advance of hardware programmability in recent years, new possibilities unfold to prototype novel networking technologies. In particular, the use of the P4 programming language for programmable ASICs holds great promise for developing a RINA router. This paper details the design and part of the implementation of the first P4-based RINA interior router, which reuses the layer management components of the IRATI Linux-based RINA implementation and implements the data-transfer components using a P4 program. We also describe the configuration and testing of our initial deployment scenarios, using ancillary open-source tools such as the P4 reference test software switch (BMv2) or the P4Runtime API.

Download Full-text

BPR-TCAM—Block and Partial Reconfiguration based TCAM on Xilinx FPGAs

Electronics ◽

10.3390/electronics9020353 ◽

2020 ◽

Vol 9 (2) ◽

pp. 353 ◽

Cited By ~ 1

Author(s):

Anees Ullah ◽

Ali Zahir ◽

Noaman A. Khan ◽

Waleed Ahmad ◽

Alexis Ramos ◽

...

Keyword(s):

Resource Utilization ◽

High Speed ◽

State Of The Art ◽

Field Programmable Gate Arrays ◽

Partial Reconfiguration ◽

Gate Arrays ◽

Content Addressable Memories ◽

Field Programmable ◽

Programmable Gate Arrays

Field Programmable Gate Arrays (FPGAs) based Ternary Content Addressable Memories (TCAMs) are widely used in high-speed networking applications.However, TCAMs are not present on state-of-the-art FPGAs and need to be emulated on SRAM-based memories (i.e., LUTRAMs and Block RAMs) which requires a large amount of FPGA resources. In this paper, we present an efficient methodology to implement FPGA-based TCAMs with significant resource savings compared to existing schemes. The proposed methodology exploits the fracturable nature of Look Up Tables (LUTs) and the built-in slice carry-chains for simultaneous mapping of two rules and its matching logic to a single FPGA slice. Multiple slices can be stacked together to build deeper and wider TCAMs in a modular way. The combination of all these techniques results in significant savings in resource utilization compared to existing approaches.

Download Full-text

An Energy efficient application specific integrated circuit for electrocardiogram feature detection and its potential for ambulatory cardiovascular disease detection

Healthcare Technology Letters ◽

10.1049/htl.2015.0030 ◽

2016 ◽

Vol 3 (1) ◽

pp. 77-84 ◽

Cited By ~ 6

Author(s):

Sanjeev Kumar Jain ◽

Basabi Bhaumik

Keyword(s):

Cardiovascular Disease ◽

Integrated Circuit ◽

Energy Efficient ◽

Feature Detection ◽

Disease Detection ◽

Application Specific Integrated Circuit ◽

Application Specific

Download Full-text

Analysis of the Quantization Noise in Discrete Wavelet Transform Filters for Image Processing

Electronics ◽

10.3390/electronics7080135 ◽

2018 ◽

Vol 7 (8) ◽

pp. 135 ◽

Cited By ~ 9

Author(s):

Nikolay Chervyakov ◽

Pavel Lyakhov ◽

Dmitry Kaplun ◽

Denis Butusov ◽

Nikolay Nagornov

Keyword(s):

Image Processing ◽

Wavelet Transform ◽

Discrete Wavelet Transform ◽

Integrated Circuit ◽

Filter Banks ◽

Signal To Noise Ratio ◽

Quantization Noise ◽

Discrete Wavelet ◽

Field Programmable ◽

Application Specific Integrated Circuit

In this paper, we analyze the noise quantization effects in coefficients of discrete wavelet transform (DWT) filter banks for image processing. We propose the implementation of the DWT method, making it possible to determine the effective bit-width of the filter banks coefficients at which the quantization noise does not significantly affect the image processing results according to the peak signal-to-noise ratio (PSNR). The dependence between the PSNR of the DWT image quality on the wavelet and the bit-width of the wavelet filter coefficients is analyzed. The formulas for determining the minimal bit-width of the filter coefficients at which the processed image achieves high quality (PSNR ≥ 40 dB) are given. The obtained theoretical results were confirmed through the simulation of DWT for a test image using the calculated bit-width values. All considered algorithms operate with fixed-point numbers, which simplifies their hardware implementation on modern devices: field-programmable gate array (FPGA), application-specific integrated circuit (ASIC), etc.

Download Full-text

Sparse Cholesky Factorization on FPGA Using Parameterized Model

Mathematical Problems in Engineering ◽

10.1155/2017/3021591 ◽

2017 ◽

Vol 2017 ◽

pp. 1-11

Author(s):

Yichun Sun ◽

Hengzhu Liu ◽

Tong Zhou

Keyword(s):

Integrated Circuit ◽

Sparse Matrix ◽

Fundamental Problem ◽

Performance Model ◽

Cholesky Factorization ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Application Specific Integrated Circuit ◽

Parameterized Model ◽

And Performance

Cholesky factorization is a fundamental problem in most engineering and science computation applications. When dealing with a large sparse matrix, numerical decomposition consumes the most time. We present a vector architecture to parallelize numerical decomposition of Cholesky factorization. We construct an integrated analytical parameterized performance model to accurately predict the execution times of typical matrices under varying parameters. Our proposed approach is general for accelerator and limited by neither field-programmable gate arrays (FPGAs) nor application-specific integrated circuit. We implement a simplified module in FPGAs to prove the accuracy of the model. The experiments show that, for most cases, the performance differences between the predicted and measured execution are less than 10%. Based on the performance model, we optimize parameters and obtain a balance of resources and performance after analyzing the performance of varied parameter settings. Comparing with the state-of-the-art implementation in CPU and GPU, we find that the performance of the optimal parameters is 2x that of CPU. Our model offers several advantages, particularly in power consumption. It provides guidance for the design of future acceleration components.

Download Full-text