hardware efficiency Latest Research Papers

<span lang="EN-US">The multiple read and write operations are performed simultaneously by multi-ported memories and are used in advanced digital design applications on reprogrammable field-programmable gate arrays (FPGAs) to achieve higher bandwidth. The Memory modules are configured by block RAM (BRAMs), which utilizes more area and power on FPGA. In this manuscript, the techniques to increase the read ports for multi-ported memory modules are designed using the bank division with XOR (BDX) approach. The read port techniques like two read-one write (2R1W) memory, hybrid mode approach either 2R1W or 4R memory, and hierarchical BDX (HBDX) Approach using 2R1W/4R memory are designed on FPGA platform. The Proposed work utilizes only slices and look-up table (LUT's) rather than BRAMs while designing the memory modules on FPGA, which reduces the computational complexity and improves the system performance. The experimental results are analyzed on Artix-7 FPGA. The performance parameters like slices, LUT utilization, maximum frequency (Fmax), and hardware efficiency are analyzed by concerning different memory depths. The 4R1W memory design using the HBDX approach utilizes 4% slices and works at 449.697 MHz operating frequency on Artix-7 FPGA. The proposed work provides a better platform to choose the proper read port technique to design an efficient modular multiport memory architecture.</span>

Get full-text (via PubEx)

Design of a Power-Efficient Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2021.15.95 ◽

2021 ◽

Vol 15 ◽

pp. 883-894

Author(s):

Kirti Samir Vaidya ◽

C. G. Dethe ◽

S. G. Akojwar

Keyword(s):

Wireless Communication ◽

Software Defined Radio ◽

Filter Bank ◽

Radio Channel ◽

Software Radio ◽

Low Complexity ◽

Power Efficient ◽

Complex Part ◽

Computing Complexity ◽

Hardware Efficiency

A solution for existing and upcoming wireless communication standards is a software-defined radio (SDR) that extracts the desired radio channel. Channelizer is supposed to be the computationally complex part of SDR. In multi-standard wireless communication, the Software Radio Channelizer is often used to extract individual channels from a wideband input signal. Despite the effective channelizer design that reduces computing complexity, delay and power consumption remain a problem. Thus, to promote the effectiveness of the channelizer, we have provided the Non-Maximally Coefficient Symmetry Multirate Filter Bank. In this paper, to improve the hardware efficiency and functionality of the proposed schemes, we propose a polyphase decomposition and coefficient symmetry incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank. For sharp wideband channelizers, the proposed methods are suitable. Furthermore, polyphase decomposition filter and coefficient symmetry is incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank to improve the hardware efficiency, power efficient, flexibility, reduce hardware size and functionality of the proposed methods. To prove the complexity enhancement of the proposed system, the design to be the communication standard for complexity comparison.

Get full-text (via PubEx)

Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

WSEAS TRANSACTIONS ON CIRCUITS AND SYSTEMS ◽

10.37394/23201.2021.20.7 ◽

2021 ◽

Vol 20 ◽

pp. 57-65

Author(s):

Kirti Samir Vaidya ◽

C. G. Dethe ◽

S. G. Akojwar

Keyword(s):

Wireless Communication ◽

Power Consumption ◽

Input Signal ◽

Filter Bank ◽

Software Radio ◽

Low Complexity ◽

Polyphase Filter ◽

The Individual ◽

Hardware Efficiency ◽

Communication Standard

For extracting the individual channels from input signal of wideband, Software Radio Channelizer was often used on multi-standard wireless communication. Despite the effective channelizer design that decreases the complexity of computational, delay and power consumption is challenging. Thus, to promote the effectiveness of the channelizer, we have provided the Non-Maximally Coefficient Symmetry Multirate Filter Bank. For this, a sharp wideband channelizer is designed to be using the latest class of masking responses with Non-maximally Decimated Polyphase Filter. Moreover, coefficient symmetry is incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank to improve the hardware efficiency and functionality of the proposed schemes. To prove the complexity enhancement of the proposed system, the design is analyzed with communication standard with existing methods.

Get full-text (via PubEx)

Real-time multi-task diffractive deep neural networks via hardware-software co-design

Scientific Reports ◽

10.1038/s41598-021-90221-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yingjie Li ◽

Ruiyang Chen ◽

Berardi Sensale-Rodriguez ◽

Weilu Gao ◽

Cunxi Yu

Keyword(s):

Neural Networks ◽

Real Time ◽

Power Efficiency ◽

Deep Neural Networks ◽

Design Method ◽

Optical Computing ◽

Domain Specific ◽

Constrained Environments ◽

Task Architecture ◽

Hardware Efficiency

AbstractDeep neural networks (DNNs) have substantial computational requirements, which greatly limit their performance in resource-constrained environments. Recently, there are increasing efforts on optical neural networks and optical computing based DNNs hardware, which bring significant advantages for deep learning systems in terms of their power efficiency, parallelism and computational speed. Among them, free-space diffractive deep neural networks (D2NNs) based on the light diffraction, feature millions of neurons in each layer interconnected with neurons in neighboring layers. However, due to the challenge of implementing reconfigurability, deploying different DNNs algorithms requires re-building and duplicating the physical diffractive systems, which significantly degrades the hardware efficiency in practical application scenarios. Thus, this work proposes a novel hardware-software co-design method that enables first-of-its-like real-time multi-task learning in D22NNs that automatically recognizes which task is being deployed in real-time. Our experimental results demonstrate significant improvements in versatility, hardware efficiency, and also demonstrate and quantify the robustness of proposed multi-task D2NN architecture under wide noise ranges of all system components. In addition, we propose a domain-specific regularization algorithm for training the proposed multi-task architecture, which can be used to flexibly adjust the desired performance for each task.

Get full-text (via PubEx)

Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

10.21203/rs.3.rs-236504/v1 ◽

2021 ◽

Author(s):

Kirti Samir Vaidya ◽

Dethe C.G ◽

S. G. Akojwar

Keyword(s):

Wireless Communication ◽

Power Consumption ◽

Input Signal ◽

Filter Bank ◽

Software Radio ◽

Low Complexity ◽

Polyphase Filter ◽

The Individual ◽

Hardware Efficiency ◽

Communication Standard

Abstract For extracting the individual channels from input signal of wideband, Software Radio Channelizer was often used on multi-standard wireless communication. Despite the effective channelizer design that decreases the complexity of computational, delay and power consumption is challenging. Thus, to promote the effectiveness of the channelizer, we have provided the Non-Maximally Coefficient Symmetry Multirate Filter Bank. For this, a sharp wideband channelizer is designed to be using the latest class of masking responses with Non-maximally Decimated Polyphase Filter. Moreover, coefficient symmetry is incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank to improve the hardware efficiency and functionality of the proposed schemes. To prove the complexity enhancement of the proposed system, the design is analyzed with communication standard with existing methods.

Get full-text (via PubEx)

Melting SNOW-V: improved lightweight architectures

Journal of Cryptographic Engineering ◽

10.1007/s13389-020-00251-6 ◽

2020 ◽

Author(s):

Andrea Caforio ◽

Fatih Balli ◽

Subhadeep Banik

Keyword(s):

Theoretical Analysis ◽

Stream Cipher ◽

Software Performance ◽

Square Root ◽

Standard Cell ◽

Instruction Set ◽

Silicon Area ◽

Throughput Rate ◽

5G Systems ◽

Hardware Efficiency

Abstract is a stream cipher proposed by Ekdahl et al. at IACR ToSC 2019(3) with an objective to be deployed as the encryption primitive in 5G systems. The stream cipher offers 256-bit security and is ready for deployment in the post-quantum era, in which as a rule of thumb (due to Grover’s algorithm), quantum security will vary as the square root of the classical security parameters. The authors further report good software performance figures in systems supporting the instruction set. However, they only provide a theoretical analysis of the cipher’s hardware efficiency. In this paper, we aim to fill this gap. We look at the three most important metrics of hardware efficiency: area, speed and power/energy, and propose circuits that optimize each of these metrics and validate our results using three different standard cell libraries. The smallest circuit we propose occupies only around 4776 gate equivalents of silicon area. Furthermore, we also report implementations which consume as little as 12.7 pJ per 128 bits of keystream and operate at a throughput rate of more than 1 Tbps.

Get full-text (via PubEx)

A comparative analysis of LFSR cascading for hardware efficiency and high fault coverage in BIST applications

2020 IEEE 29th Asian Test Symposium (ATS) ◽

10.1109/ats49688.2020.9301561 ◽

2020 ◽

Author(s):

Arbab Alamgir ◽

Abu Khari Bin A'ain ◽

Norlina Paraman ◽

Usman ullah Sheikh ◽

Ian Grout

Keyword(s):

Comparative Analysis ◽

Fault Coverage ◽

Hardware Efficiency

Get full-text (via PubEx)

Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems

Electronics ◽

10.3390/electronics9101657 ◽

2020 ◽

Vol 9 (10) ◽

pp. 1657

Author(s):

Lu Sun ◽

Bin Wu ◽

Tianchun Ye

Keyword(s):

Communication Systems ◽

High Speed ◽

Mimo Systems ◽

Multiple Input Multiple Output ◽

Cmos Technology ◽

Qr Decomposition ◽

Time Sharing ◽

Hardware Complexity ◽

Givens Rotation ◽

Hardware Efficiency

In this article, a low-complexity and high-throughput sorted QR decomposition (SQRD) for multiple-input multiple-output (MIMO) detectors is presented. To reduce the heavy hardware overhead of SQRD, we propose an efficient SQRD algorithm based on a novel modified real-value decomposition (RVD). Compared to the latest study, the proposed SQRD algorithm can save the computational complexity by more than 44.7% with similar bit error rate (BER) performance. Furthermore, a corresponding deeply pipelined hardware architecture implemented with the coordinate rotation digital computer (CORDIC)-based Givens rotation (GR) is designed. In the design, we propose a time-sharing Givens rotation structure utilizing CORDIC modules in idle state to share the concurrent GR operations of other CORDIC modules, which can further reduce hardware complexity and improve hardware efficiency. The proposed SQRD processor is implemented in SMIC 55-nm CMOS technology, which processes 62.5 M SQRD per second at a 250-MHz operating frequency with only 176.5 kilo-gates. Compared to related studies, the proposed design has the best normalized hardware efficiency and achieves a 6-Gbps MIMO data rate which can support current high-speed wireless communication systems such as IEEE 802.11ax.

Get full-text (via PubEx)

Algorithm and VLSI Architecture Designs of a Lossless Embedded Compression Encoder for HD Video Coding Systems

Journal of Circuits System and Computers ◽

10.1142/s021812662130004x ◽

2020 ◽

pp. 2130004

Author(s):

Yu-Hsuan Lee ◽

Cheng-Hung Kuei ◽

Yue-Zhan Kao ◽

Shih-Song Fan Jiang

Keyword(s):

Video Coding ◽

Vlsi Architecture ◽

Visual Quality ◽

Memory Bandwidth ◽

Coding System ◽

Maximum Throughput ◽

Coding Systems ◽

Binary Coding ◽

Embedded Compression ◽

Hardware Efficiency

The demand for visual quality has been advanced by high display resolutions and frame rates. Nevertheless, these two issues have caused tremendous memory bandwidth in a video coding system. In this study, an efficient lossless embedded compression (EC) algorithm is proposed to save memory bandwidth, while keeping visual quality. The proposed lossless EC algorithm incorporates three core techniques: tree partition, half-pixel prediction and group-based binary coding. Tree partition classifies a [Formula: see text] block into Trunk, Branch and Leaf. With tree partition, half-pixel prediction produces individual residues for Trunk, Branch and Leaf. Group-based binary coding converts theses residues to efficient codewords. The lossless compression ratio (CR) of the proposed EC is as high as 2.24 on average, saving memory bandwidth by 55.4%. This EC algorithm is implemented using CMOS 0.18[Formula: see text][Formula: see text]m technology. The maximum throughput can reach 6.4[Formula: see text]Gpixels/s, which can accommodate [Formula: see text]@60fps. The experiment results demonstrate that this study presents better hardware efficiency of 337[Formula: see text]Gpixels/J and 83.5[Formula: see text]Kpixels/s/gate.

Get full-text (via PubEx)

Exploiting error resilience of iterative and accumulation based algorithms for hardware efficiency

10.3990/1.9789036550116 ◽

2020 ◽

Author(s):

S.G.A. Gillani

Keyword(s):

Error Resilience ◽

Hardware Efficiency

Get full-text (via PubEx)

hardware efficiency
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Design and analysis of multiple read port techniques using bank division with XOR method for multi-ported-memory on FPGA platform

Design of a Power-Efficient Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

Real-time multi-task diffractive deep neural networks via hardware-software co-design

Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

Melting SNOW-V: improved lightweight architectures

A comparative analysis of LFSR cascading for hardware efficiency and high fault coverage in BIST applications

Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems

Algorithm and VLSI Architecture Designs of a Lossless Embedded Compression Encoder for HD Video Coding Systems

Exploiting error resilience of iterative and accumulation based algorithms for hardware efficiency

Export Citation Format

hardware efficiencyRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Design and analysis of multiple read port techniques using bank division with XOR method for multi-ported-memory on FPGA platform

Design of a Power-Efficient Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

Real-time multi-task diffractive deep neural networks via hardware-software co-design

Low Complexity Non Maximally Coefficient Symmetry Multi Rate Filter Bank for Wideband Channelization

Melting SNOW-V: improved lightweight architectures

A comparative analysis of LFSR cascading for hardware efficiency and high fault coverage in BIST applications

Design and VLSI Implementation of a Reduced-Complexity Sorted QR Decomposition for High-Speed MIMO Systems

Algorithm and VLSI Architecture Designs of a Lossless Embedded Compression Encoder for HD Video Coding Systems

Exploiting error resilience of iterative and accumulation based algorithms for hardware efficiency

hardware efficiency
Recently Published Documents