hardware efficiency
Recently Published Documents


TOTAL DOCUMENTS

40
(FIVE YEARS 17)

H-INDEX

5
(FIVE YEARS 2)

Author(s):  
Druva Kumar S. ◽  
Roopa M.

<span lang="EN-US">The multiple read and write operations are performed simultaneously by multi-ported memories and are used in advanced digital design applications on reprogrammable field-programmable gate arrays (FPGAs) to achieve higher bandwidth. The Memory modules are configured by block RAM (BRAMs), which utilizes more area and power on FPGA. In this manuscript, the techniques to increase the read ports for multi-ported memory modules are designed using the bank division with XOR (BDX) approach. The read port techniques like two read-one write (2R1W) memory, hybrid mode approach either 2R1W or 4R memory, and hierarchical BDX (HBDX) Approach using 2R1W/4R memory are designed on FPGA platform. The Proposed work utilizes only slices and look-up table (LUT's) rather than BRAMs while designing the memory modules on FPGA, which reduces the computational complexity and improves the system performance.  The experimental results are analyzed on Artix-7 FPGA. The performance parameters like slices, LUT utilization, maximum frequency (Fmax), and hardware efficiency are analyzed by concerning different memory depths. The 4R1W memory design using the HBDX approach utilizes 4% slices and works at 449.697 MHz operating frequency on Artix-7 FPGA. The proposed work provides a better platform to choose the proper read port technique to design an efficient modular multiport memory architecture.</span>


Author(s):  
Kirti Samir Vaidya ◽  
C. G. Dethe ◽  
S. G. Akojwar

A solution for existing and upcoming wireless communication standards is a software-defined radio (SDR) that extracts the desired radio channel. Channelizer is supposed to be the computationally complex part of SDR. In multi-standard wireless communication, the Software Radio Channelizer is often used to extract individual channels from a wideband input signal. Despite the effective channelizer design that reduces computing complexity, delay and power consumption remain a problem. Thus, to promote the effectiveness of the channelizer, we have provided the Non-Maximally Coefficient Symmetry Multirate Filter Bank. In this paper, to improve the hardware efficiency and functionality of the proposed schemes, we propose a polyphase decomposition and coefficient symmetry incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank. For sharp wideband channelizers, the proposed methods are suitable. Furthermore, polyphase decomposition filter and coefficient symmetry is incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank to improve the hardware efficiency, power efficient, flexibility, reduce hardware size and functionality of the proposed methods. To prove the complexity enhancement of the proposed system, the design to be the communication standard for complexity comparison.


2021 ◽  
Vol 20 ◽  
pp. 57-65
Author(s):  
Kirti Samir Vaidya ◽  
C. G. Dethe ◽  
S. G. Akojwar

For extracting the individual channels from input signal of wideband, Software Radio Channelizer was often used on multi-standard wireless communication. Despite the effective channelizer design that decreases the complexity of computational, delay and power consumption is challenging. Thus, to promote the effectiveness of the channelizer, we have provided the Non-Maximally Coefficient Symmetry Multirate Filter Bank. For this, a sharp wideband channelizer is designed to be using the latest class of masking responses with Non-maximally Decimated Polyphase Filter. Moreover, coefficient symmetry is incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank to improve the hardware efficiency and functionality of the proposed schemes. To prove the complexity enhancement of the proposed system, the design is analyzed with communication standard with existing methods.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yingjie Li ◽  
Ruiyang Chen ◽  
Berardi Sensale-Rodriguez ◽  
Weilu Gao ◽  
Cunxi Yu

AbstractDeep neural networks (DNNs) have substantial computational requirements, which greatly limit their performance in resource-constrained environments. Recently, there are increasing efforts on optical neural networks and optical computing based DNNs hardware, which bring significant advantages for deep learning systems in terms of their power efficiency, parallelism and computational speed. Among them, free-space diffractive deep neural networks (D2NNs) based on the light diffraction, feature millions of neurons in each layer interconnected with neurons in neighboring layers. However, due to the challenge of implementing reconfigurability, deploying different DNNs algorithms requires re-building and duplicating the physical diffractive systems, which significantly degrades the hardware efficiency in practical application scenarios. Thus, this work proposes a novel hardware-software co-design method that enables first-of-its-like real-time multi-task learning in D22NNs that automatically recognizes which task is being deployed in real-time. Our experimental results demonstrate significant improvements in versatility, hardware efficiency, and also demonstrate and quantify the robustness of proposed multi-task D2NN architecture under wide noise ranges of all system components. In addition, we propose a domain-specific regularization algorithm for training the proposed multi-task architecture, which can be used to flexibly adjust the desired performance for each task.


2021 ◽  
Author(s):  
Kirti Samir Vaidya ◽  
Dethe C.G ◽  
S. G. Akojwar

Abstract For extracting the individual channels from input signal of wideband, Software Radio Channelizer was often used on multi-standard wireless communication. Despite the effective channelizer design that decreases the complexity of computational, delay and power consumption is challenging. Thus, to promote the effectiveness of the channelizer, we have provided the Non-Maximally Coefficient Symmetry Multirate Filter Bank. For this, a sharp wideband channelizer is designed to be using the latest class of masking responses with Non-maximally Decimated Polyphase Filter. Moreover, coefficient symmetry is incorporated into the Non-Maximally Coefficient Symmetry Multirate Filter Bank to improve the hardware efficiency and functionality of the proposed schemes. To prove the complexity enhancement of the proposed system, the design is analyzed with communication standard with existing methods.


Author(s):  
Andrea Caforio ◽  
Fatih Balli ◽  
Subhadeep Banik

Abstract is a stream cipher proposed by Ekdahl et al. at IACR ToSC 2019(3) with an objective to be deployed as the encryption primitive in 5G systems. The stream cipher offers 256-bit security and is ready for deployment in the post-quantum era, in which as a rule of thumb (due to Grover’s algorithm), quantum security will vary as the square root of the classical security parameters. The authors further report good software performance figures in systems supporting the instruction set. However, they only provide a theoretical analysis of the cipher’s hardware efficiency. In this paper, we aim to fill this gap. We look at the three most important metrics of hardware efficiency: area, speed and power/energy, and propose circuits that optimize each of these metrics and validate our results using three different standard cell libraries. The smallest circuit we propose occupies only around 4776 gate equivalents of silicon area. Furthermore, we also report implementations which consume as little as 12.7 pJ per 128 bits of keystream and operate at a throughput rate of more than 1 Tbps.


Electronics ◽  
2020 ◽  
Vol 9 (10) ◽  
pp. 1657
Author(s):  
Lu Sun ◽  
Bin Wu ◽  
Tianchun Ye

In this article, a low-complexity and high-throughput sorted QR decomposition (SQRD) for multiple-input multiple-output (MIMO) detectors is presented. To reduce the heavy hardware overhead of SQRD, we propose an efficient SQRD algorithm based on a novel modified real-value decomposition (RVD). Compared to the latest study, the proposed SQRD algorithm can save the computational complexity by more than 44.7% with similar bit error rate (BER) performance. Furthermore, a corresponding deeply pipelined hardware architecture implemented with the coordinate rotation digital computer (CORDIC)-based Givens rotation (GR) is designed. In the design, we propose a time-sharing Givens rotation structure utilizing CORDIC modules in idle state to share the concurrent GR operations of other CORDIC modules, which can further reduce hardware complexity and improve hardware efficiency. The proposed SQRD processor is implemented in SMIC 55-nm CMOS technology, which processes 62.5 M SQRD per second at a 250-MHz operating frequency with only 176.5 kilo-gates. Compared to related studies, the proposed design has the best normalized hardware efficiency and achieves a 6-Gbps MIMO data rate which can support current high-speed wireless communication systems such as IEEE 802.11ax.


Author(s):  
Yu-Hsuan Lee ◽  
Cheng-Hung Kuei ◽  
Yue-Zhan Kao ◽  
Shih-Song Fan Jiang

The demand for visual quality has been advanced by high display resolutions and frame rates. Nevertheless, these two issues have caused tremendous memory bandwidth in a video coding system. In this study, an efficient lossless embedded compression (EC) algorithm is proposed to save memory bandwidth, while keeping visual quality. The proposed lossless EC algorithm incorporates three core techniques: tree partition, half-pixel prediction and group-based binary coding. Tree partition classifies a [Formula: see text] block into Trunk, Branch and Leaf. With tree partition, half-pixel prediction produces individual residues for Trunk, Branch and Leaf. Group-based binary coding converts theses residues to efficient codewords. The lossless compression ratio (CR) of the proposed EC is as high as 2.24 on average, saving memory bandwidth by 55.4%. This EC algorithm is implemented using CMOS 0.18[Formula: see text][Formula: see text]m technology. The maximum throughput can reach 6.4[Formula: see text]Gpixels/s, which can accommodate [Formula: see text]@60fps. The experiment results demonstrate that this study presents better hardware efficiency of 337[Formula: see text]Gpixels/J and 83.5[Formula: see text]Kpixels/s/gate.


Sign in / Sign up

Export Citation Format

Share Document