scholarly journals VLSI Architecture of S-box with High Area Efficiency Based on Composite Field Arithmetic

IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
You-Tun Teng ◽  
Wen-Long Chin ◽  
Deng-Kai Chang ◽  
Pei-Yin Chen ◽  
Pin-Wei Chen
Symmetry ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 700
Author(s):  
Yufei Zhu ◽  
Zuocheng Xing ◽  
Zerun Li ◽  
Yang Zhang ◽  
Yifan Hu

This paper presents a novel parallel quasi-cyclic low-density parity-check (QC-LDPC) encoding algorithm with low complexity, which is compatible with the 5th generation (5G) new radio (NR). Basing on the algorithm, we propose a high area-efficient parallel encoder with compatible architecture. The proposed encoder has the advantages of parallel encoding and pipelined operations. Furthermore, it is designed as a configurable encoding structure, which is fully compatible with different base graphs of 5G LDPC. Thus, the encoder architecture has flexible adaptability for various 5G LDPC codes. The proposed encoder was synthesized in a 65 nm CMOS technology. According to the encoder architecture, we implemented nine encoders for distributed lifting sizes of two base graphs. The eperimental results show that the encoder has high performance and significant area-efficiency, which is better than related prior art. This work includes a whole set of encoding algorithm and the compatible encoders, which are fully compatible with different base graphs of 5G LDPC codes. Therefore, it has more flexible adaptability for various 5G application scenarios.


2021 ◽  
Vol 20 (5s) ◽  
pp. 1-20
Author(s):  
Hyungmin Cho

Depthwise convolutions are widely used in convolutional neural networks (CNNs) targeting mobile and embedded systems. Depthwise convolution layers reduce the computation loads and the number of parameters compared to the conventional convolution layers. Many deep neural network (DNN) accelerators adopt an architecture that exploits the high data-reuse factor of DNN computations, such as a systolic array. However, depthwise convolutions have low data-reuse factor and under-utilize the processing elements (PEs) in systolic arrays. In this paper, we present a DNN accelerator design called RiSA, which provides a novel mechanism that boosts the PE utilization for depthwise convolutions on a systolic array with minimal overheads. In addition, the PEs in systolic arrays can be efficiently used only if the data items ( tensors ) are arranged in the desired layout. Typical DNN accelerators provide various types of PE interconnects or additional modules to flexibly rearrange the data items and manage data movements during DNN computations. RiSA provides a lightweight set of tensor management tasks within the PE array itself that eliminates the need for an additional module for tensor reshaping tasks. Using this embedded tensor reshaping, RiSA supports various DNN models, including convolutional neural networks and natural language processing models while maintaining a high area efficiency. Compared to Eyeriss v2, RiSA improves the area and energy efficiency for MobileNet-V1 inference by 1.91× and 1.31×, respectively.


2018 ◽  
Vol 232 ◽  
pp. 01046
Author(s):  
Wan Qiao ◽  
Dake Liu

In this paper, we propose a flexible scalable BP Polar decoding application-specific instruction set processor (PASIP) that supports multiple code lengths (64 to 4096) and any code rates. High throughputs and sufficient programmability are achieved by the single-instruction-multiple-data (SIMD) based architecture and specially designed Polar decoding acceleration instructions. The synthesis result using 65 nm CMOS technology shows that the total area of PASIP is 2.71 mm2. PASIP provides the maximum throughput of 1563 Mbps (for N = 1024) at the work frequency of 400MHz. The comparison with state-of-art Polar decoders reveals PASIP’s high area efficiency.


Through generations, in an endeavor to pioneer innovative circuit designs, an adder is having the greatest importance as it is a basic building block to decide the system’s overall performance. Wide varieties of adders are used for a plethora of applications in the field of Signal Processing and VLSI systems. Most predominantly used Speed efficient architecture for performing n-bit addition in VLSI applications is Square Root Carry Select Adder (SQRT-CSLA) as it pre-computes the carry and sum by assuming input carry as ‘zero’ and ‘one’. But the overall area usage is high as it uses more number of full adders when compared to Ripple Carry Adder. Though, the existing adder designing techniques are area efficient, there is still scope to achieve area efficiency as area decides the cost of the VLSI Systems. Not only area-efficient but also power potent architectures are required to accelerate the overall performance of the VLSI systems. To meet these objectives, this paper proposes an efficient VLSI architecture for carry select adder by using logic optimization technique addressing performance constraints. The proposed architecture is designed and implemented using cadence encounter tool for different data widths ranging from 16 bits to 128 bits. The performance of the proposed 128-bit architecture achieves an area improvement of 63.43% and a power improvement of 71.00923% when compared to 128-bit SQRT-CSLA architecture


2021 ◽  
Vol 21 (4) ◽  
pp. 270-278
Author(s):  
Kyoung-Il Do ◽  
Byung-Seok Lee ◽  
Seung-Hoo Jin ◽  
Yong-Seo Koo

The Design And Realization Of Efficient Multiplication And Accumulation Unit (MAC) Of A Digital FIR Filter Has Substantial Influence In Designing A Well-Organized Finite Impulse Response Filter As It Is Used To Compute The Filter Response. Area Efficiency In An FIR Filter Can Be Achieved By Reducing The Gate Count Of Either Multiplier Unit Or An Adder Unit Or Both The Units Since They Are The Basic Building Blocks Of FIR Filter. This Paper Presents A VLSI Architecture For A 4-Tap FIR Filter Which Is Designed By Using Efficient Adder And A Multiplier Employing Logic Optimization Technique. Area For MAC Based FIR Filter Employing Vedic-CSLALOT Is Improved By 11.959% When Compared To Hierarchy-SQRT-CSLA. Total Power For MAC Based FIR Filter Employing Vedic-CSLALOT Is Improved By 13.15% As Against To Hierarchy-SQRT-CSLA.


Sign in / Sign up

Export Citation Format

Share Document