VLSI design of inner-product computers using distributed arithmetic

Author(s):  
W.P. Burleson ◽  
L.L. Scharf
Author(s):  
P. Hemanthkumar ◽  
Y. Sai Kiran ◽  
V. Nava Teja

<p>Here, we exhibit the design optimization of one- and two-dimensional fully-pipelined computing structures for area-delay-power-efficient implementation of finite impulse response (FIR) filter by systolic decomposition of distributed arithmetic (DA)-based inner-product computation. This plan is found to offer a flexible choice of the address length of the look-up-tables (LUT) for DA-based computation to determine suitable area-time trade-off. It is seen that by using smaller address-lengths for DA-based computing units, it is possible to decrease the memory-size but on the other side that leads to increase of adder complexity and the latency. For efficient DA-based realization of FIR filters of different orders, the flexible linear systolic design is implemented on a Xilinx Virtex-E XCV2000E FPGA using a hybrid combination of Handel-C and parameterizable VHDL cores. Various key performance metrics such as number of slices, maximum usable frequency, dynamic power consumption, energy density and energy throughput are estimated for different filter orders and address-lengths. Obtained results on analysis shows that performance metrics of the proposed implementation is broadly in line with theoretical expectations. We have seen that the choice of address-length M=4 gives the best of area-delay power-efficient realizations of the FIR filter for different filter orders. Moreover, the proposed FPGA implementation is found to involve significantly less area-delay complexity compared with the existing DA-based implementations of FIR filter.</p>


This paper briefs an area efficient, low power and high throughput LMS adaptive filter using Distributed Arithmetic architecture. The throughput is increased because of parallel updating of filter coefficient and computing the inner product simultaneously. Here we have proposed memory-less design of distributed arithmetic (MLDA) unit. The proposed design uses 2:1 multiplexer’s architecture to replace LUT of the conventional DA to reduce the overall area of the filter. Enhanced compressor adder is used for accumulation of the partial products, which further helps to reduce the area. Parallel updating of the generation and accumulation enhance the throughput of the design. The proposed architecture requires more than half area that required for the existing LUT based inner product block. The proposed design is implemented in synopsis design compiler and the result shows that the area decreased by 52.7% and also the MUX based DA for the Adaptive filter causes 69.25% less power consumption for filter tap N=16, 32 and 64. Proposed design provides 36.50% less Area Delay Product (ADP).


Sign in / Sign up

Export Citation Format

Share Document