Reduction of partial product matrix for high-speed single or multiple constant multiplication

Author(s):  
Mathias Faust ◽  
Chip-Hong Chang
Author(s):  
Sachin B. Jadhav ◽  
Jayamala K. Patil ◽  
Ramesh T. Patil

This paper presents the details of hardware implementation of modified partial product reduction tree using 4:2 and 5:2 compressors. Speed of multiplication operation is improved by using higher compressors .In order to improve the speed of the multiplication process within the computational unit; there is a major bottleneck that is needed to be considered that is the partial products reduction network which is used in the multiplication block. For implementation of this stage require addition of large operands that involve long paths for carry propagation. The proposed architecture is based on binary tree constructed using modified 4:2 and 5:2 compressor circuits. Increasing the speed of operation is achieved by using higher modified compressors in critical path. Our objective of work is, to increase the speed of multiplication operation by minimizing the number of combinational gates using higher n:2 compressors. The experimental test of the proposed modified compressor is done using Spartan-3FPGA device (XC3S400 PQ-208). Using tree architectures for the partial products reduction network represent an attractive solution that is frequently applied to speed up the multiplication process. The simulation result shows 4:2 and 5:2 compressor output which is done using Questa Sim 6.4c Mentor Graphics tool.


2016 ◽  
Vol 25 (04) ◽  
pp. 1650027 ◽  
Author(s):  
Kore Sagar Dattatraya ◽  
Belgudri Ritesh Appasaheb ◽  
Ramdas Bhanudas Khaladkar ◽  
V. S. Kanchana Bhaaskaran

Multiplier forms the core building block of any processor, such as the digital signal processor (DSP) and a general purpose microprocessor. As the word length increases, the number of adders or compressors required for the partial product addition also increases. The addition operation of the derived partial products determines the circuit latency, area and speed performance of wider word-length multipliers. Binary count multiplier (BCM) aims to reduce the number of adders and compressors through the use of a uniquely structured binary counter and by suitably altering the logical flow of partial product addition by using binary adders is proposed in this paper. The binary counters for varying bit count values are derived by modifying the basic 4:2 compressor circuit. A [Formula: see text] bit multiplier has been developed to validate the proposed computation method. This logic structure demonstrates lower power operation, reduced device count and lesser delay in comparison against the conventional Wallace tree multiplier structure found in the literature. The BCM implementation realizes 29.17% reduction in the device count, 66% reduction in the delay and 70% reduction in the power dissipation. Furthermore, it realizes 90% reduction in the power delay product (PDP) in comparison against the Wallace tree structure. The multiplier circuits have been implemented and the validation of results has been carried out using Cadence[Formula: see text] EDA tool. Forty five nanometer technology files have been employed for the designs and exhaustive SPICE simulations.


In recent years, the filter is one of the key elements in signal processing applications to remove unwanted information. However, traditional FIR filters have been consumed more resources due to complex multiplier design. Mostly the complexity of the FIR filter is dominated by multiplier design. The conventional multipliers can be realized by Single Constant Multiplication (SCM) and Multiple Constant Multiplication (MCM) algorithms using shift and add/subtract operations. In this paper, a hybrid state decision tree algorithm is introduced to reduce hardware utilization (area) and increase speed in filter tap cells of FIR. The proposed scheme generates a decision tree to perform shift & addition and accumulation based on the combined SCM/MCM approach. The proposed FIR filter was implemented in Xilinx Field Programmable Gate Array (FPGA) platform by using Verilog language. The experimental results of the DTG-FIR filter were averagely reduced the 48.259% of LUTs, 51.567 % of flip flops and 44.497 % of slices at 183.122 MHz of operating frequency on the Virtex-5 than existing VP-FIR.


2019 ◽  
Vol 15 (3) ◽  
pp. 302-308
Author(s):  
Ganesh Kumar Ganjikunta ◽  
Sibghatullah I. Khan ◽  
M. Mahaboob Basha

A high speed N × N bit multiplier architecture that supports signed and unsigned multiplication operations is proposed in this paper. This architecture incorporates the modified two's complement circuits and also N × N bit unsigned multiplier circuit. This unsigned multiplier circuit is based on decomposing the multiplier circuit into smaller-precision independent multipliers using Vedic Mathematics. These individual multipliers generate the partial products in parallel for high speed operation, which are combined by using high speed adders and parallel adder to generate the product output. The proposed architecture has regular-shape for the partial product tree that makes easy to implement. Finally, this multiplier architecture is implemented in UMC 65 nm technology for N = 8, 16 and 32 bits. The synthesis results shows that the proposed multiplier architecture improves in terms of speed and also reduces power-delay product (PDP), compared to the architectures in the literature.


2015 ◽  
Vol 25 (02) ◽  
pp. 1650004
Author(s):  
Pouya Asadi

In this paper, a new multiplier using array architecture and a fast carry network tree is presented which uses dynamic CMOS technology. Different reforms are performed in multiplier architecture. In the first step of multiplier operator, a novel radix-16 modified Booth encoder is presented which reduces the number of partial products efficiently. In this research, we present a new algorithm for partial product reduction in multiplication operations. The algorithm is based on the implementation of compressor elements by means of carry network. The structure of these compressors into reduction trees takes advantage of the modified Wallace tree for integration of adder cells and provides an alternative to conventional operator methods. We show several reduction techniques that illustrate the proposed method and describe carry-skip examples that combine dynamic CMOS with classic conventional compressors in order to modify each scheme. In network multiplier, a novel low power high-speed adder cell is presented which uses 14 transistors in its structure. Critical path is minimized to reduce latency in whole operator architecture. Final adder of multiplier uses an optimized carry hybrid adder. The presented final adder network uses dynamic CMOS technology. It sums two final operands in a very efficient way, which has significant effect in operator structure. Presented multiplier reduces latency by 12%, decreases transistor count by 8% and modifies noise problem in an efficient way in comparison with other structures.


Sign in / Sign up

Export Citation Format

Share Document