A highly regular parallel multiplier architecture along with the novel low-power, high-performance
CMOS implementation circuits is presented. The superiority is achieved
through utilizing a unique scheme for recursive decomposition of partial product
matrices and a recently proposed non-binary arithmetic logic as well as the
complementary shift switch logic circuits.The proposed 64×64-b parallel multiplier possesses the following distinct features:
(1) generating 64 8×8-b partial product matrices instead of a single large one; (2)
comprising only four stages of bit reductions: first, by 8×8-b small parallel multipliers,
then, by small parallel counters in each of the remaining three stages. A family of shift
switch parallel counters, including non-binary (6, 3)∗ and complementary (k, 2) for
2 ≤ k ≤ 8, are proposed for the efficient bit reductions; (3) using a simple final adder.The non-binary logic operates 4-bit state signals (representing integers ranging from
(0 to 3), where no more than half of the signal bits are subject to value-change at any
logic stage. This and others including minimum transistor counts, fewer inverters, and
low-leakage logic structure, significantly reduce circuit power dissipation.