scholarly journals Hardware Optimized and Error Reduced Approximate Adder

Electronics ◽  
2019 ◽  
Vol 8 (11) ◽  
pp. 1212 ◽  
Author(s):  
Padmanabhan Balasubramanian ◽  
Douglas L. Maskell

This paper presents a new hardware optimized and error reduced approximate adder (HOERAA), which is suitable for field programmable gate array (FPGA)- and application specific integrated circuit (ASIC)-based implementations. In this work, we consider a FPGA-based implementation using Xilinx Vivado 2018.3, targeting an Artix-7 FPGA. The ASIC-based realizations are based on a 32/28nm complementary metal oxide semiconductor (CMOS) process. Based on FPGA implementations, we note the following: (i) For 32-bit addition involving a 8-bit least significant inaccurate sub-adder, HOERAA requires 22% fewer look-up tables (LUTs) and 18.6% fewer registers while reducing the minimum clock period by 7.1% and reducing the power-delay product (PDP) by 14.7%, compared to the native accurate FPGA adder, and (ii) for 64-bit addition involving a 8-bit least significant inaccurate sub-adder, HOERAA requires 11% fewer LUTs and 9.3% fewer registers while reducing the minimum clock period by 8.3% and reducing the PDP by 9.3%, compared to the native accurate FPGA adder. Based on ASIC-style implementations, HOERAA is found to achieve the following reductions in design metrics compared to an optimum accurate carry-lookahead adder: (i) A 15.7% reduction in critical path delay, a 21.4% reduction in area, and a 35% reduction in PDP for 32-bit addition involving a 8-bit least significant inaccurate sub-adder, and (ii) a 15.3% reduction in critical path delay, a 10.7% reduction in area, and a 20% reduction in PDP for 64-bit addition involving a 8-bit least significant inaccurate sub-adder. Moreover, comparisons with other approximate adders show that HOERAA has a significantly reduced average error, mean average error, and root mean square error, while reporting near optimum design metrics.

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 630
Author(s):  
Padmanabhan Balasubramanian ◽  
Raunaq Nayar ◽  
Douglas L. Maskell

This article describes the design of approximate array multipliers by making vertical or horizontal cuts in an accurate array multiplier followed by different input and output assignments within the multiplier. We consider a digital image denoising application and show how different combinations of input and output assignments in an approximate array multiplier affect the quality of the denoised images. We consider the accurate array multiplier and several approximate array multipliers for synthesis. The multipliers were described in Verilog hardware description language and synthesized by Synopsys Design Compiler using a 32/28-nm complementary metal-oxide-semiconductor technology. The results show that compared to the accurate array multiplier, one of the proposed approximate array multipliers viz. PAAM01-V7 achieves a 28% reduction in critical path delay, 75.8% reduction in power, and 64.6% reduction in area while enabling the production of a denoised image that is comparable in quality to the image denoised using the accurate array multiplier. The standard design metrics such as critical path delay, total power dissipation, and area of the accurate and approximate multipliers are given, the error parameters of the approximate array multipliers are provided, and the original image, the noisy image, and the denoised images are also depicted for comparison.


Electronics ◽  
2018 ◽  
Vol 7 (11) ◽  
pp. 272 ◽  
Author(s):  
Padmanabhan Balasubramanian ◽  
Douglas Maskell ◽  
Nikos Mastorakis

In the era of nanoelectronics, multiple faults or failures of function blocks are likely to occur. To withstand these, higher levels of redundancy are suggested to be employed in at least the sensitive portions of a circuit or system. In this context, the N-modular redundancy (NMR) scheme may be used to guard against the multiple faults or failures of function blocks. However, the NMR scheme would exacerbate the weight, cost, and design metrics to implement higher-order redundancy. Hence, as an alternative to the NMR, the majority and minority voted redundancy (MMR) scheme was proposed recently. However, the proposal was restricted to the basic implementation with no provision for indicating the correct or the incorrect operation of the MMR. Hence in this work, we present the MMR scheme with the error/no-error signaling logic (ESL). Example NMR circuits without and with the ESL (NMRESL), and example MMR circuits without and with the proposed ESL (MMRESL) were implemented to achieve similar degrees of fault tolerance using a 32/28-nm CMOS technology. The results show that, on average, the proposed MMRESL circuits have 18.9% less critical path delay, dissipate 64.8% less power, and require 49.5% less silicon area compared to their counterpart NMRESL circuits.


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 231
Author(s):  
Chester Sungchung Park ◽  
Sunwoo Kim ◽  
Jooho Wang ◽  
Sungkyung Park

A digital front-end decimation chain based on both Farrow interpolator for fractional sample-rate conversion and a digital mixer is proposed in order to comply with the long-term evolution standards in radio receivers with ten frequency modes. Design requirement specifications with adjacent channel selectivity, inband blockers, and narrowband blockers are all satisfied so that the proposed digital front-end is 3GPP-compliant. Furthermore, the proposed digital front-end addresses carrier aggregation in the standards via appropriate frequency translations. The digital front-end has a cascaded integrator comb filter prior to Farrow interpolator and also has a per-carrier carrier aggregation filter and channel selection filter following the digital mixer. A Farrow interpolator with an integrate-and-dump circuitry controlled by a condition signal is proposed and also a digital mixer with periodic reset to prevent phase error accumulation is proposed. From the standpoint of design methodology, three models are all developed for the overall digital front-end, namely, functional models, cycle-accurate models, and bit-accurate models. Performance is verified by means of the cycle-accurate model and subsequently, by means of a special C++ class, the bitwidths are minimized in a methodic manner for area minimization. For system-level performance verification, the orthogonal frequency division multiplexing receiver is also modeled. The critical path delay of each building block is analyzed and the spectral-domain view is obtained for each building block of the digital front-end circuitry. The proposed digital front-end circuitry is simulated, designed, and both synthesized in a 180 nm CMOS application-specific integrated circuit technology and implemented in the Xilinx XC6VLX550T field-programmable gate array (Xilinx, San Jose, CA, USA).


Micromachines ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 284
Author(s):  
Yihsiang Chiu ◽  
Chen Wang ◽  
Dan Gong ◽  
Nan Li ◽  
Shenglin Ma ◽  
...  

This paper presents a high-accuracy complementary metal oxide semiconductor (CMOS) driven ultrasonic ranging system based on air coupled aluminum nitride (AlN) based piezoelectric micromachined ultrasonic transducers (PMUTs) using time of flight (TOF). The mode shape and the time-frequency characteristics of PMUTs are simulated and analyzed. Two pieces of PMUTs with a frequency of 97 kHz and 96 kHz are applied. One is used to transmit and the other is used to receive ultrasonic waves. The Time to Digital Converter circuit (TDC), correlating the clock frequency with sound velocity, is utilized for range finding via TOF calculated from the system clock cycle. An application specific integrated circuit (ASIC) chip is designed and fabricated on a 0.18 μm CMOS process to acquire data from the PMUT. Compared to state of the art, the developed ranging system features a wide range and high accuracy, which allows to measure the range of 50 cm with an average error of 0.63 mm. AlN based PMUT is a promising candidate for an integrated portable ranging system.


Electronics ◽  
2018 ◽  
Vol 7 (12) ◽  
pp. 369 ◽  
Author(s):  
Padmanabhan Balasubramanian ◽  
Nikos Mastorakis

Addition is a fundamental operation in microprocessing and digital signal processing hardware, which is physically realized using an adder. The carry-lookahead adder (CLA) and the carry-select adder (CSLA) are two popular high-speed, low-power adder architectures. The speed performance of a CLA architecture can be improved by adopting a hybrid CLA architecture which employs a small-size ripple-carry adder (RCA) to replace a sub-CLA in the least significant bit positions. On the other hand, the power dissipation of a CSLA employing full adders and 2:1 multiplexers can be reduced by utilizing binary-to-excess-1 code (BEC) converters. In the literature, the designs of many CLAs and CSLAs were described separately. It would be useful to have a direct comparison of their performances based on the design metrics. Hence, we implemented homogeneous and hybrid CLAs, and CSLAs with and without the BEC converters by considering 32-bit accurate and approximate additions to facilitate a comparison. For the gate-level implementations, we considered a 32/28 nm complementary metal-oxide-semiconductor (CMOS) process targeting a typical-case process–voltage–temperature (PVT) specification. The results show that the hybrid CLA/RCA architecture is preferable among the CLA and CSLA architectures from the speed and power perspectives to perform accurate and approximate additions.


Author(s):  
Kenta Shirane ◽  
Takahiro Yamamoto ◽  
Hiroyuki Tomiyama

In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.


VLSI technology become one of the most significant and demandable because of the characteristics like device portability, device size, large amount of features, expenditure, consistency, rapidity and many others. Multipliers and Adders place an important role in various digital systems such as computers, process controllers and signal processors in order to achieve high speed and low power. Two input XOR/XNOR gate and 2:1 multiplexer modules are used to design the Hybrid Full adders. The XOR/XNOR gate is the key punter of power included in the Full adder cell. However this circuit increases the delay, area and critical path delay. Hence, the optimum design of the XOR/XNOR is required to reduce the power consumption of the Full adder Cell. So a 6 New Hybrid Full adder circuits are proposed based on the Novel Full-Swing XOR/XNOR gates and a New Gate Diffusion Input (GDI) design of Full adder with high-swing outputs. The speed, power consumption, power delay product and driving capability are the merits of the each proposed circuits. This circuit simulation was carried used cadence virtuoso EDA tool. The simulation results based on the 90nm CMOS process technology model.


2018 ◽  
Vol 7 (2.16) ◽  
pp. 94
Author(s):  
Abhishek Choubey ◽  
SPV Subbarao ◽  
Shruti B. Choubey

Multiplication is one of the most an essential arithmetic operation used in numerous applications in digital signal processing and communications. These applications need transformations, convolutions and dot products that involve an enormous amount of multiplications of an operand with a constant. Typical examples include wavelet, digital filters, such as FIR or IIR. However, multiplier structures have relatively large area-delay product, long latency and significantly high power consumption compared to other the arithmetic structure. Therefore, low power multiplier design has been always a significant part of DSP structure for VLSI design. The Booth multiplier is promising as the most efficient amongst the others multiplier as it reduces the complexity of considerably than others. In this paper, we have proposed Booth-multiplier using seamless pipelining. Theoretical comparison results show that the proposed Booth multiplier requires less critical path delay compared to traditional Booth multiplier. ASIC simulation results show proposed radix-16 Booth multiplier 13% less critical path delay for word width n=16 and 17% less critical path delay compared for bit width n=32 to best existing radix-16 Booth multiplier. 


Sign in / Sign up

Export Citation Format

Share Document