Hardware Optimized and Error Reduced Approximate Adder

Padmanabhan Balasubramanian; Douglas L. Maskell

doi:10.3390/electronics8111212

Hardware Optimized and Error Reduced Approximate Adder

Electronics ◽

10.3390/electronics8111212 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1212 ◽

Cited By ~ 6

Author(s):

Padmanabhan Balasubramanian ◽

Douglas L. Maskell

Keyword(s):

Integrated Circuit ◽

Critical Path ◽

Oxide Semiconductor ◽

Average Error ◽

Cmos Process ◽

Path Delay ◽

Clock Period ◽

Design Metrics ◽

Critical Path Delay ◽

Reduction In Area

This paper presents a new hardware optimized and error reduced approximate adder (HOERAA), which is suitable for field programmable gate array (FPGA)- and application specific integrated circuit (ASIC)-based implementations. In this work, we consider a FPGA-based implementation using Xilinx Vivado 2018.3, targeting an Artix-7 FPGA. The ASIC-based realizations are based on a 32/28nm complementary metal oxide semiconductor (CMOS) process. Based on FPGA implementations, we note the following: (i) For 32-bit addition involving a 8-bit least significant inaccurate sub-adder, HOERAA requires 22% fewer look-up tables (LUTs) and 18.6% fewer registers while reducing the minimum clock period by 7.1% and reducing the power-delay product (PDP) by 14.7%, compared to the native accurate FPGA adder, and (ii) for 64-bit addition involving a 8-bit least significant inaccurate sub-adder, HOERAA requires 11% fewer LUTs and 9.3% fewer registers while reducing the minimum clock period by 8.3% and reducing the PDP by 9.3%, compared to the native accurate FPGA adder. Based on ASIC-style implementations, HOERAA is found to achieve the following reductions in design metrics compared to an optimum accurate carry-lookahead adder: (i) A 15.7% reduction in critical path delay, a 21.4% reduction in area, and a 35% reduction in PDP for 32-bit addition involving a 8-bit least significant inaccurate sub-adder, and (ii) a 15.3% reduction in critical path delay, a 10.7% reduction in area, and a 20% reduction in PDP for 64-bit addition involving a 8-bit least significant inaccurate sub-adder. Moreover, comparisons with other approximate adders show that HOERAA has a significantly reduced average error, mean average error, and root mean square error, while reporting near optimum design metrics.

Download Full-text

Approximate Array Multipliers

Electronics ◽

10.3390/electronics10050630 ◽

2021 ◽

Vol 10 (5) ◽

pp. 630

Author(s):

Padmanabhan Balasubramanian ◽

Raunaq Nayar ◽

Douglas L. Maskell

Keyword(s):

Critical Path ◽

Complementary Metal Oxide Semiconductor ◽

Oxide Semiconductor ◽

Total Power ◽

Path Delay ◽

Input And Output ◽

Critical Path Delay ◽

Standard Design ◽

Array Multiplier ◽

Array Multipliers

This article describes the design of approximate array multipliers by making vertical or horizontal cuts in an accurate array multiplier followed by different input and output assignments within the multiplier. We consider a digital image denoising application and show how different combinations of input and output assignments in an approximate array multiplier affect the quality of the denoised images. We consider the accurate array multiplier and several approximate array multipliers for synthesis. The multipliers were described in Verilog hardware description language and synthesized by Synopsys Design Compiler using a 32/28-nm complementary metal-oxide-semiconductor technology. The results show that compared to the accurate array multiplier, one of the proposed approximate array multipliers viz. PAAM01-V7 achieves a 28% reduction in critical path delay, 75.8% reduction in power, and 64.6% reduction in area while enabling the production of a denoised image that is comparable in quality to the image denoised using the accurate array multiplier. The standard design metrics such as critical path delay, total power dissipation, and area of the accurate and approximate multipliers are given, the error parameters of the approximate array multipliers are provided, and the original image, the noisy image, and the denoised images are also depicted for comparison.

Download Full-text

Majority and Minority Voted Redundancy Scheme for Safety-Critical Applications with Error/No-Error Signaling Logic

Electronics ◽

10.3390/electronics7110272 ◽

2018 ◽

Vol 7 (11) ◽

pp. 272 ◽

Cited By ~ 2

Author(s):

Padmanabhan Balasubramanian ◽

Douglas Maskell ◽

Nikos Mastorakis

Keyword(s):

Critical Path ◽

Cmos Technology ◽

Path Delay ◽

Silicon Area ◽

Multiple Faults ◽

Design Metrics ◽

Safety Critical ◽

Critical Path Delay ◽

Function Blocks ◽

Modular Redundancy

In the era of nanoelectronics, multiple faults or failures of function blocks are likely to occur. To withstand these, higher levels of redundancy are suggested to be employed in at least the sensitive portions of a circuit or system. In this context, the N-modular redundancy (NMR) scheme may be used to guard against the multiple faults or failures of function blocks. However, the NMR scheme would exacerbate the weight, cost, and design metrics to implement higher-order redundancy. Hence, as an alternative to the NMR, the majority and minority voted redundancy (MMR) scheme was proposed recently. However, the proposal was restricted to the basic implementation with no provision for indicating the correct or the incorrect operation of the MMR. Hence in this work, we present the MMR scheme with the error/no-error signaling logic (ESL). Example NMR circuits without and with the ESL (NMRESL), and example MMR circuits without and with the proposed ESL (MMRESL) were implemented to achieve similar degrees of fault tolerance using a 32/28-nm CMOS technology. The results show that, on average, the proposed MMRESL circuits have 18.9% less critical path delay, dissipate 64.8% less power, and require 49.5% less silicon area compared to their counterpart NMRESL circuits.

Download Full-text

Design and Implementation of a Farrow-Interpolator-Based Digital Front-End in LTE Receivers for Carrier Aggregation

Electronics ◽

10.3390/electronics10030231 ◽

2021 ◽

Vol 10 (3) ◽

pp. 231

Author(s):

Chester Sungchung Park ◽

Sunwoo Kim ◽

Jooho Wang ◽

Sungkyung Park

Keyword(s):

Integrated Circuit ◽

Building Block ◽

Orthogonal Frequency Division Multiplexing ◽

Critical Path ◽

Phase Error ◽

System Level ◽

Comb Filter ◽

Carrier Aggregation ◽

Path Delay ◽

Front End

A digital front-end decimation chain based on both Farrow interpolator for fractional sample-rate conversion and a digital mixer is proposed in order to comply with the long-term evolution standards in radio receivers with ten frequency modes. Design requirement specifications with adjacent channel selectivity, inband blockers, and narrowband blockers are all satisfied so that the proposed digital front-end is 3GPP-compliant. Furthermore, the proposed digital front-end addresses carrier aggregation in the standards via appropriate frequency translations. The digital front-end has a cascaded integrator comb filter prior to Farrow interpolator and also has a per-carrier carrier aggregation filter and channel selection filter following the digital mixer. A Farrow interpolator with an integrate-and-dump circuitry controlled by a condition signal is proposed and also a digital mixer with periodic reset to prevent phase error accumulation is proposed. From the standpoint of design methodology, three models are all developed for the overall digital front-end, namely, functional models, cycle-accurate models, and bit-accurate models. Performance is verified by means of the cycle-accurate model and subsequently, by means of a special C++ class, the bitwidths are minimized in a methodic manner for area minimization. For system-level performance verification, the orthogonal frequency division multiplexing receiver is also modeled. The critical path delay of each building block is analyzed and the spectral-domain view is obtained for each building block of the digital front-end circuitry. The proposed digital front-end circuitry is simulated, designed, and both synthesized in a 180 nm CMOS application-specific integrated circuit technology and implemented in the Xilinx XC6VLX550T field-programmable gate array (Xilinx, San Jose, CA, USA).

Download Full-text

A Novel Ultrasonic TOF Ranging System Using AlN Based PMUTs

Micromachines ◽

10.3390/mi12030284 ◽

2021 ◽

Vol 12 (3) ◽

pp. 284

Author(s):

Yihsiang Chiu ◽

Chen Wang ◽

Dan Gong ◽

Nan Li ◽

Shenglin Ma ◽

...

Keyword(s):

Clock Cycle ◽

High Accuracy ◽

Ultrasonic Waves ◽

Oxide Semiconductor ◽

Average Error ◽

Cmos Process ◽

Clock Frequency ◽

Time Frequency ◽

Range Finding ◽

Wide Range

This paper presents a high-accuracy complementary metal oxide semiconductor (CMOS) driven ultrasonic ranging system based on air coupled aluminum nitride (AlN) based piezoelectric micromachined ultrasonic transducers (PMUTs) using time of flight (TOF). The mode shape and the time-frequency characteristics of PMUTs are simulated and analyzed. Two pieces of PMUTs with a frequency of 97 kHz and 96 kHz are applied. One is used to transmit and the other is used to receive ultrasonic waves. The Time to Digital Converter circuit (TDC), correlating the clock frequency with sound velocity, is utilized for range finding via TOF calculated from the system clock cycle. An application specific integrated circuit (ASIC) chip is designed and fabricated on a 0.18 μm CMOS process to acquire data from the PMUT. Compared to state of the art, the developed ranging system features a wide range and high accuracy, which allows to measure the range of 50 cm with an average error of 0.63 mm. AlN based PMUT is a promising candidate for an integrated portable ranging system.

Download Full-text

Layout-Aware Critical Path Delay Test Under Maximum Power Supply Noise Effects

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2011.2163159 ◽

2011 ◽

Vol 30 (12) ◽

pp. 1923-1934 ◽

Cited By ~ 18

Author(s):

Junxia Ma ◽

Mohammad Tehranipoor

Keyword(s):

Power Supply ◽

Critical Path ◽

Maximum Power ◽

Path Delay ◽

Power Supply Noise ◽

Delay Test ◽

Noise Effects ◽

Critical Path Delay ◽

Supply Noise ◽

Path Delay Test

Download Full-text

Performance Comparison of Carry-Lookahead and Carry-Select Adders Based on Accurate and Approximate Additions

Electronics ◽

10.3390/electronics7120369 ◽

2018 ◽

Vol 7 (12) ◽

pp. 369 ◽

Cited By ~ 4

Author(s):

Padmanabhan Balasubramanian ◽

Nikos Mastorakis

Keyword(s):

High Speed ◽

Digital Signal ◽

Complementary Metal Oxide Semiconductor ◽

Performance Comparison ◽

Oxide Semiconductor ◽

Cmos Process ◽

Least Significant Bit ◽

Design Metrics ◽

Ripple Carry Adder ◽

Speed Performance

Addition is a fundamental operation in microprocessing and digital signal processing hardware, which is physically realized using an adder. The carry-lookahead adder (CLA) and the carry-select adder (CSLA) are two popular high-speed, low-power adder architectures. The speed performance of a CLA architecture can be improved by adopting a hybrid CLA architecture which employs a small-size ripple-carry adder (RCA) to replace a sub-CLA in the least significant bit positions. On the other hand, the power dissipation of a CSLA employing full adders and 2:1 multiplexers can be reduced by utilizing binary-to-excess-1 code (BEC) converters. In the literature, the designs of many CLAs and CSLAs were described separately. It would be useful to have a direct comparison of their performances based on the design metrics. Hence, we implemented homogeneous and hybrid CLAs, and CSLAs with and without the BEC converters by considering 32-bit accurate and approximate additions to facilitate a comparison. For the gate-level implementations, we considered a 32/28 nm complementary metal-oxide-semiconductor (CMOS) process targeting a typical-case process–voltage–temperature (PVT) specification. The results show that the hybrid CLA/RCA architecture is preferable among the CLA and CSLA architectures from the speed and power perspectives to perform accurate and approximate additions.

Download Full-text

Exploring Linear Structures of Critical Path Delay Faults to Reduce Test Efforts

2006 IEEE/ACM International Conference on Computer Aided Design ◽

10.1109/iccad.2006.320072 ◽

2006 ◽

Author(s):

Shun-yen Lu ◽

Pei-ying Hsieh ◽

Jing-jia Liou

Keyword(s):

Critical Path ◽

Delay Faults ◽

Path Delay ◽

Path Delay Faults ◽

Linear Structures ◽

Critical Path Delay

Download Full-text

A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v10.i1.pp1-10 ◽

2021 ◽

Vol 10 (1) ◽

pp. 1

Author(s):

Kenta Shirane ◽

Takahiro Yamamoto ◽

Hiroyuki Tomiyama

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Design Methodology ◽

Critical Path ◽

High Accuracy ◽

Path Delay ◽

Trade Off ◽

Critical Path Delay

In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.

Download Full-text

Novel Design of Low-Power High-Speed Hybrid Full Adder Design using Gate Diffusion Input (GDI) Technique

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l7992.1091220 ◽

2020 ◽

Vol 9 (12) ◽

pp. 323-328

Keyword(s):

Power Consumption ◽

Low Power ◽

High Speed ◽

Critical Path ◽

Circuit Simulation ◽

Full Adder ◽

Cmos Process ◽

Path Delay ◽

Process Technology ◽

Xnor Gate

VLSI technology become one of the most significant and demandable because of the characteristics like device portability, device size, large amount of features, expenditure, consistency, rapidity and many others. Multipliers and Adders place an important role in various digital systems such as computers, process controllers and signal processors in order to achieve high speed and low power. Two input XOR/XNOR gate and 2:1 multiplexer modules are used to design the Hybrid Full adders. The XOR/XNOR gate is the key punter of power included in the Full adder cell. However this circuit increases the delay, area and critical path delay. Hence, the optimum design of the XOR/XNOR is required to reduce the power consumption of the Full adder Cell. So a 6 New Hybrid Full adder circuits are proposed based on the Novel Full-Swing XOR/XNOR gates and a New Gate Diffusion Input (GDI) design of Full adder with high-swing outputs. The speed, power consumption, power delay product and driving capability are the merits of the each proposed circuits. This circuit simulation was carried used cadence virtuoso EDA tool. The simulation results based on the 90nm CMOS process technology model.

Download Full-text

Design of delay efficient Booth multiplier using pipelining

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.16.11423 ◽

2018 ◽

Vol 7 (2.16) ◽

pp. 94

Author(s):

Abhishek Choubey ◽

SPV Subbarao ◽

Shruti B. Choubey

Keyword(s):

Critical Path ◽

Arithmetic Operation ◽

Vlsi Design ◽

Digital Signal ◽

Path Delay ◽

Large Area ◽

Booth Multiplier ◽

Critical Path Delay ◽

Long Latency ◽

Comparison Results

Multiplication is one of the most an essential arithmetic operation used in numerous applications in digital signal processing and communications. These applications need transformations, convolutions and dot products that involve an enormous amount of multiplications of an operand with a constant. Typical examples include wavelet, digital filters, such as FIR or IIR. However, multiplier structures have relatively large area-delay product, long latency and significantly high power consumption compared to other the arithmetic structure. Therefore, low power multiplier design has been always a significant part of DSP structure for VLSI design. The Booth multiplier is promising as the most efficient amongst the others multiplier as it reduces the complexity of considerably than others. In this paper, we have proposed Booth-multiplier using seamless pipelining. Theoretical comparison results show that the proposed Booth multiplier requires less critical path delay compared to traditional Booth multiplier. ASIC simulation results show proposed radix-16 Booth multiplier 13% less critical path delay for word width n=16 and 17% less critical path delay compared for bit width n=32 to best existing radix-16 Booth multiplier.

Download Full-text