Hardware-efficient approximate multiplier architectures for media processing applications

Purpose Multipliers that form the basic building blocks in most of the error-resilient media processing applications are computationally intensive and power-hungry modules. Therefore, improving the multiplier’s performance in terms of area, critical path delay and power has become an important research area. This paper aims to propose two improved multiplier designs based on a new approximate compressor circuit to reduce the hardware complexity at the partial product reduction stage. The proposed approximate 4:2 compressor design significantly reduces the overall hardware cost of the multiplier. The error introduced by the approximate compressor is reduced using a new technique of assigning inputs to the compressors in the partial product reduction structure. Design/methodology/approach The multiplier designs implemented using the proposed approximate 4:2 compressor are targeted for error-resilient applications. For fair comparisons, various multiplier designs, including the proposed one, are implemented in MATLAB. The quality analysis is carried out using standard images, and metrics such as structural similarity index are computed to quantify the result of proposed designs with the existing architectures. Next, Verilog gate-level designs are synthesized to compute area, delay and power to prove the efficacy of the proposed designs. Findings Exhaustive error and hardware analysis have been carried out for the existing and proposed multiplier architectures. Error analysis carried out using MATLAB proves that the proposed designs achieve better quality metrics than existing designs. Hardware results show that area, the power consumed and critical path delay are reduced up to 39.8%, 51.7% and 15.9%, respectively, compared to the existing designs. Toward the end, the proposed designs impact is quantified and compared with existing designs on real-time image sharpening and image multiplication applications. Originality/value The area, delay and power metrics of the multiplier can be improved using an approximate compressor in an error-resilient application. Accordingly, in this work, a new compressor is proposed that reduces the hardware complexity in the multiplier architecture. However, the proposed approximate compressor, while reducing the computational complexity, tends to introduce error in the multiplier. The error introduced by the approximate compressor is reduced using a new technique of assigning inputs to the compressors in the partial product reduction structure. With the help of the approximate compressor and a technique of input realignment, hardware efficient and highly accurate multiplier designs are achieved.

Download Full-text

Efficient Realization of BCD Multipliers Using FPGAs

International Journal of Reconfigurable Computing ◽

10.1155/2017/2410408 ◽

2017 ◽

Vol 2017 ◽

pp. 1-12 ◽

Cited By ~ 2

Author(s):

Shuli Gao ◽

Dhamin Al-Khalili ◽

J. M. Pierre Langlois ◽

Noureddine Chabini

Keyword(s):

Critical Path ◽

Partial Product ◽

Resource Usage ◽

Path Delay ◽

Delay Reduction ◽

Binary Operations ◽

Critical Path Delay ◽

Product Generation ◽

Improved Performance ◽

Efficient Realization

In this paper, a novel BCD multiplier approach is proposed. The main highlight of the proposed architecture is the generation of the partial products and parallel binary operations based on 2-digit columns. 1 × 1-digit multipliers used for the partial product generation are implemented directly by 4-bit binary multipliers without any code conversion. The binary results of the 1 × 1-digit multiplications are organized according to their two-digit positions to generate the 2-digit column-based partial products. A binary-decimal compressor structure is developed and used for partial product reduction. These reduced partial products are added in optimized 6-LUT BCD adders. The parallel binary operations and the improved BCD addition result in improved performance and reduced resource usage. The proposed approach was implemented on Xilinx Virtex-5 and Virtex-6 FPGAs with emphasis on the critical path delay reduction. Pipelined BCD multipliers were implemented for 4 × 4, 8 × 8, and 16 × 16-digit multipliers. Our realizations achieve an increase in speed by up to 22% and a reduction of LUT count by up to 14% over previously reported results.

Download Full-text

En bloc petrosectomy using a Gigli saw for petroclival lesions

Journal of Neurosurgery ◽

10.3171/jns.1995.83.3.0559 ◽

1995 ◽

Vol 83 (3) ◽

pp. 559-560 ◽

Cited By ~ 11

Author(s):

Tomio Sasaki ◽

Makoto Taniguchi ◽

Ichiro Suzuki ◽

Takaaki Kirino

Keyword(s):

Facial Nerve ◽

Semicircular Canals ◽

Petrous Bone ◽

New Technique ◽

En Bloc ◽

Content Type ◽

A New Technique ◽

Transtentorial Approach ◽

Fine Print

✓ The authors report a new technique for en bloc petrosectomy using a Gigli saw as an alternative to drilling the petrous bone in the combined supra- and infratentorial approach or the transpetrosal—transtentorial approach. It is simple and easy and avoids postoperative cosmetic deformity. This technique has been performed in 11 petroclival lesions without injuring the semicircular canals, the cochlea, or the facial nerve.

Download Full-text

A new technique for exposure of the cervical spine laminae

Journal of Neurosurgery Spine ◽

10.3171/spi.2002.96.1.0122 ◽

2002 ◽

Vol 96 (1) ◽

pp. 122-126 ◽

Cited By ~ 20

Author(s):

Tateru Shiraishi

Keyword(s):

Cervical Spine ◽

New Technique ◽

Diverse Range ◽

Content Type ◽

Spinous Processes ◽

Technical Details ◽

A New Technique ◽

Fine Print

✓ The author describes a new technique for exposure of the cervical spine laminae in which the attachments of the semispinalis cervicis and multifidus muscles to the spinous processes are left untouched. It provides a conservative exposure through which a diverse range of posterior cervical surgeries can be performed. In contrast to conventional cervical approaches, none of the muscular attachments to the spinous processes is compromised. In this paper the author describes the technical details and discusses the applications of the procedure.

Download Full-text

Layout-Aware Critical Path Delay Test Under Maximum Power Supply Noise Effects

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2011.2163159 ◽

2011 ◽

Vol 30 (12) ◽

pp. 1923-1934 ◽

Cited By ~ 18

Author(s):

Junxia Ma ◽

Mohammad Tehranipoor

Keyword(s):

Power Supply ◽

Critical Path ◽

Maximum Power ◽

Path Delay ◽

Power Supply Noise ◽

Delay Test ◽

Noise Effects ◽

Critical Path Delay ◽

Supply Noise ◽

Path Delay Test

Download Full-text

Exploring Linear Structures of Critical Path Delay Faults to Reduce Test Efforts

2006 IEEE/ACM International Conference on Computer Aided Design ◽

10.1109/iccad.2006.320072 ◽

2006 ◽

Author(s):

Shun-yen Lu ◽

Pei-ying Hsieh ◽

Jing-jia Liou

Keyword(s):

Critical Path ◽

Delay Faults ◽

Path Delay ◽

Path Delay Faults ◽

Linear Structures ◽

Critical Path Delay

Download Full-text

A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v10.i1.pp1-10 ◽

2021 ◽

Vol 10 (1) ◽

pp. 1

Author(s):

Kenta Shirane ◽

Takahiro Yamamoto ◽

Hiroyuki Tomiyama

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Design Methodology ◽

Critical Path ◽

High Accuracy ◽

Path Delay ◽

Trade Off ◽

Critical Path Delay

In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.

Download Full-text

Design of delay efficient Booth multiplier using pipelining

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.16.11423 ◽

2018 ◽

Vol 7 (2.16) ◽

pp. 94

Author(s):

Abhishek Choubey ◽

SPV Subbarao ◽

Shruti B. Choubey

Keyword(s):

Critical Path ◽

Arithmetic Operation ◽

Vlsi Design ◽

Digital Signal ◽

Path Delay ◽

Large Area ◽

Booth Multiplier ◽

Critical Path Delay ◽

Long Latency ◽

Comparison Results

Multiplication is one of the most an essential arithmetic operation used in numerous applications in digital signal processing and communications. These applications need transformations, convolutions and dot products that involve an enormous amount of multiplications of an operand with a constant. Typical examples include wavelet, digital filters, such as FIR or IIR. However, multiplier structures have relatively large area-delay product, long latency and significantly high power consumption compared to other the arithmetic structure. Therefore, low power multiplier design has been always a significant part of DSP structure for VLSI design. The Booth multiplier is promising as the most efficient amongst the others multiplier as it reduces the complexity of considerably than others. In this paper, we have proposed Booth-multiplier using seamless pipelining. Theoretical comparison results show that the proposed Booth multiplier requires less critical path delay compared to traditional Booth multiplier. ASIC simulation results show proposed radix-16 Booth multiplier 13% less critical path delay for word width n=16 and 17% less critical path delay compared for bit width n=32 to best existing radix-16 Booth multiplier.

Download Full-text

Modified RS encoder architecture with reduced critical path delay for high speed data communication

2017 International Conference on Intelligent Sustainable Systems (ICISS) ◽

10.1109/iss1.2017.8389244 ◽

2017 ◽

Author(s):

A. Deepa ◽

C. N. Marimuthu

Keyword(s):

High Speed ◽

Critical Path ◽

Data Communication ◽

Path Delay ◽

Critical Path Delay ◽

High Speed Data

Download Full-text

Delay Modeling and Critical-Path Delay Calculation for MTCMOS Circuits

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1093/ietfec/e89-a.12.3482 ◽

2006 ◽

Vol E89-A (12) ◽

pp. 3482-3490

Author(s):

N. OHKUBO ◽

K. USAMI

Keyword(s):

Critical Path ◽

Path Delay ◽

Critical Path Delay

Download Full-text

Approximate Array Multipliers

Electronics ◽

10.3390/electronics10050630 ◽

2021 ◽

Vol 10 (5) ◽

pp. 630

Author(s):

Padmanabhan Balasubramanian ◽

Raunaq Nayar ◽

Douglas L. Maskell

Keyword(s):

Critical Path ◽

Complementary Metal Oxide Semiconductor ◽

Oxide Semiconductor ◽

Total Power ◽

Path Delay ◽

Input And Output ◽

Critical Path Delay ◽

Standard Design ◽

Array Multiplier ◽

Array Multipliers

This article describes the design of approximate array multipliers by making vertical or horizontal cuts in an accurate array multiplier followed by different input and output assignments within the multiplier. We consider a digital image denoising application and show how different combinations of input and output assignments in an approximate array multiplier affect the quality of the denoised images. We consider the accurate array multiplier and several approximate array multipliers for synthesis. The multipliers were described in Verilog hardware description language and synthesized by Synopsys Design Compiler using a 32/28-nm complementary metal-oxide-semiconductor technology. The results show that compared to the accurate array multiplier, one of the proposed approximate array multipliers viz. PAAM01-V7 achieves a 28% reduction in critical path delay, 75.8% reduction in power, and 64.6% reduction in area while enabling the production of a denoised image that is comparable in quality to the image denoised using the accurate array multiplier. The standard design metrics such as critical path delay, total power dissipation, and area of the accurate and approximate multipliers are given, the error parameters of the approximate array multipliers are provided, and the original image, the noisy image, and the denoised images are also depicted for comparison.

Download Full-text