Wire-Length and Run-Time Optimization in FPGA Placement Using Hybrid Iterative Algorithms

Journal of Circuits System and Computers ◽

10.1142/s021812662150081x ◽

2020 ◽

pp. 2150081

Author(s):

P. Sudhanya ◽

S. P. Joy Vasantha Rani

Keyword(s):

Critical Path ◽

Iterative Algorithms ◽

Path Delay ◽

Local Minima ◽

Wire Length ◽

Inertia Weight ◽

Critical Path Delay ◽

Field Programmable ◽

Fpga Placement ◽

Run Time

This paper introduces hybrid iterative algorithms that combine Particle Swarm Optimization (PSO) and Simulated Annealing (SA) algorithms for Field Programmable Gate Array (FPGA) placement by considering adaptive inertia weight and local minima avoidance. The algorithms target to optimize the wire-length of the nets, run time and critical path delay in the placement of logic blocks. Using the adaptive inertia weight parameter and local minima avoidance, the hybrid PSO-SA algorithm is modified to Time-varying PSO-SA (TPSO-SA) and Modified PSO-SA (MPSO-SA) algorithm, respectively. These different hybrid PSO-SA algorithms are checked for efficiency by comparing with the Versatile Place and Route (VPR) algorithm of the Verilog to Routing (VTR) tool using Microelectronics Centre of North Carolina (MCNC) benchmark circuits. The hybrid PSO-SA algorithms give 5–37% better results for wire-length cost and 20–58% reduction in runtime compared to the VPR placement algorithm on different benchmark circuits. Critical path delay is also taken into consideration.

Download Full-text

ParaLarPD: Parallel FPGA Router Using Primal-Dual Sub-Gradient Method

Electronics ◽

10.3390/electronics8121439 ◽

2019 ◽

Vol 8 (12) ◽

pp. 1439 ◽

Cited By ~ 1

Author(s):

Rohit Agrawal ◽

Kapil Ahuja ◽

Chin Hau Hoo ◽

Tuan Duy Anh Nguyen ◽

Akash Kumar

Keyword(s):

Gradient Method ◽

Critical Path ◽

Design Flow ◽

Channel Width ◽

Path Delay ◽

Routing Metric ◽

Fpga Design ◽

Critical Path Delay ◽

Field Programmable ◽

Primal Dual

In the field programmable gate array (FPGA) design flow, one of the most time-consuming steps is the routing of nets. Therefore, there is a need to accelerate it. In a recent work by Hoo et al., the authors have developed a linear programming (LP)-based framework that parallelizes this routing process to achieve significant speed-ups (the resulting algorithm is termed as ParaLaR). However, this approach has certain weaknesses. Namely, the constraints violation by the solution and a standard routing metric could be improved. We address these two issues here. In this paper, we use the LP framework of ParaLaR and solve it using the primal–dual sub-gradient method that better exploits the problem properties. We also propose a better way to update the size of the step taken by this iterative algorithm. We call our algorithm as ParaLarPD. We perform experiments on a set of standard benchmarks, where we show that our algorithm outperforms not just ParaLaR but the standard existing algorithm VPR as well. We perform experiments with two different configurations. We achieve 20 % average improvement in the constraints violation and the standard metric of the minimum channel width (both of which are related) when compared with ParaLaR. When compared to VPR, we get average improvements of 28 % in the minimum channel width (there is no constraints violation in VPR). We obtain the same value for the total wire length as by ParaLaR, which is 49 % better on an average than that obtained by VPR. This is the original metric to be minimized, for which ParaLaR was proposed. Next, we look at the third and easily measurable metric of critical path delay. On an average, ParaLarPD gives 2 % larger critical path delay than ParaLaR and 3 % better than VPR. We achieve maximum relative speed-ups of up to seven times when running a parallel version of our algorithm using eight threads as compared to the sequential implementation. These speed-ups are similar to those as obtained by ParaLaR.

Download Full-text

Layout-Aware Critical Path Delay Test Under Maximum Power Supply Noise Effects

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2011.2163159 ◽

2011 ◽

Vol 30 (12) ◽

pp. 1923-1934 ◽

Cited By ~ 18

Author(s):

Junxia Ma ◽

Mohammad Tehranipoor

Keyword(s):

Power Supply ◽

Critical Path ◽

Maximum Power ◽

Path Delay ◽

Power Supply Noise ◽

Delay Test ◽

Noise Effects ◽

Critical Path Delay ◽

Supply Noise ◽

Path Delay Test

Download Full-text

Exploring Linear Structures of Critical Path Delay Faults to Reduce Test Efforts

2006 IEEE/ACM International Conference on Computer Aided Design ◽

10.1109/iccad.2006.320072 ◽

2006 ◽

Author(s):

Shun-yen Lu ◽

Pei-ying Hsieh ◽

Jing-jia Liou

Keyword(s):

Critical Path ◽

Delay Faults ◽

Path Delay ◽

Path Delay Faults ◽

Linear Structures ◽

Critical Path Delay

Download Full-text

A design methodology for approximate multipliers in convolutional neural networks: A case of MNIST

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v10.i1.pp1-10 ◽

2021 ◽

Vol 10 (1) ◽

pp. 1

Author(s):

Kenta Shirane ◽

Takahiro Yamamoto ◽

Hiroyuki Tomiyama

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Design Methodology ◽

Critical Path ◽

High Accuracy ◽

Path Delay ◽

Trade Off ◽

Critical Path Delay

In this paper, we present a case study on approximate multipliers for MNIST Convolutional Neural Network (CNN). We apply approximate multipliers with different bit-width to the convolution layer in MNIST CNN, evaluate the accuracy of MNIST classification, and analyze the trade-off between approximate multiplier’s area, critical path delay and the accuracy. Based on the results of the evaluation and analysis, we propose a design methodology for approximate multipliers. The approximate multipliers consist of some partial products, which are carefully selected according to the CNN input. With this methodology, we further reduce the area and the delay of the multipliers with keeping high accuracy of the MNIST classification.

Download Full-text

Design of delay efficient Booth multiplier using pipelining

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.16.11423 ◽

2018 ◽

Vol 7 (2.16) ◽

pp. 94

Author(s):

Abhishek Choubey ◽

SPV Subbarao ◽

Shruti B. Choubey

Keyword(s):

Critical Path ◽

Arithmetic Operation ◽

Vlsi Design ◽

Digital Signal ◽

Path Delay ◽

Large Area ◽

Booth Multiplier ◽

Critical Path Delay ◽

Long Latency ◽

Comparison Results

Multiplication is one of the most an essential arithmetic operation used in numerous applications in digital signal processing and communications. These applications need transformations, convolutions and dot products that involve an enormous amount of multiplications of an operand with a constant. Typical examples include wavelet, digital filters, such as FIR or IIR. However, multiplier structures have relatively large area-delay product, long latency and significantly high power consumption compared to other the arithmetic structure. Therefore, low power multiplier design has been always a significant part of DSP structure for VLSI design. The Booth multiplier is promising as the most efficient amongst the others multiplier as it reduces the complexity of considerably than others. In this paper, we have proposed Booth-multiplier using seamless pipelining. Theoretical comparison results show that the proposed Booth multiplier requires less critical path delay compared to traditional Booth multiplier. ASIC simulation results show proposed radix-16 Booth multiplier 13% less critical path delay for word width n=16 and 17% less critical path delay compared for bit width n=32 to best existing radix-16 Booth multiplier.

Download Full-text

Modified RS encoder architecture with reduced critical path delay for high speed data communication

2017 International Conference on Intelligent Sustainable Systems (ICISS) ◽

10.1109/iss1.2017.8389244 ◽

2017 ◽

Author(s):

A. Deepa ◽

C. N. Marimuthu

Keyword(s):

High Speed ◽

Critical Path ◽

Data Communication ◽

Path Delay ◽

Critical Path Delay ◽

High Speed Data

Download Full-text

Delay Modeling and Critical-Path Delay Calculation for MTCMOS Circuits

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1093/ietfec/e89-a.12.3482 ◽

2006 ◽

Vol E89-A (12) ◽

pp. 3482-3490

Author(s):

N. OHKUBO ◽

K. USAMI

Keyword(s):

Critical Path ◽

Path Delay ◽

Critical Path Delay

Download Full-text

Approximate Array Multipliers

Electronics ◽

10.3390/electronics10050630 ◽

2021 ◽

Vol 10 (5) ◽

pp. 630

Author(s):

Padmanabhan Balasubramanian ◽

Raunaq Nayar ◽

Douglas L. Maskell

Keyword(s):

Critical Path ◽

Complementary Metal Oxide Semiconductor ◽

Oxide Semiconductor ◽

Total Power ◽

Path Delay ◽

Input And Output ◽

Critical Path Delay ◽

Standard Design ◽

Array Multiplier ◽

Array Multipliers

This article describes the design of approximate array multipliers by making vertical or horizontal cuts in an accurate array multiplier followed by different input and output assignments within the multiplier. We consider a digital image denoising application and show how different combinations of input and output assignments in an approximate array multiplier affect the quality of the denoised images. We consider the accurate array multiplier and several approximate array multipliers for synthesis. The multipliers were described in Verilog hardware description language and synthesized by Synopsys Design Compiler using a 32/28-nm complementary metal-oxide-semiconductor technology. The results show that compared to the accurate array multiplier, one of the proposed approximate array multipliers viz. PAAM01-V7 achieves a 28% reduction in critical path delay, 75.8% reduction in power, and 64.6% reduction in area while enabling the production of a denoised image that is comparable in quality to the image denoised using the accurate array multiplier. The standard design metrics such as critical path delay, total power dissipation, and area of the accurate and approximate multipliers are given, the error parameters of the approximate array multipliers are provided, and the original image, the noisy image, and the denoised images are also depicted for comparison.

Download Full-text

Exploring Shared SRAM Tables in FPGAs for Larger LUTs and Higher Degree of Sharing

International Journal of Reconfigurable Computing ◽

10.1155/2017/7021056 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Ali Asghar ◽

Muhammad Mazher Iqbal ◽

Waqar Ahmed ◽

Mujahid Ali ◽

Husain Parvez ◽

...

Keyword(s):

High Performance ◽

Critical Path ◽

Path Delay ◽

Gate Arrays ◽

Area Reduction ◽

Area Overhead ◽

Logic Block ◽

Field Programmable ◽

Boolean Matching ◽

Programmable Gate Arrays

In modern SRAM based Field Programmable Gate Arrays, a Look-Up Table (LUT) is the principal constituent logic element which can realize every possible Boolean function. However, this flexibility of LUTs comes with a heavy area penalty. A part of this area overhead comes from the increased amount of configuration memory which rises exponentially as the LUT size increases. In this paper, we first present a detailed analysis of a previously proposed FPGA architecture which allows sharing of LUTs memory (SRAM) tables among NPN-equivalent functions, to reduce the area as well as the number of configuration bits. We then propose several methods to improve the existing architecture. A new clustering technique has been proposed which packs NPN-equivalent functions together inside a Configurable Logic Block (CLB). We also make use of a recently proposed high performance Boolean matching algorithm to perform NPN classification. To enhance area savings further, we evaluate the feasibility of more than two LUTs sharing the same SRAM table. Consequently, this work explores the SRAM table sharing approach for a range of LUT sizes (4–7), while varying the cluster sizes (4–16). Experimental results on MCNC benchmark circuits set show an overall area reduction of ~7% while maintaining the same critical path delay.

Download Full-text