scholarly journals Research on IEEE 754 Standard Single Precision Floating Point Multipliers Designed using Urdhva Triyagbhyam Sutra of Vedic Mathematics

2019 ◽  
Vol 8 (2S11) ◽  
pp. 2990-2993

Duplication of the coasting element numbers is the big activity in automated signal handling. So the exhibition of drifting problem multipliers count on a primary undertaking in any computerized plan. Coasting factor numbers are spoken to utilizing IEEE 754 modern day in single precision(32-bits), Double precision(sixty four-bits) and Quadruple precision(128-bits) organizations. Augmentation of those coasting component numbers can be completed via using Vedic generation. Vedic arithmetic encompass sixteen wonderful calculations or Sutras. Urdhva Triyagbhyam Sutra is most usually applied for growth of twofold numbers. This paper indicates the compare of tough work finished via exceptional specialists in the direction of the plan of IEEE 754 ultra-modern-day unmarried accuracy skimming thing multiplier the usage of Vedic technological statistics.

2021 ◽  
Author(s):  
Sam Hatfield ◽  
Kristian Mogensen ◽  
Peter Dueben ◽  
Nils Wedi ◽  
Michail Diamantakis

<p>Earth-System models traditionally use double-precision, 64 bit floating-point numbers to perform arithmetic. According to orthodoxy, we must use such a relatively high level of precision in order to minimise the potential impact of rounding errors on the physical fidelity of the model. However, given the inherently imperfect formulation of our models, and the computational benefits of lower precision arithmetic, we must question this orthodoxy. At ECMWF, a single-precision, 32 bit variant of the atmospheric model IFS has been undergoing rigorous testing in preparation for operations for around 5 years. The single-precision simulations have been found to have effectively the same forecast skill as the double-precision simulations while finishing in 40% less time, thanks to the memory and cache benefits of single-precision numbers. Following these positive results, other modelling groups are now also considering single-precision as a way to accelerate their simulations.</p><p>In this presentation I will present the rationale behind the move to lower-precision floating-point arithmetic and up-to-date results from the single-precision atmospheric model at ECMWF, which will be operational imminently. I will then provide an update on the development of the single-precision ocean component at ECMWF, based on the NEMO ocean model, including a verification of quarter-degree simulations. I will also present new results from running ECMWF's coupled atmosphere-ocean-sea-ice-wave forecasting system entirely with single-precision. Finally I will discuss the feasibility of even lower levels of precision, like half-precision, which are now becoming available through GPU- and ARM-based systems such as Summit and Fugaku, respectively. The use of reduced-precision floating-point arithmetic will be an essential consideration for developing high-resolution, storm-resolving Earth-System models.</p>


2016 ◽  
Author(s):  
Charles S. Zender

Abstract. Lossy compression schemes can help reduce the space required to store the false precision (i.e, scientifically meaningless data bits) that geoscientific models and measurements generate. We introduce, implement, and characterize a new lossy compression scheme suitable for IEEE floating-point data. Our new Bit Grooming algorithm alternately shaves (to zero) and sets (to one) the least significant bits of consecutive values to preserve a desired precision. This is a symmetric, two-sided variant of an algorithm sometimes called Bit Shaving which quantizes values solely by zeroing bits. Our variation eliminates the artificial low-bias produced by always zeroing bits, and makes Bit Grooming more suitable for arrays and multi-dimensional fields whose mean statistics are important. Bit Grooming relies on standard lossless compression schemes to achieve the actual reduction in storage space, so we tested Bit Grooming by applying the DEFLATE compression algorithm to bit-groomed and full-precision climate data stored in netCDF3, netCDF4, HDF4, and HDF5 formats. Bit Grooming reduces the storage space required by uncompressed and compressed climate data by up to 50 % and 20 %, respectively, for single-precision data (the most common case for climate data). When used aggressively (i.e., preserving only 1–3 decimal digits of precision), Bit Grooming produces storage reductions comparable to other quantization techniques such as linear packing. Unlike linear packing, Bit Grooming works on the full representable range of floating-point data. Bit Grooming reduces the volume of single-precision compressed data by roughly 10 % per decimal digit quantized (or "groomed") after the third such digit, up to a maximum reduction of about 50 %. The potential reduction is greater for double-precision datasets. Data quantization by Bit Grooming is irreversible (i.e., lossy) yet transparent, meaning that no extra processing is required by data users/readers. Hence Bit Grooming can easily reduce data storage volume without sacrificing scientific precision or imposing extra burdens on users.


2011 ◽  
Vol 2011 ◽  
pp. 1-12 ◽  
Author(s):  
Nikolaos Alachiotis ◽  
Alexandros Stamatakis

The use of reconfigurable computing for accelerating floating-point intensive codes is becoming common due to the availability of DSPs in new-generation FPGAs. We present the design of an efficient, pipelined floating-point datapath for calculating the logarithm function on reconfigurable devices. We integrate the datapath into a stand-alone LUT-based (Lookup Table) component, the LAU (Logarithm Approximation Unit). We extended the LAU, by integrating two architecturally independent, LAU-based datapaths into a larger component, the VLAU (vector-like LAU). The VLAU produces 2 results/cycle, while occupying the same amount of memory as the LAU. Under single precision, one LAU is 12 and 1.7 times faster than the GNU and Intel Math Kernel Library (MKL) implementations, respectively. The LAU is also 1.6 times faster than the FloPoCo reconfigurable logarithm architecture. Under double precision, one LAU is 20 and 2.6 times faster than the respective GNU and MKL functions and 1.4 times faster than the FloPoCo logarithm. The VLAU is approximately twice as fast as the LAU, both under single and double precision.


Currently, each CPU has one or additional Floating Point Units (FPUs) integrated inside it. It is usually utilized in math wide-ranging applications, such as digital signal processing. It is found in places be established in engineering, medical and military fields in adding along to in different fields requiring audio, image or video handling. A high-speed and energy-efficient floating point unit is naturally needed in the electronics diligence as an arithmetic unit in microprocessors. The most operations accounting 95% of conformist FPU are multiplication and addition. Many applications need the speedy execution of arithmetic operations. In the existing system, the FPM(Floating Point Multiplication) and FPA(Floating Point Addition) have more delay and fewer speed and fewer throughput. The demand for high speed and throughput intended to design the multiplier and adder blocks within the FPM (Floating point multiplication)and FPA(Floating Point Addition) in a format of single precision floating point and double-precision floating point operation is internally pipelined to achieve high throughput and these are supported by the IEEE 754 standard floating point representations. This is designed with the Verilog code using Xilinx ISE 14.5 software tool is employed to code and verify the ensuing waveforms of the designed code


2014 ◽  
Vol 142 (10) ◽  
pp. 3809-3829 ◽  
Author(s):  
Peter D. Düben ◽  
T. N. Palmer

Abstract A reduction of computational cost would allow higher resolution in numerical weather predictions within the same budget for computation. This paper investigates two approaches that promise significant savings in computational cost: the use of reduced precision hardware, which reduces floating point precision beyond the standard double- and single-precision arithmetic, and the use of stochastic processors, which allow hardware faults in a trade-off between reduced precision and savings in power consumption and computing time. Reduced precision is emulated within simulations of a spectral dynamical core of a global atmosphere model and a detailed study of the sensitivity of different parts of the model to inexact hardware is performed. Afterward, benchmark simulations were performed for which as many parts of the model as possible were put onto inexact hardware. Results show that large parts of the model could be integrated with inexact hardware at error rates that are surprisingly high or with reduced precision to only a couple of bits in the significand of floating point numbers. However, the sensitivities to inexact hardware of different parts of the model need to be respected, for example, via scale separation. In the last part of the paper, simulations with a full operational weather forecast model in single precision are presented. It is shown that differences in accuracy between the single- and double-precision forecasts are smaller than differences between ensemble members of the ensemble forecast at the resolution of the standard ensemble forecasting system. The simulations prove that the trade-off between precision and performance is a worthwhile effort, already on existing hardware.


2019 ◽  
Vol 8 (2S3) ◽  
pp. 1064-1067

Multiplication of floating point(FP) numbers is greatly significant in many DSP applications. The performance of the DSP’s is substantially decided by the speed of the multipliers used. This paper proposes the design and implementation of IEEE 754 standard single precision FP multiplier using Verilog, synthesized and simulated in Xilinx ISE10.1. Urdhva Triyagbhyam Sutra of Vedic mathematics is used for the unsigned mantissa calculation. The design implements floating point multiplication with sign bit and exponent calculations. The proposed design is achieved high speed with minimum delay of 3.997ns.Multiplication of floating point(FP) numbers is greatly significant in many DSP applications. The performance of the DSP’s is substantially decided by the speed of the multipliers used. This paper proposes the design and implementation of IEEE 754 standard single precision FP multiplier using Verilog, synthesized and simulated in Xilinx ISE10.1. Urdhva Triyagbhyam Sutra of Vedic mathematics is used for the unsigned mantissa calculation. The design implements floating point multiplication with sign bit and exponent calculations. The proposed design is achieved high speed with minimum delay of 3.997ns.


2019 ◽  
Vol 8 (4) ◽  
pp. 8533-8538

There should be rapid, efficient and simple process for every scenario now a day. To compute the N point DFT, Fast Fourier Transform (FFT) is a productive algorithm. It has great applications in communication, signal and image processing and instrumentation. In the implementation of FFT one of the challenges is the complex multiplications, so to make this process rapid and simple it’s necessary for a multiplier to be fast and power efficient. To tackle this problem Karatsuba sutra and Nikhilam sutra are an efficient method of multiplication in Vedic Mathematics. This paper will present a design methodology of Double Precision Floating Point Fast Fourier Transform (FFT) Processor.The execution time and complexity can be reduced by the algorithm which is there in Vedic.The main aim is to make FFT Processor process rapid and simple by designing a multiplier which is fast and power efficient by using double precision floating point and Vedic Mathematics concepts.


Sign in / Sign up

Export Citation Format

Share Document