IP Core Based on the Kalman Filter Algorithm in the FPGA Implementation

This article discusses one-dimensional Kalman filter algorithm using FPGA hardware IP core implementation process. First of all, to program the FPGA matrix operations, implementation of double precision floating point. Then the Kalman filter algorithm programmed in MATLAB, to verify the correctness of the algorithm thinking. Finally the MATLAB language programming algorithm is converted into VHDL language. And call 64 a double precision floating point data algorithm realizes the design of 1-D Kalman filtration algorithm IP core, which make the Kalman filter meet the high precision as well as high speed to complete complex algorithm.

Download Full-text

FPC: A High-Speed Compressor for Double-Precision Floating-Point Data

IEEE Transactions on Computers ◽

10.1109/tc.2008.131 ◽

2009 ◽

Vol 58 (1) ◽

pp. 18-31 ◽

Cited By ~ 107

Author(s):

Martin Burtscher ◽

Paruj Ratanaworabhan

Keyword(s):

High Speed ◽

Floating Point ◽

Double Precision ◽

Point Data

Download Full-text

Lossless Compression of Double-Precision Floating-Point Data for Numerical Simulations: Highly Parallelizable Algorithms for GPU Computing

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e95.d.2778 ◽

2012 ◽

Vol E95.D (12) ◽

pp. 2778-2786

Author(s):

Mamoru OHARA ◽

Takashi YAMAGUCHI

Keyword(s):

Numerical Simulations ◽

Gpu Computing ◽

Lossless Compression ◽

Floating Point ◽

Double Precision ◽

Point Data

Download Full-text

Implementation of Embedded Floating Point Arithmetic Units on FPGA

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.550.126 ◽

2014 ◽

Vol 550 ◽

pp. 126-136

Author(s):

N. Ramya Rani

Keyword(s):

High Speed ◽

High Performance ◽

Floating Point ◽

Double Precision ◽

Embedded Computing ◽

Floating Point Arithmetic ◽

Field Programmable ◽

Programmable Gate Arrays ◽

Arithmetic Units ◽

Point Arithmetic

:Floating point arithmetic plays a major role in scientific and embedded computing applications. But the performance of field programmable gate arrays (FPGAs) used for floating point applications is poor due to the complexity of floating point arithmetic. The implementation of floating point units on FPGAs consumes a large amount of resources and that leads to the development of embedded floating point units in FPGAs. Embedded applications like multimedia, communication and DSP algorithms use floating point arithmetic in processing graphics, Fourier transformation, coding, etc. In this paper, methodologies are presented for the implementation of embedded floating point units on FPGA. The work is focused with the aim of achieving high speed of computations and to reduce the power for evaluating expressions. An application that demands high performance floating point computation can achieve better speed and density by incorporating embedded floating point units. Additionally this paper describes a comparative study of the design of single precision and double precision pipelined floating point arithmetic units for evaluating expressions. The modules are designed using VHDL simulation in Xilinx software and implemented on VIRTEX and SPARTAN FPGAs.

Download Full-text

Design of a Reconfigurable Coprocessor for Double Precision Floating Point Matrix Algorithms

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.58-60.1037 ◽

2011 ◽

Vol 58-60 ◽

pp. 1037-1042

Author(s):

Sheng Long Li ◽

Zhao Lin Li ◽

Qing Wei Zheng

Keyword(s):

High Performance ◽

Cmos Technology ◽

General Purpose ◽

Floating Point ◽

Double Precision ◽

Synthesis Time ◽

Matrix Algorithms ◽

Matrix Operations ◽

On Chip ◽

Software Execution

Double precision floating point matrix operations are wildly used in a variety of engineering and scientific computing applications. However, it’s inefficient to achieve these operations using software approaches on general purpose processors. In order to reduce the processing time and satisfy the real-time demand, a reconfigurable coprocessor for double precision floating point matrix algorithms is proposed in this paper. The coprocessor is embedded in a Multi-Processor System on Chip (MPSoC), cooperates with an ARM core and a DSP core for high-performance control and calculation. One algorithm in GPS applications is taken for example to illustrate the efficiency of the coprocessor proposed in this paper. The experiment result shows that the coprocessor can achieve speedup a factor of 50 for the quaternion algorithm of attitude solution in inertial navigation application compare with software execution time of a TI C6713 DSP. The coprocessor is implemented in SMIC 0.13μm CMOS technology, the synthesis time delay is 9.75ns, and the power consumption is 63.69 mW when it works at 100MHz.

Download Full-text

Bit Grooming: Statistically accurate precision-preserving quantization with compression, evaluated in the netCDF Operators (NCO, v4.4.8+)

10.5194/gmd-2016-63 ◽

2016 ◽

Author(s):

Charles S. Zender

Keyword(s):

Data Storage ◽

Lossy Compression ◽

Floating Point ◽

Decimal Digit ◽

Double Precision ◽

Storage Space ◽

Climate Data ◽

Single Precision ◽

Precision Data ◽

Point Data

Abstract. Lossy compression schemes can help reduce the space required to store the false precision (i.e, scientifically meaningless data bits) that geoscientific models and measurements generate. We introduce, implement, and characterize a new lossy compression scheme suitable for IEEE floating-point data. Our new Bit Grooming algorithm alternately shaves (to zero) and sets (to one) the least significant bits of consecutive values to preserve a desired precision. This is a symmetric, two-sided variant of an algorithm sometimes called Bit Shaving which quantizes values solely by zeroing bits. Our variation eliminates the artificial low-bias produced by always zeroing bits, and makes Bit Grooming more suitable for arrays and multi-dimensional fields whose mean statistics are important. Bit Grooming relies on standard lossless compression schemes to achieve the actual reduction in storage space, so we tested Bit Grooming by applying the DEFLATE compression algorithm to bit-groomed and full-precision climate data stored in netCDF3, netCDF4, HDF4, and HDF5 formats. Bit Grooming reduces the storage space required by uncompressed and compressed climate data by up to 50 % and 20 %, respectively, for single-precision data (the most common case for climate data). When used aggressively (i.e., preserving only 1–3 decimal digits of precision), Bit Grooming produces storage reductions comparable to other quantization techniques such as linear packing. Unlike linear packing, Bit Grooming works on the full representable range of floating-point data. Bit Grooming reduces the volume of single-precision compressed data by roughly 10 % per decimal digit quantized (or "groomed") after the third such digit, up to a maximum reduction of about 50 %. The potential reduction is greater for double-precision datasets. Data quantization by Bit Grooming is irreversible (i.e., lossy) yet transparent, meaning that no extra processing is required by data users/readers. Hence Bit Grooming can easily reduce data storage volume without sacrificing scientific precision or imposing extra burdens on users.

Download Full-text

A design of high speed double precision floating point adder using macro modules

Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, 2005. ◽

10.1109/aspdac.2005.1466603 ◽

2005 ◽

Author(s):

Chi Huang ◽

Xinyu Wu ◽

Jinmei Lai ◽

Chengshou Sun ◽

Gang Li

Keyword(s):

High Speed ◽

Floating Point ◽

Double Precision

Download Full-text

Modeling of Dual-Spinning Projectile with Canard and Trajectory Filtering

International Journal of Aerospace Engineering ◽

10.1155/2018/1795158 ◽

2018 ◽

Vol 2018 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Jun Guan ◽

Wenjun Yi

Keyword(s):

Kalman Filter ◽

High Speed ◽

Unscented Kalman Filter ◽

Measurement Data ◽

Degree Of Freedom ◽

Guidance System ◽

Measurement Information ◽

Reliable Measurement ◽

Kalman Filter Algorithm ◽

Spinning Projectile

The article establishes a seven-degree-of-freedom projectile trajectory model for a new type of spinning projectile. Based on this model, a numerical analysis is performed on the ballistic characteristics of the projectile, and the trajectory of the dual-spinning projectile is filtered with the unscented Kalman filter algorithm, so that the measurement information of projectile onboard equipment is more accurate and more reliable measurement data are provided for the guidance system. The numerical simulation indicates that the dual-spinning projectile is mainly different from the traditional spinning projectile in that a degree of freedom is added in the direction of the axis of the projectile, the forebody of the projectile spins at a low speed or even holds still to improve the control precision of the projectile control system, while the afterbody spins at a high speed maintaining the gyroscopic stability of the projectile. The trajectory filtering performed according to the unscented Kalman filter algorithm can improve the accuracy of measurement data and eliminate the measurement error effectively, so as to obtain more accurate and reliable measurement data.

Download Full-text

Design and Implementation of FPU for Optimised Speed

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c6444.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3922-3933

Keyword(s):

Energy Efficient ◽

High Speed ◽

Software Tool ◽

Digital Signal ◽

Floating Point ◽

Double Precision ◽

Arithmetic Unit ◽

Single Precision ◽

Point Multiplication ◽

Floating Point Unit

Currently, each CPU has one or additional Floating Point Units (FPUs) integrated inside it. It is usually utilized in math wide-ranging applications, such as digital signal processing. It is found in places be established in engineering, medical and military fields in adding along to in different fields requiring audio, image or video handling. A high-speed and energy-efficient floating point unit is naturally needed in the electronics diligence as an arithmetic unit in microprocessors. The most operations accounting 95% of conformist FPU are multiplication and addition. Many applications need the speedy execution of arithmetic operations. In the existing system, the FPM(Floating Point Multiplication) and FPA(Floating Point Addition) have more delay and fewer speed and fewer throughput. The demand for high speed and throughput intended to design the multiplier and adder blocks within the FPM (Floating point multiplication)and FPA(Floating Point Addition) in a format of single precision floating point and double-precision floating point operation is internally pipelined to achieve high throughput and these are supported by the IEEE 754 standard floating point representations. This is designed with the Verilog code using Xilinx ISE 14.5 software tool is employed to code and verify the ensuing waveforms of the designed code

Download Full-text

Modeling and Execution of Floating Point Parallel Processing Operation for RISC Processor

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c6203.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3783-3789

Keyword(s):

Video Processing ◽

High Speed ◽

High Performance ◽

Floating Point ◽

Double Precision ◽

Risc Processor ◽

Small Set ◽

Reduced Instruction Set Computer ◽

Definition Of ◽

Instruction Format

The development of processors with sundry suggestions have been made regarding a exactitude definition of RISC, but the prosaic concept is that such a computer has a small set of simple and prosaic instructions, instead of an outsized set of intricate and specialized instructions. This project proposes the planning of a high speed 64 bit RISC processor. The miens of this processor consume less power and it contrives on high speed. The processor comprises of sections namely Instruction Fetch section, Instruction Decode section, and Execution section. The ALU within the execution section comprises a double-precision floating-point multiplier designed during a corollary architecture thus improving the speed and veracity of the execution. All the sections are designed using Verilog coding. Monotonous instruction format, cognate prosaic-purpose registers, and pellucid addressing modes were the other miens. RISC exemplified as Reduced Instruction Set Computer. For designing high-performance processors, RISC is considered to be the footing. The RISC processor has a diminished number of Instructions, fixed instruction length, more prosaic-purpose register which are catalogued into the register file, load-store architecture and facilitate addressing modes which make diacritic instruction execute faster and achieve a net gain in performance. Thus the cardinal intent of this paper is to consummate the veridicality by devouring less power, area and with merest delay and it would be done by reinstating the floating-point ALU with single precision section by floating- point double precision section. Video processing, telecommunications and image processing were the high end applications used by architecture

Download Full-text