Implementation of floating point MAC using Residue Number System

Author(s):  
Dhanabal R ◽  
Barathi V ◽  
Sarat Kumar Sahoo ◽  
Naamatheertham R Samhitha ◽  
Neethu Acha Cherian ◽  
...  
2017 ◽  
Vol 27 (01) ◽  
pp. 1850004 ◽  
Author(s):  
Konstantin Isupov ◽  
Vladimir Knyazkov

Residue number system (RNS), due to its carry-free nature, is popular in many applications of high-speed computer arithmetic, especially in digital signal processing and cryptography. However, the main limiting factor of RNS is a high complexity of such operations as magnitude comparison, sign determination and overflow detection. These operations have, for many years, been a major obstacle to more widespread use of parallel residue arithmetic. This paper presents a new efficient method to perform these operations, which is based on computation and analysis of the interval estimation for the relative value of an RNS number. The estimation, which is called the interval floating-point characteristic (IFC), is represented by two directed rounded bounds that are fixed-precision numbers. Generally, the time complexities of serial and parallel computations of IFC are linear and logarithmic functions of the size of the moduli set, respectively. The new method requires only small-integer and fixed-precision floating-point operations and focuses on arbitrary moduli sets with large dynamic ranges ([Formula: see text]). Experiments indicate that the performance of the proposed method is significantly higher than that of methods based on Mixed-Radix Conversion.


Computation ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 9
Author(s):  
Konstantin Isupov

Residue number system (RNS) is known for its parallel arithmetic and has been used in recent decades in various important applications, from digital signal processing and deep neural networks to cryptography and high-precision computation. However, comparison, sign identification, overflow detection, and division are still hard to implement in RNS. For such operations, most of the methods proposed in the literature only support small dynamic ranges (up to several tens of bits), so they are only suitable for low-precision applications. We recently proposed a method that supports arbitrary moduli sets with cryptographically sized dynamic ranges, up to several thousands of bits. The practical interest of our method compared to existing methods is that it relies only on very fast standard floating-point operations, so it is suitable for multiple-precision applications and can be efficiently implemented on many general-purpose platforms that support IEEE 754 arithmetic. In this paper, we make further improvements to this method and demonstrate that it can successfully be applied to implement efficient data-parallel primitives operating in the RNS domain, namely finding the maximum element of an array of RNS numbers on graphics processing units. Our experimental results on an NVIDIA RTX 2080 GPU show that for random residues and a 128-moduli set with 2048-bit dynamic range, the proposed implementation reduces the running time by a factor of 39 and the memory consumption by a factor of 13 compared to an implementation based on mixed-radix conversion.


Author(s):  
Anastasia S. Korzhavina ◽  
Vladimir S. Knyazkov

Introduction. The solution of the simulation problems critical to rounding errors, including the problems of computational mathematics, mathematical physics, optimal control, biochemistry, quantum mechanics, mathematical programming and cryptography, requires the accuracy from 100 to 1 000 decimal digits and more. The main lack of high-precision software libraries is a significant decrease of the speed-in-action, unacceptable for practical problems, in particular, when performing multiplication. A way to increase computation performance over very long numbers is using the residue number system. In this work, we discuss a new fast multiplication method with scaling the result using original hybrid residue positional interval logarithmic floating-point number representation. Materials and Methods. The new way of the organizing numerical information is a residue positional interval logarithmic number representation in which the mantissa is presented in the residue number system, and information on an absolute value (the characteristic) in the interval logarithmic number system that makes it possible to accelerate performance of comparison and scaling is developed to increase the speed of calculations; to compare modular numbers, the provisions of interval analysis are used; to scale modular numbers, the properties of the logarithmic number system and approximate interval calculations using the Chinese reminder theorem are used. Results. A new fast multiplication method of floating-point residue-represented numbers is developed and patented; the authors evaluated the developed method speed-in action, compared the developed method with classical and pipelined multiplication methods of long numbers. Discussion and Conclusion. The developed method is 2.4–4.0 times faster than the pipelined multiplication method, and is 6.4–12.9 times faster than classical multiplication methods.


2017 ◽  
Vol 8 (3) ◽  
pp. 189-200 ◽  
Author(s):  
Jean-Claude Bajard ◽  
Julien Eynard ◽  
Nabil Merkiche

Author(s):  
Mikhail Selianinau

AbstractIn this paper, we deal with the critical problem of performing non-modular operations in the Residue Number System (RNS). The Chinese Remainder Theorem (CRT) is widely used in many modern computer applications. Throughout the article, an efficient approach for implementing the CRT algorithm is described. The structure of the rank of an RNS number, a principal positional characteristic of the residue code, is investigated. It is shown that the rank of a number can be represented by a sum of an inexact rank and a two-valued correction to it. We propose a new variant of minimally redundant RNS, which provides low computational complexity for the rank calculation, and its effectiveness analyzed concerning conventional non-redundant RNS. Owing to the extension of the residue code, by adding the excess residue modulo 2, the complexity of the rank calculation goes down from $O\left (k^{2}\right )$ O k 2 to $O\left (k\right )$ O k with respect to required modular addition operations and lookup tables, where k equals the number of non-redundant RNS moduli.


Sign in / Sign up

Export Citation Format

Share Document