point arithmetic Latest Research Papers

Low-precision Floating-point Arithmetic for High-performance FPGA-based CNN Acceleration

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3474597 ◽

2022 ◽

Vol 15 (1) ◽

pp. 1-21

Author(s):

Chen Wu ◽

Mingyu Wang ◽

Xinyuan Chu ◽

Kun Wang ◽

Lei He

Keyword(s):

Fixed Point ◽

High Performance ◽

Good Accuracy ◽

Data Representation ◽

Floating Point ◽

Average Throughput ◽

Precision Data ◽

Content Type ◽

Point Arithmetic ◽

Better Than

Low-precision data representation is important to reduce storage size and memory access for convolutional neural networks (CNNs). Yet, existing methods have two major limitations: (1) requiring re-training to maintain accuracy for deep CNNs and (2) needing 16-bit floating-point or 8-bit fixed-point for a good accuracy. In this article, we propose a low-precision (8-bit) floating-point (LPFP) quantization method for FPGA-based acceleration to overcome the above limitations. Without any re-training, LPFP finds an optimal 8-bit data representation with negligible top-1/top-5 accuracy loss (within 0.5%/0.3% in our experiments, respectively, and significantly better than existing methods for deep CNNs). Furthermore, we implement one 8-bit LPFP multiplication by one 4-bit multiply-adder and one 3-bit adder, and therefore implement four 8-bit LPFP multiplications using one DSP48E1 of Xilinx Kintex-7 family or DSP48E2 of Xilinx Ultrascale/Ultrascale+ family, whereas one DSP can implement only two 8-bit fixed-point multiplications. Experiments on six typical CNNs for inference show that on average, we improve throughput by over existing FPGA accelerators. Particularly for VGG16 and YOLO, compared to six recent FPGA accelerators, we improve average throughput by 3.5 and 27.5 and average throughput per DSP by 4.1 and 5 , respectively.

Roots of Characteristic Polynomial Sequences in Iterative Block Cyclic Reductions

Mathematics ◽

10.3390/math9243213 ◽

2021 ◽

Vol 9 (24) ◽

pp. 3213

Author(s):

Masato Shinjo ◽

Tan Wang ◽

Masashi Iwasaki ◽

Yoshimasa Nakamura

Keyword(s):

Linear Systems ◽

Reduction Method ◽

Direct Method ◽

Coefficient Matrix ◽

Matrix Transformations ◽

Cyclic Reduction ◽

Polynomial Sequences ◽

Finite Step ◽

Point Arithmetic ◽

Repeated Block

The block cyclic reduction method is a finite-step direct method used for solving linear systems with block tridiagonal coefficient matrices. It iteratively uses transformations to reduce the number of non-zero blocks in coefficient matrices. With repeated block cyclic reductions, non-zero off-diagonal blocks in coefficient matrices incrementally leave the diagonal blocks and eventually vanish after a finite number of block cyclic reductions. In this paper, we focus on the roots of characteristic polynomials of coefficient matrices that are repeatedly transformed by block cyclic reductions. We regard each block cyclic reduction as a composition of two types of matrix transformations, and then attempt to examine changes in the existence range of roots. This is a block extension of the idea presented in our previous papers on simple cyclic reductions. The property that the roots are not very scattered is a key to accurately solve linear systems in floating-point arithmetic. We clarify that block cyclic reductions do not disperse roots, but rather narrow their distribution, if the original coefficient matrix is symmetric positive or negative definite.

A High Performance and Full Utilization Hardware Implementation of Floating Point Arithmetic Units

10.1109/icecs53924.2021.9665459 ◽

2021 ◽

Author(s):

Chen Yang ◽

Siwei Xiang ◽

Jiaxing Wang ◽

Liyan Liang

Keyword(s):

High Performance ◽

Hardware Implementation ◽

Floating Point ◽

Floating Point Arithmetic ◽

Arithmetic Units ◽

Full Utilization ◽

Point Arithmetic

A study of defect-based error estimates for the Krylov approximation of φ-functions

Numerical Algorithms ◽

10.1007/s11075-021-01190-x ◽

2021 ◽

Author(s):

Tobias Jawecki

Keyword(s):

Error Estimates ◽

Error Bounds ◽

A Priori ◽

Posteriori Error ◽

Finite Precision ◽

Ritz Values ◽

A Posteriori Error Bounds ◽

The Matrix ◽

And Performance ◽

Point Arithmetic

AbstractPrior recent work, devoted to the study of polynomial Krylov techniques for the approximation of the action of the matrix exponential etAv, is extended to the case of associated φ-functions (which occur within the class of exponential integrators). In particular, a posteriori error bounds and estimates, based on the notion of the defect (residual) of the Krylov approximation are considered. Computable error bounds and estimates are discussed and analyzed. This includes a new error bound which favorably compares to existing error bounds in specific cases. The accuracy of various error bounds is characterized in relation to corresponding Ritz values of A. Ritz values yield properties of the spectrum of A (specific properties are known a priori, e.g., for Hermitian or skew-Hermitian matrices) in relation to the actual starting vector v and can be computed. This gives theoretical results together with criteria to quantify the achieved accuracy on the fly. For other existing error estimates, the reliability and performance are studied by similar techniques. Effects of finite precision (floating point arithmetic) are also taken into account.

NetFC: Enabling Accurate Floating-point Arithmetic on Programmable Switches

10.1109/icnp52444.2021.9651946 ◽

2021 ◽

Author(s):

Penglai Cui ◽

Heng Pan ◽

Zhenyu Li ◽

Jiaoren Wu ◽

Shengzhuo Zhang ◽

...

Keyword(s):

Floating Point ◽

Floating Point Arithmetic ◽

Point Arithmetic

Precision Exploration of Floating-Point Arithmetic for Spiking Neural Networks

10.1109/isocc53507.2021.9614005 ◽

2021 ◽

Author(s):

Myeongjin Kwak ◽

Hyoju Seo ◽

Yongtae Kim

Keyword(s):

Neural Networks ◽

Spiking Neural Networks ◽

Floating Point ◽

Floating Point Arithmetic ◽

Point Arithmetic

Generation of test matrices with specified eigenvalues using floating-point arithmetic

Numerical Algorithms ◽

10.1007/s11075-021-01186-7 ◽

2021 ◽

Author(s):

Katsuhisa Ozaki ◽

Takeshi Ogita

Keyword(s):

Eigenvalue Problems ◽

Numerical Algorithms ◽

Numerical Linear Algebra ◽

Floating Point ◽

Diagonal Form ◽

Floating Point Arithmetic ◽

Complex Eigenvalues ◽

Test Matrices ◽

Block Diagonal Form ◽

Point Arithmetic

AbstractThis paper concerns test matrices for numerical linear algebra using an error-free transformation of floating-point arithmetic. For specified eigenvalues given by a user, we propose methods of generating a matrix whose eigenvalues are exactly known based on, for example, Schur or Jordan normal form and a block diagonal form. It is also possible to produce a real matrix with specified complex eigenvalues. Such test matrices with exactly known eigenvalues are useful for numerical algorithms in checking the accuracy of computed results. In particular, exact errors of eigenvalues can be monitored. To generate test matrices, we first propose an error-free transformation for the product of three matrices YSX. We approximate S by ${S^{\prime }}$ S ′ to compute ${YS^{\prime }X}$ Y S ′ X without a rounding error. Next, the error-free transformation is applied to the generation of test matrices with exactly known eigenvalues. Note that the exactly known eigenvalues of the constructed matrix may differ from the anticipated given eigenvalues. Finally, numerical examples are introduced in checking the accuracy of numerical computations for symmetric and unsymmetric eigenvalue problems.

Floating Point Arithmetic Unit with Multi-Precision for DSP Applications

10.1109/icecct52121.2021.9616759 ◽

2021 ◽

Author(s):

M. VishnuPriya ◽

B. Nancharaiah

Keyword(s):

Floating Point ◽

Arithmetic Unit ◽

Floating Point Arithmetic ◽

Point Arithmetic ◽

Dsp Applications

Using machine learning to model the training scalability of convolutional neural networks on clusters of GPUs

Computing ◽

10.1007/s00607-021-00997-9 ◽

2021 ◽

Author(s):

Sergio Barrachina ◽

Adrián Castelló ◽

Mar Catalán ◽

Manuel F. Dolz ◽

Jose I. Mestre

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Training Costs ◽

Data Parallel ◽

Proposed Model ◽

Batch Sizes ◽

Arithmetic Performance ◽

Point Arithmetic ◽

Cluster Level ◽

Analyze Data

AbstractIn this work, we build a general piece-wise model to analyze data-parallel (DP) training costs of convolutional neural networks (CNNs) on clusters of GPUs. This general model is based on i) multi-layer perceptrons (MLPs) in charge of modeling the NVIDIA cuDNN/cuBLAS library kernels involved in the training of some of the state-of-the-art CNNs; and ii) an analytical model in charge of modeling the NVIDIA NCCL Allreduce collective primitive using the Ring algorithm. The CNN training scalability study performed using this model in combination with the Roofline technique on varying batch sizes, node (floating-point) arithmetic performance, node memory bandwidth, network link bandwidth, and cluster dimension unveil some crucial bottlenecks at both GPU and cluster level. To provide evidence of this analysis, we validate the accuracy of the proposed model against a Python library for distributed deep learning training.

FAST ROBUST ARITHMETICS FOR GEOMETRIC ALGORITHMS AND APPLICATIONS TO GIS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlvi-4-w2-2021-1-2021 ◽

2021 ◽

Vol XLVI-4/W2-2021 ◽

pp. 1-8

Author(s):

T. Bartels ◽

V. Fisikopoulos

Keyword(s):

Code Generation ◽

Computational Cost ◽

Floating Point ◽

Data Sets ◽

Delaunay Triangulations ◽

Triangulated Irregular Networks ◽

Gis Data ◽

Irregular Networks ◽

Meta Programming ◽

Point Arithmetic

Abstract. Geometric predicates are used in many GIS algorithms, such as the construction of Delaunay Triangulations for Triangulated Irregular Networks (TIN) or geospatial predicates. With floating-point arithmetic, these computations can incur roundoff errors that may lead to incorrect results and inconsistencies, causing computations to fail. This issue has been addressed using a combination of exact arithmetics for robustness and floating-point filters to mitigate the computational cost of exact computations. The implementation of exact computations and floating-point filters can be a difficult task, and code generation tools have been proposed to address this. We present a new C++ meta-programming framework for the generation of fast, robust predicates for arbitrary geometric predicates based on polynomial expressions. We show examples of how this approach produces correct results for GIS data sets that could lead to incorrect predicate results for naive implementations. We also show benchmark results that demonstrate that our implementation can compete with state-of-the-art solutions.

point arithmetic
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Low-precision Floating-point Arithmetic for High-performance FPGA-based CNN Acceleration

Roots of Characteristic Polynomial Sequences in Iterative Block Cyclic Reductions

A High Performance and Full Utilization Hardware Implementation of Floating Point Arithmetic Units

A study of defect-based error estimates for the Krylov approximation of φ-functions

NetFC: Enabling Accurate Floating-point Arithmetic on Programmable Switches

Precision Exploration of Floating-Point Arithmetic for Spiking Neural Networks

Generation of test matrices with specified eigenvalues using floating-point arithmetic

Floating Point Arithmetic Unit with Multi-Precision for DSP Applications

Using machine learning to model the training scalability of convolutional neural networks on clusters of GPUs

FAST ROBUST ARITHMETICS FOR GEOMETRIC ALGORITHMS AND APPLICATIONS TO GIS

Export Citation Format

point arithmeticRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Low-precision Floating-point Arithmetic for High-performance FPGA-based CNN Acceleration

Roots of Characteristic Polynomial Sequences in Iterative Block Cyclic Reductions

A High Performance and Full Utilization Hardware Implementation of Floating Point Arithmetic Units

A study of defect-based error estimates for the Krylov approximation of φ-functions

NetFC: Enabling Accurate Floating-point Arithmetic on Programmable Switches

Precision Exploration of Floating-Point Arithmetic for Spiking Neural Networks

Generation of test matrices with specified eigenvalues using floating-point arithmetic

Floating Point Arithmetic Unit with Multi-Precision for DSP Applications

Using machine learning to model the training scalability of convolutional neural networks on clusters of GPUs

FAST ROBUST ARITHMETICS FOR GEOMETRIC ALGORITHMS AND APPLICATIONS TO GIS

point arithmetic
Recently Published Documents