point arithmetic
Recently Published Documents


TOTAL DOCUMENTS

651
(FIVE YEARS 118)

H-INDEX

35
(FIVE YEARS 4)

2022 ◽  
Vol 15 (1) ◽  
pp. 1-21
Author(s):  
Chen Wu ◽  
Mingyu Wang ◽  
Xinyuan Chu ◽  
Kun Wang ◽  
Lei He

Low-precision data representation is important to reduce storage size and memory access for convolutional neural networks (CNNs). Yet, existing methods have two major limitations: (1) requiring re-training to maintain accuracy for deep CNNs and (2) needing 16-bit floating-point or 8-bit fixed-point for a good accuracy. In this article, we propose a low-precision (8-bit) floating-point (LPFP) quantization method for FPGA-based acceleration to overcome the above limitations. Without any re-training, LPFP finds an optimal 8-bit data representation with negligible top-1/top-5 accuracy loss (within 0.5%/0.3% in our experiments, respectively, and significantly better than existing methods for deep CNNs). Furthermore, we implement one 8-bit LPFP multiplication by one 4-bit multiply-adder and one 3-bit adder, and therefore implement four 8-bit LPFP multiplications using one DSP48E1 of Xilinx Kintex-7 family or DSP48E2 of Xilinx Ultrascale/Ultrascale+ family, whereas one DSP can implement only two 8-bit fixed-point multiplications. Experiments on six typical CNNs for inference show that on average, we improve throughput by over existing FPGA accelerators. Particularly for VGG16 and YOLO, compared to six recent FPGA accelerators, we improve average throughput by 3.5 and 27.5 and average throughput per DSP by 4.1 and 5 , respectively.


Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3213
Author(s):  
Masato Shinjo ◽  
Tan Wang ◽  
Masashi Iwasaki ◽  
Yoshimasa Nakamura

The block cyclic reduction method is a finite-step direct method used for solving linear systems with block tridiagonal coefficient matrices. It iteratively uses transformations to reduce the number of non-zero blocks in coefficient matrices. With repeated block cyclic reductions, non-zero off-diagonal blocks in coefficient matrices incrementally leave the diagonal blocks and eventually vanish after a finite number of block cyclic reductions. In this paper, we focus on the roots of characteristic polynomials of coefficient matrices that are repeatedly transformed by block cyclic reductions. We regard each block cyclic reduction as a composition of two types of matrix transformations, and then attempt to examine changes in the existence range of roots. This is a block extension of the idea presented in our previous papers on simple cyclic reductions. The property that the roots are not very scattered is a key to accurately solve linear systems in floating-point arithmetic. We clarify that block cyclic reductions do not disperse roots, but rather narrow their distribution, if the original coefficient matrix is symmetric positive or negative definite.


Author(s):  
Tobias Jawecki

AbstractPrior recent work, devoted to the study of polynomial Krylov techniques for the approximation of the action of the matrix exponential etAv, is extended to the case of associated φ-functions (which occur within the class of exponential integrators). In particular, a posteriori error bounds and estimates, based on the notion of the defect (residual) of the Krylov approximation are considered. Computable error bounds and estimates are discussed and analyzed. This includes a new error bound which favorably compares to existing error bounds in specific cases. The accuracy of various error bounds is characterized in relation to corresponding Ritz values of A. Ritz values yield properties of the spectrum of A (specific properties are known a priori, e.g., for Hermitian or skew-Hermitian matrices) in relation to the actual starting vector v and can be computed. This gives theoretical results together with criteria to quantify the achieved accuracy on the fly. For other existing error estimates, the reliability and performance are studied by similar techniques. Effects of finite precision (floating point arithmetic) are also taken into account.


2021 ◽  
Author(s):  
Penglai Cui ◽  
Heng Pan ◽  
Zhenyu Li ◽  
Jiaoren Wu ◽  
Shengzhuo Zhang ◽  
...  

Author(s):  
Katsuhisa Ozaki ◽  
Takeshi Ogita

AbstractThis paper concerns test matrices for numerical linear algebra using an error-free transformation of floating-point arithmetic. For specified eigenvalues given by a user, we propose methods of generating a matrix whose eigenvalues are exactly known based on, for example, Schur or Jordan normal form and a block diagonal form. It is also possible to produce a real matrix with specified complex eigenvalues. Such test matrices with exactly known eigenvalues are useful for numerical algorithms in checking the accuracy of computed results. In particular, exact errors of eigenvalues can be monitored. To generate test matrices, we first propose an error-free transformation for the product of three matrices YSX. We approximate S by ${S^{\prime }}$ S ′ to compute ${YS^{\prime }X}$ Y S ′ X without a rounding error. Next, the error-free transformation is applied to the generation of test matrices with exactly known eigenvalues. Note that the exactly known eigenvalues of the constructed matrix may differ from the anticipated given eigenvalues. Finally, numerical examples are introduced in checking the accuracy of numerical computations for symmetric and unsymmetric eigenvalue problems.


Computing ◽  
2021 ◽  
Author(s):  
Sergio Barrachina ◽  
Adrián Castelló ◽  
Mar Catalán ◽  
Manuel F. Dolz ◽  
Jose I. Mestre

AbstractIn this work, we build a general piece-wise model to analyze data-parallel (DP) training costs of convolutional neural networks (CNNs) on clusters of GPUs. This general model is based on i) multi-layer perceptrons (MLPs) in charge of modeling the NVIDIA cuDNN/cuBLAS library kernels involved in the training of some of the state-of-the-art CNNs; and ii) an analytical model in charge of modeling the NVIDIA NCCL Allreduce collective primitive using the Ring algorithm. The CNN training scalability study performed using this model in combination with the Roofline technique on varying batch sizes, node (floating-point) arithmetic performance, node memory bandwidth, network link bandwidth, and cluster dimension unveil some crucial bottlenecks at both GPU and cluster level. To provide evidence of this analysis, we validate the accuracy of the proposed model against a Python library for distributed deep learning training.


Author(s):  
T. Bartels ◽  
V. Fisikopoulos

Abstract. Geometric predicates are used in many GIS algorithms, such as the construction of Delaunay Triangulations for Triangulated Irregular Networks (TIN) or geospatial predicates. With floating-point arithmetic, these computations can incur roundoff errors that may lead to incorrect results and inconsistencies, causing computations to fail. This issue has been addressed using a combination of exact arithmetics for robustness and floating-point filters to mitigate the computational cost of exact computations. The implementation of exact computations and floating-point filters can be a difficult task, and code generation tools have been proposed to address this. We present a new C++ meta-programming framework for the generation of fast, robust predicates for arbitrary geometric predicates based on polynomial expressions. We show examples of how this approach produces correct results for GIS data sets that could lead to incorrect predicate results for naive implementations. We also show benchmark results that demonstrate that our implementation can compete with state-of-the-art solutions.


Sign in / Sign up

Export Citation Format

Share Document