scholarly journals Mixed-precision block gram Schmidt orthogonalization

Author(s):  
Ichitaro Yamazaki ◽  
Stanimire Tomov ◽  
Jakub Kurzak ◽  
Jack Dongarra ◽  
Jesse Barlow
2021 ◽  
Vol 47 (2) ◽  
pp. 1-28
Author(s):  
Goran Flegar ◽  
Hartwig Anzt ◽  
Terry Cojean ◽  
Enrique S. Quintana-Ortí

The use of mixed precision in numerical algorithms is a promising strategy for accelerating scientific applications. In particular, the adoption of specialized hardware and data formats for low-precision arithmetic in high-end GPUs (graphics processing units) has motivated numerous efforts aiming at carefully reducing the working precision in order to speed up the computations. For algorithms whose performance is bound by the memory bandwidth, the idea of compressing its data before (and after) memory accesses has received considerable attention. One idea is to store an approximate operator–like a preconditioner–in lower than working precision hopefully without impacting the algorithm output. We realize the first high-performance implementation of an adaptive precision block-Jacobi preconditioner which selects the precision format used to store the preconditioner data on-the-fly, taking into account the numerical properties of the individual preconditioner blocks. We implement the adaptive block-Jacobi preconditioner as production-ready functionality in the Ginkgo linear algebra library, considering not only the precision formats that are part of the IEEE standard, but also customized formats which optimize the length of the exponent and significand to the characteristics of the preconditioner blocks. Experiments run on a state-of-the-art GPU accelerator show that our implementation offers attractive runtime savings.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Mohammed Al-Smadi ◽  
Nadir Djeddi ◽  
Shaher Momani ◽  
Shrideh Al-Omari ◽  
Serkan Araci

AbstractOur aim in this paper is presenting an attractive numerical approach giving an accurate solution to the nonlinear fractional Abel differential equation based on a reproducing kernel algorithm with model endowed with a Caputo–Fabrizio fractional derivative. By means of such an approach, we utilize the Gram–Schmidt orthogonalization process to create an orthonormal set of bases that leads to an appropriate solution in the Hilbert space $\mathcal{H}^{2}[a,b]$ H 2 [ a , b ] . We investigate and discuss stability and convergence of the proposed method. The n-term series solution converges uniformly to the analytic solution. We present several numerical examples of potential interests to illustrate the reliability, efficacy, and performance of the method under the influence of the Caputo–Fabrizio derivative. The gained results have shown superiority of the reproducing kernel algorithm and its infinite accuracy with a least time and efforts in solving the fractional Abel-type model. Therefore, in this direction, the proposed algorithm is an alternative and systematic tool for analyzing the behavior of many nonlinear temporal fractional differential equations emerging in the fields of engineering, physics, and sciences.


Author(s):  
Wei-Fan Chiang ◽  
Mark Baranowski ◽  
Ian Briggs ◽  
Alexey Solovyev ◽  
Ganesh Gopalakrishnan ◽  
...  

2021 ◽  
Vol 18 (2) ◽  
pp. 1-24
Author(s):  
Nhut-Minh Ho ◽  
Himeshi De silva ◽  
Weng-Fai Wong

This article presents GRAM (<underline>G</underline>PU-based <underline>R</underline>untime <underline>A</underline>daption for <underline>M</underline>ixed-precision) a framework for the effective use of mixed precision arithmetic for CUDA programs. Our method provides a fine-grain tradeoff between output error and performance. It can create many variants that satisfy different accuracy requirements by assigning different groups of threads to different precision levels adaptively at runtime . To widen the range of applications that can benefit from its approximation, GRAM comes with an optional half-precision approximate math library. Using GRAM, we can trade off precision for any performance improvement of up to 540%, depending on the application and accuracy requirement.


2014 ◽  
Vol 36 (2) ◽  
pp. C240-C263 ◽  
Author(s):  
M. Petschow ◽  
E. S. Quintana-Ortí ◽  
P. Bientinesi

Sign in / Sign up

Export Citation Format

Share Document