Multiple-Precision Residue-Based Arithmetic Library for Parallel CPU-GPU Architectures: Data Types and Features

Author(s):  
Konstantin Isupov ◽  
Alexander Kuvaev ◽  
Mikhail Popov ◽  
Anton Zaviyalov
2020 ◽  
Author(s):  
Konstantin Isupov ◽  
Vladimir Knyazkov

The binary32 and binary64 floating-point formats provide good performance on current hardware, but also introduce a rounding error in almost every arithmetic operation. Consequently, the accumulation of rounding errors in large computations can cause accuracy issues. One way to prevent these issues is to use multiple-precision floating-point arithmetic. This preprint, submitted to Russian Supercomputing Days 2020, presents a new library of basic linear algebra operations with multiple precision for graphics processing units. The library is written in CUDA C/C++ and uses the residue number system to represent multiple-precision significands of floating-point numbers. The supported data types, memory layout, and main features of the library are considered. Experimental results are presented showing the performance of the library.


Author(s):  
Kristina Mihajlenko ◽  
Mikhail Lukin ◽  
Andrey Stankevich

Introduction: Decompilers are useful tools for software analysis and support in the absence of source code. They are available for many hardware architectures and programming languages. However, none of the existing decompilers support modern AMD GPU architectures such as AMD GCN and RDNA. Purpose: We aim at developing the first assembly decompiler tool for a modern AMD GPU architecture that generates code in the OpenCL language, which is widely used for programming GPGPUs. Results: We developed the algorithms for the following operations: preprocessing assembly code, searching data accesses, extracting systemvalues, decompiling arithmetic operations and recovering data types. We also developed templates for decompilation of branching operations. Practical relevance: We implemented the presented algorithms in Python as a tool called OpenCLDecompiler, which supports a large subset of AMD GCN instructions. This tool automatically converts disassembled GPGPU code into the equivalent OpenCL code, which reduces the effort required to analyze assembly code.


2021 ◽  
Author(s):  
Alexssandro Ferreira Cordeiro ◽  
Pedro Luiz de Paula Filho ◽  
Hamilton Pereira Silva ◽  
Arnaldo Candido Junior ◽  
Edresson Casanova ◽  
...  

Abstract Purpose: analysis of processing time and similarity of images generated between CPU and GPU architectures and sequential and parallel programming methodologies. Material and methods: for image processing a computer with AMD FX-8350 processor and an Nvidia GTX 960 Maxwell GPU was used, along with the CUDAFY library and the programming language C# with the IDE Visual studio. Results: the results of the comparisons indicate that the form of sequential programming in a CPU generates reliable images at a high custom of time when compared to the forms of parallel programming in CPU and GPU. While parallel programming generates faster results, but with increased noise in the reconstructed image. For data types float a GPU obtained best result with average time equivalent to 1/3 of the processor, however the data is of type double the parallel CPU approach obtained the best performance. Conclusion: for the float data type, the GPU had the best average time performance, while for the double data type the best average time performance was for the parallel approach CPU. Regarding image quality, the sequential approach obtained similar outputs, while theparallel approaches generated noise in their outputs.


2020 ◽  
Author(s):  
Konstantin Isupov ◽  
Vladimir Knyazkov

The binary32 and binary64 floating-point formats provide good performance on current hardware, but also introduce a rounding error in almost every arithmetic operation. Consequently, the accumulation of rounding errors in large computations can cause accuracy issues. One way to prevent these issues is to use multiple-precision floating-point arithmetic. This preprint, submitted to Russian Supercomputing Days 2020, presents a new library of basic linear algebra operations with multiple precision for graphics processing units. The library is written in CUDA C/C++ and uses the residue number system to represent multiple-precision significands of floating-point numbers. The supported data types, memory layout, and main features of the library are considered. Experimental results are presented showing the performance of the library.


2021 ◽  
Author(s):  
Dylan Jude ◽  
Jay Sitaraman ◽  
Andrew M. Wissink
Keyword(s):  

2018 ◽  
Author(s):  
Prathiba Natesan ◽  
Smita Mehta

Single case experimental designs (SCEDs) have become an indispensable methodology where randomized control trials may be impossible or even inappropriate. However, the nature of SCED data presents challenges for both visual and statistical analyses. Small sample sizes, autocorrelations, data types, and design types render many parametric statistical analyses and maximum likelihood approaches ineffective. The presence of autocorrelation decreases interrater reliability in visual analysis. The purpose of the present study is to demonstrate a newly developed model called the Bayesian unknown change-point (BUCP) model which overcomes all the above-mentioned data analytic challenges. This is the first study to formulate and demonstrate rate ratio effect size for autocorrelated data, which has remained an open question in SCED research until now. This expository study also compares and contrasts the results from BUCP model with visual analysis, and rate ratio effect size with nonoverlap of all pairs (NAP) effect size. Data from a comprehensive behavioral intervention are used for the demonstration.


2012 ◽  
Vol 10 (4) ◽  
pp. 202-215
Author(s):  
Manoel Agamemnon Lopes ◽  
Roberta Vilhena Vieira Lopes
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document