compute unified device architecture Latest Research Papers

Particle systems present challenges that have warranted and attracted large amount of attention in both usage and optimization. The use of particle systems has driven complexity of simulation to greater needs of data size and accuracy. Optimization, thus, has become a moving target for researchers to reach. Studies show that multithreading has potential to make the simulation efficient while optimizing complex and data-intensive particle systems. The CUDA (Compute Unified Device Architecture) works with programming languages such as C/C++ and Python to make multithreaded parallel programming easier. This work serves to analyze particle systems using CUDA and provide an understanding about how various parameters such as the particle count and grid size influence the simulation performance. We improve the CUDA particles demo by Nvidia using our Python scripts and study the impact of particles and grids on execution time and throughput. Experimental results indicate that a required level of performance can be achieved by varying the number of particles, the size grids, and the orientation of grids as needed.

Download Full-text

A Three-Dimensional Visualization Framework for Underground Geohazard Recognition on Urban Road-Facing GPR Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9110668 ◽

2020 ◽

Vol 9 (11) ◽

pp. 668

Author(s):

Zhenwu Wang ◽

Benting Wan ◽

Mengjie Han

Keyword(s):

Three Dimensional ◽

Pso Algorithm ◽

Sampling Point ◽

Compute Unified Device Architecture ◽

Point Selection ◽

Triangulated Irregular Network ◽

Device Architecture ◽

Ground Penetrating ◽

Geometry Method ◽

Difficult Issue

The identification of underground geohazards is always a difficult issue in the field of underground public safety. This study proposes an interactive visualization framework for underground geohazard recognition on urban roads, which constructs a whole recognition workflow by incorporating data collection, preprocessing, modeling, rendering and analyzing. In this framework, two proposed sampling point selection methods have been adopted to enhance the interpolated accuracy for the Kriging algorithm based on ground penetrating radar (GPR) technology. An improved Kriging algorithm was put forward, which applies a particle swarm optimization (PSO) algorithm to optimize the Kriging parameters and adopts in parallel the Compute Unified Device Architecture (CUDA) to run the PSO algorithm on the GPU side in order to raise the interpolated efficiency. Furthermore, a layer-constrained triangulated irregular network algorithm was proposed to construct the 3D geohazard bodies and the space geometry method was used to compute their volume information. The study also presents an implementation system to demonstrate the application of the framework and its related algorithms. This system makes a significant contribution to the demonstration and understanding of underground geohazard recognition in a three-dimensional environment.

Download Full-text

GPU Accelerated Nonlinear Electronic Circuits Solver for Transient Simulation of Systems with Large Number of Components

Electronics ◽

10.3390/electronics9111819 ◽

2020 ◽

Vol 9 (11) ◽

pp. 1819

Author(s):

David Černý ◽

Josef Dobeš

Keyword(s):

Krylov Subspace ◽

Direct Method ◽

Iterative Algorithms ◽

Transient Simulation ◽

Compute Unified Device Architecture ◽

Circuit Simulator ◽

Simulation Speed ◽

Device Architecture ◽

Highly Nonlinear ◽

Improved Accuracy

GPU cards have been used for scientific calculations for many years. Despite their ever-increasing performance, there are cases where they may still have problems. This article addresses possible performance and memory issues and their solutions that may occur during GPU calculations of iterative algorithms. Specifically, the article focuses on the optimization of transient simulation of extra-large highly nonlinear time-dependent circuits in SPICE-like electronic circuit simulator core enhanced with NVIDIA/CUDA (Compute Unified Device Architecture) interface and iterative Krylov Subspace methods with emphasis on improved accuracy. The article presents procedures for solving problems that may occur during this integration and negatively affect either the simulation speed or the accuracy of the calculation. Finally, a comparison of the implementation of an iterative calculation procedure with the use of GPU cards, calculation by the direct method and calculation on the CPU only is presented.

Download Full-text

Parallel Genetic Algorithm using Compute Unified Device Architecture (CUDA) for Designing Energy Saving Glass Coating Structure

2020 2nd International Conference on Computer and Information Sciences (ICCIS) ◽

10.1109/iccis49240.2020.9257606 ◽

2020 ◽

Author(s):

Khoo Wen Xin ◽

Abdul Samad Shibghatullah ◽

Rohana Sham

Keyword(s):

Genetic Algorithm ◽

Energy Saving ◽

Glass Coating ◽

Compute Unified Device Architecture ◽

Parallel Genetic Algorithm ◽

Coating Structure ◽

Device Architecture

Download Full-text

Fingerprint Matching Using Bozorth3 Algorithm and Parallel Computation on NVIDIA Compute Unified Device Architecture

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/879/1/012109 ◽

2020 ◽

Vol 879 ◽

pp. 012109

Author(s):

S Supatmi ◽

I D Sumitra

Keyword(s):

Parallel Computation ◽

Fingerprint Matching ◽

Compute Unified Device Architecture ◽

Device Architecture

Download Full-text

Performance gains with Compute Unified Device Architecture-enabled eddy current correction for diffusion MRI.

Neuroreport ◽

10.1097/wnr.0000000000001475 ◽

2020 ◽

Vol 31 (10) ◽

pp. 746-753

Author(s):

Jerome J. Maller ◽

Stuart M. Grieve ◽

Simon J. Vogrin ◽

Thomas Welton

Keyword(s):

Eddy Current ◽

Diffusion Mri ◽

Compute Unified Device Architecture ◽

Device Architecture ◽

Performance Gains ◽

Eddy Current Correction

Download Full-text

A multilevel index optimization method for fast kinematic calibration configuration of serial manipulators based on Compute Unified Device Architecture parallel computing

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1177/0954406220925843 ◽

2020 ◽

Vol 234 (23) ◽

pp. 4708-4724

Author(s):

Xuejie Jiang ◽

Lijin Fang ◽

Yue Gao

Keyword(s):

Parallel Computing ◽

Optimization Method ◽

Evaluation Criterion ◽

Kinematic Calibration ◽

Compute Unified Device Architecture ◽

Multilevel Optimization ◽

Computing Technique ◽

Serial Manipulators ◽

Device Architecture ◽

Calibration Accuracy

The kinematic calibration accuracy of serial manipulators is affected by the error expression ability of the selected measurement configurations and non-geometric errors such as joint disturbance, measurement noise, etc. Based on the observability of configurations, deviation of identifiable parameters, and calibration robustness, this paper proposes a multilevel evaluation criterion for measurement configuration optimization. In addition, based on the Compute Unified Device Architecture (CUDA) parallel computing technique, the most time-consuming Jacobian matrix calculation program in the algorithm is modified, and an efficient optimization algorithm for measurement configurations is established, to guarantee the feasibility of the evaluation criterion. Combined with CUDA algorithm, fast calibration is implemented with fewer measurement points and relatively higher accuracy, by means of multilevel optimization. The results illustrate the effectiveness and the universality of the proposed multilevel evaluation criterion. The criterion can be applied in calibration experiments of multi-degree of freedom (DOF) serial manipulators with complex structures.

Download Full-text

Performance analysis and GPU parallelisation of ECO object tracking algorithm

New Trends and Issues Proceedings on Advances Pure and Applied Sciences ◽

10.18844/gjpaas.v0i12.4991 ◽

2020 ◽

pp. 109-118

Author(s):

Ugur Taygan ◽

Adnan Ozsoy

Keyword(s):

Performance Analysis ◽

Object Tracking ◽

Parallel Programming ◽

Real Time ◽

Object Classification ◽

Tracking Algorithm ◽

Compute Unified Device Architecture ◽

Convolution Operators ◽

Device Architecture ◽

Programming Tools

The classification and tracking of objects has gained popularity in recent years due to the variety and importance of their application areas. Although object classification does not necessarily have to be real time, object tracking is often intended to be carried out in real time. While the object tracking algorithm mainly focuses on robustness and accuracy, the speed of the algorithm may degrade significantly. Due to their parallelisable nature, the use of GPUs and other parallel programming tools are increasing in the object tracking applications. In this paper, we run experiments on the Efficient Convolution Operators object tracking algorithm, in order to detect its time-consuming parts, which are the bottlenecks of the algorithm, and investigate the possibility of GPU parallelisation of the bottlenecks to improve the speed of the algorithm. Finally, the candidate methods are implemented and parallelised using the Compute Unified Device Architecture. Keywords: Object tracking, parallel programming.

Download Full-text

Parallel genetic algorithm for N‐Queens problem based on message passing interface‐compute unified device architecture

Computational Intelligence ◽

10.1111/coin.12300 ◽

2020 ◽

Vol 36 (4) ◽

pp. 1621-1637

Author(s):

Cao Jianli ◽

Chen Zhikui ◽

Wang Yuxin ◽

Guo He

Keyword(s):

Genetic Algorithm ◽

Message Passing ◽

Message Passing Interface ◽

Compute Unified Device Architecture ◽

Parallel Genetic Algorithm ◽

Device Architecture

Download Full-text

compute unified device architecture
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

GPU-accelerated deformable registration of cone-beam CT images: design and implementation in the compute unified device architecture

Optimizing Particle Systems through CUDA-Assisted Multithreading

A Three-Dimensional Visualization Framework for Underground Geohazard Recognition on Urban Road-Facing GPR Data

GPU Accelerated Nonlinear Electronic Circuits Solver for Transient Simulation of Systems with Large Number of Components

Parallel Genetic Algorithm using Compute Unified Device Architecture (CUDA) for Designing Energy Saving Glass Coating Structure

Fingerprint Matching Using Bozorth3 Algorithm and Parallel Computation on NVIDIA Compute Unified Device Architecture

Performance gains with Compute Unified Device Architecture-enabled eddy current correction for diffusion MRI.

A multilevel index optimization method for fast kinematic calibration configuration of serial manipulators based on Compute Unified Device Architecture parallel computing

Performance analysis and GPU parallelisation of ECO object tracking algorithm

Parallel genetic algorithm for N‐Queens problem based on message passing interface‐compute unified device architecture

Export Citation Format

compute unified device architectureRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

GPU-accelerated deformable registration of cone-beam CT images: design and implementation in the compute unified device architecture

Optimizing Particle Systems through CUDA-Assisted Multithreading

A Three-Dimensional Visualization Framework for Underground Geohazard Recognition on Urban Road-Facing GPR Data

GPU Accelerated Nonlinear Electronic Circuits Solver for Transient Simulation of Systems with Large Number of Components

Parallel Genetic Algorithm using Compute Unified Device Architecture (CUDA) for Designing Energy Saving Glass Coating Structure

Fingerprint Matching Using Bozorth3 Algorithm and Parallel Computation on NVIDIA Compute Unified Device Architecture

Performance gains with Compute Unified Device Architecture-enabled eddy current correction for diffusion MRI.

A multilevel index optimization method for fast kinematic calibration configuration of serial manipulators based on Compute Unified Device Architecture parallel computing

Performance analysis and GPU parallelisation of ECO object tracking algorithm

Parallel genetic algorithm for N‐Queens problem based on message passing interface‐compute unified device architecture

compute unified device architecture
Recently Published Documents