compute unified device architecture
Recently Published Documents


TOTAL DOCUMENTS

100
(FIVE YEARS 11)

H-INDEX

10
(FIVE YEARS 1)

Particle systems present challenges that have warranted and attracted large amount of attention in both usage and optimization. The use of particle systems has driven complexity of simulation to greater needs of data size and accuracy. Optimization, thus, has become a moving target for researchers to reach. Studies show that multithreading has potential to make the simulation efficient while optimizing complex and data-intensive particle systems. The CUDA (Compute Unified Device Architecture) works with programming languages such as C/C++ and Python to make multithreaded parallel programming easier. This work serves to analyze particle systems using CUDA and provide an understanding about how various parameters such as the particle count and grid size influence the simulation performance. We improve the CUDA particles demo by Nvidia using our Python scripts and study the impact of particles and grids on execution time and throughput. Experimental results indicate that a required level of performance can be achieved by varying the number of particles, the size grids, and the orientation of grids as needed.


2020 ◽  
Vol 9 (11) ◽  
pp. 668
Author(s):  
Zhenwu Wang ◽  
Benting Wan ◽  
Mengjie Han

The identification of underground geohazards is always a difficult issue in the field of underground public safety. This study proposes an interactive visualization framework for underground geohazard recognition on urban roads, which constructs a whole recognition workflow by incorporating data collection, preprocessing, modeling, rendering and analyzing. In this framework, two proposed sampling point selection methods have been adopted to enhance the interpolated accuracy for the Kriging algorithm based on ground penetrating radar (GPR) technology. An improved Kriging algorithm was put forward, which applies a particle swarm optimization (PSO) algorithm to optimize the Kriging parameters and adopts in parallel the Compute Unified Device Architecture (CUDA) to run the PSO algorithm on the GPU side in order to raise the interpolated efficiency. Furthermore, a layer-constrained triangulated irregular network algorithm was proposed to construct the 3D geohazard bodies and the space geometry method was used to compute their volume information. The study also presents an implementation system to demonstrate the application of the framework and its related algorithms. This system makes a significant contribution to the demonstration and understanding of underground geohazard recognition in a three-dimensional environment.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1819
Author(s):  
David Černý ◽  
Josef Dobeš

GPU cards have been used for scientific calculations for many years. Despite their ever-increasing performance, there are cases where they may still have problems. This article addresses possible performance and memory issues and their solutions that may occur during GPU calculations of iterative algorithms. Specifically, the article focuses on the optimization of transient simulation of extra-large highly nonlinear time-dependent circuits in SPICE-like electronic circuit simulator core enhanced with NVIDIA/CUDA (Compute Unified Device Architecture) interface and iterative Krylov Subspace methods with emphasis on improved accuracy. The article presents procedures for solving problems that may occur during this integration and negatively affect either the simulation speed or the accuracy of the calculation. Finally, a comparison of the implementation of an iterative calculation procedure with the use of GPU cards, calculation by the direct method and calculation on the CPU only is presented.


Author(s):  
Xuejie Jiang ◽  
Lijin Fang ◽  
Yue Gao

The kinematic calibration accuracy of serial manipulators is affected by the error expression ability of the selected measurement configurations and non-geometric errors such as joint disturbance, measurement noise, etc. Based on the observability of configurations, deviation of identifiable parameters, and calibration robustness, this paper proposes a multilevel evaluation criterion for measurement configuration optimization. In addition, based on the Compute Unified Device Architecture (CUDA) parallel computing technique, the most time-consuming Jacobian matrix calculation program in the algorithm is modified, and an efficient optimization algorithm for measurement configurations is established, to guarantee the feasibility of the evaluation criterion. Combined with CUDA algorithm, fast calibration is implemented with fewer measurement points and relatively higher accuracy, by means of multilevel optimization. The results illustrate the effectiveness and the universality of the proposed multilevel evaluation criterion. The criterion can be applied in calibration experiments of multi-degree of freedom (DOF) serial manipulators with complex structures.


Author(s):  
Ugur Taygan ◽  
Adnan Ozsoy

The classification and tracking of objects has gained popularity in recent years due to the variety and importance of their application areas. Although object classification does not necessarily have to be real time, object tracking is often intended to be carried out in real time. While the object tracking algorithm mainly focuses on robustness and accuracy, the speed of the algorithm may degrade significantly. Due to their parallelisable nature, the use of GPUs and other parallel programming tools are increasing in the object tracking applications. In this paper, we run experiments on the Efficient Convolution Operators object tracking algorithm, in order to detect its time-consuming parts, which are the bottlenecks of the algorithm, and investigate the possibility of GPU parallelisation of the bottlenecks to improve the speed of the algorithm. Finally, the candidate methods are implemented and parallelised using the Compute Unified Device Architecture.   Keywords: Object tracking, parallel programming.


Sign in / Sign up

Export Citation Format

Share Document