GPU-ACCELERATED INTERACTIVE VISUALIZATION OF 3D VOLUMETRIC DATA USING CUDA

2013 ◽  
Vol 13 (02) ◽  
pp. 1340003 ◽  
Author(s):  
PIYUSH KUMAR ◽  
ANUPAM AGRAWAL

Improving the image quality and the rendering speed have always been a challenge to the programmers involved in large scale volume rendering especially in the field of medical image processing. The paper aims to perform volume rendering using the graphics processing unit (GPU), in which, with its massively parallel capability has the potential to revolutionize this field. This work is now better with the help of GPU accelerated system. The final results would allow the doctors to diagnose and analyze the 2D computed tomography (CT) scan data using three dimensional visualization techniques. The system is used in multiple types of datasets, from 10 MB to 350 MB medical volume data. Further, the use of compute unified device architecture (CUDA) framework, a low learning curve technology, for such purpose would greatly reduce the cost involved in CT scan analysis; hence bring it to the common masses. The volume rendering has been done on Nvidia Tesla C1060 (there are 240 CUDA cores, which provides execution of data parallely) card and its performance has also been benchmarked.

Author(s):  
Hui Huang ◽  
Jian Chen ◽  
Blair Carlson ◽  
Hui-Ping Wang ◽  
Paul Crooker ◽  
...  

Due to enormous computation cost, current residual stress simulation of multipass girth welds are mostly performed using two-dimensional (2D) axisymmetric models. The 2D model can only provide limited estimation on the residual stresses by assuming its axisymmetric distribution. In this study, a highly efficient thermal-mechanical finite element code for three dimensional (3D) model has been developed based on high performance Graphics Processing Unit (GPU) computers. Our code is further accelerated by considering the unique physics associated with welding processes that are characterized by steep temperature gradient and a moving arc heat source. It is capable of modeling large-scale welding problems that cannot be easily handled by the existing commercial simulation tools. To demonstrate the accuracy and efficiency, our code was compared with a commercial software by simulating a 3D multi-pass girth weld model with over 1 million elements. Our code achieved comparable solution accuracy with respect to the commercial one but with over 100 times saving on computational cost. Moreover, the three-dimensional analysis demonstrated more realistic stress distribution that is not axisymmetric in hoop direction.


2019 ◽  
Vol 9 (24) ◽  
pp. 5437
Author(s):  
Lei Xiao ◽  
Guoxiang Yang ◽  
Kunyang Zhao ◽  
Gang Mei

In numerical modeling, mesh quality is one of the decisive factors that strongly affects the accuracy of calculations and the convergence of iterations. To improve mesh quality, the Laplacian mesh smoothing method, which repositions nodes to the barycenter of adjacent nodes without changing the mesh topology, has been widely used. However, smoothing a large-scale three dimensional mesh is quite computationally expensive, and few studies have focused on accelerating the Laplacian mesh smoothing method by utilizing the graphics processing unit (GPU). This paper presents a GPU-accelerated parallel algorithm for Laplacian smoothing in three dimensions by considering the influence of different data layouts and iteration forms. To evaluate the efficiency of the GPU implementation, the parallel solution is compared with the original serial solution. Experimental results show that our parallel implementation is up to 46 times faster than the serial version.


2012 ◽  
Vol 433-440 ◽  
pp. 5448-5452 ◽  
Author(s):  
Li Ping Zhao ◽  
Mei Fang ◽  
Yuan Wang Wei

An efficient compressed volume rendering algorithm is presented. Firstly, the original volume data is compressed by a content-based classified hierarchical vector quantization algorithm. Secondly, the compressed volume data is then transferred to Graphic Processing Unit and decompressed in real time, subsequently, the decompressed data is rendered by a three-dimensional textures mapping method to accelerate the speed of rendering. Experimental results show that, in addition to reasonable fidelity and faster rendering speed, the presented algorithm can obtain multiple levels of detail on the off-the-shelf graphic hardware.


2018 ◽  
Vol 7 (12) ◽  
pp. 472 ◽  
Author(s):  
Bo Wan ◽  
Lin Yang ◽  
Shunping Zhou ◽  
Run Wang ◽  
Dezhi Wang ◽  
...  

The road-network matching method is an effective tool for map integration, fusion, and update. Due to the complexity of road networks in the real world, matching methods often contain a series of complicated processes to identify homonymous roads and deal with their intricate relationship. However, traditional road-network matching algorithms, which are mainly central processing unit (CPU)-based approaches, may have performance bottleneck problems when facing big data. We developed a particle-swarm optimization (PSO)-based parallel road-network matching method on graphics-processing unit (GPU). Based on the characteristics of the two main stages (similarity computation and matching-relationship identification), data-partition and task-partition strategies were utilized, respectively, to fully use GPU threads. Experiments were conducted on datasets with 14 different scales. Results indicate that the parallel PSO-based matching algorithm (PSOM) could correctly identify most matching relationships with an average accuracy of 84.44%, which was at the same level as the accuracy of a benchmark—the probability-relaxation-matching (PRM) method. The PSOM approach significantly reduced the road-network matching time in dealing with large amounts of data in comparison with the PRM method. This paper provides a common parallel algorithm framework for road-network matching algorithms and contributes to integration and update of large-scale road-networks.


Author(s):  
Alan Gray ◽  
Kevin Stratford

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.


2021 ◽  
Vol 87 (5) ◽  
pp. 363-373
Author(s):  
Long Chen ◽  
Bo Wu ◽  
Yao Zhao ◽  
Yuan Li

Real-time acquisition and analysis of three-dimensional (3D) human body kinematics are essential in many applications. In this paper, we present a real-time photogrammetric system consisting of a stereo pair of red-green-blue (RGB) cameras. The system incorporates a multi-threaded and graphics processing unit (GPU)-accelerated solution for real-time extraction of 3D human kinematics. A deep learning approach is adopted to automatically extract two-dimensional (2D) human body features, which are then converted to 3D features based on photogrammetric processing, including dense image matching and triangulation. The multi-threading scheme and GPU-acceleration enable real-time acquisition and monitoring of 3D human body kinematics. Experimental analysis verified that the system processing rate reached ∼18 frames per second. The effective detection distance reached 15 m, with a geometric accuracy of better than 1% of the distance within a range of 12 m. The real-time measurement accuracy for human body kinematics ranged from 0.8% to 7.5%. The results suggest that the proposed system is capable of real-time acquisition and monitoring of 3D human kinematics with favorable performance, showing great potential for various applications.


2011 ◽  
Vol 110-116 ◽  
pp. 2740-2745
Author(s):  
Kirana Kumara P. ◽  
Ashitava Ghosal

Real-time simulation of deformable solids is essential for some applications such as biological organ simulations for surgical simulators. In this work, deformable solids are approximated to be linear elastic, and an easy and straight forward numerical technique, the Finite Point Method (FPM), is used to model three dimensional linear elastostatics. Graphics Processing Unit (GPU) is used to accelerate computations. Results show that the Finite Point Method, together with GPU, can compute three dimensional linear elastostatic responses of solids at rates suitable for real-time graphics, for solids represented by reasonable number of points.


Sign in / Sign up

Export Citation Format

Share Document