scholarly journals A Distributed GPU-Based Framework for Real-Time 3D Volume Rendering of Large Astronomical Data Cubes

2012 ◽  
Vol 29 (3) ◽  
pp. 340-351 ◽  
Author(s):  
A. H. Hassan ◽  
C. J. Fluke ◽  
D. G. Barnes

AbstractWe present a framework to volume-render three-dimensional data cubes interactively using distributed ray-casting and volume-bricking over a cluster of workstations powered by one or more graphics processing units (GPUs) and a multi-core central processing unit (CPU). The main design target for this framework is to provide an in-core visualization solution able to provide three-dimensional interactive views of terabyte-sized data cubes. We tested the presented framework using a computing cluster comprising 64 nodes with a total of 128 GPUs. The framework proved to be scalable to render a 204 GB data cube with an average of 30 frames per second. Our performance analyses also compare the use of NVIDIA Tesla 1060 and 2050 GPU architectures and the effect of increasing the visualization output resolution on the rendering performance. Although our initial focus, as shown in the examples presented in this work, is volume rendering of spectral data cubes from radio astronomy, we contend that our approach has applicability to other disciplines where close to real-time volume rendering of terabyte-order three-dimensional data sets is a requirement.

2010 ◽  
Vol 133 (2) ◽  
Author(s):  
Tobias Brandvik ◽  
Graham Pullan

A new three-dimensional Navier–Stokes solver for flows in turbomachines has been developed. The new solver is based on the latest version of the Denton codes but has been implemented to run on graphics processing units (GPUs) instead of the traditional central processing unit. The change in processor enables an order-of-magnitude reduction in run-time due to the higher performance of the GPU. The scaling results for a 16 node GPU cluster are also presented, showing almost linear scaling for typical turbomachinery cases. For validation purposes, a test case consisting of a three-stage turbine with complete hub and casing leakage paths is described. Good agreement is obtained with previously published experimental results. The simulation runs in less than 10 min on a cluster with four GPUs.


Author(s):  
Tobias Brandvik ◽  
Graham Pullan

A new three-dimensional Navier-Stokes solver for flows in turbomachines has been developed. The new solver is based on the latest version of the Denton codes, but has been implemented to run on Graphics Processing Units (GPUs) instead of the traditional Central Processing Unit (CPU). The change in processor enables an order-of-magnitude reduction in run-time due to the higher performance of the GPU. Scaling results for a 16 node GPU cluster are also presented, showing almost linear scaling for typical turbomachinery cases. For validation purposes, a test case consisting of a three-stage turbine with complete hub and casing leakage paths is described. Good agreement is obtained with previously published experimental results. The simulation runs in less than 10 minutes on a cluster with four GPUs.


2021 ◽  
Vol 7 (2) ◽  
pp. 35
Author(s):  
Boris Shirokikh ◽  
Alexey Shevtsov ◽  
Alexandra Dalechina ◽  
Egor Krivov ◽  
Valery Kostjuchenko ◽  
...  

The prevailing approach for three-dimensional (3D) medical image segmentation is to use convolutional networks. Recently, deep learning methods have achieved human-level performance in several important applied problems, such as volumetry for lung-cancer diagnosis or delineation for radiation therapy planning. However, state-of-the-art architectures, such as U-Net and DeepMedic, are computationally heavy and require workstations accelerated with graphics processing units for fast inference. However, scarce research has been conducted concerning enabling fast central processing unit computations for such networks. Our paper fills this gap. We propose a new segmentation method with a human-like technique to segment a 3D study. First, we analyze the image at a small scale to identify areas of interest and then process only relevant feature-map patches. Our method not only reduces the inference time from 10 min to 15 s but also preserves state-of-the-art segmentation quality, as we illustrate in the set of experiments with two large datasets.


Due to extensive needs for growth in various sectors, which include software, telecom, healthcare, defence, etc., there is a necessary increase in the number as well as the duration of meetings, conference calls, reconnaissance stakeouts, financial reviews. The obtained reports of these play a significant role in defining the plan of actions. The proposed model is to convert real-time speech to corresponding text and then to its respective summary using Natural Language Grammar (NLG) and Abstract Meaning Representation (AMR) graphs and then again turned back the obtained summary to speech. The proposed model intends to achieve the task using two major algorithms, 1) Deep Speech 2, 2) AMR graphs. The speech-recognition model recommended has a speedup of 4x if the algorithm runs on a Central Processing Unit (CPU), and the use of particular Graphics Processing Units (GPUs) for running deep learning algorithms can give a speedup of 21x. The performance of the summarizer used is close to the Lead-3-AMR-Baseline model, which is a solid baseline for the CNN/Dailymail dataset. The summarizer we use scores ROGUE score close to the Lead-3- AMR-Baseline model with an accuracy of 99.37%.


2010 ◽  
Vol 43 (6) ◽  
pp. 1535-1539 ◽  
Author(s):  
Filipe R. N. C. Maia ◽  
Tomas Ekeberg ◽  
David van der Spoel ◽  
Janos Hajdu

The past few years have seen a tremendous growth in the field of coherent X-ray diffractive imaging, in large part due to X-ray free-electron lasers which provide a peak brilliance billions of times higher than that of synchrotrons. However, this rapid development in terms of hardware has not been matched on the software side. The release ofHawkis intended to close this gap. To the authors' knowledgeHawkis the first publicly available and fully open source software program for reconstructing images from continuous diffraction patterns. The software handles all steps leading from a raw diffraction pattern to a reconstructed two-dimensional image including geometry determination, background correction, masking and phasing. It also includes preliminary three-dimensional support and support for graphics processing units using the Compute Unified Device Architecture, which speeds up processing by orders of magnitude compared to a single central processing unit.Hawkimplements numerous algorithms and is easily extended. This, in combination with its open-source licence, provides a platform for other groups to test, develop and distribute their own algorithms.Hawkis available under the GNU General Public License from http://xray.bmc.uu.se/hawk.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Marwan Abdellah ◽  
Ayman Eldeib ◽  
Amr Sharawi

Fourier volume rendering (FVR) is a significant visualization technique that has been used widely in digital radiography. As a result of itsO(N2log⁡N)time complexity, it provides a faster alternative to spatial domain volume rendering algorithms that areO(N3)computationally complex. Relying on theFourier projection-slice theorem, this technique operates on the spectral representation of a 3D volume instead of processing its spatial representation to generate attenuation-only projections that look likeX-ray radiographs. Due to the rapid evolution of its underlying architecture, the graphics processing unit (GPU) became an attractive competent platform that can deliver giant computational raw power compared to the central processing unit (CPU) on a per-dollar-basis. The introduction of the compute unified device architecture (CUDA) technology enables embarrassingly-parallel algorithms to run efficiently on CUDA-capable GPU architectures. In this work, a high performance GPU-accelerated implementation of the FVR pipeline on CUDA-enabled GPUs is presented. This proposed implementation can achieve a speed-up of 117x compared to a single-threaded hybrid implementation that uses the CPU and GPU together by taking advantage of executing the rendering pipeline entirely on recent GPU architectures.


2021 ◽  
Vol 13 (5) ◽  
pp. 2950
Author(s):  
Su-Kyung Sung ◽  
Eun-Seok Lee ◽  
Byeong-Seok Shin

Climate change increases the frequency of localized heavy rains and typhoons. As a result, mountain disasters, such as landslides and earthworks, continue to occur, causing damage to roads and residential areas downstream. Moreover, large-scale civil engineering works, including dam construction, cause rapid changes in the terrain, which harm the stability of residential areas. Disasters, such as landslides and earthenware, occur extensively, and there are limitations in the field of investigation; thus, there are many studies being conducted to model terrain geometrically and to observe changes in terrain according to external factors. However, conventional topography methods are expressed in a way that can only be interpreted by people with specialized knowledge. Therefore, there is a lack of consideration for three-dimensional visualization that helps non-experts understand. We need a way to express changes in terrain in real time and to make it intuitive for non-experts to understand. In conventional height-based terrain modeling and simulation, there is a problem in which some of the sampled data are irregularly distorted and do not show the exact terrain shape. The proposed method utilizes a hierarchical vertex cohesion map to correct inaccurately modeled terrain caused by uniform height sampling, and to compensate for geometric errors using Hausdorff distances, while not considering only the elevation difference of the terrain. The mesh reconstruction, which triangulates the three-vertex placed at each location and makes it the smallest unit of 3D model data, can be done at high speed on graphics processing units (GPUs). Our experiments confirm that it is possible to express changes in terrain accurately and quickly compared with existing methods. These functions can improve the sustainability of residential spaces by predicting the damage caused by mountainous disasters or civil engineering works around the city and make it easy for non-experts to understand.


2021 ◽  
Vol 87 (5) ◽  
pp. 363-373
Author(s):  
Long Chen ◽  
Bo Wu ◽  
Yao Zhao ◽  
Yuan Li

Real-time acquisition and analysis of three-dimensional (3D) human body kinematics are essential in many applications. In this paper, we present a real-time photogrammetric system consisting of a stereo pair of red-green-blue (RGB) cameras. The system incorporates a multi-threaded and graphics processing unit (GPU)-accelerated solution for real-time extraction of 3D human kinematics. A deep learning approach is adopted to automatically extract two-dimensional (2D) human body features, which are then converted to 3D features based on photogrammetric processing, including dense image matching and triangulation. The multi-threading scheme and GPU-acceleration enable real-time acquisition and monitoring of 3D human body kinematics. Experimental analysis verified that the system processing rate reached ∼18 frames per second. The effective detection distance reached 15 m, with a geometric accuracy of better than 1% of the distance within a range of 12 m. The real-time measurement accuracy for human body kinematics ranged from 0.8% to 7.5%. The results suggest that the proposed system is capable of real-time acquisition and monitoring of 3D human kinematics with favorable performance, showing great potential for various applications.


Author(s):  
Baptiste Ristagno ◽  
Dominique Giraud ◽  
Julien Fontchastagner ◽  
Denis Netter ◽  
Noureddine Takorabet ◽  
...  

Purpose Optimization processes and movement modeling usually require a high number of simulations. The purpose of this paper is to reduce global central processing unit (CPU) time by decreasing each evaluation time. Design Methodology Approach Remeshing the geometry at each iteration is avoided in the proposed method. The idea consists in using a fixed mesh on which functions are projected to represent geometry and supply. Findings Results are very promising. CPU time is reduced for three dimensional problems by almost a factor two, keeping a low relative deviation from usual methods. CPU time saving is performed by avoiding meshing step and also by a better initialization of iterative resolution. Optimization, movement modeling and transient-state simulation are very efficient and give same results as usual finite element method. Research Limitations Implications The method is restricted to simple geometry owing to the difficulty of finding spatial mathematical function describing the geometry. Moreover, a compromise between imprecision, caused by the boundary evaluation, and time saving must be found. Originality Value The method can be applied to optimize rotating machines design. Moreover, movement modeling is performed by shifting functions corresponding to moving parts.


2021 ◽  
Vol 119 ◽  
pp. 07002
Author(s):  
Youness Rtal ◽  
Abdelkader Hadjoudja

Graphics Processing Units (GPUs) are microprocessors attached to graphics cards, which are dedicated to the operation of displaying and manipulating graphics data. Currently, such graphics cards (GPUs) occupy all modern graphics cards. In a few years, these microprocessors have become potent tools for massively parallel computing. Such processors are practical instruments that serve in developing several fields like image processing, video and audio encoding and decoding, the resolution of a physical system with one or more unknowns. Their advantages: faster processing and consumption of less energy than the power of the central processing unit (CPU). In this paper, we will define and implement the Lagrange polynomial interpolation method on GPU and CPU to calculate the sodium density at different temperatures Ti using the NVIDIA CUDA C parallel programming model. It can increase computational performance by harnessing the power of the GPU. The objective of this study is to compare the performance of the implementation of the Lagrange interpolation method on CPU and GPU processors and to deduce the efficiency of the use of GPUs for parallel computing.


Sign in / Sign up

Export Citation Format

Share Document