Benchmarking the Nvidia GPU Lineage

The pavement inspection task, which mainly includes crack and garbage detection, is essential and carried out frequently. The human-based or dedicated system approach for inspection can be easily carried out by integrating with the pavement sweeping machines. This work proposes a deep learning-based pavement inspection framework for self-reconfigurable robot named Panthera. Semantic segmentation framework SegNet was adopted to segment the pavement region from other objects. Deep Convolutional Neural Network (DCNN) based object detection is used to detect and localize pavement defects and garbage. Furthermore, Mobile Mapping System (MMS) was adopted for the geotagging of the defects. The proposed system was implemented and tested with the Panthera robot having NVIDIA GPU cards. The experimental results showed that the proposed technique identifies the pavement defects and litters or garbage detection with high accuracy. The experimental results on the crack and garbage detection are presented. It is found that the proposed technique is suitable for deployment in real-time for garbage detection and, eventually, sweeping or cleaning tasks.

Download Full-text

Acceleration of CNN-Based Facial Emotion Detection Using NVIDIA GPU

Intelligent Computing and Information and Communication - Advances in Intelligent Systems and Computing ◽

10.1007/978-981-10-7245-1_26 ◽

2018 ◽

pp. 257-264 ◽

Cited By ~ 1

Author(s):

Bhakti Sonawane ◽

Priyanka Sharma

Keyword(s):

Facial Emotion ◽

Emotion Detection ◽

Nvidia Gpu

Download Full-text

Parallel approach to tomographic reconstruction algorithm using a Nvidia GPU

10.1063/1.5095930 ◽

2019 ◽

Author(s):

Tomás Antonio Valencia Pérez ◽

Javier Miguel Hernández López ◽

Eduardo Moreno Barbosa ◽

Mario Iván Martínez Hernández ◽

Guillermo Tejeda Muñoz ◽

...

Keyword(s):

Tomographic Reconstruction ◽

Reconstruction Algorithm ◽

Nvidia Gpu

Download Full-text

Implementation of Parallel Image Processing Using NVIDIA GPU Framework

Communications in Computer and Information Science - Advances in Computing, Communication and Control ◽

10.1007/978-3-642-18440-6_58 ◽

2011 ◽

pp. 457-464 ◽

Cited By ~ 12

Author(s):

Brijmohan Daga ◽

Avinash Bhute ◽

Ashok Ghatol

Keyword(s):

Image Processing ◽

Parallel Image Processing ◽

Parallel Image ◽

Nvidia Gpu

Download Full-text

GPU-Centric Communication on NVIDIA GPU Clusters with InfiniBand: A Case Study with OpenSHMEM

2017 IEEE 24th International Conference on High Performance Computing (HiPC) ◽

10.1109/hipc.2017.00037 ◽

2017 ◽

Cited By ~ 6

Author(s):

Sreeram Potluri ◽

Anshuman Goswami ◽

Davide Rossetti ◽

C.J. Newburn ◽

Manjunath Gorentla Venkata ◽

...

Keyword(s):

Gpu Clusters ◽

Nvidia Gpu

Download Full-text

AN intelligent road traffic management system using NVIDIA GPU

2016 19th International Conference on Computer and Information Technology (ICCIT) ◽

10.1109/iccitechn.2016.7860235 ◽

2016 ◽

Cited By ~ 2

Author(s):

Tahmid Tanzi Alam ◽

Ahmad Naquib Chowdhury ◽

Mohammad Zahidur Rahman

Keyword(s):

Traffic Management ◽

Management System ◽

Road Traffic ◽

Traffic Management System ◽

Nvidia Gpu

Download Full-text

Using Data Compression for Increasing Efficiency of Data Transfer Between Main Memory and Intel Xeon Phi Coprocessor or NVidia GPU in Parallel DBMS

Procedia Computer Science ◽

10.1016/j.procs.2015.11.072 ◽

2015 ◽

Vol 66 ◽

pp. 635-641 ◽

Cited By ~ 1

Author(s):

Konstantin Y. Besedin ◽

Pavel S. Kostenetskiy ◽

Stepan O. Prikazchikov

Keyword(s):

Data Compression ◽

Data Transfer ◽

Main Memory ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Using Data ◽

Nvidia Gpu ◽

Parallel Dbms ◽

Intel Xeon

Download Full-text

Efficient spherical harmonic transforms on GPU and its use in planetary core dynamics simulations

10.5194/egusphere-egu21-13680 ◽

2021 ◽

Author(s):

Nathanael Schaeffer

Keyword(s):

Fourier Transform ◽

Spherical Harmonic ◽

High Performance ◽

Peak Performance ◽

Legendre Transform ◽

Core Dynamics ◽

Program Flow ◽

Nvidia Gpu ◽

The Fourier Transform ◽

Dynamics Simulations

Most of the new supercomputers now use acceleration technology such as GPUs. They promise much higher performance than traditional CPU-only servers, both in terms of floating point operation throughput and memory bandwidth. Furthermore, the electric consumption is significantly reduced, resulting in lower carbon emissions. However, such high computation speeds can only be achieved if a set of more or less stringent rules are followed with respect to memory access and program flow. As a consequence some algorithms more easily approach peak performance.Here, we present the results of an effort to achieve high performance on recent nvidia GPU accelerators for the spherical harmonic transform. The spherical harmonic transform can be split into a Legendre transform (which is compute bound) and a Fourier transform (which is memory bound). By taking advantage of recent algorithmic improvements as well as by tuning the Fourier transform, the can now compute a full forward or backward spherical harmonic transform up to degree 8191 on a single 16GB Volta GPU in less than 0.35 seconds. For lower resolution (up to degree 1023), a single Volta GPU performs a full transform more than 3 times faster than a 48-cores dual socket Skylake Xeon Platinum server.We also present results of an ongoing effort to port the (simulation of planetary core fluid and magnetic field dynamics) to GPU-accelerated computers.

Download Full-text

NVIDIA GPU

Encyclopedia of Parallel Computing ◽

10.1007/978-0-387-09766-4_276 ◽

2011 ◽

pp. 1339-1345

Author(s):

Laxmikant V. Kalé ◽

Abhinav Bhatele ◽

Eric J. Bohm ◽

James C. Phillips ◽

David H. Bailey ◽

...

Keyword(s):

Nvidia Gpu

Download Full-text

Performance Evaluation of an OpenCL Implementation of the Lattice Boltzmann Method on the Intel Xeon Phi

Parallel Processing Letters ◽

10.1142/s0129626415410017 ◽

2015 ◽

Vol 25 (03) ◽

pp. 1541001 ◽

Cited By ~ 1

Author(s):

Christian Obrecht ◽

Bernard Tourancheau ◽

Frédéric Kuznik

Keyword(s):

Lattice Boltzmann Method ◽

Lattice Boltzmann ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Hardware Architectures ◽

Nvidia Gpu ◽

Many Core ◽

Hardware Platforms ◽

Boltzmann Method ◽

Intel Xeon

A portable OpenCL implementation of the lattice Boltzmann method targeting emerging many-core architectures is described. The main purpose of this work is to evaluate and compare the performance of this code on three mainstream hardware architectures available today, namely an Intel CPU, an Nvidia GPU, and the Intel Xeon Phi. Because of the similarities between OpenCL and CUDA, we chose to follow some of the strategies devised to implement efficient lattice Boltzmann solvers on Nvidia GPU, while remaining as generic as possible. Being fairly configurable, this program makes possible to ascertain the best options for each hardware platforms. The achieved performance is quite satisfactory for both the CPU and the GPU. For the Xeon Phi however, the results are below expectations. Nevertheless, comparison with data from the literature shows that on this architecture the code seems memory-bound.

Download Full-text