A parallel implementation of Viterbi training for acoustic models using graphics processing units

Phase change materials have a wide range of application including thermal energy storage in building structures, solar air collectors, heat storage units and exchangers. Such applications often utilize a commercially produced phase change material enclosed in a thin panel (container) made of aluminum. A parallel 1D heat transfer model of a container with phase change material was developed by means of the control volume and effective heat capacity methods. The parallel implementation in the CUDA computing architecture allows the model for running on graphics processing units which makes the model very fast in comparison to traditional models computed on a single CPU. The paper presents the model implementation and results of computational model benchmarking carried out with the use of high-level and low-level GPUs NVIDIA.

Download Full-text

Parallel implementation of the discrete wavelet transform on graphics processing units

2014 1st International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) ◽

10.1109/atsip.2014.6834587 ◽

2014 ◽

Author(s):

Randa Khemiri ◽

Fatma Sayadi ◽

Taoufik Saidani ◽

Marwa Chouchene ◽

Haythem Bahri ◽

...

Keyword(s):

Wavelet Transform ◽

Discrete Wavelet Transform ◽

Graphics Processing Units ◽

Parallel Implementation ◽

Discrete Wavelet ◽

Graphics Processing

Download Full-text

PARALLEL IMPLEMENTATION OF MORPHOLOGICAL PROFILE BASED SPECTRAL-SPATIAL CLASSIFICATION SCHEME FOR HYPERSPECTRAL IMAGERY

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xli-b7-263-2016 ◽

2016 ◽

Vol XLI-B7 ◽

pp. 263-267

Author(s):

B. Kumar ◽

O. Dikshit

Keyword(s):

Classification Accuracy ◽

Graphics Processing Units ◽

Spatial Information ◽

Parallel Implementation ◽

Hyperspectral Imagery ◽

Hyperspectral Images ◽

Important Concern ◽

Speed Up ◽

Graphics Processing ◽

The Impact

Extended morphological profile (EMP) is a good technique for extracting spectral-spatial information from the images but large size of hyperspectral images is an important concern for creating EMPs. However, with the availability of modern multi-core processors and commodity parallel processing systems like graphics processing units (GPUs) at desktop level, parallel computing provides a viable option to significantly accelerate execution of such computations. In this paper, parallel implementation of an EMP based spectralspatial classification method for hyperspectral imagery is presented. The parallel implementation is done both on multi-core CPU and GPU. The impact of parallelization on speed up and classification accuracy is analyzed. For GPU, the implementation is done in compute unified device architecture (CUDA) C. The experiments are carried out on two well-known hyperspectral images. It is observed from the experimental results that GPU implementation provides a speed up of about 7 times, while parallel implementation on multi-core CPU resulted in speed up of about 3 times. It is also observed that parallel implementation has no adverse impact on the classification accuracy.

Download Full-text

GPUDePiCt: A Parallel Implementation of a Clustering Algorithm for Computing Degenerate Primers on Graphics Processing Units

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2014.2355231 ◽

2015 ◽

Vol 12 (2) ◽

pp. 445-454 ◽

Cited By ~ 1

Author(s):

Trevor Cickovski ◽

Tiffany Flor ◽

Galen Irving-Sachs ◽

Philip Novikov ◽

James Parda ◽

...

Keyword(s):

Graphics Processing Units ◽

Clustering Algorithm ◽

Parallel Implementation ◽

Degenerate Primers ◽

Graphics Processing

Download Full-text

Developing Extensible Lattice-Boltzmann Simulators for General-Purpose Graphics-Processing Units

Communications in Computational Physics ◽

10.4208/cicp.351011.260112s ◽

2013 ◽

Vol 13 (3) ◽

pp. 867-879 ◽

Cited By ~ 6

Author(s):

Stuart D. C. Walsh ◽

Martin O. Saar

Keyword(s):

Code Generation ◽

Lattice Boltzmann ◽

Graphics Processing Units ◽

Parallel Implementation ◽

General Purpose ◽

Lattice Boltzmann Simulation ◽

Lattice Boltzmann Simulations ◽

Gpu Architectures ◽

Automatic Code ◽

Graphics Processing

AbstractLattice-Boltzmann methods are versatile numerical modeling techniques capable of reproducing a wide variety of fluid-mechanical behavior. These methods are well suited to parallel implementation, particularly on the single-instruction multiple data (SIMD) parallel processing environments found in computer graphics processing units (GPUs).Although recent programming tools dramatically improve the ease with which GPUbased applications can be written, the programming environment still lacks the flexibility available to more traditional CPU programs. In particular, it may be difficult to develop modular and extensible programs that require variable on-device functionality with current GPU architectures.This paper describes a process of automatic code generation that overcomes these difficulties for lattice-Boltzmann simulations. It details the development of GPU-based modules for an extensible lattice-Boltzmann simulation package – LBHydra. The performance of the automatically generated code is compared to equivalent purposewritten codes for both single-phase,multiphase, andmulticomponent flows. The flexibility of the new method is demonstrated by simulating a rising, dissolving droplet moving through a porous medium with user generated lattice-Boltzmann models and subroutines.

Download Full-text

Optimized Parallel Implementation of Gillespie's First Reaction Method on Graphics Processing Units

2009 International Conference on Computer Modeling and Simulation ◽

10.1109/iccms.2009.42 ◽

2009 ◽

Cited By ~ 14

Author(s):

Cristian Dittamo ◽

Davide Cangelosi

Keyword(s):

Graphics Processing Units ◽

Parallel Implementation ◽

Reaction Method ◽

Graphics Processing

Download Full-text

PARALLEL IMPLEMENTATION OF MORPHOLOGICAL PROFILE BASED SPECTRAL-SPATIAL CLASSIFICATION SCHEME FOR HYPERSPECTRAL IMAGERY

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xli-b7-263-2016 ◽

2016 ◽

Vol XLI-B7 ◽

pp. 263-267

Author(s):

B. Kumar ◽

O. Dikshit

Keyword(s):

Classification Accuracy ◽

Graphics Processing Units ◽

Spatial Information ◽

Parallel Implementation ◽

Hyperspectral Imagery ◽

Hyperspectral Images ◽

Important Concern ◽

Speed Up ◽

Graphics Processing ◽

The Impact

Extended morphological profile (EMP) is a good technique for extracting spectral-spatial information from the images but large size of hyperspectral images is an important concern for creating EMPs. However, with the availability of modern multi-core processors and commodity parallel processing systems like graphics processing units (GPUs) at desktop level, parallel computing provides a viable option to significantly accelerate execution of such computations. In this paper, parallel implementation of an EMP based spectralspatial classification method for hyperspectral imagery is presented. The parallel implementation is done both on multi-core CPU and GPU. The impact of parallelization on speed up and classification accuracy is analyzed. For GPU, the implementation is done in compute unified device architecture (CUDA) C. The experiments are carried out on two well-known hyperspectral images. It is observed from the experimental results that GPU implementation provides a speed up of about 7 times, while parallel implementation on multi-core CPU resulted in speed up of about 3 times. It is also observed that parallel implementation has no adverse impact on the classification accuracy.

Download Full-text

Enhancing the performance of the aggregated bit vector algorithm in network packet classification using GPU

PeerJ Computer Science ◽

10.7717/peerj-cs.185 ◽

2019 ◽

Vol 5 ◽

pp. e185 ◽

Cited By ~ 2

Author(s):

Mahdi Abbasi ◽

Razieh Tahouri ◽

Milad Rafiee

Keyword(s):

Graphics Processing Units ◽

High Speed ◽

Parallel Implementation ◽

Packet Classification ◽

Experimental Results ◽

Analysis Method ◽

Network Systems ◽

Computationally Intensive ◽

Bit Vector ◽

Graphics Processing

Packet classification is a computationally intensive, highly parallelizable task in many advanced network systems like high-speed routers and firewalls that enable different functionalities through discriminating incoming traffic. Recently, graphics processing units (GPUs) have been exploited as efficient accelerators for parallel implementation of software classifiers. The aggregated bit vector is a highly parallelizable packet classification algorithm. In this work, first we present a parallel kernel for running this algorithm on GPUs. Next, we adapt an asymptotic analysis method which predicts any empirical result of the proposed kernel. Experimental results not only confirm the efficiency of the proposed parallel kernel but also reveal the accuracy of the analysis method in predicting important trends in experimental results.

Download Full-text