On the Effect of Exploiting GPUs for a More Eco-Sustainable Lease of Life

Author(s):  
Giuseppe Scanniello ◽  
Ugo Erra ◽  
Giuseppe Caggianese ◽  
Camine Gravino

It has been estimated that about 2% of global carbon dioxide emissions can be attributed to IT systems. Green (or sustainable) computing refers to supporting business critical computing needs with the least possible amount of power. This phenomenon changes the priorities in the design of new software systems and in the way companies handle existing ones. In this paper, we present the results of a research project aimed to develop a migration strategy to give an existing software system a new and more eco-sustainable lease of life. We applied a strategy for migrating a subject system that performs intensive and massive computation to a target architecture based on a Graphics Processing Unit (GPU). We validated our solution on a system for path finding robot simulations. An analysis on execution time and energy consumption indicated that: (i) the execution time of the migrated system is less than the execution time of the original system; and (ii) the migrated system reduces energy waste, so suggesting that it is more eco-sustainable than its original version. Our findings improve the body of knowledge on the effect of using the GPU in green computing.

Author(s):  
Yohsuke Tanaka ◽  
Hiroki Matsushi ◽  
Shigeru Murata

Abstract We introduce a graphics processing unit (GPU) acceleration to reconstructing holograms of phase retrieval holography for a drastic reduction of the execution time. We conducted GPU acceleration using the FFT library CUFFT on the GPU chip (GEFORCE GTX 1050, GDDR5 2GB, NVIDIA). We also used Intel Xeon CPU (E5-2690, 2.90GHz, Intel), the memory of 24 GB, and the operating system of Ubuntu 16.04 to compare GPU and CPU. Reconstructed volumes changed from 2562 × 128 voxels to 20482 × 1024 voxel to compare execution times. The ratio of the time of GPU to that of CPUs is constantly higher than 100 times except for small volume. We also demonstrated that GPU acceleration decreased the time by observing falling particles, recorded in 40 frames, from particle feeder. As a result, it is found that the execution time is reduced from 13 hours to 30 minutes.


2021 ◽  
Author(s):  
Randa Khemiri ◽  
Soulef Bouaafia ◽  
Asma Bahba ◽  
Maha Nasr ◽  
Fatma Ezahra Sayadi

In Motion estimation (ME), the block matching algorithms have a great potential of parallelism. This process of the best match is performed by computing the similarity for each block position inside the search area, using a similarity metric, such as Sum of Absolute Differences (SAD). It is used in the various steps of motion estimation algorithms. Moreover, it can be parallelized using Graphics Processing Unit (GPU) since the computation algorithm of each block pixels is similar, thus offering better results. In this work a fixed OpenCL code was performed firstly on several architectures as CPU and GPU, secondly a parallel GPU-implementation was proposed with CUDA and OpenCL for the SAD process using block of sizes from 4x4 to 64x64. A comparative study established between execution time on GPU on the same video sequence. The experimental results indicated that GPU OpenCL execution time was better than that of CUDA times with performance ratio that reached the double.


Author(s):  
Luca Mussi ◽  
Spela Ivekovic ◽  
Youssef S.G. Nashed ◽  
Stefano Cagnoni

The authors formulate the body pose estimation as a multi-dimensional nonlinear optimization problem, suitable to be approximately solved by a meta-heuristic, specifically, the particle swarm optimization (PSO). Starting from multi-view video sequences acquired in a studio environment, a full skeletal configuration of the human body is retrieved. They use a generic subdivision-surface body model in 3-D to generate solutions for the optimization problem. PSO then looks for the best match between the silhouettes generated by the projection of the model in a candidate pose and the silhouettes extracted from the original video sequence. The optimization method, in this case PSO, is run in parallel on the Graphics Processing Unit (GPU) and is implemented in Cuda-C™ on the nVidia CUDA™ architecture. The authors compare the results obtained by different configurations of the camera setup, fitness function, and PSO neighborhood topologies.


2018 ◽  
Vol 19 (12) ◽  
pp. 802-807
Author(s):  
Łukasz Nozdrzykowski ◽  
Magdalena Nozdrzykowska

The authors present models for estimating the time of execution of program loops compliant with the FAN model with no data dependencies or with data dependencies only within the body programming loop, which can be executed either by CPUs or by stream multiprocessors referred to as GPU cores. The models presented will make it possible to determine whether it would be more efficient to execute computation in the existing environment using the CPU (Central Pro-cessing Unit) or a state-of-the-art graphics card with a high-performance GPU (Graphics Processing Unit) and super-fast memory, of-ten implemented in modern graphics cards. Validity checks confirming the developed time estimation model for GPU are presented. The purpose of these models is to provide methods for accelerating the performance of applications performing various tasks, including transport tasks, such as accelerated solution searching, searching paths in graphs, or accelerating image processing algorithms in vision systems of autonomous and semiautonomous vehicles, where these models allow to build an automatic task distribution system between the CPU and the GPU with the variability of computing resources.


2007 ◽  
Author(s):  
Fredrick H. Rothganger ◽  
Kurt W. Larson ◽  
Antonio Ignacio Gonzales ◽  
Daniel S. Myers

2021 ◽  
Vol 22 (10) ◽  
pp. 5212
Author(s):  
Andrzej Bak

A key question confronting computational chemists concerns the preferable ligand geometry that fits complementarily into the receptor pocket. Typically, the postulated ‘bioactive’ 3D ligand conformation is constructed as a ‘sophisticated guess’ (unnecessarily geometry-optimized) mirroring the pharmacophore hypothesis—sometimes based on an erroneous prerequisite. Hence, 4D-QSAR scheme and its ‘dialects’ have been practically implemented as higher level of model abstraction that allows the examination of the multiple molecular conformation, orientation and protonation representation, respectively. Nearly a quarter of a century has passed since the eminent work of Hopfinger appeared on the stage; therefore the natural question occurs whether 4D-QSAR approach is still appealing to the scientific community? With no intention to be comprehensive, a review of the current state of art in the field of receptor-independent (RI) and receptor-dependent (RD) 4D-QSAR methodology is provided with a brief examination of the ‘mainstream’ algorithms. In fact, a myriad of 4D-QSAR methods have been implemented and applied practically for a diverse range of molecules. It seems that, 4D-QSAR approach has been experiencing a promising renaissance of interests that might be fuelled by the rising power of the graphics processing unit (GPU) clusters applied to full-atom MD-based simulations of the protein-ligand complexes.


Sign in / Sign up

Export Citation Format

Share Document