scholarly journals HYPERDOCK: Improving virtual screening through parallel hyperheuristics

Author(s):  
Baldomero Imbernón ◽  
Antonio Llanes ◽  
José-Matías Cutillas-Lozano ◽  
Domingo Giménez

Virtual screening (VS) methods aid clinical research by predicting the interaction of ligands with pharmacological targets. The computational requirements of VS, along with the size of the databases, propitiate the use of high-performance computing. METADOCK is a tool for the application of metaheuristics to VS in heterogeneous clusters of computers based on central processing unit (CPU) and graphics processing unit (GPU). HYPERDOCK represents a step forward; the exploration for satisfactory metaheuristics is systematically approached by means of hyperheuristics working on top of the metaheuristic schema of METADOCK. Multiple metaheuristics are explored, so the process is computationally demanding. HYPERDOCK exploits the parallelism of METADOCK and includes parallelism at its own level. The different levels of parallelism can be used to exploit the parallelism offered by computational systems composed of multicore CPU + multi-GPUs. The efficient exploitation of these systems enables HYPERDOCK to improve ligand–receptor binding.

Author(s):  
Timothy Dykes ◽  
Claudio Gheller ◽  
Marzia Rivi ◽  
Mel Krokos

With the increasing size and complexity of data produced by large-scale numerical simulations, it is of primary importance for scientists to be able to exploit all available hardware in heterogenous high-performance computing environments for increased throughput and efficiency. We focus on the porting and optimization of Splotch, a scalable visualization algorithm, to utilize the Xeon Phi, Intel’s coprocessor based upon the new many integrated core architecture. We discuss steps taken to offload data to the coprocessor and algorithmic modifications to aid faster processing on the many-core architecture and make use of the uniquely wide vector capabilities of the device, with accompanying performance results using multiple Xeon Phi. Finally we compare performance against results achieved with the Graphics Processing Unit (GPU) based implementation of Splotch.


Author(s):  
Ana Moreton–Fernandez ◽  
Hector Ortega–Arranz ◽  
Arturo Gonzalez–Escribano

Nowadays the use of hardware accelerators, such as the graphics processing units or XeonPhi coprocessors, is key in solving computationally costly problems that require high performance computing. However, programming solutions for an efficient deployment for these kind of devices is a very complex task that relies on the manual management of memory transfers and configuration parameters. The programmer has to carry out a deep study of the particular data that needs to be computed at each moment, across different computing platforms, also considering architectural details. We introduce the controller concept as an abstract entity that allows the programmer to easily manage the communications and kernel launching details on hardware accelerators in a transparent way. This model also provides the possibility of defining and launching central processing unit kernels in multi-core processors with the same abstraction and methodology used for the accelerators. It internally combines different native programming models and technologies to exploit the potential of each kind of device. Additionally, the model also allows the programmer to simplify the proper selection of values for several configuration parameters that can be selected when a kernel is launched. This is done through a qualitative characterization process of the kernel code to be executed. Finally, we present the implementation of the controller model in a prototype library, together with its application in several case studies. Its use has led to reductions in the development and porting costs, with significantly low overheads in the execution times when compared to manually programmed and optimized solutions which directly use CUDA and OpenMP.


2020 ◽  
Vol 22 (5) ◽  
pp. 1217-1235 ◽  
Author(s):  
M. Morales-Hernández ◽  
M. B. Sharif ◽  
S. Gangrade ◽  
T. T. Dullo ◽  
S.-C. Kao ◽  
...  

Abstract This work presents a vision of future water resources hydrodynamics codes that can fully utilize the strengths of modern high-performance computing (HPC). The advances to computing power, formerly driven by the improvement of central processing unit processors, now focus on parallel computing and, in particular, the use of graphics processing units (GPUs). However, this shift to a parallel framework requires refactoring the code to make efficient use of the data as well as changing even the nature of the algorithm that solves the system of equations. These concepts along with other features such as the precision for the computations, dry regions management, and input/output data are analyzed in this paper. A 2D multi-GPU flood code applied to a large-scale test case is used to corroborate our statements and ascertain the new challenges for the next-generation parallel water resources codes.


Author(s):  
Sterling Ramroach ◽  
Jonathan Herbert ◽  
Ajay Joshi

Identifying important features from high dimensional data is usually done using one-dimensional filtering techniques. These techniques discard noisy attributes and those that are constant throughout the data. This is a time-consuming task that has scope for acceleration via high performance computing techniques involving the graphics processing unit (GPU). The proposed algorithm involves acceleration via the Compute Unified Device Architecture (CUDA) framework developed by Nvidia. This framework facilitates the seamless scaling of computation on any CUDA-enabled GPUs. Thus, the Pearson Correlation Coefficient can be applied in parallel on each feature with respect to the response variable. The ranks obtained for each feature can be used to determine the most relevant features to select. Using data from the UCI Machine Learning Repository, our results show an increase in efficiency for multi-dimensional analysis with a more reliable feature importance ranking. When tested on a high-dimensional dataset of 1000 samples and 10,000 features, we achieved a 1,230-time speedup using CUDA. This acceleration grows exponentially, as with any embarrassingly parallel task.


Author(s):  
Stefan Boodoo ◽  
Ajay Joshi

Oil and Gas companies keep exploring every new possible method to increase the likelihood of finding a commercial hydrocarbon bearing prospect. Well logging generates gigabytes of data from various probes and sensors. After processing, a prospective reservoir will indicate areas of oil, gas, water and reservoir rock. Incorporating High Performance Computing (HPC) methodologies will allow for thousands of potential wells to be indicative of its hydrocarbon bearing potential. This study will present the use of the Graphics Processing Unit (GPU) computing as another method of analyzing probable reserves. Raw well log data from the Kansas Geological Society (1999-2018) forms the basis of the data analysis. Parallel algorithms are developed and make use of Nvidia’s Compute Unified Device Architecture (CUDA). The results gathered highlight a 5 times speedup using a Nvidia GeForce GT 330M GPU as compared to an Intel Core i7 740QM Central Processing Unit (CPU). The processed results display depth wise areas of shale and rock formations as well as water, oil and/or gas reserves.


2017 ◽  
Vol 14 (1) ◽  
pp. 789-795
Author(s):  
V Saveetha ◽  
S Sophia

Parallel data clustering aims at using algorithms and methods to extract knowledge from fat databases in rational time using high performance architectures. The computational challenge faced by cluster analysis due to increasing capacity of data can be overcome by exploiting the power of these architectures. The recent development in parallel power of Graphics Processing Unit enables low cost high performance solutions for general purpose applications. The Compute Unified Device Architecture programming model provides application programming interface methods to handle data proficiently on Graphics Processing Unit for iterative clustering algorithms like K-Means. The existing Graphics Processing Unit based K-Means algorithms highly focus on improvising the speedup of the algorithms and fall short to handle the high time spent on transfer of data between the Central Processing Unit and Graphics Processing Unit. A competent K-Means algorithm is proposed in this paper to lessen the transfer time by introducing a novel approach to check the convergence of the algorithm and utilize the pinned memory for direct access. This algorithm outperforms the other algorithms by maximizing parallelism and utilizing the memory features. The relative speedups and the validity measure for the proposed algorithm is elevated when compared with K-Means on Graphics Processing Unit and K-Means using Flag on Graphics Processing Unit. Thus the planned approach proves that communication overhead can be reduced in K-Means clustering.


Computation ◽  
2020 ◽  
Vol 8 (1) ◽  
pp. 4 ◽  
Author(s):  
Håvard H. Holm ◽  
André R. Brodtkorb ◽  
Martin L. Sætra

In this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy efficiency between Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL); between GPU generations; and between low-end, mid-range, and high-end GPUs. Our findings showed that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments showed that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiency.


2012 ◽  
Vol 10 (H16) ◽  
pp. 679-680
Author(s):  
Christopher J. Fluke

AbstractAs we move ever closer to the Square Kilometre Array era, support for real-time, interactive visualisation and analysis of tera-scale (and beyond) data cubes will be crucial for on-going knowledge discovery. However, the data-on-the-desktop approach to analysis and visualisation that most astronomers are comfortable with will no longer be feasible: tera-scale data volumes exceed the memory and processing capabilities of standard desktop computing environments. Instead, there will be an increasing need for astronomers to utilise remote high performance computing (HPC) resources. In recent years, the graphics processing unit (GPU) has emerged as a credible, low cost option for HPC. A growing number of supercomputing centres are now investing heavily in GPU technologies to provide O(100) Teraflop/s processing. I describe how a GPU-powered computing cluster allows us to overcome the analysis and visualisation challenges of tera-scale data. With a GPU-based architecture, we have moved the bottleneck from processing-limited to bandwidth-limited, achieving exceptional real-time performance for common visualisation and data analysis tasks.


Sign in / Sign up

Export Citation Format

Share Document