Parallelization of Molecular Docking: A Review

Molecular docking, as one of the widely used virtual screening methods, aims to predict the binding-conformations of small molecule ligands to the appropriate target binding site. Because of the computational complexity and the arrival of the big data era, molecular docking requests High- Performance Computing (HPC) to improve its performance and accuracy. We discuss, in detail, the advances in accelerating molecular docking software in parallel, based on the different common HPC platforms, respectively. Not only the existing suitable programs have been optimized and ported to HPC platforms, but also many novel parallel algorithms have been designed and implemented. This review focuses on the techniques and methods adopted in parallelizing docking software. Where appropriate, we refer readers to exemplary case studies.

Download Full-text

METADOCK: A parallel metaheuristic schema for virtual screening methods

The International Journal of High Performance Computing Applications ◽

10.1177/1094342017697471 ◽

2017 ◽

Vol 32 (6) ◽

pp. 789-803 ◽

Cited By ~ 14

Author(s):

Baldomero Imbernón ◽

José M Cecilia ◽

Horacio Pérez-Sánchez ◽

Domingo Giménez

Keyword(s):

Molecular Docking ◽

Virtual Screening ◽

High Performance ◽

Optimization Technique ◽

Screening Methods ◽

Scoring Functions ◽

Dynamic Assignment ◽

Parallel Metaheuristics ◽

Heterogeneous Architectures ◽

Metaheuristic Methods

Virtual screening through molecular docking can be translated into an optimization problem, which can be tackled with metaheuristic methods. The interaction between two chemical compounds (typically a protein, enzyme or receptor, and a small molecule, or ligand) is calculated by using highly computationally demanding scoring functions that are computed at several binding spots located throughout the protein surface. This paper introduces METADOCK, a novel molecular docking methodology based on parameterized and parallel metaheuristics and designed to leverage heterogeneous computers based on heterogeneous architectures. The application decides the optimization technique at running time by setting a configuration schema. Our proposed solution finds a good workload balance via dynamic assignment of jobs to heterogeneous resources which perform independent metaheuristic executions when computing different molecular interactions required by the scoring functions in use. A cooperative scheduling of jobs optimizes the quality of the solution and the overall performance of the simulation, so opening a new path for further developments of virtual screening methods on high-performance contemporary heterogeneous platforms.

Download Full-text

Perspectives on High-Performance Computing in a Big Data World

Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing - HPDC '19 ◽

10.1145/3307681.3325410 ◽

2019 ◽

Author(s):

Geoffrey C. Fox

Keyword(s):

Big Data ◽

High Performance Computing ◽

High Performance ◽

Performance Computing

Download Full-text

High-Performance Computing for Big Data Processing

Future Generation Computer Systems ◽

10.1016/j.future.2018.07.054 ◽

2018 ◽

Vol 88 ◽

pp. 693-695 ◽

Cited By ~ 1

Author(s):

Yulei Wu ◽

Yang Xiang ◽

Jingguo Ge ◽

Peter Muller

Keyword(s):

Big Data ◽

Data Processing ◽

High Performance Computing ◽

High Performance ◽

Big Data Processing ◽

Performance Computing

Download Full-text

High Performance Computing and Big Data

Studies in Big Data - Guide to Big Data Applications ◽

10.1007/978-3-319-53817-4_6 ◽

2017 ◽

pp. 125-147 ◽

Cited By ~ 1

Author(s):

Rishi Divate ◽

Sankalp Sah ◽

Manish Singh

Keyword(s):

Big Data ◽

High Performance Computing ◽

High Performance ◽

Performance Computing

Download Full-text

Big Data and IT Network Data Visualization

International Journal of Mathematical Engineering and Management Sciences ◽

10.33889/ijmems.2018.3.1-002 ◽

2018 ◽

Vol 3 (1) ◽

pp. 9-16 ◽

Cited By ~ 3

Author(s):

Lidong Wang

Keyword(s):

Big Data ◽

Network Analysis ◽

Graphics Processing Units ◽

Data Analytics ◽

High Performance ◽

Big Data Analytics ◽

Network Visualization ◽

Network Data ◽

Graphics Processing ◽

Performance Computing

Visualization with graphs is popular in the data analysis of Information Technology (IT) networks or computer networks. An IT network is often modelled as a graph with hosts being nodes and traffic being flows on many edges. General visualization methods are introduced in this paper. Applications and technology progress of visualization in IT network analysis and big data in IT network visualization are presented. The challenges of visualization and Big Data analytics in IT network visualization are also discussed. Big Data analytics with High Performance Computing (HPC) techniques, especially Graphics Processing Units (GPUs) helps accelerate IT network analysis and visualization.

Download Full-text

High Performance Numerical Computing for High Energy Physics: A New Challenge for Big Data Science

Advances in High Energy Physics ◽

10.1155/2014/507690 ◽

2014 ◽

Vol 2014 ◽

pp. 1-13 ◽

Cited By ~ 3

Author(s):

Florin Pop

Keyword(s):

Monte Carlo ◽

Big Data ◽

Numerical Methods ◽

High Performance ◽

Data Science ◽

Experimental Validation ◽

High Energy Physics ◽

High Energy ◽

Performance Computing ◽

Energy Physics

Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP) analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.

Download Full-text

Parallel Flexible Molecular Docking in Computational Chemistry on High Performance Computing Clusters

Computational Collective Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-319-24306-1_41 ◽

2015 ◽

pp. 418-427 ◽

Cited By ~ 8

Author(s):

Rafael Dolezal ◽

Teodorico C. Ramalho ◽

Tanos C.C. França ◽

Kamil Kuca

Keyword(s):

Molecular Docking ◽

Computational Chemistry ◽

High Performance Computing ◽

High Performance ◽

Performance Computing

Download Full-text

Bringing High Performance Computing to Big Data Algorithms

Handbook of Big Data Technologies ◽

10.1007/978-3-319-49340-4_23 ◽

2017 ◽

pp. 777-806 ◽

Cited By ~ 2

Author(s):

H. Anzt ◽

J. Dongarra ◽

M. Gates ◽

J. Kurzak ◽

P. Luszczek ◽

...

Keyword(s):

Big Data ◽

High Performance Computing ◽

High Performance ◽

Performance Computing

Download Full-text

Synchronizing Execution of Big Data in Distributed and Parallelized Environments

Big Data ◽

10.4018/978-1-4666-9840-6.ch071 ◽

2016 ◽

pp. 1555-1581

Author(s):

Gueyoung Jung ◽

Tridib Mukherjee

Keyword(s):

Big Data ◽

Distributed System ◽

Data Analytics ◽

High Performance ◽

Large Scale ◽

Big Data Analytics ◽

Loosely Coupled ◽

Current Trends ◽

Distributed Computing Infrastructures ◽

Performance Computing

In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This prevalent humungous amount of data—referred to as big data—has given rise to the problem of finding the “needle in the haystack” (i.e., extracting meaningful information from big data). Many researchers and practitioners are focusing on big data analytics to address the problem. One of the major issues in this regard is the computation requirement of big data analytics. In recent years, the proliferation of many loosely coupled distributed computing infrastructures (e.g., modern public, private, and hybrid clouds, high performance computing clusters, and grids) have enabled high computing capability to be offered for large-scale computation. This has allowed the execution of the big data analytics to gather pace in recent years across organizations and enterprises. However, even with the high computing capability, it is a big challenge to efficiently extract valuable information from vast astronomical data. Hence, we require unforeseen scalability of performance to deal with the execution of big data analytics. A big question in this regard is how to maximally leverage the high computing capabilities from the aforementioned loosely coupled distributed infrastructure to ensure fast and accurate execution of big data analytics. In this regard, this chapter focuses on synchronous parallelization of big data analytics over a distributed system environment to optimize performance.

Download Full-text

Big data, high performance computing and Pharmaceutical innovations

Journal of Pharmaceutics & Drug Delivery Research ◽

10.4172/2325-9604.c1.001 ◽

2016 ◽

Vol 05 (03) ◽

Author(s):

Jun Xu

Keyword(s):

Big Data ◽

High Performance Computing ◽

High Performance ◽

Performance Computing

Download Full-text