Efficient parallelization of perturbative Monte Carlo QM/MM simulations in heterogeneous platforms

Author(s):  
Sebastião Miranda ◽  
Jonas Feldt ◽  
Frederico Pratas ◽  
Ricardo A Mata ◽  
Nuno Roma ◽  
...  

A novel perturbative Monte Carlo mixed quantum mechanics (QM)/molecular mechanics (MM) approach has been recently developed to simulate molecular systems in complex environments. However, the required accuracy to efficiently simulate such complex molecular systems is usually granted at the cost of long executing times. To alleviate this problem, a new parallelization strategy of multi-level Monte Carlo molecular simulations is herein proposed for heterogeneous systems. It simultaneously exploits fine-grained (at the data level), coarse-grained (at the Markov chain level) and task-grained (pure QM, pure MM and QM/MM procedures) parallelism to ensure an efficient execution in heterogeneous systems composed of central processing units and multiple and possibly different graphical processing units. This is achieved by making use of the OpenCL library, together with appropriate dynamic load balancing schemes. From the conducted evaluation with real benchmarking data, a speed-up of 56x in the computational bottleneck part was observed, which results in a global speed-up of 38x for the whole simulation, reducing the time of a typical simulation from 80 hours to only 2 hours.

Author(s):  
Zhuliang Yao ◽  
Shijie Cao ◽  
Wencong Xiao ◽  
Chen Zhang ◽  
Lanshun Nie

In trained deep neural networks, unstructured pruning can reduce redundant weights to lower storage cost. However, it requires the customization of hardwares to speed up practical inference. Another trend accelerates sparse model inference on general-purpose hardwares by adopting coarse-grained sparsity to prune or regularize consecutive weights for efficient computation. But this method often sacrifices model accuracy. In this paper, we propose a novel fine-grained sparsity approach, Balanced Sparsity, to achieve high model accuracy with commercial hardwares efficiently. Our approach adapts to high parallelism property of GPU, showing incredible potential for sparsity in the widely deployment of deep learning services. Experiment results show that Balanced Sparsity achieves up to 3.1x practical speedup for model inference on GPU, while retains the same high model accuracy as finegrained sparsity.


Author(s):  
JIANYONG CHEN ◽  
QIUZHEN LIN ◽  
QINGBIN HU

In this paper, a novel clonal algorithm applied in multiobjecitve optimization (NCMO) is presented, which is designed from the improvement of search operators, i.e. dynamic mutation probability, dynamic simulated binary crossover (D-SBX) operator and hybrid mutation operator combining with Gaussian and polynomial mutations (GP-HM) operator. The main notion of these approaches is to perform more coarse-grained search at initial stage in order to speed up the convergence toward the Pareto-optimal front. Once the solutions are getting close to the Pareto-optimal front, more fine-grained search is performed in order to reduce the gaps between the solutions and the Pareto-optimal front. Based on this purpose, a cooling schedule is adopted in these approaches, reducing the parameters gradually to a minimal threshold, the aim of which is to keep a desirable balance between fine-grained search and coarse-grained search. By this means, the exploratory capabilities of NCMO are enhanced. When compared with various state-of-the-art multiobjective optimization algorithms developed recently, simulation results show that NCMO has remarkable performance.


Author(s):  
K. Liagkouras ◽  
K. Metaxiotis

In this paper, we present a novel Interval-Based Mutation (IBMU) operator. The proposed mutation operator is performing coarse-grained search at initial stage in order to speed up convergence toward more promising regions of the search landscape. Then, more fine-grained search is performed in order to guide the solutions towards the Pareto front. Computational experiments indicate that the proposed mutation operator performs better than conventional approaches for solving several well-known benchmarking problems.


2017 ◽  
Vol 91 (7/8) ◽  
pp. 224-235
Author(s):  
Yvonne Krabbe-Alkemade ◽  
Tom Groot

This paper explores the question how much detail a cost system needs to have in order to provide reliable cost information at a reasonable price. In general, fine-grained cost systems with a lot of detail (in product definition, in cost drivers and in cost pools) are expected to provide more reliable cost information than coarse- grained cost systems with less detail. This paper takes as an example the DBC cost system that has been developed for the Dutch hospital sector. The fine-grained DBC system with over 40,000 health care products appears to outperform lowergrained DRG systems with “only” 15,000 and 6,000 health care products on cost homogeneity and predictive validity. It does so however at the cost of a high number of products with measurement and specification errors, caused by a large number of outliers and by a low number of observations in product groups. The cost-effectiveness of the DBC system is not very high: only 3% of all DBC-codes explains 80% of total costs, whereas the lower-grained DRG system uses 14% of the codes to explain 80% of total costs. Combined with the high administration cost of the DBCsystem, it was from an economic perspective, a sensible idea to replace the finegrained DBC-system by the coarse-grained DOT system.


2020 ◽  
Vol 644 ◽  
pp. A151
Author(s):  
Mika Juvela

Context. Radiative transfer (RT) modelling is part of many astrophysical simulations. It is used to make synthetic observations and to assist the analysis of observations. We concentrate on modelling the radio lines emitted by the interstellar medium. In connection with high-resolution models, this can be a significant computationally challenge. Aims. Our aim is to provide a line RT program that makes good use of multi-core central processing units (CPUs) and graphics processing units (GPUs). Parallelisation is essential to speed up computations and to enable large modelling tasks with personal computers. Methods. The program LOC is based on ray-tracing (i.e. not Monte Carlo) and uses standard accelerated lambda iteration methods for faster convergence. The program works on 1D and 3D grids. The 1D version makes use of symmetries to speed up the RT calculations. The 3D version works with octree grids, and to enable calculations with large models, is optimised for low memory usage. Results. Tests show that LOC results agree with other RT codes to within ∼2%. This is typical of code-to-code differences, which are often related to different interpretations of the model set-up. LOC run times compare favourably especially with those of Monte Carlo codes. In 1D tests, LOC runs were faster by up to a factor ∼20 on a GPU than on a single CPU core. In spite of the complex path calculations, a speed-up of up to ∼10 was also observed for 3D models using octree discretisation. GPUs enable calculations of models with hundreds of millions of cells, as are encountered in the context of large-scale simulations of interstellar clouds. Conclusions. LOC shows good performance and accuracy and is able to handle many RT modelling tasks on personal computers. It is written in Python, with only the computing-intensive parts implemented as compiled OpenCL kernels. It can therefore also a serve as a platform for further experimentation with alternative RT implementation details.


2021 ◽  
Vol 8 (3) ◽  
pp. 1-18
Author(s):  
James Edwards ◽  
Uzi Vishkin

Boolean satisfiability (SAT) is an important performance-hungry problem with applications in many problem domains. However, most work on parallelizing SAT solvers has focused on coarse-grained, mostly embarrassing, parallelism. Here, we study fine-grained parallelism that can speed up existing sequential SAT solvers, which all happen to be of the so-called Conflict-Directed Clause Learning variety. We show the potential for speedups of up to 382× across a variety of problem instances. We hope that these results will stimulate future research, particularly with respect to a computer architecture open problem we present.


2020 ◽  
Author(s):  
Jun Zhang ◽  
Yaokun Lei ◽  
Yi Isaac Yang ◽  
Yi Qin Gao

Molecular simulations are widely applied in the study of chemical and bio-physical systems. However, the<br>accessible timescales of atomistic simulations are limited, and extracting equilibrium properties of systems<br>containing rare events remains challenging. Two distinct strategies are usually adopted in this regard: either<br>sticking to the atomistic level and performing enhanced sampling, or trading details for speed by leveraging<br>coarse-grained models. Although both strategies are promising, either of them, if adopted individually,<br>exhibits severe limitations. In this paper we propose a machine-learning approach to ally both strategies so<br>that simulations on different scales can benefit mutually from their cross-talks: Accurate coarse-grained (CG)<br>models can be inferred from the fine-grained (FG) simulations through deep generative learning; In turn, FG<br>simulations can be boosted by the guidance of CG models via deep reinforcement learning. Our method<br>defines a variational and adaptive training objective which allows end-to-end training of parametric<br>molecular models using deep neural networks. Through multiple experiments, we show that our method is<br>efficient and flexible, and performs well on challenging chemical and bio-molecular systems. <br>


Author(s):  
Jun Zhang ◽  
Yaokun Lei ◽  
Yi Isaac Yang ◽  
Yi Qin Gao

Molecular simulations are widely applied in the study of chemical and bio-physical systems. However, the<br>accessible timescales of atomistic simulations are limited, and extracting equilibrium properties of systems<br>containing rare events remains challenging. Two distinct strategies are usually adopted in this regard: either<br>sticking to the atomistic level and performing enhanced sampling, or trading details for speed by leveraging<br>coarse-grained models. Although both strategies are promising, either of them, if adopted individually,<br>exhibits severe limitations. In this paper we propose a machine-learning approach to ally both strategies so<br>that simulations on different scales can benefit mutually from their cross-talks: Accurate coarse-grained (CG)<br>models can be inferred from the fine-grained (FG) simulations through deep generative learning; In turn, FG<br>simulations can be boosted by the guidance of CG models via deep reinforcement learning. Our method<br>defines a variational and adaptive training objective which allows end-to-end training of parametric<br>molecular models using deep neural networks. Through multiple experiments, we show that our method is<br>efficient and flexible, and performs well on challenging chemical and bio-molecular systems. <br>


2020 ◽  
Author(s):  
Javier Caceres-Delpiano ◽  
Lee-Ping Wang ◽  
Jonathan W. Essex

AbstractAtomistic models provide a detailed representation of molecular systems, but are sometimes inadequate for simulations of large systems over long timescales. Coarse-grained models enable accelerated simulations by reducing the number of degrees of freedom, at the cost of reduced accuracy. New optimisation processes to parameterise these models could improve their quality and range of applicability. We present an automated approach for the optimisation of coarse-grained force fields, by reproducing free energy data derived from atomistic molecular simulations. To illustrate the approach, we implemented hydration free energy gradients as a new target for force field optimisation in ForceBalance and applied it successfully to optimise the un-charged side-chains and the protein backbone in the SIRAH protein coarse-grain force field. The optimised parameters closely reproduced hydration free energies of atomistic models and gave improved agreement with experiment.


Sign in / Sign up

Export Citation Format

Share Document