scholarly journals Efficient parallelization of short-range molecular dynamics simulations on many-core systems

2013 ◽  
Vol 88 (5) ◽  
Author(s):  
R. Meyer
Author(s):  
Hasan Metin Aktulga ◽  
Chris Knight ◽  
Paul Coffman ◽  
Kurt A O’Hearn ◽  
Tzu-Ray Shan ◽  
...  

Reactive molecular dynamics simulations are computationally demanding. Reaching spatial and temporal scales where interesting scientific phenomena can be observed requires efficient and scalable implementations on modern hardware. In this article, we focus on optimizing the performance of the widely used LAMMPS/ReaxC package for many-core architectures. As hybrid parallelism allows better leverage of the increasing on-node parallelism, we adopt thread parallelism in the construction of bonded and nonbonded lists and in the computation of complex ReaxFF interactions. To mitigate the I/O overheads due to large volumes of trajectory data produced and to save users the burden of post-processing, we also develop a novel in situ tool for molecular species analysis. We analyze the performance of the resulting ReaxC-OMP package on two different architectures: (i) Mira, an IBM Blue Gene/Q system and (ii) Cori-II, a Cray XC-40 sytem with Knights Landing processors. For Pentaerythritol tetranitrate (PETN) systems of sizes ranging from 32 thousand to 16.6 million particles, we observe speedups in the range of 1.5–4.5×. We observe sustained performance improvements for up to 262,144 cores (1,048,576 processes) of Mira and a weak scaling efficiency of 91.5% in large simulations containing 16.6 million particles. The in situ molecular species analysis tool incurs only insignificant overheads across various system sizes and runs configurations.


2012 ◽  
Vol 136 (15) ◽  
pp. 154702 ◽  
Author(s):  
Minerva González-Melchor ◽  
Gregorio Hernández-Cocoletzi ◽  
Jorge López-Lemus ◽  
Alejandro Ortega-Rodríguez ◽  
Pedro Orea

2015 ◽  
Vol 1753 ◽  
Author(s):  
Ralf Meyer ◽  
Chris M. Mangiardi

ABSTRACTThis article discusses novel algorithms for molecular-dynamics (MD) simulations with short-ranged forces on modern multi- and many-core processors like the Intel Xeon Phi. A task-based approach to the parallelization of MD on shared-memory computers and a tiling scheme to facilitate the SIMD vectorization of the force calculations is described. The algorithms have been tested with three different potentials and the resulting speed-ups on Intel Xeon Phi coprocessors are shown.


Langmuir ◽  
2006 ◽  
Vol 22 (24) ◽  
pp. 9994-10002 ◽  
Author(s):  
Pritesh A. Patel ◽  
Junhwan Jeon ◽  
Patrick T. Mather ◽  
Andrey V. Dobrynin

Sign in / Sign up

Export Citation Format

Share Document