Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing

With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.

Download Full-text

Performance Evaluation of Scientific Applications on Intel Xeon Phi Knights Landing Clusters

2018 International Conference on High Performance Computing & Simulation (HPCS) ◽

10.1109/hpcs.2018.00063 ◽

2018 ◽

Cited By ~ 4

Author(s):

Ji-Hoon Kang ◽

Oh-Kyoung Kwon ◽

Hoon Ryu ◽

Jinwoo Jeong ◽

Kyunghun Lim

Keyword(s):

Performance Evaluation ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Scientific Applications ◽

Knights Landing ◽

Intel Xeon

Download Full-text

Simulating Multiphase Flows in Porous Media Using OpenFOAM on Intel Xeon Phi Knights Landing Processors

Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact - PEARC17 ◽

10.1145/3093338.3093350 ◽

2017 ◽

Cited By ~ 1

Author(s):

Zhi Shang ◽

Honggao Liu

Keyword(s):

Porous Media ◽

Multiphase Flows ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Flows In Porous Media ◽

Knights Landing ◽

Intel Xeon

Download Full-text

Long-time simulations with complex code using multiple nodes of Intel Xeon Phi Knights Landing

Journal of Computational and Applied Mathematics ◽

10.1016/j.cam.2017.12.050 ◽

2018 ◽

Vol 337 ◽

pp. 18-36 ◽

Cited By ~ 1

Author(s):

Jonathan S. Graf ◽

Matthias K. Gobbert ◽

Samuel Khuvis

Keyword(s):

Xeon Phi ◽

Intel Xeon Phi ◽

Long Time ◽

Knights Landing ◽

Intel Xeon

Download Full-text

Accelerating Seismic Simulations Using the Intel Xeon Phi Knights Landing Processor

Lecture Notes in Computer Science - High Performance Computing ◽

10.1007/978-3-319-58667-0_8 ◽

2017 ◽

pp. 139-157 ◽

Cited By ~ 5

Author(s):

Josh Tobin ◽

Alexander Breuer ◽

Alexander Heinecke ◽

Charles Yount ◽

Yifeng Cui

Keyword(s):

Xeon Phi ◽

Intel Xeon Phi ◽

Knights Landing ◽

Intel Xeon

Download Full-text

Exploiting Very-Wide Vectors on Intel Xeon Phi with Lattice-QCD Kernels

2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP) ◽

10.1109/pdp.2016.116 ◽

2016 ◽

Cited By ~ 1

Author(s):

Andreas Diavastos ◽

Giannos Stylianou ◽

Giannis Koutsou

Keyword(s):

Lattice Qcd ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Intel Xeon

Download Full-text

Optimization of Lattice QCD with CG and multi-shift CG on Intel Xeon Phi Coprocessor

10.22323/1.251.0029 ◽

2016 ◽

Cited By ~ 1

Author(s):

Hirokazu Kobayashi ◽

Yoshifumi Nakamura ◽

Shinji Takeda ◽

Yoshinobu Kuramashi

Keyword(s):

Lattice Qcd ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Intel Xeon

Download Full-text

Performance Comparison of Intel Xeon Phi Knights Landing

SIAM Undergraduate Research Online ◽

10.1137/17s015896 ◽

2017 ◽

Vol 10 ◽

Cited By ~ 2

Author(s):

Ishmail Jabbie

Keyword(s):

Performance Comparison ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Knights Landing ◽

Intel Xeon

Download Full-text

A parallel algorithm of Euclidean distance matrix computation for the Intel Xeon Phi Knights Landing many-core processor

Bulletin of the South Ural State University Series Computational Mathematics and Software Engineering ◽

10.14529/cmse180305 ◽

2018 ◽

Vol 7 (3) ◽

Keyword(s):

Parallel Algorithm ◽

Euclidean Distance ◽

Distance Matrix ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Euclidean Distance Matrix ◽

Matrix Computation ◽

Knights Landing ◽

Many Core ◽

Intel Xeon

Download Full-text

DD-αAMG on QPACE 3

EPJ Web of Conferences ◽

10.1051/epjconf/201817502007 ◽

2018 ◽

Vol 175 ◽

pp. 02007 ◽

Cited By ~ 4

Author(s):

Peter Georg ◽

Daniel Richtmann ◽

Tilo Wettig

Keyword(s):

First Generation ◽

Xeon Phi ◽

Intel Xeon Phi ◽

Knights Landing ◽

Speedup Factor ◽

Single Processor ◽

Intel Xeon

We describe our experience porting the Regensburg implementation of the DD-αAMG solver from QPACE 2 to QPACE 3. We first review how the code was ported from the first generation Intel Xeon Phi processor (Knights Corner) to its successor (Knights Landing). We then describe the modifications in the communication library necessitated by the switch from InfiniBand to Omni-Path. Finally, we present the performance of the code on a single processor as well as the scaling on many nodes, where in both cases the speedup factor is close to the theoretical expectations.

Download Full-text