Efficient spherical harmonic transforms on GPU and its use in planetary core dynamics simulations
<p>Most of the new supercomputers now use acceleration technology such as GPUs. They promise much higher performance than traditional CPU-only servers, both in terms of floating point operation throughput and memory bandwidth. Furthermore, the electric consumption is significantly reduced, resulting in lower carbon emissions.<br>However, such high computation speeds can only be achieved if a set of more or less stringent rules are followed with respect to memory access and program flow. As a consequence some algorithms more easily approach peak performance.</p><p>Here, we present the results of an effort to achieve high performance on recent nvidia GPU accelerators for the spherical harmonic transform. The spherical harmonic transform can be split into a Legendre transform (which is compute bound) and a Fourier transform (which is memory bound).<br>By taking advantage of recent algorithmic improvements as well as by tuning the Fourier transform, the can now compute a full forward or backward spherical harmonic transform up to degree 8191 on a single 16GB Volta GPU in less than 0.35 seconds.<br>For lower resolution (up to degree 1023), a single Volta GPU performs a full transform more than 3 times faster than a 48-cores dual socket Skylake Xeon Platinum server.</p><p>We also present results of an ongoing effort to port the (simulation of planetary core fluid and magnetic field dynamics) to GPU-accelerated computers.</p>