GPU acceleration in Nek5000 (with SESSI)

Due to its high performance and throughput capabilities, GPU-accelerated computing is becoming a popular technology in scientific computing, in particular using programming models such as CUDA and OpenACC. The main advantage with OpenACC is that it enables to simply port codes in their ”original” form to GPU systems through compiler directives, thus allowing an incremental approach. An OpenACC implementation is applied to the CFD code Nek5000 for simulation of incompressible flows, based on the spectral-element method. The work follows up previous implementations and focuses now on the P N − P N −2 method for the spatial discretization of the Navier-Stokes equations. Performance results of the ported code show a speed-up of up to 3.1 on multi-GPU for a polynomial order N > 11.

Performance of the matrix-matrix multiplication on a single P100 GPU, with respect to the polynomial order N and the number of elements E.

Published in Otero et al. https://doi.org/10.1016/j.jpdc.2019.05.010.