Nek5000

We will focus on two aspects in further developing the Nek5000 code within SESSI. First, we will porting Nek5000 to accelerators using OpenACC and CUDA. We will continue the effort of programming Nek5000 for using accelerators to perform batched small matrix matrix multiplication that this the main computational kernel affecting Nek5000 performance. This work will also include optimization with the possibility of using CUDA in combination with OpenACC and improve the efficiency of data movement between host and GPU memories in the GS operator code. This work will be done in collaboration with the EC-funded exascale EPiGRAM-HS project that is led by PDC, and researchers at the Argonne National Laboratory. Second, we will consider new formulations of the compute and communication intensive kernels of Nek5000, including the main communication library gslib. We continue our work on one-sided communication primitives into this kernel via UPC, a PGAS programming system taking advantage of modern network hardware support for efficient one-sided communication. This includes the expertise of Niclas Jansson who developed an initial proof-of-concept of such new software. Features of new languages will also be used to overlap computation and communications by re-organising the flow of the communication.