Designing the Next-Generation FFT library

FFT algorithms and software libraries are workforces of scientific computing and data analysis. Most large-scale scientific applications use and rely on FFT libraries, such as FFTW and FFTPACK, which were developed at the end of the Eighties. Their original design did not target the usage of today’s hardware, such as accelerators and vector instructions, and emerging computational approaches, such as mixed-precision and machine-learning assisted methods. For this reason, there is a need to refound fundamental libraries for FFT libraries and develop the next-generation FFT solvers. This project aims at designing and developing a portable and optimized multi-dimensional FFT library for emerging and future architectures, such as Nvidia, AMD GPUs, FPGAs, and upcoming RISC-V systems. As in FFTW, in this project, we rely on compiler technologies and automatic code generation, such as DaCE and MLIR, for generating optimized and portable code that can reuse FFTW optimized codelets and port them to new emerging technologies. The figure on this page shows the development of a portable FFT using the DaCE framework: the DaCE graph, comprising tasklets for calculating a 1D FFT and map operations to parallelize the algorithms, is easily translated and optimized for different architectures (Nvidia, AMD GPUs, and FPGAs) using the DaCE source-to-source compiler. The new FFT library’s usage is integrated and demonstrated in the GROMACS and Nek5000 SESSI codes.