Sessi MCP: Low-Overhead Thread-Parallelization Library

Over the past decade OpenMP has become the most popular threading model for HPC applications. In the GROMACS molecular simulation package we currently use OpenMP for thread parallelization. But for task parallelization there are both performance and feature limitations. There are many other frameworks available that provide better tasking functionality, but they come with a large overhead, usually because of support for dynamic dependency graphs. For GROMACS we need a simple and very low overhead threading framework that takes into account hardware (cache) locality and requires only static scheduling and a few synchronization primitives. Writing such a framework should not be a large effort and can be done independently from GROMACS. The resulting library would be of use for any code that needs statically scheduled tasks of micro/millisecond duration.

People involved:
Szilard Pall (PDC/MolSim)
Berk Hess (MolSim)
John Eblen (ORNL)
Roland Schulz (Intel)
Stefano Markidis (PDC)

M1: Decision on what frameworks to try and compare (March 2016)
M2: Implement and benchmark basic, task parallelized MD loop (July 2016)
M3: Convert main components of the MD loop to task parallelization (December 2016)