In recent years, the former EU project POP CoE (Performance Optimisation and Productivity Centre of Excellence in HPC) conducted hundreds of performance analyses of HPC applications. The results showed that load imbalances, even ahead of MPI communication, pose the greatest challenge to the scaling of codes. Increasingly complex simulation projects with coupled requirements, mixed discretion methods, and dynamic adaptivity combine to make a perfect load distribution impossible. On top of this, variabilities in hardware performance make predictive load distribution even more difficult.
Task-oriented programming models are inherently adaptive, dynamic, and reactive; they adjust themselves continuously to the current load distribution and react to load imbalances by migrating tasks. On architectures with distributed memory, these three advantages are, however, substantially limited and lead to high costs for communication and data transfer: adaptive actions require locally inaccessible knowledge of load distribution, reactive actions must exchange input and output data concerning tasks between distant memory locations, and dynamic actions are delayed due to additional communication times, especially communication latency. When tasks located at the same memory location can be started by calling up nearby compute cores, it is often too late to start an additional task migration after recognizing a load imbalance between MPI ranks because of the additional communication costs.
In a BMBF-funded project calledChameleon a construct called “omp target” was improved to migrate accordingly marked OpenMP tasks reactively to other MPI processes temporarily, where they were executed and then returned as a result. targetDART is building on these features and is developing the omp target approach further, so that asynchronous task-oriented programming can be used with reactive load balancing on exascale architectures, in interaction between the application and runtime system of the programming model.
targetDART plans to use tasking concepts from OpenMP as a foundation for support of heterogeneity and migration, in order to develop new load balancing strategies for applications, particularly with respect to exascale architectures. Building on established, standardized concepts will make it possible to bring sustainably proven approaches into the MPI and OpenMP programming standards. In this way, the success story of Chameleon will be continued while connections between research and standardization activities will be supported. This effort constitutes an important contribution to the German and European HPC community.
01. October 2022
30. September 2025
Optimization & Scalability
Scalable Programming Models and Tools
BMBF - SCALEXA call
See all projects
Head, Scalable Programming Models and Tools
High-Performance Computing Center Stuttgart
Nobelstraße 19, 70569 Stuttgart, Germany
+49 711 685-87209
A member of the Gauss Centre for Supercomputing, HLRS is one of three German national centers for high-performance computing.
HLRS is a central unit of the University of Stuttgart.