Introduction to Hybrid Programming in HPC and Porting and Optimization Workshop

Research & Science
Introduction to Hybrid Programming in HPC and Porting and Optimization Workshop


The agenda for Wednesday - Friday has been updated with information on the videoconference on Wednesday.

Last modification on Jan 24, 2020.

Monday-Tuesday: Introduction to Hybrid Programming in HPC (open to everybody)

Most HPC systems are clusters of shared memory nodes. To use such systems efficiently both memory consumption and communication time has to be optimized. Therefore, hybrid programming may combine the distributed memory parallelization on the node interconnect (e.g., with MPI) with the shared memory parallelization inside of each node (e.g., with OpenMP or MPI-3.0 shared memory). This course analyzes the strengths and weaknesses of several parallel programming models on clusters of SMP nodes. Multi-socket-multi-core systems in highly parallel environments are given special consideration. MPI-3.0 has introduced a new shared memory programming interface, which can be combined with inter-node MPI communication. It can be used for direct neighbor accesses similar to OpenMP or for direct halo copies, and enables new hybrid programming models. These models are compared with various hybrid MPI+OpenMP approaches and pure MPI. Numerous case studies and micro-benchmarks demonstrate the performance-related aspects of hybrid programming.

Hands-on sessions are included on both days. Tools for hybrid programming such as thread/process placement support and performance analysis are presented in a "how-to" section. This course provides scientific training in Computational Science, and in addition, the scientific exchange of the participants among themselves. This course is organized by HLRS, in cooperation with RRZE and VSC (Vienna Scientific Cluster).

Target audience: Scientists developping HPC applications with MPI.

Although these two days are independent of any specific hardware, these "hybrid days" may be a good basis for the challenges when porting your application to Hawk, the new AMD Rome based HPE cluster at HLRS with 128 cores per node.

Wednesday-Friday: Porting and Optimization Workshop (only for customers of HLRS)

The Wednesday morning is dedicated to introductions to the new system (hardware characteristics, batch system, compilers, MPI, ...).

In this workshop, users can port their applications to HLRS' new AMD-based HPE Apollo 9000 supercomputer "Hawk" (to be installed in early 2020) assisted by HLRS and HPE staff. For the course, Hawk will not be available, but you can start porting your application on our small Hawk test system. 

In order to achieve usable efficiency enhancements, it is important to discuss pros and cons of potential solutions. This, however, requires application as well as machine expertise. Hence, this workshop brings together our users (with their application expertise) and support staff (with their machine expertise).

Target audience: Groups holding a compute time budget to be used on Hawk.

We combined both course parts, because the hybrid days may be a good basis for the challenges when porting your application to Hawk with its 128 cores per node.

Agenda & Content (preliminary)


1st day (Hybrid programming, part 1)

08:45   Registration
09:00      Welcome
09:05      Motivation
09:15      Introduction
09:45      Programming Models
09:50       - MPI + OpenMP
10:30   Coffee Break
10:50       - continue: MPI + OpenMP
11:40         Practical (how to compile and start)
12:30         Practical (hybrid through OpenMP parallelization)
13:00   Lunch
14:00         Practical (continued)
15:00   Coffee Break
15:20       - Overlapping Communication and Computation
15:40         Practical (taskloops)
16:20       - MPI + OpenMP Conclusions
16:30       - MPI + Accelerators
16:45      Tools
17:00   End of first day

2nd day (Hybrid programming, part 2)

09:00      Programming Models (continued)
09:05       - MPI + MPI-3.0 Shared Memory
09:45         Practical (replicated data)
10:30   Coffee break
10:50         continue: Practical (replicated data)
11:50       - MPI Memory Models and Synchronization
12:30   Lunch
13:30       - Pure MPI
13:50       - Topology Optimization
14:30   Coffee Break
14:50         Practical (application aware Cartesian topology)
15:45       - Topology Optimization (Wrap up)
16:00      Conclusions
16:15      Q & A
16:30   End of second day (course)


Supported by HLRS and HPE specialists, you will learn how to port your application to the new supercomputer system “Hawk”.  All categories of bottlenecks (CPU, memory subsystem, communication and I/O) will be addressed, according to the respective requirements.

The talks on Wednesday morning will also be provided via a video conference system.

Please join the videoconference with your favourite browser on Windows, MacOS, GNU/Linux:  


  • In order to interact (e.g. ask questions), please use the text-chat system provided in the videoconference room. The participants' audio is otherwise muted.
  • The recording of the lectures will be made available online.

3rd day (Porting workshop):

08:45   Registration (for new participants)
09:00      Module-Environment
                - including important libraries
09:15      Processor
                - AMD-specific  stuff
                - Compiler 
09:45      MPI
                - usage of MPI on Hawk
10:15      Batch System
10:30   Coffee Break
on demand:       Performance Tools
13:00   Lunch
14:00      Porting of own applications to Hawk (on the Hawk test system)
17:30   End of third day

4th day (Porting workshop, continued):

09:00-17:30 Porting of own applications to Hawk (on the Hawk test system) - continued

5th day (Porting workshop, continued):

09:00-15:30 dto.

We recommend to book cancellable tickets and hotels only, so that you can leave as soon as your porting work is done.


Hybrid programming days:
Basic MPI and OpenMP knowledge as presented, e.g., in our Training Courses on MPI and OpenMP.
For the hands-on sessions you should know Unix/Linux and either C/C++ or Fortran in particular.

Scaling workshop:
Your group holds a compute time budget to be used on Hawk.


The course language is English.

Course material
Course material

Hybrid programming days: See


Hybrid programming days: Dr. habil. Georg Hager (RRZE/HPC, Uni. Erlangen), Dr. Rolf Rabenseifner (HLRS, Uni. Stuttgart), Dr. Claudia Blaas-Schenner and Dr. Irene Reichl (VSC Team, TU Wien)

Porting and optimization days: HLRS, HPE and AMD staff


The course is full. Therefore, registration is closed.
To get early information about future courses, you may register to our e-mail list also via the online registration form.

Extended Deadline

for registration is Jan. 19, 2020 (extended deadline)

Late registrations after the deadline are still possible but maybe with reduced quality of the handouts.


Members of universities and public research institutes within EU or PRACE member countries, registering only Wednesday  - Friday: 0 EUR
Students without Diploma/Master: 25 EUR
Students with Diploma/Master (PhD students) at German universities: 45 EUR
Members of German universities and public research institutes: 45 EUR
Members of universities and public research institutes within EU or PRACE member countries: 90 EUR.
Members of other universities and public research institutes: 180 EUR
Others: 420 EUR

(includes coffee breaks)


Travel Information and Accommodation

see our How to find us page.


HLRS is part of the Gauss Centre for Supercomputing (GCS), which is one of the six PRACE Advanced Training Centres (PATCs) that started in Feb. 2012.
HLRS is also member of the Baden-Württemberg initiative bwHPC-C5.
This course is provided within the framework of the bwHPC-C5 user support. This course is not part of the PATC curriculum and is not sponsored by the PATC program.

Local Organizer

Rolf Rabenseifner phone 0711 685 65530,
Lorenzo Zanon phone 0711 685 63824,
Björn Dick phone 0711 685 87189, dick

Shortcut-URL & Course Number