Höchstleistungsrechenzentrum Stuttgart

Introduction to OpenMP Offloading with AMD GPUs

Learn how to use the APUs and GPUs in HLRS's systems Hunter and Herder. OpenMP is our recommended portable choice to do so.

OpenMP is one major option how to use GPUs to accelerate/offload computations on today's heterogenous computer systems. This course will give an introduction to the AMD Instinct™ GPU and Accelerated Processing Unit (APU) architectures to lay foundations of how GPUs work and can be used for offloading in OpenMP. New features of recent OpenMP versions and GPUs such as the unified memory programming model will be introduced, which make writing HPC applications much easier for a wide range of GPU programming models. In addition, tools for performance analysis and optimization will be presented.

This course targets beginners in GPU programming having basic knowledge of parallelization with OpenMP on CPUs. After this course you will have learned the basics to confidently start porting your application from a CPU only system to systems with discrete GPU accelerators or APUs.

Veranstaltungsort

Online course
Organizer: HLRS, University of Stuttgart, Germany

Veranstaltungsbeginn

27. Okt. 2026
09:00

Verstaltungsende

28. Okt. 2026
13:00

Sprache

Englisch

Einstiegslevel

Basis

Themenbereiche

Hardware-Beschleuniger

Paralleles Programmieren

Themen

Code-Optimierung

GPU-Programmierung

OpenMP

Zurück zur Liste

Prerequisites and content levels

Prerequisites

Basic experience in OpenMP programming, e.g. by attending the Parallel Programming Workshop. Participants should have an application developer's general knowledge of computer hardware, operating systems, and be familiar with C/C++ or Fortran.

See also the suggested prereading below (resources and public videos).

Content levels

Basic: 2 hours
Intermediate: 2.5 hours
Advanced: 1 hours

Learn more about course curricula and content levels

Instructors

Presenters: Michael Klemm, Luka Stanisic, Johanna Potyka
Additional HLRS and AMD staff to support exercises (tbd)

Learning outcomes

In this course, participants will

  • Gain foundational knowledge about GPUs and APUs and their roles in high-performance computing.
  • Learn how to utilize OpenMP offloading with unified shared memory to simplify data management and improve performance.
  • Explore techniques for explicit data management in OpenMP offloading, enabling more control over data movement and optimization for discrete GPUs.
  • Understand the principles and benefits of asynchronous offloading to enhance computational efficiency and overlap computation with data transfer.
  • Discover various tools and methodologies for analyzing and optimizing the performance of your applications.
  • Apply their knowledge in a practical session where they will port a small application, reinforcing the concepts learned throughout the course.

Agenda

Preliminary - All times are CEST.

Day 1:

  • 08:45 - 09:00 Drop in to Webex
  • 9:00 - 13:00 Introduction to OpenMP offload with and without unified shared memory (with exercises)
  • 14:00-17:00 Participants can continue working on exercises or do own experiments. Limited support via chat available

Day 2: 

  • 08:45 - 09:00 Drop in to Webex
  • 9:00 - 13:00 Real world OpenMP porting: App porting examples and tools (with exercises)
  • 14:00-17:00 Participants can continue working on exercises or do own experiments. Limited support via chat available

Lectures and exercises will cover following topics:

  • Introduction to GPU and APU
  • OpenMP offload using unified shared memory
  • OpenMP offload with explicit data management
  • Asynchronous offloading
  • Real world porting and optimization examples
  • Tools for performance analysis and optimizations
  • Hands-on with porting a small app

Registration information

Apply for this course via the button at the top of this page (will be available soon).

Please be aware that the talks and Q'n'A sessions will be recorded. You declare that you are aware of and consent to the recording by registering.

Registration closes on October 13, 2026.

Fees

This course is free of charge.

Resources for additional reading

Resources

Contact

Tobias Haas phone 0711 685 87223, training(at)hlrs.de

HLRS Training Collaborations in HPC and AI

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. SIDE is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC
Since 2025, HLRS has been coordinating one of the AI Factories of the EuroHPC JU: HammerHAI

Further courses

See the training overview and the Supercomputing Academy pages.

Ähnliche Trainingskurse
Alle Trainingskurse

März 02 - 06, 2026

Dresden

Englisch

März 23 - Apr. 17, 2026

Hybrid, Stuttgart

Englisch

März 23 - 27, 2026

Hybrid, Stuttgart

Englisch

Apr. 21 - 24, 2026

Online

Englisch

Juni 01 - 02, 2026

Online

Englisch

Juli 14 - 17, 2026

Stuttgart

Englisch

Okt. 19 - 23, 2026

Stuttgart

Englisch