Höchstleistungsrechenzentrum Stuttgart

AMD Instinct™ GPU Training

Learn how to use and program the APUs in HLRS's system Hunter.

This course will give a deep dive into the AMD Instinct™ GPU architecture and its ROCm™ ecosystem, including the tools to develop or port HPC or AI applications to AMD GPUs. Participants will be introduced to the programming models for discrete GPUs and APU. This course gives you crucial insights how to use and program HLRS's system Hunter which is equipped with the MI300A APUs and the next system Herder.

The new unified memory programming model makes writing HPC applications much easier for a wide range of GPU programming models. We will cover how to use pragma-based languages such as OpenMP, the basic GPU programming language HIP, and performance portable languages such as Kokkos and RAJA. In addition, there will be presentations on other important topics such as GPU-aware MPI, and Affinity. The AMD tool suite, including the debugger and the profiling tools will also be covered.

A short introduction will be given into using GPUs with Python and container workflows, in particular for AI workflows.

Veranstaltungsort

Online course
Organizer: HLRS, University of Stuttgart, Germany

Veranstaltungsbeginn

21. Apr. 2026
13:00

Verstaltungsende

24. Apr. 2026
17:00

Sprache

Englisch

Einstiegslevel

Mittel

Themenbereiche

Hardware-Beschleuniger

Performance-Optimierung & Debugging

Themen

Code-Optimierung

GPU-Programmierung

MPI+OpenMP

OpenMP

Zurück zur Liste

Prerequisites and content levels

Prerequisites

Some knowledge in GPU and/or HPC programming. Participants should have an application developer's general knowledge of computer hardware, operating systems, and at least one HPC programming language.

See also the suggested prereading below (resources and public videos).

Content levels

Basic: 1 hours
Intermediate: 7 hours
Advanced: 6 hours

Learn more about course curricula and content levels.

Resources
  • Book on HIP programming - Porting CUDA
    • Accelerated Computing with HIP,  Yifan Sun, Trinayan Baruah, David R Kaeli,
      ISBN-13: ‎ 979-8218107444
  • Book on OpenMP GPU programming
    • Programming Your GPU with OpenMP, Tom Deakin and Tim Mattson,
      ISBN-13: ‎ 978-0262547536
  • Book of parallel and high performance computing topics
    • Parallel and High Performance Computing, Manning Publications, Robert Robey and Yuliana Zamora,
      ISBN-13: ‎ 978-0262547536
  • ENCCS resources
  • AMD Lab Notes series on GPUOpen.com
    • Finite difference method - Laplacian part 1
    • Finite difference method - Laplacian part 2
    • Finite difference method - Laplacian part 3
    • Finite difference method - Laplacian part 4
    • AMD matrix cores
    • Introduction to profiling tools for AMD hardware
    • AMD ROCm™ installation
    • AMD Instinct™ MI200 GPU memory space overview
    • Register pressure in AMD CDNA2™ GPUs
    • GPU-Aware MPI with ROCm
    • Creating a PyTorch/TensorFlow Code Environment on AMD GPUs
    • Jacobi Solver with HIP and OpenMP offloading
    • Sparse matrix vector multiplication - part 1
  • Quick start guides at Oak Ridge National Laboratory

Instructors

Bob Robey, AMD Global Training Lead HPC and Sovereign AI Enablement and additional AMD staff.

Learning outcomes

After this course, participants will

  • have learned about the many GPU programming languages for AMD GPUs
  • understand how to get performance scaling
  • have gained knowledge about the AMD programming tools
  • have gotten an introduction to the AMD Machine learning software
  • know about profiling and debugging.

Agenda (subject to change)

All times are CEST.
Day 1 - AMD Programming Model, OpenMP and MPI

12:45 - 13:00 Drop in to call

  • 13:00 HLRS Intro
  • 13:10 AMD Presentation Roadmap and Introduction to the System for Exercises
  • 13:20 Programming Model Instinct GPUs
  • 13:45 Programming Model Exercises
  • 14:00 Break
  • 14:10 OpenMP® Offloading with unified shared memory on the APU
  • 14:55 Break
  • 15:10 OpenMP® Exercises
  • 15:40 OpenMP® Offloading on discrete GPUs
  • 16:10 OpenMP® Exercises Part II
  • 16:55 Wrap up
Day 2 - MPI and HIP and interoperability

12:45 - 13:00 Drop in to call

  • 13:00 HIP and ROCm
  • 14:00 HIP Exercises
  • 14:15 Break
  • 14:30 Porting code to HIP
  • 14:50 Porting Exercises
  • 15:00 OpenMP® and HIP Interoperability
  • 15:25 Interoperability Exercises
  • 15:40 Break
  • 15:55 Optimizing HIP Code
  • 16:35 HIP Optimization Exercises
  • 16:55 Wrapup
Day 3 - Performance Portable languages, C++ Std Par, MPI and Python+GPU

12:45 - 13:00 Drop in to call

  • 13:00 Communication on AMD GPUs: GPU-Aware MPI and RCCL
  • 13:30 MPI Exercises
  • 13:50 Break
  • 14:00 Python for HPC: CuPy, HIP-Python and mpi4py
  • 15:00 AI workflows and containers
  • 15:30 Container Exercises
  • 15:50 Break
  • 16:00 Performance Portability Frameworks; Intro to Kokkos
  • 16:30 C++ Standard Parallelism
  • 16:55 Wrapup
Day 4 - AMD Debuggers and Profiling Tools

12:45 - 13:00 Drop in to call

  • 13:00 Debugging with Rocgdb
  • 13:40 Rocgdb Exercises
  • 14:00 Break
  • 14:15 GPU Profiling - Performance Timelines
  • 14:55 Timeline Profiling Exercises
  • 15:15 Break
  • 15:30 Kernel Profiling with rocprof-compute
  • 16:15 Kernel Profiling Exercises
  • 16:45 Additional Training Resources
  • 16:55 Wrapup

Registration information

Apply for this course via the button at the top of this page (will be available soon).

Please be aware that the talks and Q'n'A sessions will be recorded. You declare that you are aware of and consent to the recording by registering.

Registration closes on April 11, 2026.

Fees

This course is free of charge.

Contact

Tobias Haas phone 0711 685 87223, training(at)hlrs.de

HLRS Training Collaborations in HPC and AI

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC. Since 2025, HLRS coordinates HammerHAI

Further courses

See the training overview and the Supercomputing Academy pages.

Ähnliche Trainingskurse
Alle Trainingskurse

Jan. 12 - 16, 2026

Hybrid, Garching

Englisch

Jan. 19 - Feb. 06, 2026

Online (flexible)

Englisch

Jan. 20 - 29, 2026

Online

Englisch

Feb. 10 - 12, 2026

Hybrid, Stuttgart

Englisch

März 02 - 06, 2026

Dresden

Englisch

März 23 - Apr. 17, 2026

Hybrid, Stuttgart

Englisch

März 23 - 27, 2026

Hybrid, Stuttgart

Englisch

Juni 01 - 02, 2026

Online

Englisch

Juli 14 - 17, 2026

Stuttgart

Englisch