Höchstleistungsrechenzentrum Stuttgart: AMD Instinct™ GPU Training

Prerequisites and content levels

Prerequisites

Some knowledge in GPU and/or HPC programming. Participants should have an application developer's general knowledge of computer hardware, operating systems, and at least one HPC programming language.

See also the suggested prereading below (resources and public videos).

Content levels

Basic: 1 hours
Intermediate: 7 hours
Advanced: 6 hours

Learn more about course curricula and content levels.

Resources

Book on HIP programming - Porting CUDA
- Accelerated Computing with HIP, Yifan Sun, Trinayan Baruah, David R Kaeli,
  ISBN-13: ‎ 979-8218107444
Book on OpenMP GPU programming
- Programming Your GPU with OpenMP, Tom Deakin and Tim Mattson,
  ISBN-13: ‎ 978-0262547536
Book of parallel and high performance computing topics
- Parallel and High Performance Computing, Manning Publications, Robert Robey and Yuliana Zamora,
  ISBN-13: ‎ 978-0262547536
ENCCS resourses
- Developing Applications with the AMD ROCm
AMD Lab Notes series on GPUOpen.com
- Finite difference method - Laplacian part 1
- Finite difference method - Laplacian part 2
- Finite difference method - Laplacian part 3
- Finite difference method - Laplacian part 4
- AMD matrix cores
- Introduction to profiling tools for AMD hardware
- AMD ROCm™ installation
- AMD Instinct™ MI200 GPU memory space overview
- Register pressure in AMD CDNA2™ GPUs
- GPU-Aware MPI with ROCm
- Creating a PyTorch/TensorFlow Code Environment on AMD GPUs
- Jacobi Solver with HIP and OpenMP offloading
- Sparse matrix vector multiplication - part 1
Quick start guides at Oak Ridge National Laboratory
- Crusher quick-start guide
- Frontier user guide

Instructors

Bob Robey, AMD Global Training Lead for Data Center GPUs

Additional AMD Staff Presenters:
Gina Sitaraman
Shelby Lockhart
Samuel Antao
Paul Bauer
Suyash Tandon
Alesandro Fanfarillo
Mahdieh Ghazimirsaeed
Joanna Morgan
Ian Bogle

Assisting AMD Staff
Cathal McCabe

Agenda (subject to change)

All times are CEST.

Day 1 (Mon) - AMD Programming Model, OpenMP and MPI

12:45 - 13:00 Drop in to Zoom

13:00 HLRS Intro
13:10 AMD Presentation Roadmap and Introduction to the System for Exercises – Bob Robey
13:20 Programming Model for MI200 and MI300 series – Gina Sitaraman
13:45 Programming Model Exercises – Gina Sitaraman
14:00 Break
14:10 Introduction OpenMP® Offloading – Shelby Lockhart
14:40 OpenMP® Exercises – Shelby Lockhart
14:55 Break
15:10 Real-World OpenMP® Language Constructs – Shelby Lockhart
15:45 OpenMP® Language Constructs Exercises – Shelby Lockhart
16:00 Advanced OpenMP® - zero-copy, debugging and optimization – Samuel Antao
16:30 Advanced OpenMP® Exercises – Samuel Antao
16:50 Wrapup – Bob Robey

Day 2 (Tue) - MPI and HIP and interoperability

12:45 - 13:00 Drop in to Zoom

13:00 HIP and ROCm – Bob Robey
14:00 HIP Exercises – Bob Robey
14:15 Break
14:30 Porting code to HIP – Paul Bauer
14:50 Porting Exercises – Paul Bauer
15:00 Optimizing HIP Code – Gina Sitaraman
15:40 HIP Optimization Exercises – Gina Sitaraman
16:00 Break
16:15 OpenMP® and HIP Interoperability – Bob Robey
16:40 Interoperability Exercises – Bob Robey
16:55 Wrapup – Bob Robey

Day 3 (Wed) - Performance Portable languages, C++ Std Par, MPI and Machine Learning

12:45 - 13:00 Drop in to Zoom

13:00 Performance Portability Frameworks; Intro to Kokkos – Suyash Tandon
13:30 Kokkos Exercises – Suyash Tandon
13:50 Break
14:00 C++ Standard Parallelism – Alessandro Fanfarillo
14:30 C++ Std Par Exercises – Alessandro Fanfarillo
14:50 Break
15:00 GPU-Aware MPI on AMD GPUs – Mahdieh Ghazimirsaeed and Joanna Morgan
15:30 MPI Exercises – Mahdieh Ghazimirsaeed and Joanna Morgan
15:50 Break
16:00 ML/AI on AMD GPUs – Samuel Antao
16:30 ML/AI Exercises – Samuel Antao
16:55 Wrapup

Day 4 (Thu) - AMD Debuggers and Profiling Tools

12:45 - 13:00 Drop in to Zoom

13:00 Debugging with Rocgdb – Paul Bauer
13:40 Rocgdb Exercises – Paul Bauer
14:00 Break
14:15 GPU Profiling - Performance Timelines – Gina Sitaraman
14:55 Timeline Profiling Exercises – Gina Sitaraman
15:15 Break
15:30 Kernel Profiling with Omniperf – Ian Bogle
16:15 Kernel Profiling Exercises – Ian Bogle
16:45 Additional Training Resources – Bob Robey
16:55 Wrapup

Registration information

Register via the button at the top of this page.
We encourage you to register to the waiting list if the course is full. Places might become available.

Fees

This course is free of charge.

Contact

Khatuna Kakhiani phone 0711 685 65796, kakhiani(at)hlrs.de
Tobias Haas phone 0711 685 87223, tobias.haas(at)hlrs.de

HLRS Training Collaborations in HPC and AI

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC. Since 2025, HLRS coordinates HammerHAI.

Further courses

See the training overview and the Supercomputing Academy pages.

AMD Instinct™ GPU Training

Veranstaltungsort

Prerequisites and content levels

Prerequisites

Content levels

Resources

Instructors

Agenda (subject to change)

All times are CEST.

Day 1 (Mon) - AMD Programming Model, OpenMP and MPI

Day 2 (Tue) - MPI and HIP and interoperability

Day 3 (Wed) - Performance Portable languages, C++ Std Par, MPI and Machine Learning

Day 4 (Thu) - AMD Debuggers and Profiling Tools

Registration information

Fees

Contact

HLRS Training Collaborations in HPC and AI

Further courses

Ähnliche Trainingskurse

Alle Trainingskurse

GENE/GENE-X user training and tutorial

Supercomputing Academy - Introduction to OpenMP

GPU Programming using CUDA

Hybrid Programming in HPC – MPI+X

Parallel Programming with MPI, OpenMP and Tools

Iterative Linear Solvers and Parallelization

Efficient Parallel Programming with GASPI