AMD Instinct™ GPU Training

All communication will be done through Zoom, Slack and email.

This course will give a deep dive into the AMD Instinct™ GPU architecture and its ROCm™ ecosystem, including the tools to develop or port HPC or AI applications to AMD GPUs. Participants will be introduced to the programming models for the MI200 series GPUs and MI300A APU. The new unified memory programming model makes writing HPC applications much easier for a wide range of GPU programming models. We will cover how to use pragma-based languages such as OpenMP, the basic GPU programming language HIP, and performance portable languages such as Kokkos and RAJA. In addition, there will be presentations on other important topics such as GPU-aware MPI, and Affinity. The AMD tool suite, including the debugger, rocgdb, and the profiling tools rocprof, omnitrace, and omniperf will also be covered. A short introduction will be given into the AMD Machine Learning software stack including PyTorch and Tensorflow and how they have been used in HPC.

After this course, participants will

  • have learned about the many GPU programming languages for AMD GPUs
  • understand how to get performance scaling
  • have gained knowledge about the AMD programming tools
  • have gotten an introduction to the AMD Machine learning software
  • know about profiling and debugging.

Veranstaltungsort

Online course
Organizer: HLRS, University of Stuttgart, Germany

Veranstaltungsbeginn

22. Apr 2024
13:00

Verstaltungsende

25. Apr 2024
17:00

Sprache

Englisch

Einstiegslevel

Mittel

Themenbereiche

Daten in HPC / Deep Learning / Maschinelles Lernen

Paralleles Programmieren

Performance-Optimierung & Debugging

Themen

Beschleuniger

Code-Optimierung

GPU-Programmierung

Maschinelles Lernen

MPI+OpenMP

OpenMP

Zurück zur Liste

Prerequisites and content levels

Prerequisites

Some knowledge in GPU and/or HPC programming. Participants should have an application developer's general knowledge of computer hardware, operating systems, and at least one HPC programming language.

See also the suggested prereading below (resources and public videos).

Content levels

Basic: 1 hours
Intermediate: 7 hours
Advanced: 6 hours

Learn more about course curricula and content levels.

Resources
  • Book on HIP programming - Porting CUDA
    • Accelerated Computing with HIP,  Yifan Sun, Trinayan Baruah, David R Kaeli,
      ISBN-13: ‎ 979-8218107444
  • Book on OpenMP GPU programming
    • Programming Your GPU with OpenMP, Tom Deakin and Tim Mattson,
      ISBN-13: ‎ 978-0262547536
  • Book of parallel and high performance computing topics
    • Parallel and High Performance Computing, Manning Publications, Robert Robey and Yuliana Zamora,
      ISBN-13: ‎ 978-0262547536
  • ENCCS resourses
  • AMD Lab Notes series on GPUOpen.com

    • Finite difference method - Laplacian part 1
    • Finite difference method - Laplacian part 2
    • Finite difference method - Laplacian part 3
    • Finite difference method - Laplacian part 4
    • AMD matrix cores
    • Introduction to profiling tools for AMD hardware
    • AMD ROCm™ installation
    • AMD Instinct™ MI200 GPU memory space overview 
    • Register pressure in AMD CDNA2™ GPUs
    • GPU-Aware MPI with ROCm
    • Creating a PyTorch/TensorFlow Code Environment on AMD GPUs
    • Jacobi Solver with HIP and OpenMP offloading
    • Sparse matrix vector multiplication - part 1
  • Quick start guides at Oak Ridge National Laboratory

Instructors

Bob Robey, AMD Global Training Lead for Data Center GPUs

Additional AMD Staff Presenters:
Gina Sitaraman
Shelby Lockhart
Samuel Antao
Paul Bauer
Suyash Tandon
Alesandro Fanfarillo
Mahdieh Ghazimirsaeed
Joanna Morgan
Ian Bogle

Assisting AMD Staff
Cathal McCabe

Agenda (subject to change)

All times are CEST.
Day 1 (Mon) - AMD Programming Model, OpenMP and MPI

12:45 - 13:00 Drop in to Zoom

  • 13:00 HLRS Intro
  • 13:10 AMD Presentation Roadmap and Introduction to the System for Exercises – Bob Robey
  • 13:20 Programming Model for MI200 and MI300 series – Gina Sitaraman
  • 13:45 Programming Model Exercises – Gina Sitaraman
  • 14:00 Break
  • 14:10 Introduction OpenMP® Offloading – Shelby Lockhart
  • 14:40 OpenMP® Exercises – Shelby Lockhart
  • 14:55 Break
  • 15:10 Real-World OpenMP® Language Constructs – Shelby Lockhart
  • 15:45 OpenMP® Language Constructs Exercises – Shelby Lockhart
  • 16:00 Advanced OpenMP® - zero-copy, debugging and optimization – Samuel Antao
  • 16:30 Advanced OpenMP® Exercises – Samuel Antao
  • 16:50 Wrapup – Bob Robey
Day 2 (Tue) - MPI and HIP and interoperability

12:45 - 13:00 Drop in to Zoom

  • 13:00 HIP and ROCm – Bob Robey
  • 14:00 HIP Exercises – Bob Robey
  • 14:15 Break
  • 14:30 Porting code to HIP – Paul Bauer
  • 14:50 Porting Exercises – Paul Bauer
  • 15:00 Optimizing HIP Code – Gina Sitaraman
  • 15:40 HIP Optimization Exercises – Gina Sitaraman
  • 16:00 Break
  • 16:15 OpenMP® and HIP Interoperability – Bob Robey
  • 16:40 Interoperability Exercises – Bob Robey
  • 16:55 Wrapup – Bob Robey
Day 3 (Wed) - Performance Portable languages, C++ Std Par, MPI and Machine Learning

12:45 - 13:00 Drop in to Zoom

  • 13:00 Performance Portability Frameworks; Intro to Kokkos – Suyash Tandon
  • 13:30 Kokkos Exercises – Suyash Tandon
  • 13:50 Break
  • 14:00 C++ Standard Parallelism – Alessandro Fanfarillo
  • 14:30 C++ Std Par Exercises – Alessandro Fanfarillo
  • 14:50 Break
  • 15:00 GPU-Aware MPI on AMD GPUs – Mahdieh Ghazimirsaeed and Joanna Morgan
  • 15:30 MPI Exercises – Mahdieh Ghazimirsaeed and Joanna Morgan
  • 15:50 Break
  • 16:00 ML/AI on AMD GPUs – Samuel Antao
  • 16:30 ML/AI Exercises – Samuel Antao
  • 16:55 Wrapup
Day 4 (Thu) - AMD Debuggers and Profiling Tools

12:45 - 13:00 Drop in to Zoom

  • 13:00 Debugging with Rocgdb – Paul Bauer
  • 13:40 Rocgdb Exercises – Paul Bauer
  • 14:00 Break
  • 14:15 GPU Profiling - Performance Timelines – Gina Sitaraman
  • 14:55 Timeline Profiling Exercises – Gina Sitaraman
  • 15:15 Break
  • 15:30 Kernel Profiling with Omniperf – Ian Bogle
  • 16:15 Kernel Profiling Exercises – Ian Bogle
  • 16:45 Additional Training Resources – Bob Robey
  • 16:55 Wrapup

Registration information

Register via the button at the top of this page.
We encourage you to register to the waiting list if the course is full. Places might become available.

Fees

This course is free of charge.

Contact

Khatuna Kakhiani phone 0711 685 65796, kakhiani(at)hlrs.de
Tobias Haas phone 0711 685 87223, tobias.haas(at)hlrs.de

HLRS Training Collaborations in HPC

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC.

Further courses

See the training overview and the Supercomputing Academy pages.

Ähnliche Trainingskurse

Alle Trainingskurse

Mai 06 - 07, 2024

Online


Mai 13 - 17, 2024

Hybrid Event - Stuttgart, Germany


Juni 03 - 07, 2024

Hybrid Event - Stuttgart, Germany


Juni 25 - 26, 2024

Online


Juli 02 - 05, 2024

Stuttgart, Germany


Oktober 14 - 18, 2024

Stuttgart, Germany


November 04 - Dezember 13, 2024

Online