AMD Instinct™ GPU Training

This course will be provided ONLINE via Zoom.

This course will deep dive into the AMD Instinct™ GPU architecture and its ROCm™ ecosystem, including the tools to port or/and migrate their application (HPC/AI) on AMD GPU through HIP (Heterogeneous-computing Interface for Portability) as run them in a multi-node environment. An introduction to RocOps, the AMD software stack for AI, will be provided to attendees, including optimized execution engine for deep learning neural networks.

After this course, participants will

  • have gained knowledge about software enablement on AMD GPU using HIP
  • be able to port simple code from CUDA to HIP
  • have gained knowledge about optimized execution engine for deep learning neural networks.


Online course
Organizer: HLRS, University of Stuttgart, Germany

Start date

Sep 29, 2022

End date

Sep 30, 2022



Entry level



Code Optimization

Deep Learning

GPU Programming

Machine Learning

Back to list

Prerequisites and content levels


Some knowledge in GPU programming, e.g., CUDA, and AI/Deep Learning

See also the suggested prereading below (resources and public videos).

Content levels

Basic: 1 hours
Intermediate: 4 hours
Advanced: 6 hours

Learn more about course curricula and content levels.

Public videos


Essam Morsi, Adil Lashab and Philipp Samfass (AMD)


All times are CEST.
Day 1

08:45 - 9:00 Drop in to Zoom

  • 9:00 - 10:30 Introduction AMD GPU
    • GCN/CDNA Overview
    • Memory Hierarchy
    • HIP GPU Compute Terminology
    • Compute Units
  • 10:45 - 12:45 HIP
    • Introduction to HIP & Core HIP API
    • Memory Management in HIP
    • Asynchronous Computing with HIP
    • Tips & Tricks
  • 12:45 - 13:45 Lunch Break
  • 13:45 - 15:15  ROCm
    • Introduction to ROCm
    • Multi-GPU RCCL/MPI with ROCm
    • Debugging
    • Profiling
  • 15:30 - 16:15  AI
    • RocOps introduction (AMD AI SW Stack)
    • Training
      • Single GPU training with TF and PY for all models
      • mGPU training with TF and PY
      • Distributed mNode training with TF and PY
    • Inference
      • MLPerf MIGraphX & TVM Backend
Day 2 (Hands-on)

08:45 - 9:00 Drop in to Zoom

  • 9:00 - 9:45 ROCm setup
  • 10:00 - 12:00 Hipification: CUDA To HIP
    • Square
    • RTM TTI
  • 12:00 - 13:00 Lunch Break
  • 13:00 - 14:30 HIP
    • Profiling HW counters 
    • Example with bank conflicts on shared memory / how to profile / resolve it
    • RocProfiler RocTracer ISA code
  • 14:45 - 16:00 AI
    • Distributed Training Demonstration
  • 16:00 - 16:15 QA

Registration information

This course is already fully booked.


This course is free of charge.


Khatuna Kakhiani phone 0711 685 65796, kakhiani(at)
Lorenzo Zanon phone 0711 685 63824, zanon(at)


HLRS is part of the Gauss Centre for Supercomputing (GCS), which is one of the six PRACE Advanced Training Centres (PATCs) that started in Feb. 2012.

HLRS is also member of the Baden-Württemberg initiative bwHPC.

This course is also provided within the framework of the bwHPC training program. This course is not part of the PATC curriculum and is not sponsored by the PATC program.

Further courses

See the training overview and the Supercomputing Academy pages.

Related training

All training

December 11 - 15, 2023

Hybrid Event - Stuttgart, Germany

November 04 - December 13, 2024