Introduction to oneAPI, SYCL2020 and OpenMP offloading

Intel's oneAPI logo showing a 1 above the text "oneAPI"

Most current HPC systems are heterogeneous and use accelerators. oneAPI is a standardized and portable programming model adapted to heterogeneous computing. In this course we will provide an introduction to Intel's oneAPI implementation, which supports two portable methods of heterogeneous computing: Data Parallel C++ (DPC) with SYCL and OpenMP for C, C++, and Fortran. Both are portable on any Intel CPU and Intel based accelerator, but also other GPUs. The course will give an introduction in these two programming methods, Intel's libraries like oneMKL and tools for performance analysis, profiling, and debugging. Further an introduction to Intel's DPC compatibility Tool to facilitate code migration from CUDA to SYCL and to Intel's MPI implementation support with GPU awareness completes the program.

Location

Online course
Organizer: HLRS, University of Stuttgart, Germany

Start date

Sep 23, 2024
08:45

End date

Sep 25, 2024
12:40

Language

English

Entry level

Basic

Course subject areas

Parallel Programming

Hardware Accelerators

Topics

Code Optimization

GPU Programming

MPI

OpenMP

Back to list

Prerequisites and content levels

Good knowlegde any of C/C++/Fortran and familiarity with usual OpenMP programming is sufficient for the OpenMP part. For Data Parallel C++/SYCL knowlegde of C++11 or later is recommended (C++17 very much faciliates SYCL2020 programming).

Content levels
  • Basic: 6:35 hours
  • Intermediate: 4:25 hours

Learn more about course curricula and content levels.

Instructors

Intel staff.

Learning outcomes

After this course, participants will:

  • be familiar with the oneAPI programming model,
  • have an overview over DPC++/SYCL programming,
  • have gained knowledge about fundamental OpenMP offloading,
  • have an overview over oneAPI libraries (oneMKL, ...),
  • basic knowledge about profiling and performance analysis,
  • basic knowlegde on (dynamical) debugging of programms using the oneAPI programming model,
  • be aware how Intel's Open Source Compatibility tool can help to migrate CUDA to SYCL code.

Agenda

The preliminary agenda is as follows. All times are CEST.

Day 1
Start End  
8:45 9:00 Drop in to Zoom
09:00 09:10 Welcome and Introduction to Day 1

09:10

09:20

oneAPI – Introduction to a mixed Architecture Development Environment
- Motivation and oneAPI Standardization
- Intel’s oneAPI Toolkits Portfolio and Components
- Intel oneAPI plug-ins for Nvidia and AMD hardware ( CPU and GPUs)

09:20

10:10

Direct programming with oneAPI Compilers (Part 1) – with Demos
- Intro to heterogeneous programming model with SYCL 2020
- SYCL features and examples
   o  “Hello World” Example
   o  Device Selection
   o  Execution Model

10:10

10:25

Break

10:25

11:15

Direct programming with oneAPI Compilers (Part 2) – with Demos
   o  Compilation and Execution Flow  
   o  Memory Model; Buffers, Unified Shared Memory (USM)
   o  Performance optimizations with SYCL features

11:15

11:30

Break

11:30

12:30

oneAPI Case Study – GROMACS

12:30

12:40

Introduction to the Intel Tiber Developer Cloud
12:40 13:00 Instructions on lab exercises (direct programming with SYCL using Intel oneAPI compilers)
13:00 14:00 Lunch break
14:00 14:50 Programming for AI workloads with  SW dev tools powered by oneAPI - Part 1
  • AI Frameworks (IDP, PyTorch, TensorFlow)
  • oneDNN
  • oneCCL
14:50 15:00 Break
15:00 16:00 Programming for AI workloads with  SW dev tools powered by oneAPI - Part 2
  • SYCLomatic Code Migration Tool (30 minutes)
  • How to Compile SYCL Kernels Using PyTorch
  • Migrating AI Workload from CUDA to SYCL
  • Demo: llama.cpp code migration

Day 2

Start End  
8:45 9:00 Drop in to Zoom

9:00

9:05

Welcome and Introduction to Day 2

09:05

09:55

Intel OpenMP for Offloading for Fortran – with Demos
- Parallelizing heterogeneous applications with OpenMP 5.2

09:55

10:00

Break

10:00

10:35

Intel oneAPI libraries (oneMKL) for HPC  - with demos
- Performance optimized libraries for numerical simulations and other purposes

10:35

11:15

Target NVIDIA and AMD with oneAPI and SYCL
Using SYCL based NVIDIA and AMD plugins with Demos
11:15 11:30 Break
11:30 12:00

Intel Debugging Tools for heterogeneous programming (CPU, GPU) - with demos

12:00 12:30 Programming for Distributed HPC Systems using Intel MPI
12:30 13:30 Lunch break
13:30 14:45 Self-paced hands-on with Intel technical consultancy and support via Slack
14:45 15:15

Break

15:15 16:00 Self-paced hands-on with Intel technical consultancy and support via Slack

Day 3

Start End  
8:45 9:00 Drop in to Zoom
9:00 9:05 Welcome and Introduction to Day 3
9:05 10:05 Application profiling for CPU and or mixed hardware withe the Intel® VTune™ Profiler with Demos

10:05

10:15

Break

10:15

11:25

Application profiling for CPU and/ or mixed hardware withe the Intel® VTune™ Profiler with Demos

11:25

11:35

Break

11:35 12:05 Intel® VTune™ Profiler Examples from Cookbook
12:05 12:10 Break

12:10

12:40

Application profiling for CPU and mixed  hardware with the Intel® Advisor™ with Demos
  • Intel® Advisor™ Roofline for application performance anaylsis
12:40 12:50 Intructions on lab exercises
12:50 13:00 Questions and Answers - Wrap up
13:00 14:00 Lunch break
14:00 15:00 Self-paced hands-on with TCE support via Slack
15:00 15:10 Break
15:10 16:00 Self-paced hands-on with TCE support via Slack

Exercises

During the lectures in the morning only demonstrations will be shown. However, we will also show how to access Intel's DevCloud where participants can explore and work on the examples given themselves in the afternoon. Additionally, Intel will offer support to a limited number of participants.

Registration-information

Register via the button at the top of this page.
We encourage you to register to the waiting list if the course is full. Places might become available.

Please be aware that the talks and Q'n'A sessions will be recorded. You declare that you are aware of and consent to the recording by registering.

Registration closes on September 8, 2024 (extended registration phase).

Late registrations after that date are still possible according to the course capacity.

Fees

This course is free of charge.

Our course fee includes coffee breaks (in classroom courses only).

Contact

Tobias Haas phone 0711 685 87223, training(at)hlrs.de

HLRS Training Collaborations in HPC

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC.

Further courses

See the training overview and the Supercomputing Academy pages.

Related training

All training

September 16 - October 18, 2024

Online (flexible)


September 25 - 26, 2024

Ljubljana, Slovenia


October 14 - 18, 2024

Stuttgart, Germany


October 23 - 25, 2024

Dresden, Germany


November 04 - 08, 2024

Online


November 04 - December 06, 2024

Online (flexible)


November 11 - 15, 2024

Hybrid Event - Stuttgart, Germany


December 02 - 05, 2024

Online by JSC


December 09 - 13, 2024

Online