Optimization of Scaling, I/O and Node-level Performance on Hawk

Hybrid course

In order to increase the efficiency of our users' applications on Hawk, HLRS, HPE and AMD offer this workshop to enhance the node-level performance, I/O and scaling of the codes utilized by participants. By doing so, users can raise the volume as well as quality of their scientific findings while the costs (in terms of core hours) remain constant.

By means of this workshop, you can tweak compiler flags, environment settings, etc. in order to make your code run faster. According to our experience gathered in prior workshops, these “low-hanging fruits” can give you a significant speedup but require only little effort.

Furthermore, you will analyze the runtime behavior of your code, locate bottlenecks, design and discuss potential solutions as well as implement them. All categories of bottlenecks (CPU, memory subsystem, communication and I/O) will be addressed, according to the respective requirements. Ideally, the above steps will be repeated multiple times in order to address several bottlenecks.

In addition, an introduction on how to use Ray on Hawk-AI nodes to scale Python and AI tasks will be given. A real-world example that combines these with HPC simulations as well as a use case featuring distributed data preparation, training, and hyperparameter optimization, will give you an idea of what you could do within your
project.

Every attending group will be supported during the entire workshop by a dedicated member of the HLRS user support group. In addition, HPE and AMD specialists will be available to help with issues specific to e.g. MPI and the processor.

To make it easy for you to attend, we decided to provide this workshop in a hybrid fashion. Besides meeting in person at HLRS, we will hence also setup breakout rooms in a Zoom session which enable remote participants to communicate as well as share screens and remote control applications with support staff, hence providing the same options of interaction as meeting in person.

Learning outcomes

Basic knowledge on important tools to enhance the node-level performance, I/O and scaling of the codes utilized by participants on Hawk.
Moreover, practical experience in applying these methods.

Target audience: Groups holding a compute time budget to be used on Hawk.

Location

This hybrid event will take place online and at
HLRS, University of Stuttgart
Nobelstraße 19
70569 Stuttgart, Germany
Room 0.439 / Rühle Saal
Location and nearby accommodations

Location

This hybrid event will take place online and at
HLRS, University of Stuttgart
Nobelstraße 19
70569 Stuttgart, Germany
Room 0.439 / Rühle Saal
Location and nearby accommodations

Start date

Nov 06, 2023
09:00

End date

Nov 10, 2023
17:00

Language

German

Entry level

Advanced

Course subject areas

Performance Optimization & Debugging

Topics

Code Optimization

Machine Learning

MPI

OpenMP

Back to list

Prerequisites and content levels

In order to attend the workshop, you should already have an account on Hawk and your application should already be running on the system. Furthermore, we require that you bring your own code including a test case which is set up according to the following rules:

use case selection:
- When processing the test case, your code should have a behavior and profile which is close to that of current and maybe future production runs.
- If possible, the test case should be representative for those production runs of your group which consume the largest part of your compute time budget.
number of cores:
- In order to be representative, the test case should be in size comparable to the respective current and maybe future production runs.
- In order to save valuable resources and to allow for a productive workflow, it should however be as small as possible.
- So take into account to reduce the size of your test case as long as the profile stays almost constant. This can often be achieved by reducing the simulated domain and keep the computational load per core constant ("weak down scaling").
wall time:
- In order to allow for a productive workflow, the wall time should be a few minutes only.
- At the same time, it should cover all important parts of the code, i.e. computation, communication and I/O.
- So take into account to reduce the number of simulated time steps.

Content levels

Intermediate level: 1 hour 30 minutes
Community-target and domain-specific level: 31 hours

Learn more about course curricula and content levels.

In general the language of instruction is German, but can be changed to English, if required.

Instructors

HLRS, HPE and AMD user support staff

Agenda (subject to update)

Local times: Central European Time Zone (Berlin).
Communication format: Face-to-face, via Slack, Zoom and Email.

We start ON_SITE/ONLINE workshop on Monday with

8:30 Local registration (verification, seating)/drop in to Zoom
9:00 Start of the workshop

Workshop ends on Friday, latest at 17:00.

Daily Mo-Fr agenda:

9:00 - 17:30 Workshop ON-SITE/ONLINE
Talk on November 8 at 16:00 CET:
“Ray on Hawk-AI nodes: how to scale Python and AI tasks”, Dr. Kerem Kayabay (HLRS)
Every attending group will be supported during the entire workshop by a dedicated member of the HLRS user support group. In addition, HPE and AMD specialists will be available to help with issues specific to e.g. MPI and the processor. For more details please reffer to the workshop overview above.
We will update this information if opening times etc. change and inform you more precisely before the course starts.

Handout

Handouts will be available to participants in terms of PDFs.

HLRS concept for on-site courses

Besides the content of the training itself, another important aspect of this event is the scientific exchange among the participants. We try to facilitate such communication by

offering common coffee and lunch breaks and
working together in groups of two during the exercises. Laptops wil be provided.

Fees

Members of German universities and public research institutes: none
Members of universities and public research institutes within EU or PRACE member countries: none
Members of other universities and public research institutes: 120 EUR
Others: 400 EUR
Our course fee includes coffee breaks (on-site courses only)

Contact

Andreas Ruopp phone 0711 685 87259, andreas.ruopp@hlrs.de
Björn Dick phone 0711 - 685 87189, dick(at)hlrs.de
Khatuna Kakhiani phone 0711 685 65796, khatuna.kakhiani(at)hlrs.de
Tobias Haas phone 0711 685 87223, tobias.haas(at)hlrs.de

HLRS Training Collaborations in HPC

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC.

Official course URL(s)

http://www.hlrs.de/training/2023/HPE2

Further courses

See the training overview and the Supercomputing Academy pages.

Back to list

High-Performance Computing Center Stuttgart

Optimization of Scaling, I/O and Node-level Performance on Hawk

Location

Prerequisites and content levels

Content levels

Instructors

Agenda (subject to update)

Handout

HLRS concept for on-site courses

Fees

Contact

HLRS Training Collaborations in HPC

Official course URL(s)

Further courses

Related training

All training

BOOTCAMP: AI profiling

Introduction to SYCL2020

Six-day course in parallel programming with MPI/OpenMP

Supercomputing Academy: Data Analysis with HPC

Parallel Programming Workshop (Train the Trainer)

Parallel Programming Workshop (MPI, OpenMP and Advanced Topics)

Introduction to OpenMP Offloading with AMD GPUs

Multi-GPU Deep Learning

Supercomputing Academy: Parallel Programming with MPI

Advanced Parallel Programming with MPI and OpenMP

Hackathon: Porting and Optimization for Hunter