Optimization of Scaling and Node-level Performance on Hazel Hen (Cray XC40)

Enterprises & SME Research & Science
Optimization of Scaling and Node-level Performance on Hazel Hen (Cray XC40)

Overview

 

In order to increase the efficiency of our users' applications on Hazel Hen, HLRS and Cray offer this workshop to enhance the node-level performance as well as scaling of the codes utilized by participants. By doing so, users can raise the volume as well as quality of their scientific findings while the costs (in terms of core hours) remain constant.

Target audience: Code owners of applications already running on Hazel Hen.

Program
Program

Supported by HLRS and CRAY specialists, you will analyze the runtime behavior of your code, locate bottlenecks, develop and discuss potential solutions as well as implement them. All categories of bottlenecks (CPU, memory subsystem, communication and I/O) will be addressed, according to the respective requirements. Ideally, the above steps will be repeated multiple times in order to address several bottlenecks.

In order to achieve usable efficiency enhancements, it is important to discuss pros and cons of potential solutions. This, however, requires application as well as hardware expertise. Hence, this workshop brings together application and hardware specialists in the shape of users and machine experts.

Besides working on the codes, there will be lectures and discussions on:

  • profiling tools
  • node-level performance tuning
  • parallel I/O (MPI-IO, NetCDF, HDF5)

Daily schedule

First day
9:00 -  9:30 Local registration
9:30 - 17:30 Course

2nd and 3rd day
9:00 - 17:30 Course

Last day
9:00 - 16:30 Course

Prerequisites
Prerequisites

In order to attend the workshop, you should already have an account on Hazel Hen and your application should already be running on the system. Furthermore, we require that you bring your own code including a dataset. Before the workshop starts, you will be required to profile your application once using CrayPAT-lite by means of the following steps:

  1. Set up a test case. While doing so, please adhere to the following requirements:
    • number of cores:

      • In order to be representative, the test case should be in size comparable to your current and maybe future production runs.
      • In order to save valuable resources, it should however be as small as possible.
      • So take into account to reduce the size of your test case as long as the profile obtained by step 4.) stays almost constant (cf. pat_report manuals regarding how to view profiles).

    • wall time:

      • In order to allow for a productive as well as effective workflow, the wall time should be a few minutes only.
      • At the same time, it should cover all important parts of the code, including I/O.
      • So take into account to reduce the number of simulated time steps.
  2. Load the perftools modules:
    • $> module load perftools-base
    • $> module load perftools-lite
  3. Rebuild your application, e.g.:
    • $> make clean; make
  4. Run the job with the newly built executable from within a Workspace (cf. https://wickie.hlrs.de/platforms/index.php/Workspace_mechanism).
  5. As a result of 4.), a profiling container <executable_name>+<some-id>s is created in the working directory.
  6. Please send us this subdirectory by copying it to
    /lustre/cray/ws8/ws/hpcmscho-XC_WS_preparation/<your_application_your_name>
    (please also change the permissions).

Please be aware of the fact that merely attending the introductory course (e.g. in  Sep. 2017) does not provide you with all of these prerequisites!

Language
Language

German (in English, if required)

Teacher
Teacher

Stefan Andersson and TBA (Cray)

Handouts
Handouts

Handouts will be available to participants in printed as well as digital form.

Registration

via online registration form

Deadline
Deadline

for registration is Mar. 25, 2018

Fee
Fee

Members of German universities and public research institutes: none
Members of universities and public research institutes within EU or PRACE-member countries: none
Members of other universities and public research institutes: 120 EUR
Others: 400 EUR
(includes coffee breaks)

Organization

Travel Information and Accommodation

see our How to find us page.

PRACE PATC and bwHPC-C5

HLRS is part of the Gauss Centre for Supercomputing (GCS), which is one of the six PRACE Advanced Training Centres (PATCs) that started in Feb. 2012.
HLRS is also member of the Baden-Württemberg initiative bwHPC-C5.
This course is provided within the framework of the bwHPC-C5 user Support. This course is not part of the PATC curriculum and is not sponsored by the PATC program.

Contact

Rolf Rabenseifner phone 0711 - 685 65530, rabenseifner@hlrs.de
Björn Dick phone 0711 - 685 87189, dick@hlrs.de
Lucienne Dettki phone 0711 - 685 63894, dettki@hlrs.de

Shortcut-URL & Course Number

http://www.hlrs.de/training/2018/XC40-1