Course Curricula and Content Levels

Before registering for a course, please review the information concerning Entry Level and Content Level that is indicated on the course information page. This will assist you in selecting a training activity that is appropriate for your knowledge level and professional needs.

Entry levels

The Entry Level indicates whether there are prerequisites for taking a course. It is shown in the right column near the top of every course listing.

Basic courses: There are no prerequisites, although some knowledge in computer programming may be helpful.
Intermediate and advanced courses: Some prerequisites are necessary for this course. Please check for details in the course description.

For more details, see the Prerequisites section within the course description.

Learning levels

Each course description also indicates the number of hours taught at each learning level.

Beginners' content: There are no prerequisites.
Intermediate content: Typical basic knowledge is required for these course parts.
Advanced content: This content may be especially relevant for high-performance computing (HPC).
Community-target and domain-specific content: This content is designed to address the interests of specific HPC user communities.

Important: Many courses may combine beginners' + intermediate + advanced + community-targeted parts.

Typically, in a 5-day-course, the first day(s) start with beginner's content, followed by intermediate and advanced content. Therefore, a main topic may show up twice in the agenda, first on beginner's level and later on during the week with intermediate or advanced level.

0-to-100: It often helps to start from zero and progressively learn all you need to use an HPC (high performance computing) system.

Partial courses: According to the specific agenda of such courses (in each course website), you may register for specific days, for example, by choosing the beginners' + intermediate parts.

Content levels for subjects of instruction

Click on one of the following topics for more detailed information about course content and learning levels.

MPI: Message Passing Interface

Section numbers in parentheses refer to the MPI Standard Version 3.1:

Beginners' content

MPI Overview
MPI Process Model (8.7-8)
Point-to-Point Communication (3.1-6, 8.6)
Non-Blocking Communication (3.7+10)
Collective Communication (5.1-11+13)
Error Handling (8.3-5)

Intermediate content

Groups and communicators, Environment Management, MPI_Comm_split, intra- & inter-communicators (6.1-5)
Virtual Topologies (7.1-5, 3.11)
One-sided Communication (11, 8.2)
Derived Datatypes (4.1.1-5+9)
Parallel File I/O (basics) (13)
Parallel File I/O (fileviews) (13.3)
Parallel File I/O (access methods) (13)
Best Practice
The new Fortran module mpi_f08 (17.1)
Collective Communication, advanced topics, Nonblocking Collectives (5.12), MPI_IN_PLACE (5.2.1)
Re-numbering on a cluster, Collective communication on inter-communicators (6.6), Info object (9), Attribute caching & naming (6.7-8), Implementation information (8.1)
Neighborhood Communication (7.6-7) and MPI_BOTTOM (4.1.5)
Shared Memory One-sided Communication (11.2.3)
Shared Memory synchronization rules (11.4-5)

Advanced content, Community-targeted and domain-specific content

Derived Datatypes and Resizing (4.1.7-12, 4.2-3)
MPI and Threads (12.4)
Probe, Cancel, Persistent Requests (3.8-9)
Process Creation and Management (10)
Other MPI Features (1-2, 12.1-3, 14-16, 2.6.1, 17.2, 8)
MPI Parameter Tuning

Shared memory parallelization with OpenMP

Beginners' content

Overview and execution model (OpenMP 3.1 features)
Work sharing directives (OpenMP 3.1 features)
Data environment (OpenMP 3.1 features)

Intermediate content

Heat Example (practical/homework)
Pitfalls

Community-targeted and domain-specific content

OpenMP-4.0 and 4.5 Extensions (OpenMP 4.0 and 4.1 features, without GPU support)

Iterative solvers and parallelization

Beginners' content

Parallelization of Explicit and Implicit Solvers
Numerical and parallel libraries

Advanced content

Parallel Programming Models on Hybrid Systems / MPI + OpenMP

Community-targeted and domain-specific content

Particle-based domain decomposition
PETSc, An Introduction
Laplace-Example with MPI and PETSc: Introduction
Laplace-Example with MPI and PETSc: Writing a parallel MPI program with a CG solver
Laplace-Example with MPI and PETSc: Laplace-Example with PETSc
Iterative Linear Solvers

Visualization

Beginners' content

Introduction to concepts of visualization

Introduction to the use of graphic tools

Intermediate content

Extension of graphic tools with own programming steps

Coupling of simulations with real-time visualization

Visualization of parallel applications

Advanced content

Advanced themes

GPU programming

Basic content

Introduction to GPU programming, Introduction to directive based programming using OpenACC or OpenMP GPU

Intermediate content

Asynchronous execution, Parallel execution of kernels, Introduction to Multi-GPU programming

Advanced content

Advanced and Modern Concepts (CUDA Cooperative Groups, CUDA Graphs), Advanced Multi-GPU using e.g. MPI, NCCL or NVSHMEM

Data analysis and statistics

Levels not yet decided.

C++ programming language

Beginners' content

Introduction

Intermediate content

Intermediate C++: Look for the courses "Modern C++ Software Design (Intermediate)"

Advanced content

Advanced C++: Look for the courses "Modern C++ Software Design (Advanced)"
C++ and HPC performance programming
C++14 shared memory programming

Fortran programming language

We acknowledge Reinhold Bader at LRZ, member of the Fortran Standardization Body, for the sorting of the topic "Fortran Programming Language", which can also be found here.

Numbered references in parentheses are to the currently valid Fortran 2018 standard (ISO/IEC 1539-1:2018); they are often entry points to further cross-references inside the standard. Otherwise, processor/platform dependencies and indications of best practices are pointed out where appropriate.

Beginners' content

Concept of program units (5.2.1-4, 6.3.1): program (14.1) and module (14.2, simplest usage).
Layout of tokens and program lines (6.2) in free source form (6.3.2)
Concept of execution sequence (5.3.1-5) for a single image, program termination via END PROGRAM or STOP (5.3.7)
Type system (7.1)
1. Intrinsic types (7.4) and type parameters (7.2) for intrinsic types
2. Simple derived types (7.5)
3. Implicit typing (8.7), specifically its avoidance (best practice)
Declaration and processing of data (scalar or array, see below) of intrinsic or simple derived type:
1. Declaration of objects (8.2) for non-dynamic entities
2. Intrinsic operations (5.5.6, 7.4.2)
3. Expressions (10.1) and their evaluation; conversions and recommendations on their use (best practice)
4. Assignment (10.2) for intrinsic and simple derived types
Concept of attribute (8.1, 8.5), specifically the DIMENSION (8.5.8) attribute (for arrays, see below) and PARAMETER (8.5.13) attribute (for constants).
Array concept (5.4.6):
1. Arrays (9.5) of intrinsic or simple derived type; rank, shape and bounds of an array (5.4.6)
2. Construction of array values (7.8), implied-do loops
3. Array sections, array element sequence (9.5.3) and performance impact of their use (best practice)
Execution control via block constructs (11.1), specifically BLOCK (11.1.4), DO (11.1.7), IF (11.1.8) and SELECT CASE (11.1.9)
Intrinsic procedures (16.9):
1. Overview of simple and commonly used (elemental) mathematical and string processing intrinsic functions
2. Conversion intrinsics
3. Inquiry intrinsics for array properties
Fortran environment:
1. The ISO_FORTRAN_ENV intrinsic module (16.10.2) with names for commonly used intrinsic type parameter values, for standard storage size values, and for preconnected I/O unit values.
2. Intrinsic procedures (16.9) for processing environment information (variable values and command line arguments)
Procedures (15):
1. Function vs. subroutine (15.2.1); concept of dummy argument (15.6.2) for static objects of intrinsic or simple derived type; function results and the RESULT clause (15.6.2.2)
2. Invoking a procedure (15.5.1); association between actual and dummy argument (15.5.2) by position or keyword; also includes rules for aliasing and side effects and recommendations (best practice)
3. Procedure interfaces (15.4): prefer explicit to implicit interfaces (15.4.2, best practice), use interface blocks (15.4.3.2) where necessary
4. Module procedures and internal procedures (15.2.2.2, best practice)
5. (Non-)recursive (15.6.2) and PURE (15.7) procedures
6. Passing arrays to procedures: assumed-size (8.5.8.5), explicit-shape (8.5.8.2) and assumed-shape (8.5.8.3) dummy arguments
7. The INTENT (8.5.10, best practice), OPTIONAL (8.5.12), and VALUE (8.5.18) attributes
I/O facilities (12, 13):
1. Concept of external file (12.3.1) and record (12.2.1)
2. Connecting a unit (12.5.3) with a file (12.5.4); the OPEN (12.5.6) and CLOSE (12.5.7) statements for sequential (12.3.3.2) formatted (12.2.2) files
3. Data transfer I/O lists with implied-do loops (12.6.3)
4. Use of preconnected units (12.5.5, 12.5.1)
5. READ and WRITE (or PRINT) statements (12.6) for list-directed (13.10) or format-controlled (13.1-4) I/O data transfers
6. The most commonly used data edit descriptors (13.7.1-4) and control edit descriptors (13.8); character string editing (13.9)
Interoperability with C (18):
1. Type parameter values (18.2.2) supplied by the intrinsic module ISO_C_BINDING (18.2)
2. Interoperation of scalar or arrays of intrinsic C type (18.3.1) or simple C struct type (18.3.3) through BIND(C) procedure calls (18.3.6, 18.10.2)
3. The C_SIZEOF procedure (18.2.3)

Intermediate content

Intrinsic procedures (16.9):
1. Classification according to inquiry and transformational procedures (16.1)
2. Treatment of array arguments, including semantics of MASK and DIM (16.2), in particular for reduction functions
3. Numerical models (16.4) and their associated intrinsics
Modules (14.2):
1. General syntax and semantics (14.2.1)
2. Use association (14.2.2) and accessibility (8.5.2); the PROTECTED (8.5.15) attribute
3. Limited namespace management via USE, ONLY and renaming of module entities (best practice)
4. Opaque (7.5.4.1) and private types (best practices)
Constant expressions (10.1.12), initializations (8.4) and specification expressions (10.1.11)
1. DATA statement (8.6.7) and its implied-do semantics
Host association (19.5.1.4) and scoping rules (19.1)
1. Controlling host access via the IMPORT statement (8.8)
Dynamic data:
1. The ALLOCATABLE (8.5.3), POINTER (8.5.14) and TARGET (8.5.17) attributes
2. Dynamic association (9.7) and lifecycle management of dynamic objects (semantics of allocation, deallocation and pointer association)
3. Auto-allocation on assignment (10.2.1.3), moving an allocation (16.9.137), checking allocation status (16.9.11) and association status (16.9.16)
4. Allocatable scalars and deferred-length strings
5. Pointer assignment (10.2.2), including rules for rank changing and bounds remapping.
6. Potential memory management issues for POINTER objects (implementation dependencies and best practices)
7. Automatic objects (8.3); discussion of heap vs. stack allocation (best practices)
Array processing (9.5):
1. Passing arrays to procedures: deferred-shape (8.5.8.4) and assumed-rank (8.5.8.7) dummy arguments
2. The WHERE (10.2.3) array assignment construct
3. The SELECT RANK (11.1.10) block construct
4. The CONTIGUOUS attribute (8.5.7) and its semantics; simply contiguous array designators (9.5.4). Discussion of performance tradeoffs (best practice)
Procedures:
1. Elemental procedures (15.8)
2. Factory procedures: subroutines with dynamic dummy arguments (15.5.2.5-7); functions with dynamic result variables (15.6.2.2)
3. Using procedure arguments (15.5.2.9) for a functional programming style
4. IMPURE procedures (15.6.2.1, 15.7)
5. Procedure pointers (15.2.2.4, 8.5.14)
Object-based programming:
1. Type components with the POINTER or ALLOCATABLE attributes (7.5.4.1)
2. Semantics of such type components for structure construction (7.5.10), assignment (10.2.1), and in procedure invocations (8.5.10)
3. Avoiding resource leaks through final procedures (7.5.6)
4. The ASSOCIATE block construct (11.1.3)
5. Derived types parameterized by length and kind components (7.2, 7.5.3); simple scenarios for use of assumed and deferred length type parameters. Using parameterized types for performance-driven design, based on memory layout (best practice).
Generic programming:
1. Named interfaces (15.4.3), distinguishability of specific procedures with the same generic identifier (15.4.3.4.5), and resolution of generic calls at compilation time (15.5.5.1)
2. Defined operations (10.1.6) and assignment (10.2.1.4)
I/O facilities (12, 13):
1. File inquiry (12.10)
2. File positioning (12.3.4) and non-advancing I/O (12.6.2.4); file positioning statements (12.8)
3. Concept of file storage unit (12.3.5)
4. Further file access modes: direct access (12.3.3.3), stream access (12.3.3.4)
5. Unformatted I/O (12.2.3); tradeoffs of performance vs. portability (best practice)
6. Internal files (12.4)
7. Handling groups of key-value pairs via namelist I/O (13.11)
8. I/O error handling (12.11)

Advanced content

Interoperability with C (18):
1. Interoperation of global data (18.9)
2. Interoperation with C pointer types (18.3.2); the C_LOC, C_ASSOCIATED, and C_F_POINTER module procedures (18.2.3)
3. Interoperation with C procedure pointers (18.3.2); the C_FUNLOC and C_F_PROCPOINTER module procedures (18.2.3)
IEEE Arithmetic and floating point exception handling (17)
1. The IEEE_EXCEPTIONS, IEEE_ARITHMETIC, and IEEE_FEATURES intrinsic modules (17.1) and the types defined therein (17.2)
2. IEEE conforming floating point representations and associated real kind numbers; the procedures IEEE_SELECTED_REAL_KIND (17.11.34) and IEEE_SUPPORT_DATATYPE (17.11.48)
3. IEEE exceptions (17.3) and associated flags; signaling vs. non-signaling exceptions
4. Rounding (17.4) and its control (17.11)
5. Underflow mode (17.5) and its control (17.11)
6. Halting (17.6) and its control (17.11)
7. Managing (17.11) the floating point status (17.7)
8. Exceptional floating point numbers (17.8) and their creation (17.11)
Object-oriented programming:
1. Type extension, abstract types and inheritance (7.5.7)
2. Data and interface polymorphism - declaration of variables with the CLASS specifier (7.3.2.1, 7.3.2.3); dynamic and declared type; type compatibility
3. Typed, sourced and molded allocation (9.7.1)
4. Type-bound procedures (7.5.5) and object-bound procedures (10.2.2.2); the PASS and NOPASS attributes (7.5.4.5)
5. Overriding type-bound procedures (7.5.7.3) and the NON_OVERRIDABLE attribute (7.5.5)
6. Run time type and class identification with the SELECT TYPE block construct (11.1.11); type inquiry intrinsics SAME_TYPE_AS (16.9.165) and EXTENDS_TYPE_OF (16.9.76).
7. Abstract interfaces (15.4.3.2) and deferred type-bound procedures (7.5.5)
8. Unlimited polymorphic objects (7.3.2.3)
9. Polymorphic assignment (10.2), overloading of the structure constructor (15.4.3.4) and polymorphic object construction
10. Generic type-bound procedures and operations (7.5.5)
11. Invocation of specific and generic type-bound procedures (15.5.6)
Dependency inversion with submodules (5.2.5):
1. Syntax and semantics of submodules (14.2.3)
2. Separate module procedures (15.6.2.5)
3. Avoidance of compilation cascades, implementation of dependency-inversed object-oriented patterns (use cases, best practices)
Parallel programming:
1. SPMD-style multi-image execution (5.3.4-5)
2. Coarray concept (5.4.7): The CODIMENSION attribute (8.5.6) and symmetric data decomposition
3. Coarray use: programming one-sided (put- or get-style) data exchange between images through coindexing (9.6)
4. Intrinsic procedures for parallel execution control and coarray inquiry: NUM_IMAGES (16.9.145), THIS_IMAGE (16.9.190), IMAGE_INDEX (16.9.97)
5. Synchronization: Concept of an execution segment (11.6.2) and the need for image control statements (11.6.1) to impose segment ordering; collective synchronization with SYNC ALL (11.6.3)
6. Collective intrinsic subroutines (16.6) for data redistribution and computation: CO_REDUCE (16.9.49), CO_MIN (16.9.48), CO_MAX (16.9.47), CO_SUM (16.9.50), CO_BROADCAST (16.9.46)
7. One-sided segment ordering through EVENT POST (11.6.7) and EVENT WAIT (11.6.8)
8. Mutual exclusion via CRITICAL blocks (11.1.6) or locks (11.6.10); synchronization of image subsets via SYNC IMAGES (11.6.4)
9. Unsynchronized coarray updates through atomic procedures (16.5, 16.9.20-30); support for user-defined segment ordering via SYNC MEMORY (11.6.5)
10. Coarray dummy arguments (15.5.2.8,13)
11. Dynamic coarrays: symmetric and unsymmetric allocation (9.7.1), the MOVE_ALLOC intrinsic subroutine (16.9.137) for coarrays
12. Features and limitations of coarray use in an object-based or object-oriented context (allocatable/polymorphic coarrays or coarray components)
13. Creation and execution of (disjoint) image teams via the FORM TEAM statement (11.6.9) and the CHANGE TEAM block construct (11.1.5), respectively
14. Cross-image access (9.6) to established coarrays (5.4.8) within teams and across team boundaries
15. Team-wide synchronization with the SYNC TEAM image control statement (11.6.6)
I/O facilities (12, 13):
1. Asynchronous data transfers (12.6.2.5) and the ASYNCHRONOUS attribute (8.5.4); assuring completion of asynchronous data transfers (12.7)
2. User-defined derived type I/O: Derived type edit descriptor (13.7.6) and its type-bound or generic procedures (12.6.4.8)

Community-targeted and domain-specific content

Interoperability topics (18) for advanced library interface development:
1. Binding non-interoperable Fortran interfaces to C through use of a C descriptor (18.3.6, 18.4)
2. Manipulation of a C descriptor (18.5) and rules for its use (18.6-8)
3. Assumed-type dummy argument (7.3.2) and its interoperation semantics
4. Using assumed-type and assumed-rank entities for effective suppression of procedure argument TKR checking
5. Extended use of the ASYNCHRONOUS attribute (18.10.4)
Modernization of legacy code:
1. Compiler support for flagging non-standard, standard-level, obsolescent, or removed features (tools)
2. Fixed source form (6.3.3, B.3.7) and conversion tools
3. Non-standard notations for intrinsic types and type promotion by the compiler
4. CHARACTER* declaration (7.4.4.2, B.3.8)
5. Legacy notation for operators (10.1.5)
6. Legacy execution control:
  1. Branching (11.2)
  2. arithmetic IF (deleted)
  3. computed GOTO (11.2.3)
  4. assigned GOTO and ASSIGN (deleted)
  5. non-block DO loop (deleted) and labeled DO loop (B.3.10)
  6. non-integer loop control variable (deleted)
7. Legacy type concepts: SEQUENCE types (7.5.2.3) and (non-standard) record types
8. Procedures:
  1. Implicit interfaces (15.4.2, 15,4,3,8) and external procedures
  2. Arguments declared without INTENT (8.5.10)
  3. Statement functions (15.6.4, B.3.4)
  4. Alternate return arguments (15.6.2.7, B.3.2)
  5. Assumed character length function result (B.3.6)
  6. ENTRY statement (B.3.9)
9. Specific names for intrinsic functions (B.3.12)
10. COMMON blocks and their initialization with BLOCK DATA (B.3.11)
11. Enforcing storage association with EQUIVALENCE (B.3.11); replacement by appropriate POINTER entities, ALLOCATABLE entities, or the TRANSFER intrinsic subroutine (16.9.193)
12. Non-standard dynamic memory with Cray Pointers and its replacement by either C interoperability features or dynamic Fortran objects
13. I/O
  1. Hollerith edit descriptor (deleted)
  2. vertical format control (deleted)
  3. PAUSE statement (deleted)
14. Array assignments with FORALL (B.3.13)
Design of user-defined low-level data representations:
1. The TRANSFER intrinsic subroutine (16.9.193)
2. Bit model and sequences (16.3) and intrinsic procedures for bit manipulation
3. BOZ literals (7.7) and their use
Resilience for large-scale parallel programs: This covers the concept of continuing program execution in the face of image failure. A processor is not obliged to support this feature.
1. Concept of failed image (5.3.6) and the STAT_FAILED_IMAGE constant
2. Termination model for Fortran (5.3.7) and forcing an image failure via FAIL IMAGE (11.5)
3. Explicit status query on image control statements (9.7.4, 11.6.11) and collective procedures (16.6) as a prerequisite for resilient applications
4. Semantic differences between stopped and failed images (best practices)
5. Effect of coindexed accesses involving failed images (9.6)
6. Use of teams to maintain program execution integrity (C.6.8)

High-Performance Computing Center Stuttgart Nobelstraße 19, 70569 Stuttgart, Germany

Contact & Location

A member of the Gauss Centre for Supercomputing, HLRS is one of three German national centers for high-performance computing.

www.gauss-centre.eu

HLRS is a central unit of the University of Stuttgart.

www.uni-stuttgart.de/en/

© 2025 HLRS. All rights reserved.