High-Performance Computing Center Stuttgart

Supercomputing Academy: Deployable Data Analysis & AI Pipelines with HPC

This course is designed for practitioners in SMEs, startups, and applied research teams who need to turn data into actionable insights and repeatable, deployable ML workflows using high-performance computing (HPC) resources. Participants learn practical patterns to build reproducible and scalable analytics and ML pipelines that remain reliable under real-world constraints, including messy data, growing volumes, performance bottlenecks, and execution across heterogeneous compute environments.

For teams preparing to use HPC resources, the course provides a clear path from notebook prototypes to reproducible, job-based workflows. Participants develop solutions in Jupyter Notebooks for rapid iteration and then operationalize them as script-based jobs suitable for production-style execution on high-performance systems. The hands-on curriculum covers scalable data processing with Apache Spark, performance-aware execution, and portable environments that help teams turn allocated compute time into measurable progress.

By the end of the course, participants will take back reusable assets they can apply immediately, including a project template (notebooks and job scripts), evaluation reports, scalable preprocessing pipelines, and runnable workflows that scale from laptop to HPC environments.

Location

Flexible online course: Combination of self-study and live seminars (HLRS Supercomputing Academy)
Organizer: HLRS, University of Stuttgart, Germany

Start date

Sep 07, 2026

End date

Oct 09, 2026

Language

English

Entry level

Basic

Course subject areas

Data in HPC / Deep Learning / Machine Learning

Supercomputing Academy

Topics

Big Data

Data Storage & Management

Deep Learning

Machine Learning

Python

Back to list

Prerequisites and content levels

Prerequisites

  • Working knowledge of Linux/Unix: shell commands, SSH, file permissions, directory structure, and basic script editing (nano/vim/emacs).
  • Basic Python skills (or strong experience in another language such as C/C++/Fortran/Java and the ability to ramp up quickly). 

Recommended (not required):

  • Basic Git usage (clone/commit/branch)
  • Familiarity with typical data/ML libraries (numpy, pandas, matplotlib, seaborn) and Jupyter Notebooks.
  • Familiarity with virtual environments (venv/conda) or containers

Content levels

  • Beginners: 20 hours
  • Intermediate: 20 hours
  • Advanced: 10 hours

Target audience

This course is intended for, but not limited to:

  • HPC users who want to run modern data analytics and ML workflows efficiently on shared compute infrastructure.
  • Engineers and researchers who need practical skills in data preparation, supervised ML, deep learning fundamentals, and scalable processing.
  • Industry data roles, including data analysts/scientists, data engineers, ML engineers, and platform/AI infrastructure engineers who want to move from prototypes to repeatable, scalable pipelines.
  • Professionals who want to stay up-to-date with Big Data, HPC, and modern AI trends.

Learning outcomes

After completing this course, participants will be able to:

  • Build end-to-end analytics and ML workflows from raw data to evaluated models.
  • Perform EDA, data profiling, preprocessing, and feature engineering with practical patterns that reduce downstream issues.
  • Implement and compare supervised learning approaches (classification/regression) and select metrics that reflect real objectives.
  • Avoid common real-world pitfalls (e.g., data leakage, improper splitting, misleading metrics) and document assumptions clearly.
  • Use Apache Spark to scale preprocessing and feature engineering workflows.
  • Train and run deep learning models using PyTorch, with practical GPU-aware considerations.
  • Understand responsible AI principles and key ethical/legal considerations relevant to real-world usage.
  • Gain exposure to emerging HPC/AI trends relevant to modern data and ML workloads.

Instructors

Dr. -Ing. Lorenzo Zanon (HLRS) lorenzo.zanon(at)hlrs.de and 
Junghwa Lee (HLRS) junghwa.lee(at)hlrs.de

Agenda

  • Week 1: Introduction to Data Analysis and AI within HPC
  • Week 2: Machine Learning (ML) and Deep Learning (DL) — from fundamentals to first runnable workflows
  • Week 3–4: Practical exercises in data pre-processing, feature engineering, and machine learning applications
  • Week 5: Emerging HPC/AI trends — agentic AI, containers, and cost/performance scaling

Registration information

Register via the button at the top of this page.
We encourage you to register to the waiting list if the course is full. Places might become available.

Registration closes on August 23, 2026.

Fees

  • 0 Euro: Employees of the HLRS, or the Jülich Supercomputing Centre (JSC), or the Leibniz Supercomputing Centre (LRZ)
  • 65,00 Euro: Students without master’s degree or equivalent
  • 155,00 Euro: PhD students or employees at a German university or public research institute
  • 305,00 Euro: PhD students or employees at a university or public research institute in an EU, EU-associated or PRACE country other than Germany
  • 610,00 Euro: PhD students or employees at a university or public research institute outside of EU, EU-associated or PRACE countries
  • 1060,00 Euro: Participants from other public service providers, or government
  • 1690,00 Euro: Other participants, e.g. from industry

Link to the EU and EU-associated (Horizon Europe), and PRACE countries.

HLRS concept for flexible learning

Flexible learning

This course offers flexible learning, allowing you to learn at your own pace and access online course materials and cluster resources. Web-seminars are held weekly to discuss the learning modules and to answer your questions. We also provide forum channels that enable you to communicate with the lecturer and peers, as well as to share your experiences.

Learning duration

The course is divided into multiple learning units of 10 hours each. Participants can learn the individual learning content on their own schedule. In addition, this course has fixed dates for virtual seminars and the exam.

Course certificate & confirmation of participation

The High-Performance Computing Center (HLRS) issues participants a confirmation of participation if they have joined all seminars and hand in at least 50% of the assignments, as well as a course certificate if they have passed the exam at the end of the course.

Technical requirement
  • Stable Internet connection so you can access and download the learning materials.
  • Access to video conferencing tool with camera and microphone for participation in regular seminars.

Contact

Jasper Seehofer, phone +49 711 68587229, training(at)hlrs.de

HLRS training collaborations in HPC and AI

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. EuroCC@GCS is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC. Since 2025, HLRS coordinates HammerHAI

This course is provided within the framework of the bwHPC training program.

Further courses and training team

See the training overview and the Supercomputing Academy pages.
See also information about the HLRS training department and staff.

Related training
All training

Mar 23 - Apr 17, 2026

Hybrid, Stuttgart

English

Jul 14 - 17, 2026

Stuttgart

English