High-Performance Computing Center Stuttgart

Supercomputing Academy: Deployable Data Analysis & AI Pipelines with HPC

This course is designed for practitioners in SMEs, startups, and applied research teams who need to turn data into actionable insights and repeatable, deployable ML workflows using high-performance computing (HPC) resources. Participants will learn practical patterns to build reproducible and scalable analytics and ML pipelines that remain reliable under real-world constraints, including messy data, growing volumes, performance bottlenecks, and execution across heterogeneous compute environments.

For teams preparing themselves to use HPC resources, the course provides a clear path from notebook prototypes to reproducible, job-based workflows. Participants will develop solutions in Jupyter Notebooks for rapid iteration and then operationalize them as script-based jobs suitable for production-style execution on high-performance systems. The hands-on curriculum covers scalable data processing with Apache Spark, performance-aware execution, and portable environments that help teams turn allocated compute time into measurable progress.

By the end of the course, participants will take back reusable assets they can apply immediately, including a project template (notebooks and job scripts), evaluation reports, scalable preprocessing pipelines, and runnable workflows that scale from laptop to HPC environments.

Location

Flexible online course: Combination of self-study and live seminars (HLRS Supercomputing Academy)
Organizer: HLRS, University of Stuttgart, Germany

Start date

Sep 07, 2026

End date

Oct 09, 2026

Language

English

Entry level

Basic

Course subject areas

Data in HPC / Deep Learning / Machine Learning

Supercomputing Academy

Topics

Big Data

Data Storage & Management

Deep Learning

Machine Learning

Python

Back to list

Prerequisites and content levels

Prerequisites

  • Working knowledge of Linux/Unix: shell commands, SSH, file permissions, directory structure, and basic script editing (nano/vim/emacs).
  • Basic Python skills (or strong experience in another language such as C/C++/Fortran/Java and the ability to ramp up quickly). 

Recommended (not required):

  • Basic Git usage (clone/commit/branch)
  • Familiarity with typical data/ML libraries (numpy, pandas, matplotlib, seaborn) and Jupyter Notebooks.
  • Familiarity with virtual environments (venv/conda) or containers

Content levels

  • Beginners: 20 hours
  • Intermediate: 20 hours
  • Advanced: 10 hours

Target audience

This course is intended for, but not limited to:

  • HPC users who want to run modern data analytics and ML workflows efficiently on shared compute infrastructure.
  • Engineers and researchers who need practical skills in data preparation, supervised ML, deep learning fundamentals, and scalable processing.
  • Industry data roles, including data analysts/scientists, data engineers, ML engineers, and platform/AI infrastructure engineers who want to move from prototypes to repeatable, scalable pipelines.
  • Professionals who want to stay up-to-date with Big Data, HPC, and modern AI trends.

Learning outcomes

After completing this course, participants will be able to:

  • Build end-to-end analytics and ML workflows from raw data to evaluated models.
  • Perform EDA, data profiling, preprocessing, and feature engineering with practical patterns that reduce downstream issues.
  • Implement and compare supervised learning approaches (classification/regression) and select metrics that reflect real objectives.
  • Avoid common real-world pitfalls (e.g., data leakage, improper splitting, misleading metrics) and document assumptions clearly.
  • Use Apache Spark to scale preprocessing and feature engineering workflows.
  • Train and run deep learning models using PyTorch, with practical GPU-aware considerations.
  • Understand responsible AI principles and key ethical/legal considerations relevant to real-world usage.
  • Gain exposure to emerging HPC/AI trends relevant to modern data and ML workloads.

Agenda

  • Week 1: Introduction to Data Analysis and AI within HPC
  • Week 2: Machine Learning (ML) and Deep Learning (DL) — from fundamentals to first runnable workflows
  • Week 3–4: Practical exercises in data pre-processing, feature engineering, and machine learning applications
  • Week 5: Emerging HPC/AI trends — agentic AI, containers, and cost/performance scaling
     
Dates for the Seminars and Exam (Preliminary schedule)
  • Seminars are scheduled on Mondays, 16:30-18:00: Sep. 7 (kick-off, Seminar 1), and Sep. 14, 21, 28, and Oct. 5.
  • Exam is scheduled for Friday, Oct. 16. You may start the approximately 2-hour exam anytime between 06:00 and 23:00. The official course dates reflect the course weeks only, not your exam preparation or the exam itself.
  • Although the schedule is preliminary, we strongly recommend that you reserve these dates when you register for this course. 

Registration information

Apply for this course via the button at the top of this page.

Registration closes on Sunday, August 23, 2026.

Late applications may still be accepted depending on course capacity. For late applications, we cannot guarantee that an ILIAS account will be available before the course begins.

Fees

  • 0 Euro: Employees of the HLRS, or the Jülich Supercomputing Centre (JSC), or the Leibniz Supercomputing Centre (LRZ)
  • 65,00 Euro: Students without master’s degree or equivalent
  • 155,00 Euro: PhD students or employees at a German university or public research institute
  • 305,00 Euro: PhD students or employees at a university or public research institute in an EU, EU-associated or PRACE country other than Germany
  • 610,00 Euro: PhD students or employees at a university or public research institute outside of EU, EU-associated or PRACE countries
  • 1060,00 Euro: Participants from other public service providers, or government
  • 1690,00 Euro: Other participants, e.g. from industry

Link to the EU and EU-associated (Horizon Europe), and PRACE countries.

HLRS concept for flexible learning

Flexible learning

This course offers flexible learning, allowing you to learn at your own pace and access online course materials and cluster resources. Web-seminars are held weekly to discuss the learning modules and to answer your questions. We also provide forum channels that enable you to communicate with the lecturer and peers, as well as to share your experiences.

Learning duration

The course is divided into multiple learning units of 10 hours each. Participants can learn the individual learning content on their own schedule. In addition, this course has fixed dates for virtual seminars and the exam.

Course certificate & confirmation of participation

Upon completion of the course survey by the participant, the High-Performance Computing Center (HLRS) issues a confirmation of participation if they hand in at least 50% of the assignments. Participants who additionally pass the exam at the end of the course receive a course certificate.

Technical requirement
  • Stable Internet connection so you can access and download the learning materials.
  • Access to video conferencing tool with camera and microphone for participation in regular seminars.

Contact

Jasper Seehofer, phone +49 711 68587229, training(at)hlrs.de

Further courses and training team

See the training overview and the Supercomputing Academy pages.
See also information about the HLRS training department and staff.

HLRS Training Collaborations in HPC and AI

HLRS is part of the Gauss Centre for Supercomputing (GCS), together with JSC in Jülich and LRZ in Garching near Munich. SIDE is the German National Competence Centre (NCC) for High-Performance Computing. HLRS is also a member of the Baden-Württemberg initiative bwHPC
Since 2025, HLRS has been coordinating one of the AI Factories of the EuroHPC JU: HammerHAI

This event is offered as part of HammerHAI, Germany’s first AI Factory, which has a dedicated focus on industry, manufacturing, engineering, and research. HammerHAI provides AI resources and solutions, an upcoming AI-optimized supercomputer, and personalized expert support for AI users at all stages in the AI lifecycle. This project has received funding from the European High Performance Computing Joint Undertaking under grant agreement No. 101234027. This project is co-funded by the European Commission, the German Federal Ministry of Research, Technology and Space (BMFTR), the Baden-Württemberg Ministry of Science, Research and the Arts, the Bavarian State Ministry of Science and the Arts and the Lower Saxony Ministry of Science and Culture.

Related training
All training

Jun 22 - Jul 02, 2026

Stuttgart

English

Jul 14 - 17, 2026

Stuttgart

English