- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
--- Performance analysis and GPU programming (CUDA, OpenACC, OpenMP, kokkos) ---
Personal exchange is very important to us and essential for workshops like this. We therefore kindly ask for your understanding that we do not offer streaming of or online participation in the event.
Organized bus shuttle from Jülich to JSC
Welcome, agenda, house keeping
With JUPITER, Europe's first exascale system is right on the doorstep. The system features two modules, a CPU-centric JUPITER Cluster and a highly-scalable JUPITER Booster, using nearly 24 000 GPUs for 1 EFLOP/s of sustained HPL performance. The talk will introduce the JUPITER system design, the current status, and key defining features of the GPU technology selected to enable this computational milestone in Europe.
To effectively harness the computing capabilities of todays and future
supercomputing systems, performance analysis and optimization should
be a regular activity during scientific software development. Instead
of using do-it-yourself solutions usually based on coarse-grained
timers (e.g., time per timestep or solver iteration), developers of
scientific code bases can resort to a variety of spezialized tools
that have been specifically developed to assist them with this task.
In this part of the workshop, we will introduce the open-source tools
Score-P and Cube, and explore their usage and capabilities with a
number of hands-on exercises.
In canteen Seekasino you can choose from the menu yourself with your meal vouchers.
In this first session, we will introduce the key concepts of parallel
performance analysis using Score-P. After introducing some basic
terminology and the tools ecosystem centered around the Score-P
instrumentation and measurement system, the general workflow of using
Score-P to collect and the Cube graphical user interface to examine
application profiles will be explained via hands-on exercises with a
smaller-sized benchmark code.
The goal of this second session is to intensify the knowledge gained
during the first session using a series of hands-on exercises with a
production ESM application. These cover basic performance analysis
using collected profiles, as well as a cross-experiment scalability
analysis. If time permits, the challenges in examining coupled MPMD
simulations will also be addressed.
To wrap up, we will present an experience report on applying Score-P
to the ICON weather and climate model. We will summarize the steps
taken, the challenges we encountered, and how they have been
addressed.
Organized bus shuttle from Jülich to JSC
JUPITER will utilize nearly 24 000 NVIDIA GPUs to enter the Exascale Era. While CUDA is the native programming model for NVIDIA GPUs, there are alternatives which can offer higher productivity or more portability, like OpenACC, OpenMP, or Kokkos. This tutorial will present the relevant programming models and offer exercises to showcase the respective strengths.
In canteen Seekasino you can choose from the menu yourself with your meal vouchers.