Towards exascale-ready astrophysics

Europe/Berlin
Virtual Meeting

Virtual Meeting

Description

Exascale computing represents a transformative tool enabling researchers to better model and analyse complex systems, thereby enhancing scientific discovery. With a strong emphasis on numerical simulations in astrophysics, this workshop aims to present the latest developments in exascale technology and its applications within the field of astrophysics research. By bringing together scientists, code developers, and high-performance computing (HPC) experts, this workshop aims to discuss current challenges and future opportunities in utilising exascale computing for astrophysics and cosmology.

 

The scope of the workshop includes but is not limited to the following topics

  • Computational astrophysics and cosmology in the exascale era: challenges and limitations
  • Exascale-ready astrophysics codes: case studies and implementations
  • Energy-efficient (green) computing for environmentally sustainable astrophysics research
  • FAIR astrophysics simulations and data
  • Big data astronomy and machine learning techniques
  • Deep learning for accelerating astrophysical simulations

 

The workshop will feature a mix of invited and contributed presentations, as well as interactive tutorials focusing on astrophysics simulation codes, high-performance computing, machine learning, and data analytics. Attendees will have opportunities for knowledge sharing, discussing innovative ideas, and fostering collaborations through technical talks, hands-on training sessions, and networking events. We strongly encourage female and early career researchers to participate, as their unique perspectives and contributions are invaluable to our discussions.

Keynote speakers (confirmed)

  • Jeroen Bédorf, minds.ai, USA
  • Geoffroy Lesur, Institute for Planetary Sciences and Astrophysics of Genoble, France
  • Junichiro Makino, Preferred Networks, Japan
  • Jason McEwen, University College London, UK
  • Rüdiger Pakmor, Max Planck Institute for Astrophysics, Germany
  • Evan Schneider, University of Pittsburgh, USA
  • Volker Springel, Max Planck Institute for Astrophysics, Germany

Tutorials on

  • AI for astrophysicists
  • AthenaPK (astrophysical MHD code)
  • IDEFIX (astrophysical fluid dynamics)
  • JUPITER (exascale supercomputer)
  • PLUTO (astrophysical fluid dynamics)

 

Time and Location

 

The workshop will start on September 25th (9 am) and end on September 27th (12:30 pm) 2024. It will be held as virtual meeting. There is no registration fee.

In case of questions, please contact the organisers via tera2024@fz-juelich.de.

Participants
  • Alessandro Ruzza
  • Alex Merrow
  • Andrea Mignone
  • Annika Hagemeier
  • Anuran Sarkar
  • Arghyadeep Basu
  • Arman Khalatyan
  • Aviv Padawer-Blatt
  • Brian O'Shea
  • Cairns Turnbull
  • Caterina Caravita
  • Chris Byrohl
  • Christian Boly
  • David Felipe Bambague Sichaca
  • David Smolinski
  • Dylan Nelson
  • Eduard Vorobyov
  • Elena Lacchin
  • Elena Sacchi
  • Eleni Antonopoulou
  • Eleonora Panini
  • Eva Sciacca
  • Evan Schneider
  • Filippo Barbani
  • Fournier Martin
  • Francesco Maria Flammini Dotti
  • Frank Wagner
  • Furkan Dincer
  • Harry Enke
  • HItesh Kishore Das
  • Holger Stiele
  • Jason McEwen
  • Jayesh Badwaik
  • Jibin Joseph
  • Jolanta Zjupa
  • Jomana Ehab
  • Junichiro Makino
  • Lorenzo Maria Perrone
  • Manolis Marazakis
  • Marcel Trattner
  • Marco Rossazza
  • Marisa Zanotti
  • Mark Ivan Ugalino
  • Martin Obergaulinger
  • Martynas Laužikas
  • Massimo Gaspari
  • Matheus Da Costa
  • Mohamed Mohamed Helmy
  • Moorits Mihkel Muru
  • Mukadi Chisabi
  • Navonil Saha
  • Nesa Abedini
  • Oliver Zier
  • Pablo Daniel Contreras Guerra
  • Peenal Gupta
  • Philipp Grete
  • Prasanna Ponnusamy
  • Rainer Spurzem
  • Rainer Spurzem
  • Robert Wissing
  • Rüdiger Pakmor
  • Sacha Gavino
  • saim ali
  • Sebastian T. Gomez
  • Sergey Khoperskov
  • Shifang Li
  • Simon Portegies Zwart
  • Susanne Pfalzner
  • Sébastien Paine
  • Thomas Guillet
  • Tiago Batalha de Castro
  • Tim-Eric Rathjen
  • Tobias Buck
  • Valentina Cesare
  • Vignesh Vaikundaraman
  • Vincenzo Antonuccio-Delogu
  • Volker Springel
  • Wolfram Schmidt
  • Åke Nordlund
  • +36
    • 1
      Welcome
      Speaker: Susanne Pfalzner (FZJ)
    • Session 1: Computational astrophysics and cosmology in the exascale era: challenges and limitations
      Convener: Salvatore Cielo (Leibniz Supercomputing Centre)
      • 2
        The promise of next generation hydrodynamic cosmological simulations

        Cosmological hydrodynamical simulations are an indispensable and uniquely powerful tool to link fundamental parameters of cosmological theories with small scale astrophysics, thereby allowing predictions of numerous observables far into the non-linear regime. In the future we seek to build up on successful recent calculations such as IllustrisTNG by expanding the physical faithfulness of the numerical treatments of star formation and black hole growth as well as their associated energetic feedback processes, and in addition, we enlarge the size and statistical power of the leading cosmological models as this is required to take full advantage of upcoming new survey data. In my talk, I will review the methodologies we currently pursue to obtain future multi-physics, multi-scale simulations that realize more reliable and thus more predictive calculations. The road towards such a next generation of galaxy formation simulations is rich with technical challenges and scientific opportunities, especially for the arriving exascale supercomputers.

        Speaker: Volker Springel (Max Planck Institute for Astrophysics)
      • 3
        On the difficulties of calculating gravity

        The calculation of gravitational interactions scales with the number of particles squared. This creates enourmous computational demands with large number of particles. To mitigate this issue multiple algorithms, such as tree or grid approaches, have been used in the past to reduce the computational costs and to allow for larger simulations with larger numbers of particles. While these algorithms work well on CPU, they are difficult to port on the GPU and are usually resulting in significant overhead.

        In OpenGADGET3, a N-body/SPH code for massive cosmological simulations and part of the SPACE CoE, we try to implement a new approach combining the strength of a gravitational tree and the computational power of the GPU.

        Speaker: Geray Karademir (USM, LMU)
      • 4
        Black Hole Imaging: Radiative transfer in extreme gravity

        The GRAVITY instrument made a remarkable observation during the Near-Infrared flares of 2018, detecting a fast-moving hot spot in what seemed to be a circular orbit around SgrA*, the supermassive black hole in our Galactic Center. These profound observations have motivated the development of an advanced Python code for General-Relativistic Radiative Transfer calculations within the framework of Kerr spacetime. We provide a deeper understanding of the inner workings of ray tracing schemes in general and offer detailed insights into the challenges encountered during the development process. Moreover, we present rigorous tests to evaluate the accuracy of the code’s results and highlight the importance of implementing high performance computing techniques in general relativistic numerical simulations. This work investigates how General Relativity and the spin of a black hole shape photon geodesics and studies the effect of parameters such as the hot spot’s angular velocity and the observer’s inclination on the resulting trajectory. More specifically, we employ our radiative transfer algorithm to interpret the observed flaring events in the vicinity of our Galactic Center and seek out the optimal orbital parameters for modeling similar phenomena. In accordance with the latest state-of-the-art GRMHD simulations, our research scope encompasses physically motivated ejected hot spot models, such as helical and conical configurations, that represent the most suitable candidates for replicating the observed flares.

        Speaker: Eleni Antonopoulou (National & Kapodistrian University of Athens / Academy of Athens)
      • 5
        The role of planets in the mass segregation of stellar systems

        The dynamical evolution of planetary objects in stellar clusters is still an uncharted territory, and observation of planets in general are extremely limited. In this talk, I will first explain how these objects are created, and then numerically explore the dynamical evolution of such objects, using different stellar cluster densities. The main objective will be to understand if the dynamical mass segregation in these cluster affects planets, and, if not, what is the reason behind it. The initial work of this series suggest that the planets follow the potential of the central core. This is supported by the fact that denser cores retain more planet, statistically. Moreover, if we vary the energy distribution of the planet population, the result are similar.

        Speaker: Francesco Maria Flammini Dotti (University of Heidelberg)
    • 10:40
      Coffee break
    • 6
      Tutorial: Exascale supercomputer JUPITER
      Speaker: Mathis Bode (Jülich Supercomputing Centre)
    • 12:30
      Lunch break
    • Session 2: Large-scale simulations in astrophysics and cosmology
      Convener: Jolanta Zjupa (JSC/FZJ)
      • 7
        Challenges and solutions to run and analyse large cosmological simulation boxes IllustrisTNG and MillenniumTNG

        Large cosmological boxes are a crucial tool to understand the universe and to interpret ongoing and future cosmological surveys. The largest cosmological boxes simulations are among the biggest simulations run on current supercomputers. I will discuss our experiences running and analysing them, with a focus on the technical, machine, algorithmic problems we encountered when running MillenniumTNG (10^11 resolution elements, 122000 cores on Supermuc-NG, 170M core-h) and the solutions we came up with to overcome them.

        Speaker: Rüdiger Pakmor (Max Planck Institute for Astrophysics)
      • 8
        Nuclear and Globular Star Clusters on the path to Exascale

        Rainer Spurzem and Silk Road Team
        National Astronmical Observatories CAS, Beijing, China
        Astronomisches Rechen-Institut, Center for Astronomy (ARI/ZAH), Univ. of Heidelberg, Germany
        Kavli Institute for Astronomy and Astrophysics (KIAA), Peking Univ., Beijing, China

        Nuclear and Globular Star Clusters on the path to Exascale

        Nuclear and globular star clusters (NSC and GC) are spectacular self-gravitating stellar systems in our Galaxy and across the Universe - in many respects. They populate disks and spheroids of galaxies as well as almost every galactic center. In massive elliptical galaxies NSCs harbor supermassive black holes, which might influence the evolution of their host galaxies as a whole. The evolution of star clusters is not only governed by the aging of their stellar populations and simple Newtonian dynamics. For increasing particle number, unique gravitational effects of collisional many-body systems begin to dominate the early cluster evolution. Direct N-body simulations are the most computationally expensive but also the most astrophysically advanced method to simulated GC and NSC evolution, using massively parallel supercomputers with GPU acceleration. The current legacy code Nbody6++GPU has seen many algorithmic and astrophysical improvements in recent years. A timing model shown and confirmed by benchmarks with up to 16 million bodies is presented (a record number in this domain), which approach the Exaflop regime. Current and projected astrophysical results will be shown, for example on intermediate mass black hole formation in star clusters, on clusters as gravitational wave sources, and on powerful tools to predict and analyze properties of TDEs (tidal disruption events in nuclear star clusters, where stars are disrupted by tidal forces of a supermassive black hole). Such events will be observable by large numbers through next generation astrophysical instruments.

        Speaker: Prof. Rainer Spurzem (ARI/ZAH Univ. Heidelberg, NAOC/CAS Beijing)
    • 14:20
      Coffee break
    • 9
      Tutorial: Astrophysics code AthenaPK
      Speaker: Philipp Grete (University of Hamburg)
    • 16:10
      Coffee break
    • Session 3.1: Exascale-ready codes for astrophysical problems: case studies and implementations
      Convener: Susanne Pfalzner (FZJ)
      • 10
        Galaxy Simulations in the Era of Exascale

        Recent years have witnessed enormous gains in the complexity of astrophysics simulations, and the computational power of the machines that run them. Only a couple decades ago, models of galaxy formation and evolution employed calculations with a few million cells or particles -- now those numbers typically exceed billions. With the advent of modern, GPU-based machines, such as the exascale-breaking Frontier at Oak Ridge National Lab, a new opportunity arises to increase the resolution of simulations by orders-of-magnitude more... provided our software can keep up. In this talk, I will describe our work to prepare the GPU-native astrophysics code Cholla to run trillion-cell galaxy simulations on Frontier. With ~parsec-scale resolution throughout the domain, these simulations are able to self-consistently capture the cycle of star formation, supernovae, and outflows on the scales of our own Milky Way, allowing us to probe new regimes of resolved galaxy evolution and answer long-standing questions about the nature of our Galaxy.

        Speaker: Evan Schneider (University of Pittsburgh)
      • 11
        Arepo-RT: Moving Mesh Radiation Hydrodynamics with GPU Acceleration

        Radiative transfer (RT) is essential for modeling many astrophysical phenomena, but its integration into radiation-hydrodynamics (RHD) simulations is computationally intensive due to the stringent time-stepping and high dimensionality requirements. The emergence of exascale supercomputers, equipped with extensive CPU cores and GPU accelerators, offers new avenues for optimizing these simulations. This talk will outline our progress in adapting Arepo-RT for exascale environments. Key advancements include a new node-to-node communication strategy utilizing shared memory, which significantly reduces intra-node communication overhead by leveraging direct memory access. By consolidating inter-node messages, we increase network bandwidth utilization, improving performance on both large-scale and smaller-scale systems. Additionally, transitioning RT calculations to GPUs has led to a speedup of approximately 15 times for standard benchmarks. As a case study, cosmological RHD simulations of the Epoch of Reionization demonstrate a threefold improvement in efficiency without requiring modifications to the core Arepo codebase. These developments have broad implications for the scalability and efficiency of future astrophysical simulations, offering a framework for porting similar simulation codes based on unstructured resolution elements to GPU-centric architectures.

        Speaker: Oliver Zier (CFA Harvard)
      • 12
        Thor - a multi-target resonant emission line radiative transfer code

        I will present the single-source CPU/GPU resonant emission line radiative transfer code “thor” for emission and absorption line post-processing studies in astrophysical simulations. Relying on SYCL and modern C++ allows us to provide an elegant approach to write data-structure agnostic kernel implementations and follow the DRY principle. I will share my experience on performance characteristics and writing modern multi-target code using the C++ SYCL framework, enabling straightforward migration between different current and upcoming accelerator-enabled HPC systems.

        Speaker: Chris Byrohl (ITA Heidelberg)
    • Session 3.2: Exascale-ready codes for astrophysical problems: case studies and implementations
      Convener: Susanne Pfalzner (FZJ)
      • 13
        The Idefix code: Looking back at the development of an exascale code, from design to production on pre-exascale clusters

        Idefix is a versatile Godunov MHD finite volume code designed to run on accelerated supercomputers using the C++ Kokkos framework. In this keynote, I will discuss our motivations for creating a new code (in contrast to porting an existing one) and the path we followed. As the code is now public and is becoming more widely used, I will also illustrate the difficulties physicists encounter when using codes of this kind, and how to address them to maximise the transition of our communities to the new generation of accelerated machines.

        Speaker: Geoffroy Lesur (Grenoble Alpes University, CNRS, IPAG)
      • 14
        First experiences at the exascale with Parthenon – a performance portable block-structured adaptive mesh refinement framework

        On the path to exascale the landscape of computer device architectures and corresponding programming models has become much more diverse. While various low-level performance portable programming models are available, support at the application level lacks behind. To address this issue, we present the performance portable block-structured adaptive mesh refinement (AMR) framework Parthenon, derived from the well-tested and widely used Athena++ astrophysical magnetohydrodynamics code, but generalized to serve as the foundation for a variety of downstream multi-physics codes. Parthenon adopts the Kokkos programming model, and provides various levels of abstractions from multi-dimensional variables, to packages defining and separating components, to launching of parallel compute kernels. Parthenon allocates all data in device memory to reduce data movement, supports the logical packing of variables and mesh blocks to reduce kernel launch overhead, and employs point-to-point, asynchronous MPI calls to reduce communication overhead in multi-node simulations. At the largest scale, a Parthenon-based hydrodynamics miniapp reaches a total of 17 trillion cell-updates per second on 9,216 nodes (73,728 logical GPUs) on Frontier at ~92% weak scaling parallel efficiency (starting from a single node). In this talk, I will highlight our performance-motivated key design decisions in developing Parthenon. Moreover, I will share our experiences and challenges in scaling up with an emphasis on handling the number of concurrent messages on the interconnect, writing large output files, and post-process them for visualization – which also translate to other applications.

        Speaker: Philipp Grete (University of Hamburg)
      • 15
        ngFEOSAD. Coarray and GPU accelerated radiation hydrodynamics code for studying initial stages of star and planet formation.

        Numerical simulations of the initial stages of star and planet formation requires developing specialized codes, which can capture large spatial scales and employ efficient parallelization techniques. We will discuss the problems and their solutions, which we encountered when developing the nested-grids numerical gravito-radiation hydrodynamics code ngFEOSAD. Subtilties of the interface between the coarray FORTRAN, a fast and user-friendly alternative to MPI, and CUDA FORTRAN parallelization techniques will be presented.

        Speaker: Eduard Vorobyov (University of Vienna)
      • 16
        High performance massively parallel direct N-body simulations on large hybrid CPU/GPU clusters.

        Theoretical numerical modeling has become a third pillar of science, alongside theory and experiment (in the case of astrophysics, experiment is mostly replaced by observation). Numerical modeling allows one to compare theory with experimental or observational data in unprecedented detail, and it also provides theoretical insight into physical processes at work in complex systems. We are in the midst of a new revolution in parallel processor technologies and a shift in parallel programming paradigms that can help push today's software to the exaflop/s scale and help better solve and understand typical multi-scale problems. The current revolution in parallel programming has been largely catalyzed by the use of graphical processing units (GPUs) for general-purpose programming, but it is not clear that this will continue to be the case in the future. GPUs are now widely used to accelerate a wide range of applications, including computational physics and astrophysics, image/video processing, engineering simulations, quantum chemistry, to name a few. In this work, we present direct astrophysical N-body simulations with up to six million bodies using our parallel MPI/CUDA code on large hybrid CPU/GPU clusters (JUREAP/Germany and LUMI/Finland) with different types of mixed CPU/GPU hardware. We achieve about one third of the peak GPU performance for this code, in a real application scenario with single hierarchical block time-steps, high (4th, 6th, and 8th) order Hermite integration schemes, and a real core-halo density structure of the modeled stellar systems.

        Speaker: Peter Bercik (Main Astronomical Observatory, National Academy of Sciences of Ukraine)
    • 10:30
      Coffee break
    • Session 4: FAIR astrophysics simulations and data, green computing and environmentally sustainable research
      Convener: Holger Stiele
      • 17
        Energy-efficient large-scale simulations: Hardware and software perspectives

        In the last 50 years, the energy consumption of supercomputers have increased by more than two orders of magnitude. In recent years, due to the quite rapid growth of the use of AI in data centers, the total power consumption of data centers also has increased rapidly and is predicted to show similar increase. In this talk, I discuss the physical/technological reason for such a rapid increase of power consumption and what we can do for reducing the power consumption, from the viewpoints of both hardware and software.

        Speaker: Junichiro Makino (Kobe University/Preferred Networks)
      • 18
        The Ecological Impact of High-Performance Computing

        Computer use in science continues to increase, and so also its impact on the environment. To minimize the effects, scientists should avoid interpreted scripting languages such as Python, favour the optimal use of energy-efficient workstations, and shunt carbon-based powered supercomputers.

        Speaker: Simon Portegies Zwart (Leiden Observatory)
      • 19
        Towards FAIR Astrophysical Simulations

        Scientific work needs to be reproducible, which requires open access to the data. In the context of astronomical observations, the FAIR principles of research data management are widely discussed. However, it is equally important that numerical simulations in astrophysics become FAIR and reproducible. We discuss the need to share not only simulation codes, but also results data and diagnostic tools. Currently, the degree to which this data is shared varies significantly. While code sharing has become more common, often only older or reduced versions are made publicly available. Additionally, the actual result data and diagnostic tools are much less widely disseminated. This situation hinders the reproducibility of research and the reuse of results by other researchers. To address this, we review the requirements for making data FAIR in computational astrophysics and discuss supporting tools, platforms, methods, and best practices.

        Speaker: Frank Wagner (Jülich Supercomputing Centre)
    • 12:00
      Lunch break
    • 20
      Tutorial: Astrophysics code PLUTO
      Speaker: Andrea Mignone (UNITO)
    • 14:30
      Coffee break
    • 21
      Tutorial: Astrophysics code Idefix
      Speaker: Geoffroy Lesur (Grenoble Alpes University, CNRS, IPAG)
    • 16:20
      Coffee break
    • Session 5.1: Big data astronomy and ML/DL techniques
      Conveners: Christian Boily (Observatory of Strasbourg (OAS)), Simon Portegies Zwart (Leiden Observatory)
      • 22
        Harnessing GPU Power: The Past, Present, and Future of AI and Machine Learning in Astronomy

        In this talk, we will briefly review the introduction of GPU accelerators — the hardware driving the AI revolution — into the field of astronomy. After this historical overview, we will shift our focus to the future, examining how GPUs are transforming the computational landscape and their impact on data-intensive applications in astronomy. Additionally, we will explore the expanding role of AI and machine learning in advancing astronomical research, enabled by the power of modern chips.

        Speaker: Jeroen Bédorf (minds.ai)
      • 23
        Interpreting the largest cosmological simulations using Representation Learning

        Numerical simulations are the best approximation to experimental laboratories in astrophysics and cosmology. However, the complexity and richness of their outputs severely limit the interpretability of their predictions. We describe a new assumption-free approach to obtaining scientific insights from cosmological simulations. The method can be applied to today’s largest simulations and will be essential to solve the extreme data access, exploration, and analysis challenges posed by the Exascale computing era. Our tool automatically learns compact representations of simulated galaxies in a low-dimensional space that naturally describes their intrinsic features. The data is seamlessly projected onto this latent space for interactive inspection, visual interpretation, sample selection, and local analysis. We present a working prototype using a Hyperspherical Variational Convolutional Autoencoder trained on the entire sample of simulated galaxies from the IllustrisTNG project. The tool produces an interactive visualization of a “Hubble tuning fork”-style similarity space of simulated galaxies on the surface of a sphere. The hierarchical spherical projection of the data can be extended to arbitrarily large simulations with millions of galaxies.

        Speaker: Sebastian Trujillo (Heidelberg Institute for Theoretical Studies (HITS))
    • 24
      Tutorial: AI for astrophysicists
      Speaker: Stefan Kesselheim
    • 10:30
      Coffee break
    • Session 5.2: Big data astronomy and ML/DL techniques
      Conveners: Andrea Mignone (UNITO), Stephan Hachinger (Leibniz Supercomputing Centre (LRZ) of the BAdW)
      • 25
        Towards learned exascale computational imaging for the SKA

        Exascale computational challenges in astrophysics range across big-data, big-models, and big-sims, all of which require big-compute. In this talk I will focus on the big-data challenge of the Square Kilometre Array (SKA), the next-generation of radio interferometric telescopes, which is currently under construction. The SKA will deliver unprecedented resolution and sensitivity that will unlock numerous science goals, ranging from studying dark matter and dark energy, to extreme tests of general relativity, to observing for the first time the epoch when luminous objects in the Universe formed. However, the SKA also presents unprecedented data processing challenges and is a truly exascale experiment. Imaging raw observations of the SKA requires solving an ill-posed inverse problem that has been identified as a critical bottleneck in current data processing pipelines. I will review highly distributed and parallelised algorithms to scale computational inverse imaging to the exascale. Furthermore, I will describe how artificial intelligence (AI) can be integrated into this approach to realise a hybrid physics-AI approach that can leverage big-sims and, perhaps surprisingly, small-models, providing superior reconstruction quality, further acceleration, and uncertainty quantification.

        Speaker: Jason McEwen (UCL)
      • 26
        Analysing edge-on galaxies with deep learning

        The advent of large astronomical surveys, such as Euclid, will offer unprecedented insights into the statistical properties of galaxies. However, the large amounts of data that will be generated by these surveys call for the application of machine learning methods. For this purpose, we trained the YOLOv5 algorithm to accurately detect spiral, edge-on galaxies in astronomical images and the SCSS-Net neural network to generate segmentation masks, so that the detected galaxies can be used for any further analysis. This algorithm was applied on current astronomical images; however, its real power lies in its applicability to data from future surveys, where it can lead to new discoveries. We will also present one of our future goals, which is the study of the galactic warps: a well-known distortion of the galactic discs occurring in most spiral galaxies, including the Milky Way. Despite the fact that we know hundreds of warped galaxies of different shapes and sizes, it is still not clear how the warp is created. We will show how our algorithm can yield a deeper statistical analysis that will enable us to make connections between the different warps, understand their environmental dependencies, and thus contribute to understanding how this feature forms and what role it plays in the galactic evolution.

        Speaker: Dr Žofia Chrobáková (Mullard Space Science Laboratory, University College London, Holmbury St Mary, Dorking, Surrey RH5 6NT, UK)
      • 27
        The Gaia AVU-GSR solver: a CPU + GPU parallel code toward Exascale systems

        The solver module of the Astrometric Verification Unit-Global Sphere Reconstruction (AVU-GSR) pipeline aims to find the astrometric parameters of $\sim$$10^8$ stars in the Milky Way, besides the attitude and instrumental settings of the Gaia satellite and the parametrized post Newtonian parameter gamma with a resolution of 10-100 micro-arcseconds. To perform this task, the code solves a system of linear equations with the iterative LSQR algorithm, where the coefficient matrix is large (10-50 TB) and sparse and the iterations stop when convergence is reached in the least square sense. The two matrix-by-vector products performed at each LSQR step were GPU-ported, firstly with OpenACC and then with CUDA, resulting in a 1.5x and 14x speedup, respectively, over an original code version entirely parallelized on the CPU with MPI + OpenMP. The CUDA code was further optimized and then ported with portable programming frameworks, obtaining a further 2x acceleration factor. One critical section of the code consists in the computation of covariances, whose total number is Nunk x (Nunk - 1)/2 and occupy ~1 EB, being Nunk $\sim$$5 \times 10^8$ the total number of unknowns. This “Big Data” issue cannot be faced with standard approaches: we defined an I/O based pipeline made of two concurrently launched jobs, where one job, i.e. the LSQR, writes the files and the second job reads them, iteratively computes the covariances and deletes them. The pipeline does not present significant bottlenecks until a number of covariances elements equal to $\sim$$8 \times 10^6$. The code currently runs in production on Leonardo CINECA infrastructure.

        Speaker: Valentina Cesare (Istituto Nazionale di Astrofisica (INAF) - Osservatorio Astrofisico di Catania)
    • 28
      Discussion & Farewell
      Speaker: Susanne Pfalzner (FZJ)