Preprints
https://doi.org/10.5194/egusphere-2023-2547
https://doi.org/10.5194/egusphere-2023-2547
08 Jan 2024
 | 08 Jan 2024

Accelerating Lagrangian transport simulations on graphics processing units: performance optimizations of MPTRAC v2.6

Lars Hoffmann, Kaveh Haghighi Mood, Andreas Herten, Markus Hrywniak, Jiri Kraus, Jan Clemens, and Mingzhao Liu

Abstract. Lagrangian particle dispersion models are indispensable tools for the study of atmospheric transport processes. However, Lagrangian transport simulations can become numerically expensive when large numbers of air parcels are involved. To accelerate these simulations, we made considerable efforts to port the Massive-Parallel Trajectory Calculations (MPTRAC) model to graphics processing units (GPUs). Here we discuss performance optimizations of the major bottleneck of the GPU code of MPTRAC, the advection kernel. Timeline, roofline, and memory analyses of the baseline GPU code revealed that the application is memory-bound and performance suffers from near-random memory access patterns. By changing the data structure of the horizontal wind and vertical velocity fields of the global meteorological data driving the simulations from Structure of Arrays (SoA) to Array of Structures (AoS), and by introducing a sorting method for better memory alignment of the particle data, performance was greatly improved. We evaluated the performance on NVIDIA A100 GPUs of the Jülich Wizard for European Leadership Science (JUWELS) Booster module at the Jülich Supercomputing Center, Germany. For our largest test case, transport simulations with 108 particles driven by the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 reanalysis, we found that the runtime for the full set of physics computations was reduced by 75 %, including a reduction of 85 % for the advection kernel. In addition to demonstrating the benefits of code optimization for GPUs, we show that the runtime of CPU-only simulations is also improved. For our largest test case, we found a runtime reduction of 34 % for the physics computations, including a reduction of 65 % for the advection kernel. The code optimizations discussed here bring the MPTRAC model closer to applications on upcoming exascale high performance computing systems, and will also be of interest for optimizing the performance of other models using particle methods.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Journal article(s) based on this preprint

17 May 2024
Accelerating Lagrangian transport simulations on graphics processing units: performance optimizations of Massive-Parallel Trajectory Calculations (MPTRAC) v2.6
Lars Hoffmann, Kaveh Haghighi Mood, Andreas Herten, Markus Hrywniak, Jiri Kraus, Jan Clemens, and Mingzhao Liu
Geosci. Model Dev., 17, 4077–4094, https://doi.org/10.5194/gmd-17-4077-2024,https://doi.org/10.5194/gmd-17-4077-2024, 2024
Short summary
Lars Hoffmann, Kaveh Haghighi Mood, Andreas Herten, Markus Hrywniak, Jiri Kraus, Jan Clemens, and Mingzhao Liu

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2023-2547', Anonymous Referee #1, 01 Feb 2024
  • RC2: 'Comment on egusphere-2023-2547', Anonymous Referee #2, 03 Feb 2024
  • RC3: 'Comment on egusphere-2023-2547', Anonymous Referee #3, 04 Feb 2024
  • AC1: 'Comment on egusphere-2023-2547', Lars Hoffmann, 02 Apr 2024

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2023-2547', Anonymous Referee #1, 01 Feb 2024
  • RC2: 'Comment on egusphere-2023-2547', Anonymous Referee #2, 03 Feb 2024
  • RC3: 'Comment on egusphere-2023-2547', Anonymous Referee #3, 04 Feb 2024
  • AC1: 'Comment on egusphere-2023-2547', Lars Hoffmann, 02 Apr 2024

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
AR by Lars Hoffmann on behalf of the Authors (02 Apr 2024)  Author's response   Author's tracked changes   Manuscript 
ED: Publish as is (09 Apr 2024) by Xiaomeng Huang
AR by Lars Hoffmann on behalf of the Authors (09 Apr 2024)

Journal article(s) based on this preprint

17 May 2024
Accelerating Lagrangian transport simulations on graphics processing units: performance optimizations of Massive-Parallel Trajectory Calculations (MPTRAC) v2.6
Lars Hoffmann, Kaveh Haghighi Mood, Andreas Herten, Markus Hrywniak, Jiri Kraus, Jan Clemens, and Mingzhao Liu
Geosci. Model Dev., 17, 4077–4094, https://doi.org/10.5194/gmd-17-4077-2024,https://doi.org/10.5194/gmd-17-4077-2024, 2024
Short summary
Lars Hoffmann, Kaveh Haghighi Mood, Andreas Herten, Markus Hrywniak, Jiri Kraus, Jan Clemens, and Mingzhao Liu

Data sets

Supplementary material to `Accelerating Lagrangian transport simulations on graphics processing units: performance optimizations of MPTRAC v2.6' Lars Hoffmann https://doi.org/10.5281/zenodo.10065785

Model code and software

Massive-Parallel Trajectory Calculations (MPTRAC) v2.6 L. Hoffmann et al. https://doi.org/10.5281/zenodo.10067751

Lars Hoffmann, Kaveh Haghighi Mood, Andreas Herten, Markus Hrywniak, Jiri Kraus, Jan Clemens, and Mingzhao Liu

Viewed

Total article views: 399 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
296 81 22 399 9 8
  • HTML: 296
  • PDF: 81
  • XML: 22
  • Total: 399
  • BibTeX: 9
  • EndNote: 8
Views and downloads (calculated since 08 Jan 2024)
Cumulative views and downloads (calculated since 08 Jan 2024)

Viewed (geographical distribution)

Total article views: 396 (including HTML, PDF, and XML) Thereof 396 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 

Cited

Latest update: 03 Sep 2024
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Lagrangian particle dispersion models are crucial for studying atmospheric transport, but they can be computationally intensive. To speed up simulations, the MPTRAC model was adapted for GPUs. Performance optimizations of data structures and memory alignment resulted in run-time improvements of up to 75 % on NVIDIA A100 GPUs for ERA5-based simulations with 100 million particles. These optimizations make the MPTRAC model well suited for upcoming HPC systems.