A GPU-parallelization of the neXtSIM-DG dynamical core (v0.3.1)

Jendersie, Robert; Lessig, Christian; Richter, Thomas

doi:https://doi.org/10.5194/egusphere-2024-2539

Robert Jendersie, Christian Lessig, and Thomas Richter

Abstract. The cryosphere plays a crucial role in Earth’s climate system, making accurate sea ice simulation essential for improving climate projections. To achieve higher resolution simulations, graphics processing units (GPUs) have become increasingly appealing due to their higher floating point peak performance and superior energy efficiency compared to CPUs. However, harnessing the full theoretical performance of GPUs often requires significant effort in redesigning algorithms and careful implementation. Recently, several frameworks have emerged, aiming to simplify general-purpose GPU programming. In this study, we evaluate multiple such frameworks, including CUDA, SYCL, Kokkos, and PyTorch, for the parallelization of neXtSIM-DG, a finite-element-based dynamical core for sea ice. Based on our assessment of usability and performance, CUDA demonstrates the best performance, while Kokkos is a suitable option for its robust heterogeneous computing capabilities. Our complete implementation of the momentum equation using Kokkos achieves a sixfold speedup on the GPU compared to our OpenMP-based CPU code, while maintaining competitiveness when run on the CPU. Additionally, we explore the impact of different discretization orders and the use of lower precision floating-point types on the GPU, showing that switching to single precision can further accelerate sea ice codes.

Received: 12 Aug 2024 – Discussion started: 25 Sep 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Country	#	Views	%
United States of America	1	160	30
Germany	2	68	13
Denmark	3	28	5
China	4	21	4
Netherlands	5	21	4


Total:	0
HTML:	0
PDF:	0
XML:	0

A GPU-parallelization of the neXtSIM-DG dynamical core (v0.3.1)

Viewed

Viewed (geographical distribution)