Preprints
https://doi.org/10.5194/egusphere-2025-4435
https://doi.org/10.5194/egusphere-2025-4435
03 Nov 2025
 | 03 Nov 2025
Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

Actionable reporting of CPU-GPU performance comparisons: Insights from a CLUBB case study

Gunther Huebler, Vincent E. Larson, John Dennis, and Sheri Voelz

Abstract. Graphics Processing Units (GPUs) are becoming increasingly central to high-performance computing (HPC), but fair comparison with central processing units (CPUs) remains challenging, particularly for applications that can be subdivided into smaller workloads. Traditional metrics such as speedup ratios can overstate GPU advantages and obscure the conditions under which CPUs are competitive, as they depend strongly on workload choice. We introduce two peak-based performance metrics, the Peak Ratio Crossover (PRC) and the Peak-to-Peak Ratio (PPR) which provide clearer comparisons by accounting for the best achievable performance of each device. Using a case study into the performance of the Cloud Layers Unified by Binormals (CLUBB) standalone model, we demonstrate these metrics in practice, show how they can guide execution strategy, and examine how they shift under factors that affect workload. We further analyze how implementation choices and code structure influence these metrics, showing how they enable performance comparisons to be expressed in a concise and actionable way, while also helping identify which optimization efforts should be prioritized to meet different performance goals.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Gunther Huebler, Vincent E. Larson, John Dennis, and Sheri Voelz

Status: open (until 29 Dec 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Gunther Huebler, Vincent E. Larson, John Dennis, and Sheri Voelz

Model code and software

GitHub repo of CLUBB code Gunther Huebler and Vincent Larson https://github.com/larson-group/clubb_release/tree/clubb_performance_testing

Zenodo archive of CLUBB code and profiling results Gunther Huebler https://doi.org/10.5281/zenodo.17081296

Gunther Huebler, Vincent E. Larson, John Dennis, and Sheri Voelz
Metrics will be available soon.
Latest update: 03 Nov 2025
Download
Short summary
Central processing units (CPUs) and graphics processing units (GPUs) are different devices that suit different kinds of work. Using a climate modeling component, we provide a clearer way to tell which device type is faster for a given task. This matters because runs usually use only one device type. Our results are actionable: they guide device choice, report performance gains fairly, highlight code areas to improve, and show how code structure and optimization can change conclusions.
Share