Climate model spread outweighs glacier model spread in 21st-century drought buffering projections
Abstract. Drought risk is changing as the hydrological cycle responds to anthropogenic climate change. Projections of future drought risk used to inform water management would ideally be conducted at local scale, but local-scale projections demand local data and computational resources that are often not available. As an alternative, global-scale projections of glacier runoff and the hydrological cycle can provide important insights for the local scale, particularly when interpreted carefully. Here, we use an ensemble of latest-generation (CMIP6) climate models to force three different global glacier models, and we examine changes in glacial drought buffering for 75 major river basins in the early, mid-, and late 21st century. Despite differences in absolute glacier runoff simulated by each global glacier model, their glacial drought buffering results are broadly consistent. By contrast, we find that the spread in glacial drought buffering among different climate models is large and likely under-sampled. This work highlights that, for downstream hydrological studies: (1) no one global glacier model is more suitable than another, and (2) analysing a representative ensemble of climate models is imperative. Our findings illustrate that differences in glacier model outputs that appear consequential to glaciologists may be less consequential for downstream impact metrics.
Climate model spread outweighs glacier model spread in 21st-century drought buffering projections
Summary
The manuscript evaluates how uncertainty in climate forcing (GCMs) and uncertainty across three global glacier models (GloGEM, PyGEM, OGGM) propagate to projections of glacial drought buffering for 75 large glacierized river basins worldwide. The authors compute two versions of a 3-month SPEI (one that includes glacier runoff) using CMIP6 forcings and monthly glacier runoff outputs. They summarize buffering as the difference in drought severity between SPEIs in three periods and compare the spread arising from the GCMs versus the spread across the three glacier models. Main findings: Glacial buffering generally increases through the 21st century and correlates with basin glacier fraction. Qualitative buffering trends are similar across the three glacier models. Inter-GCM spread in buffering substantially exceeds inter-glacier-model spread, and the commonly used 11-member glacier forcing ensemble undersamples the broader CMIP6 spread.
This paper addresses an important and timely research question, building on previous work (Ultee et al., 2022; Wimberly et al., 2025). The manuscript is generally well-conceived, clearly written, and presents a valuable approach, particularly in its use of a large multi-GCM and multi-realization ensemble. Before acceptance, some methodological clarifications, additional contextual details, and sensitivity analyses would further strengthen the manuscript. Below, I outline detailed comments and specific suggestions to guide the authors in refining their work.
Major Comments (not in order of importance)
Inclusion of “hot” models. While reading the manuscript, I wondered whether sampling the entire CMIP6 spread is necessarily the most informative approach. Recent studies (e.g., Hausfather et al., 2022) have highlighted that several CMIP6 models exhibit unrealistically high climate sensitivities, leading to warming projections that exceed observationally constrained ranges. Consequently, many multi-model analyses now either exclude these “hot” models or reframe their results by grouping or averaging simulations according to warming levels. It would therefore be valuable for the authors to discuss how the inclusion of such “hot” models may influence their conclusions. Specifically, would the finding that inter-GCM spread outweighs inter-glacier-model spread remain robust if the analysis were repeated using a more balanced subset of GCMs?
Hausfather, Z., Marvel, K., Schmidt, G. A., Nielsen-Gammon, J. W., & Zelinka, M. (2022). Climate simulations: recognize the ‘hot model’ problem. Nature, 605(7908), 26–29
GCM selection and ensemble representativeness. The paper's central claim depends on comparing the 11-member ensemble used to force glacier models against the wider CMIP6 ensemble. The manuscript should explicitly list the 11 GCMs and briefly explain how these GCMs were selected (if there is a reason). In addition, provide a table of the 24 GCMs used for comparison/reproducibility purposes (including the number of realizations per GCM).
Details about downscaling / bias correction. The Methods state that the glacier models were forced by a single continuous historical (2000–2014) + SSP (2015–2100) simulation per GCM. Please provide details on any bias correction/downscaling applied to GCM variables before feeding glacier models and computing SPEI.
Influence of precipitation factors. Several glacier models apply precipitation correction factors to compensate for known underestimation biases in precipitation datasets. While such factors primarily affect mass balance calibration, they can significantly influence the magnitude of simulated runoff. Consequently, these corrections may amplify the apparent glacier buffering capacity. It was not entirely clear from the manuscript whether such precipitation correction factors were applied within the glacier model simulations used here, and if so, whether they differ across the three models. I recommend that the authors clarify this aspect in the Methods section. If corrections were applied, please specify their magnitude. If not, a brief discussion on the potential implications of uncorrected precipitation biases for the glacier runoff component would strengthen the paper.
Glacial runoff definition. In L90, glacial runoff is defined as the sum of ice and rain runoff from glacierized areas. However, because the three glacier models use fixed-gauge water runoff, this definition likely also includes seasonal snowmelt. This is critical, as the study’s core argument attributes all runoff from glacierized regions to glacier change, while part of it may originate from seasonal snowmelt. This distinction has been highlighted in recent discussions (Gascoin, 2024), stressing the need for precise terminology when describing “glacial runoff.” If possible, the analysis should isolate ice melt to maintain conceptual consistency with drought buffering through glacier mass loss. Otherwise, the authors should clearly acknowledge this limitation and the potential overestimation of glacier influence.
Use of multiple realizations per GCM. To my understanding, this is the first study to use such a large archive of future simulations (>100), incorporating multiple models and realizations. This is a valuable and novel aspect, as most studies rely on single realizations per GCM. While perhaps beyond the current scope (L127–128), it would be interesting to include some discussion on how representative ensembles with only one realization per GCM compare to this larger multi-realization approach. Providing brief insights on this point would offer a useful lesson for modelers and practitioners regarding ensemble design.
Specific Comments
Title and abstract
Section 1 (Introduction)
Section 2 (Methods)
Section 3 (Results)
Section 4–5