Climate model spread outweighs glacier model spread in 21st-century drought buffering projections

Ultee, Lizz; Wimberly, Finn; Coats, Sloan; Mackay, Jonathan; Holmgren, Erik

doi:10.5194/egusphere-2025-3965

Preprints

https://doi.org/10.5194/egusphere-2025-3965

Preprints

08 Sep 2025

| 08 Sep 2025

Climate model spread outweighs glacier model spread in 21st-century drought buffering projections

Lizz Ultee, Finn Wimberly, Sloan Coats, Jonathan Mackay, and Erik Holmgren

Abstract. Drought risk is changing as the hydrological cycle responds to anthropogenic climate change. Projections of future drought risk used to inform water management would ideally be conducted at local scale, but local-scale projections demand local data and computational resources that are often not available. As an alternative, global-scale projections of glacier runoff and the hydrological cycle can provide important insights for the local scale, particularly when interpreted carefully. Here, we use an ensemble of latest-generation (CMIP6) climate models to force three different global glacier models, and we examine changes in glacial drought buffering for 75 major river basins in the early, mid-, and late 21st century. Despite differences in absolute glacier runoff simulated by each global glacier model, their glacial drought buffering results are broadly consistent. By contrast, we find that the spread in glacial drought buffering among different climate models is large and likely under-sampled. This work highlights that, for downstream hydrological studies: (1) no one global glacier model is more suitable than another, and (2) analysing a representative ensemble of climate models is imperative. Our findings illustrate that differences in glacier model outputs that appear consequential to glaciologists may be less consequential for downstream impact metrics.

Received: 13 Aug 2025 – Discussion started: 08 Sep 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Lizz Ultee, Finn Wimberly, Sloan Coats, Jonathan Mackay, and Erik Holmgren

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-3965', Anonymous Referee #1, 14 Oct 2025
Climate model spread outweighs glacier model spread in 21st-century drought buffering projections
Summary
The manuscript evaluates how uncertainty in climate forcing (GCMs) and uncertainty across three global glacier models (GloGEM, PyGEM, OGGM) propagate to projections of glacial drought buffering for 75 large glacierized river basins worldwide. The authors compute two versions of a 3-month SPEI (one that includes glacier runoff) using CMIP6 forcings and monthly glacier runoff outputs. They summarize buffering as the difference in drought severity between SPEIs in three periods and compare the spread arising from the GCMs versus the spread across the three glacier models. Main findings: Glacial buffering generally increases through the 21st century and correlates with basin glacier fraction. Qualitative buffering trends are similar across the three glacier models. Inter-GCM spread in buffering substantially exceeds inter-glacier-model spread, and the commonly used 11-member glacier forcing ensemble undersamples the broader CMIP6 spread.
This paper addresses an important and timely research question, building on previous work (Ultee et al., 2022; Wimberly et al., 2025). The manuscript is generally well-conceived, clearly written, and presents a valuable approach, particularly in its use of a large multi-GCM and multi-realization ensemble. Before acceptance, some methodological clarifications, additional contextual details, and sensitivity analyses would further strengthen the manuscript. Below, I outline detailed comments and specific suggestions to guide the authors in refining their work.
Major Comments (not in order of importance)
Inclusion of “hot” models. While reading the manuscript, I wondered whether sampling the entire CMIP6 spread is necessarily the most informative approach. Recent studies (e.g., Hausfather et al., 2022) have highlighted that several CMIP6 models exhibit unrealistically high climate sensitivities, leading to warming projections that exceed observationally constrained ranges. Consequently, many multi-model analyses now either exclude these “hot” models or reframe their results by grouping or averaging simulations according to warming levels. It would therefore be valuable for the authors to discuss how the inclusion of such “hot” models may influence their conclusions. Specifically, would the finding that inter-GCM spread outweighs inter-glacier-model spread remain robust if the analysis were repeated using a more balanced subset of GCMs?
Hausfather, Z., Marvel, K., Schmidt, G. A., Nielsen-Gammon, J. W., & Zelinka, M. (2022). Climate simulations: recognize the ‘hot model’ problem. Nature, 605(7908), 26–29
GCM selection and ensemble representativeness. The paper's central claim depends on comparing the 11-member ensemble used to force glacier models against the wider CMIP6 ensemble. The manuscript should explicitly list the 11 GCMs and briefly explain how these GCMs were selected (if there is a reason). In addition, provide a table of the 24 GCMs used for comparison/reproducibility purposes (including the number of realizations per GCM).
Details about downscaling / bias correction. The Methods state that the glacier models were forced by a single continuous historical (2000–2014) + SSP (2015–2100) simulation per GCM. Please provide details on any bias correction/downscaling applied to GCM variables before feeding glacier models and computing SPEI.
Influence of precipitation factors. Several glacier models apply precipitation correction factors to compensate for known underestimation biases in precipitation datasets. While such factors primarily affect mass balance calibration, they can significantly influence the magnitude of simulated runoff. Consequently, these corrections may amplify the apparent glacier buffering capacity. It was not entirely clear from the manuscript whether such precipitation correction factors were applied within the glacier model simulations used here, and if so, whether they differ across the three models. I recommend that the authors clarify this aspect in the Methods section. If corrections were applied, please specify their magnitude. If not, a brief discussion on the potential implications of uncorrected precipitation biases for the glacier runoff component would strengthen the paper.
Glacial runoff definition. In L90, glacial runoff is defined as the sum of ice and rain runoff from glacierized areas. However, because the three glacier models use fixed-gauge water runoff, this definition likely also includes seasonal snowmelt. This is critical, as the study’s core argument attributes all runoff from glacierized regions to glacier change, while part of it may originate from seasonal snowmelt. This distinction has been highlighted in recent discussions (Gascoin, 2024), stressing the need for precise terminology when describing “glacial runoff.” If possible, the analysis should isolate ice melt to maintain conceptual consistency with drought buffering through glacier mass loss. Otherwise, the authors should clearly acknowledge this limitation and the potential overestimation of glacier influence.
Use of multiple realizations per GCM. To my understanding, this is the first study to use such a large archive of future simulations (>100), incorporating multiple models and realizations. This is a valuable and novel aspect, as most studies rely on single realizations per GCM. While perhaps beyond the current scope (L127–128), it would be interesting to include some discussion on how representative ensembles with only one realization per GCM compare to this larger multi-realization approach. Providing brief insights on this point would offer a useful lesson for modelers and practitioners regarding ensemble design.
Specific Comments
Title and abstract
Title: Consider adding “CMIP6” to make the forcing context explicit.

Abstract: Good summary. Suggest explicitly stating the number of GCMs (11 vs. 24) and glacier models (3). Please also consider briefly mentioning the scenario used.

Abstract: “…likely under-sampled.” Under-sampled compared to what? The reader currently needs to read the Methods to understand this.

Section 1 (Introduction)
L41: Clarify that the initial increase is only expected in some cases, as the “peak water” has already been reached in many regions.

Section 2 (Methods)
Study areas: This section reads more like a description of the figures rather than an introduction to the study areas. Include more information about the basins themselves, such as their geographic distribution, climatic or hydrological diversity, and relevance to the research question.

Study areas: Please briefly justify the >3000 km² and >30 km² thresholds.

Glacier model descriptions: Include a concise summary (or supplementary table) outlining key information for each glacier model, such as version used, variables used, main structural differences, and calibration approach. While the manuscript references Wimberly et al. (2025), providing this overview here would greatly improve readability and help interpret potential differences in the results.

Historical climate: Does the single continuous historical run correspond to the ‘historical’ experiment from each GCM, or is it derived from a reanalysis dataset? Please clarify.

SPEI selection: The argument for using SPEI could be moderated to better reflect its scope. SPEI only accounts for precipitation and potential evapotranspiration, but does not explicitly represent other catchment hydrological processes.

SPEI computation: Potential evapotranspiration is calculated using the Penman-Monteith equation. Please indicate the variables used here rather than in Section 2.4 Model Spread.

SPEI time window: The authors use a 3-month SPEI and define droughts as SPEI ≤ −1. Please justify: (a) the choice of a 3-month accumulation window—why not 6 or 12 months, given that droughts typically develop over longer periods? (b) the use of the 1900–1979 period for standardization. Shouldn’t this align with the period used for historical glacier simulations or with the baseline applied in bias correction of the climate projections?

Glacier area for SPEI: Please clarify how glacier area is treated. By “initial glacierized area of the basin,” do you refer to the year 2000, 2014, or the year reported in the RGI dataset?

Section 3 (Results)
Figures 2 vs. 3: Please check that the color schemes for the different models are consistent. Also, use consistent names/units for glacial drought buffering.

Section 3.1: “Expressed in terms of reduced number of droughts”—this is only shown in terms of severity in Figure 2.

Figure 2: Why is it Δ2 in the axis label?

Figures 2–3: Please remind the reader that the values correspond to the difference between SPEI(\text{N}) and SPEI(\text{g}). Also, indicate that the whiskers correspond to the min–max GCM range.

Figures 4–5: Consider replacing points/circles with a bar plot, adding whiskers to represent the spread among GCMs. Colors could indicate glacier area (as in Fig. 1) or mean annual precipitation/temperature. Combining panels a and b to extend horizontal space may help, while panel c could show aggregated results across all basins.

Sentences like “results demonstrate conclusively that inter-GCM spread outweighs inter-glacier-model spread” should be tempered to reflect the conditional nature of the findings, acknowledging potential limitations (parameter or initialization uncertainty, calibration choices, data sources, etc.).

Year ranges: Ensure consistency—some text uses 2080–2100, some captions 2081–2100.

Figure 5: Please add “relative to 2000–2020” as in Figure 4 (axis label).

Section 4–5
L205–206: I don’t fully agree with this statement. For hydrological applications, absolute runoff is important for potential model coupling. Therefore, differences among glacier models may be directly relevant, especially when no normalization is applied, as is the case for many metrics.

Discussion: The section is thoughtful and balanced. Please add a short, practical recommendation for users. For example, if only one glacier model can be used (very common), which steps should be prioritized—choosing a glacier model suited to the region, increasing the sample size of climate models, or selecting GCMs based on their skill in representing regional teleconnections?
Citation: https://doi.org/10.5194/egusphere-2025-3965-RC1
RC2: 'Comment on egusphere-2025-3965', Anonymous Referee #2, 04 Nov 2025

The authors investigate the global drought buffering capacity of glaciers throughout the 21st century as simulated by an ensemble of glacier evolution and global circulation models (GEM resp. GCM). Their goal is not to obtain actual drought buffering estimates, but rather to analyze how drought buffering estimates differ among different GEMs and GCMs, and which of these two model types represents the largest source of uncertainty (expressed in the width of the ensemble spread). Across 75 basins, 3 GEMs and 11 GCMs, they compute a 3-month drought index (SPEI) and define the drought buffering capacity as the difference in drought index between GCM projections with and without GEM coupling. Their conclusions are three-fold: 1) in accordance with previous work, drought buffering increases with increasing basin glacier fraction, 2) drought buffering estimates are much more sensitive to GCM model choice than to GEM model choice, and 3) the ensemble spread of drought buffering estimates would likely be larger had the authors expanded the GCM ensemble to include the full 112-member CMIP6 climate product. The paper is well-structured, well-written and concise.
The methodology and scope are similar to a previous study by some of the same authors (Ultee et al., 2022). In that study, they looked at a single GEM forced by 8 GCMs and analyzed the actual drought buffering capacity of glaciers worldwide. The current paper does not present large novelty in methodology compared to the previous paper, it essentially repeats the previous study with more GEMs and GCMs and across a larger number of basins, and compares the results from the different model combinations. However, the topic and the results of the work are highly relevant for both the scientific community and the general public. While the previous study presented the community with evidence of the end-of-century drought buffering capacity of glaciers worldwide, the present paper shows the relative importance of GEMs and GCMs in making global drought projections, a highly relevant matter for which a large number of stakeholders count on the scientific community. Despite the limited methodological novelty, I would therefore recommend publication after incorporating the following comments.
Major comment (potentially)
The SPEI is computed using PET estimates, which are said to be computed using the Penman-Monteith equation (L83). It is not clear if the computations are done by the authors themselves or if they originate from each of the GCMs. In case they are done by the authors, further justification of this particular PET model is necessary. Many different models exist, and their estimates can vary such that it is often recommended using multiple PET models to account for their uncertainty (Vremec et al., 2024). If there is no solid evidence in the literature that the Penman-Monteith model can safely be used as a single, global PET model, then the paper would highly benefit from a multi-model PET ensemble alongside the multi-model GEM and GCM ensembles. This will allow quantification of the uncertainty coming from PET model choice. If the uncertainty ends up being small, then a small addition to the discussion chapter could be sufficient.
Minor comments
A minor point of revision concerns the differences in glacier runoff between the different GEMs. The result that GEM uncertainty is much smaller than GCM uncertainty is presented as surprising because the authors knew the runoff differences between GEMs to be considerable. However, the reader is not presented with any evidence of this, except a reference to Wimberley et al. 2025. It would be good for the reader to be at least given an idea of how large these differences are.
A brief discussion on the choice of the SPEI as a drought index is missing. In case other drought indices exist, the reader might also wonder how the choice of a different drought index would change the results and the conclusions of the manuscript.
Specific comments
L2-4: I would argue there is inherent value in global projections (of glacier drought or other), independent of local-scale relevance. In any case, this fits better in the introduction than as the main motivation for the manuscript in the abstract.
L5: omit "latest generation", and potentially "CMIP6".
L6 "despite differences in absolute glacier runoff": the authors don't show this in the manuscript. I advise the authors to find a way to explain the results without mentioning this explicitly.
L18: isn't this the case for all classes of climate models?
L21: Not a good reference for this statement. See (Hanus et al., 2024) : "Large-scale hydrological models mostly lack glacier representation. Only one out of 16 models used to simulate water availability in the Inter-Sectoral Impact Model Intercomparison Project phase 2 (ISIMIP2) had some representation of mountain glaciers, namely CWatM (Telteu et al., 2021). However, even this model has a very simplistic glacier representation resembling a snow redistribution method to avoid snow accumulation (Burek et al., 2020; Telteu et al., 2021). "
L26: what about global scale analyses?
L28: replace "possible" with "feasible"
L32: specify "settings"
L33-35: link with previous sentences not sufficiently smooth
L44: SPEI requires better introduction
L45: add "state-of-the-art" in front of "climate models"
Figure 1: North American, Icelandic, Patagonian and New Zealand basins are not distinguishable, perhaps zoom-boxes would increase their visibility. Caption: Which Appendix?
L55-60: This paragraph does not benefit the storyline. These instructions can simply be stated in the figure captions.
L65: Explain which scenario this is
L69: questionable use of "we"
L76: there is no section 2.3.2, so there is perhaps no need for a sub-section
L82: Why the letter D?
L97-98: omit quotation marks
L98-L99: Unclear sentence, rephrase
L107: complicated sentence
L107-109: is there no parameter uncertainty in GCMs?
L138-139: elaborate on the theoretical understanding of hydrological trade-offs
Figure 2: increase figure width and apply a (semi-transparent) grid to both sub-figures. In the top figure it is difficult to see which points represent the same glacier
L193: unclear
L206: "that appear large to glaciologists" needs reference or elaboration
L216-217: "climate scenario" not the right terminology
L236: are there other global glaciers models? They have not been mentioned in the manuscript as far as I am aware
L236: replace "most-up-to-date" with "state-of-the-art"
L240-241: That depends on which downstream impact studies. This conclusion is specific to drought buffering in large glacierized basins.

Hanus, S., Schuster, L., Burek, P., Maussion, F., Wada, Y., and Viviroli, D.: Coupling a large-scale glacier and hydrological model (OGGM v1.5.3 and CWatM V1.08) – towards an improved representation of mountain water resources in global assessments, Geosci. Model Dev., 17, 5123–5144, https://doi.org/10.5194/gmd-17-5123-2024, 2024.
Ultee, L., Coats, S., and Mackay, J.: Glacial runoff buffers droughts through the 21st century, Earth Syst Dynam, 13, 935–959, https://doi.org/10.5194/esd-13-935-2022, 2022.
Vremec, M., Collenteur, R. A., and Birk, S.: PyEt v1.3.1: a Python package for the estimation of potential evapotranspiration, Geosci. Model Dev., 17, 7083–7103, https://doi.org/10.5194/gmd-17-7083-2024, 2024.

Citation: https://doi.org/10.5194/egusphere-2025-3965-RC2

Lizz Ultee, Finn Wimberly, Sloan Coats, Jonathan Mackay, and Erik Holmgren

Viewed

Total article views: 1,709 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,629	61	19	1,709	35	43

HTML: 1,629
PDF: 61
XML: 19
Total: 1,709
BibTeX: 35
EndNote: 43

Views and downloads (calculated since 08 Sep 2025)

Month	HTML	PDF	XML	Total
Sep 2025	1,477	17	8	1,502
Oct 2025	89	29	7	125
Nov 2025	54	13	4	71
Dec 2025	9	2	0	11

Cumulative views and downloads (calculated since 08 Sep 2025)

Month	HTML	PDF	XML	Total
Sep 2025	1,477	17	8	1,502
Oct 2025	89	29	7	125
Nov 2025	54	13	4	71
Dec 2025	9	2	0	11

Viewed (geographical distribution)

Total article views: 1,615 (including HTML, PDF, and XML) Thereof 1,615 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 05 Dec 2025

Short summary

Runoff from glaciers can be an important water source in mountain regions. Global climate models used to understand future changes in the water cycle do not include glacier changes. We simulated glacier change in all available glacier models using information from global climate models as input. We found that for analysis of future drought, it is more important to understand the climate input than to use all available glacier models together.


Total:	0
HTML:	0
PDF:	0
XML:	0