the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
DCG-MIP: The Debris-Covered Glacier melt Model Intercomparison exPeriment
Abstract. In a warming world of glacier changes, the scientific community has dedicated increasing attention to debris-covered glaciers and their response to climate. A variety of models with distinct complexity and data requirements have been developed and widely used to simulate melt under debris at different sites and scales, but their skills have never been compared. As part of the activities of the International Association of Cryospheric Science (IACS) Debris Covered Glacier Working Group, we present an intercomparison exercise aimed at advancing our understanding of model skills in simulating ice melt under a debris layer. We compare 14 models with different complexity at nine sites in the European Alps, Caucasus, Chilean Andes, Nepalese Himalaya and the Southern Alps of New Zealand, over one melt season. We run the models with measured meteorological data from automatic weather stations and estimated or measured debris properties. We consider four main model categories: i) energy balance models that calculate melt by solving the physics of heat transfer to the debris layer, but require a high amount of input data; ii) a simplified energy balance model; iii) an enhanced temperature-index model; and iv) simple empirical temperature-index models that have been extensively used given their low data requirement but require calibration of their empirical parameters. Model performance is evaluated using on-site measurements of sub-debris melt (for all models) and surface temperature (for models based on the surface energy balance). Our results show that physically-based energy balance models and empirical temperature-index models perform in a distinct manner. At the one end of the spectrum, simple temperature index models are accurate when recalibrated or when using site-specific literature parameters, and show poor results when parameters are uncalibrated. At the other end, energy balance models show a range of performance: the most accurate energy balance models are those with the highest degree of complexity at the atmosphere-debris interface. An important data gap emerged from our experiment: the poor performance of all models at three sites was related to the poor knowledge of debris properties, and specifically of thermal conductivity. Future work should focus on both: i) consistent data acquisition to evaluate existing models and support new model developments; ii) advancing models by accounting for processes such as debris-snow interactions, moisture in the debris and refreezing. We suggest that a systematic effort of model development using a common model framework could be carried out in phase II of the Working Group.
- Preprint
(2030 KB) - Metadata XML
-
Supplement
(3099 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-3837', Anonymous Referee #1, 18 Sep 2025
-
RC2: 'Comment on egusphere-2025-3837', Duncan J. Quincey, 30 Sep 2025
This is a comprehensive analysis of the available debris-covered glacier melt models and their performance across a range of sites. It represents a considerable effort on behalf of the DCG community and despite the complexity of the experimental approach, the paper is relatively easy to follow and the key points are simple to digest. The summary figures are well-constructed and convey a clear message, and the transparency in describing the limitations of the approach is particularly appreciated. I have few substantial comments overall, and I am in support of the paper being published, but I would like to raise one issue (see comments under ‘Table 3’ below) that may require some consideration and might change some of the interpretation of which models are the ‘best’ performers, if it is addressed.
General comments
Throughout, I find the bold lettering in the main text to be distracting. In some places it doesn’t highlight anything that seems to be particularly important – maybe just let the reader make their own judgement, as in most other papers?
L115: it is unclear here whether the phrase ‘where most previous research has been carried out’ refers to the European Alps, or to the remote sites outside the European Alps – it can be interpreted both ways.
Figure 1: At first look the colour bar seems to relate to the shading on each of the inset glaciers, rather than to the data in the map. I think it’s because the inset glaciers and the colour bar are the most distinctive elements (with the map data being more subtle). Maybe shrinking the colour bar, or having a more complete legend that includes the two glacier shades as additional items, rather than describing them in the caption, would make it clearer? The darkest colours in the 1 x 1 degree cells (i.e. those approaching black) also seem to be beyond the minimum value in the colour bar?
Table 2: Kayasha = Kayastha?
Table 3: I have a doubt about using the median of a range of signed values to evaluate how well a model is performing, since the positive and negative values effectively cancel each other out such that the median tends to zero even where a model is performing poorly. The median therefore tells us only of the model is biased towards under- or over-prediction, rather than anything about the accuracy. It seems to make more sense to treat the values as absolute, and then take the median of those. Or better still, use a magnitude-based metric like MAE or RMSE – which would also give consistency with the subsequent analysis of surface temperatures that follow. If you do this re-calculation, the ranking changes, and this will feed through into some of the interpretations and might impact the discussion as well.
Consider shading the cells in Table 3 according to how close to zero they are? I think that would help the reader to more quickly identify the patterns across models and across sites.
Line 504: this value of 4.3% is misleading for the same reason given above. Indeed, the models rarely perform as well as this (Table 3); the uninitiated reader could easily read the paper and think the models overall perform exceptionally well, when the truth is somewhat different. If this is instead calculated using absolute values, the median error is 23.3%, which I would suggest is a much more realistic estimation of the overall model performances.
Figure 9: It’s interesting that the error exceeds the uncertainty in the majority of cases. What is the uncertainty analysis missing here? Does it come down to the factors discussed in Section 6.2, or is it something more systematic? Maybe you can just add a line or two about that here?
L633-647: this seems like a key point – why not run all of the models with and without these modifications given that they were ‘known’ to be incorrect, at least in terms of the debris thickness? I appreciate that would be a pain, but it doesn’t seem right to have knowingly run the models with incorrect values…?
L940: One clear outcome from the analysis is that TI models perform surprisingly well – indeed four out of the top five models are temperature index based (Table 4). I wonder if you can highlight this a bit more clearly in the conclusions? L941 states they perform ‘very well’ once calibrated, but then goes on to point out that they don’t perform well if not calibrated. I don’t think this does them justice, given Figure 7 very clearly shows that the calibrated versions of the TI models comfortably outperform the EB models in the scenarios tested! I think it’s worth flagging this up more explicitly to the reader here.
L944: On a similar point, it’s clear that the EB models can quite seriously overestimate melt, but the second bullet has quite a positive spin – highlighting that increased complexity improves process-based representation at the debris-atmosphere interface. That may be so but do the results not also suggest that increased complexity can lead to overestimation? The bullet point might just be better balanced by also clearly stating where they are also deficient.
Just to note that the data and simulations are not currently publicly accessible – I presume they will be made so if/when the paper is published.
Consider also giving the links to the model codes (rather than just their references)?
Citation: https://doi.org/10.5194/egusphere-2025-3837-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
2,285 | 41 | 10 | 2,336 | 20 | 64 | 59 |
- HTML: 2,285
- PDF: 41
- XML: 10
- Total: 2,336
- Supplement: 20
- BibTeX: 64
- EndNote: 59
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
This is an eagerly awaited report of results from the debris-covered glacier intercomparison project. It has a good geographical range and number of participating models, but is limited to only one melt season per site. This does not allow spin up of debris and ice temperatures or division into calibration and validation periods. Repeating the experiments with more years of data would be beyond feasible modifications for this paper, but reasons for and implications of this restriction should be discussed in more detail.
134
For annual mass balance simulations, temperature-index models also require precipitation as input.
Table 1
Position to four significant figures locates the glaciers to within about 10 m.
Table 2
With net solar radiation used as an input, why is KO2 not classified as an enhanced temperature index model?
Figure 4
Notation for radiation fluxes differs from Equation 1.
272
Elsewhere it is stated that net shortwave radiation is given, not calculated.
282
Relative humidity is a property of air. Are the assumptions rather on the wetness of the debris surface?
Figure 5
(a) and (b) labels are missing from the figure.
458-474
With this much discussion of Figure S3, it would be better to include it in the paper.
577
Djankuat is just Fig. 10c.
Figure 11
A reminder that fluxes are negative when away from the surface would be useful in the figure caption.
795
The “uncalibrated” version of Hyper-fit with parameters for the same glacier but a different time period would be regarded as a calibrated model in any other study.
I have not checked the reference list in detail, but the authors should. Kuzmin (1961) at least is missing.
Table S1
Although the models are not required to calculate albedo, it would be interesting to know the measured debris albedo for each site.
The paper is well written, with few errors that I noticed:
73
“which has has”
821
“This suggests suggests”