the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Learning to melt: Emulating Greenland surface melt from a polar RCM with machine learning
Abstract. Predicting surface melt on the Greenland ice sheet is critical for understanding surface mass balance (SMB) and sensitivity to climate change. Polar regional climate models are the primary tools for simulating melt and projecting future SMB, but different models produce significantly different results. However, they are too computationally expensive to create the large ensembles needed to quantify this uncertainty. We develop a neural network based emulator that predicts daily surface melt from atmospheric variables, trained on output from the polar regional climate model HIRHAM5 and its firn model DMIHH forced by ERA-Interim reanalysis. The emulator uses a physics-informed design combining short-term weather patterns with long-term climate memory, capturing both immediate atmospheric forcing and accumulated firn characteristics. The emulator achieves mean absolute error below 0.23 mm w.e. per day across all six Greenland drainage basins, with the errors primarily attributable to spatial over-smoothing. Our work demonstrates that machine learning can successfully emulate firn model behavior from climate forcing alone with computational costs orders of magnitude lower than traditional simulations. Once retrained for specific climate forcings, the emulator thus enables extensive ensemble projections. Furthermore, the modular architecture can be readily adapted to emulate other SMB quantities such as runoff. This represents a crucial first step toward computationally efficient emulation of polar regional climate models and surrogate modeling of SMB components in Earth system modeling.
Competing interests: At least one of the (co-)authors is a member of the editorial board of The Cryosphere.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(6135 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 10 Mar 2026)
-
RC1: 'Comment on egusphere-2026-7', Anonymous Referee #1, 02 Mar 2026
reply
-
AC1: 'Reply on RC1', Elke Schlager, 05 Mar 2026
reply
We thank the referee for the positive assessment of our study and for the constructive feedback. We believe that addressing these comments and revising the manuscript accordingly will improve the clarity of the paper.
The comments with our replies on how we plan to address each of them are provided in the attachment, with the referee comments in black, and our replies in blue.
-
AC1: 'Reply on RC1', Elke Schlager, 05 Mar 2026
reply
-
RC2: 'Comment on egusphere-2026-7', Anonymous Referee #2, 02 Mar 2026
reply
Overall, I think the authors have done a very good job. The manuscript is clearly written, the structure is logical, and the figures are generally of high quality. With some minor revisions, I believe the paper should be ready for publication. Please see the in-text comments for specific suggestions to improve clarity and to make certain figures more rigorous.
One point I would like to raise concerns the choice of using a single year (2016) as the test set. The manuscript does not provide a clear justification for this decision. Using only one test year raises the concern that the evaluation may depend on a particularly “lucky” year, or alternatively on a year with atypical behavior. In either case, it becomes difficult to convincingly demonstrate the model's generalisability.
As a reader without specific expertise in the Greenland Ice Sheet (GIS), I am unsure whether 2016 is representative of typical conditions or whether it may have experienced unusual or extreme events. From a scientific robustness perspective, it would strengthen the study to repeat the prediction-versus-observation scatter plots for one or two additional test years. This would allow the reader to assess whether, for example, the melt overestimation by the autoregressive and modular neural network models is a persistent feature or specific to 2016.If the authors intentionally selected 2016, I would encourage them to provide a clear justification. For example, an appendix figure showing the distribution of SMB over the GIS compared to other years could help demonstrate whether 2016 is representative or exceptional.
A similar concern applies to Figure 5, which focuses on 21 July 2016. While this date is interesting, showing only a single summer day risks giving the impression of a carefully selected (“lucky”) example. I would encourage the authors to include additional dates in the appendix, ideally covering different seasons, for example, shoulder seasons or winter periods when little or no melt is expected. This would provide a more comprehensive picture of model behavior across varying surface mass balance regimes.
In addition, it could be helpful to show aggregated diagnostics, such as spatial plots of the mean SMB over several months (or seasonal averages) for the best-performing model. Such analyses would provide stronger evidence that the model captures robust patterns rather than performing well on isolated dates.
Overall, I consider this a valuable contribution, but addressing these points would further strengthen the manuscript.
-
AC2: 'Reply on RC2', Elke Schlager, 05 Mar 2026
reply
We thank the referee for the positive assessment of our study and for the helpful suggestions. We believe that including the requested information will significantly strengthen the manuscript.
The comments with our replies on how we plan to address each of the comments are provided in the attachment, with the referee comments in black, and our replies in blue.
-
AC2: 'Reply on RC2', Elke Schlager, 05 Mar 2026
reply
-
EC1: 'Comment on egusphere-2026-7', Andrew Orr, 10 Mar 2026
reply
This manuscript examines the use of a machine-learning based emulator of surface melt over the Greenland ice sheet, trained on output from a regional climate model and its firn model.
The manuscript is relatively well written, although it does feel rather underwhelming – perhaps due to the focus on the development of an emulator / machine learning method, and this not being followed up by using the emulator to investigate a pressing science question. A similar concern would be around the novelty of the work not being especially clear. For example, the Introduction mentions using several machine-learning approaches, including neural-networks, yet never explains why the neural network approach used here was adopted, or how this work builds on existing work (including limitations). This would be my major concern #1.
I also feel the language could be a lot tighter in places, and also that there is occasionally some information missing that would make it much easier to read. There also seems to be something missing from the text to make it a ‘Cryosphere’ paper. For example, the Conclusions mentions ‘diverse climatic regimes of Greenland’ – but there has been little acknowledgement or explanation of this up to now, so it seems rather too little too late, and also no references mentioned. This would be major concern #2.
Additionally, I simply don’t see section 3 as a ‘Results and discussion’ section – it came across as simply explaining the results, and no discussion of them. A ‘discussion’ thoroughly interprets, analyses, and explains the significance of the study's findings in relation to the research question and existing literature. The Conclusion section actually contains some interesting discussion points, so I would suggest bolstering that instead, and also adding references.
I also have a few small points that I noticed on first reading:
- NN is defined twice in the Introduction,
- Introduction mentions XGBoost and Neural Networks, yet these may not be immediately familiar with a reader. A sentence explaining the basis for these approaches here may be useful.
- section 2.1 could be tightened, as for example it’s not immediately clear that the atmospheric forcing mention (first line of the second paragraph) is for the firm model, and also its not clear what time resolution the sub-daily SMB outputs are, which are subsequently aggregated to daily values (third paragraph),
- section 2.1 and elsewhere use subheadings as bold text such as ‘Data cleaning’. I have to say I really don’t like this approach, and find it rather lazy.
- given the importance of the atmospheric forcing used, I think some additional information on the HIRHAM5 model would be useful, and also on the appropriateness of using ERAI to force it. For example there is no mention of the spatial resolution. Albedo is mentioned later in the manuscript, yet there is no mention here of how this is computed by HIRHAM5.
- in section 2.1 there is no mention of how the firn model is spun-up.
- in section 2.1 its not clear what ‘symlog’ means, or what the variable x is.
- In section 2.1, the period 1980-1990 is 11 years, not 10. It might also be worthwhile explaining the differences between the training, validation, and test periods – as this is not intuitively obvious to someone not familiar with machine-learning. Also, the manuscript should include some justification for selecting these periods, such as why only a single year (2016) is used for the test period.
- Figure 1 caption typo. Schema -> Schematic.
- Section 2.2, ‘regressing the surface melt based on atmospheric variables’ is rather vague. What are these variables ? How are they chosen? Presumably these are the predictor values?
- Is Eq. 1 missing an explanation of what N is?
- Section 2.3, typo ‘of the of’
- The captions could do with more information. For example, the caption for Fig. 2 does not mention what the various labels are in the figure. SW. SE, CE etc. The caption for Table 2 does not mention what R^2 is.
- Line 255 mentions ‘residuals’, but what this means is not defined – is it the difference between the actual and emulated values?
- Conclusion section uses ‘neural network’ and not NN.
- For the Conclusion section, I would recommend adding mentioning the appropriate figure or table when the results are being reiterated, so the reader is absolutely clear about the novelty of the work.
Andrew Orr
Citation: https://doi.org/10.5194/egusphere-2026-7-EC1
Data sets
Output of Learning to melt: Emulating Greenland surface melt from a polar RCM with machine learning Elke Schlager https://doi.org/10.5281/zenodo.17913228
Model code and software
MeltEmulation Elke Schlager https://github.com/eschlager/MeltEmulation
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 315 | 116 | 21 | 452 | 17 | 30 |
- HTML: 315
- PDF: 116
- XML: 21
- Total: 452
- BibTeX: 17
- EndNote: 30
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Review of “Learning to melt: Emulating Greenland surface melt from a polar RCM with machine learning” by Elke Schlager et al.
The Cryosphere (TC): egusphere-2026-7
General comments
This paper introduces a newly developed neural network-based emulator that predicts the temporal evolution of Greenland ice sheet surface melt. The emulator was trained on the output from the polar regional climate model HIRHAM5 and its firn model DMIHH, forced by the ERA-Interim reanalysis. It is clearly shown that the Modular NN configuration of the emulator, the standard setting developed in this study, can provide realistic information on the spatiotemporal evolution of ice-sheet surface melt, along with the daily melt amount. My impression is that this is a unique study that can provide useful information on the synergy between machine learning and cryosphere science. Although I think the information provided, in particular on the methods, can be improved, the results and discussion sound reasonable and sufficient to me. Therefore, I suggest that this paper can be published after revisions. I list some specific comments below.
Specific comments
L. 9 “mean absolute error below 0.23 mm w.e.”: Compared to what? What is the reference data for this comparison? Please explain.
L. 45 ~ 58: It is worth reviewing and citing the paper by Hu et al. (https://doi.org/10.5194/tc-15-5639-2021) in this part.
L. 59 “high temporal variability”: Can the authors explain this point quantitatively and add a reference for this argument if possible?
L. 60 “temporal context”: I don’t think this technical term is widely recognized in the cryosphere community. Can the authors introduce additional explanations about the term so that more readers can easily understand?
L. 60 “While the models predicting annual ~”: Do the authors mean that the models refer to “ML” emulator? Or RCMs? Please clarify.
L. 63: What do the authors mean by “lag effects”? Please explain in more detail.
L. 67: What do the authors mean by “model generalization”? Please explain.
L. 73 “Our model can be re-trained on data for future scenarios ~”: If the NN will be used for the future simulations of the ice sheet surface melt, do the authors have to train the NN using the output from the future climate simulations by an RCM such as HIRHAM5? Please explain more explicitly.
L. 78 ~ 79: Please explain all the properties included in the daily output of the polar RCM HIRHAM5 with its firn model DMIHH.
L. 79: What is the total snow and ice model layer thickness that DMIHH considers with the 32 model layers?
L. 84: It is better to explain how bare ice is determined in the DMIHH model.
L. 85: Atmospheric forcing for what? For DMIHH? Or for the newly developed emulator? Please clarify. In addition, please list all the properties included in the atmospheric forcing.
L. 90: It is unclear what the “input data” are. Input data for DMIHH? Or input data for the emulator?
L. 97 “they can be problematic when training ML models.”: Please explain the reason for this argument in more detail.
L. 107: Does the negative sensible heat flux mean that the heat flux directs from the ice sheet surface to the atmosphere? Or opposite? Please explain.
L. 129: Why is the number 5000 selected here? A more detailed explanation is needed.
L. 178 “we choose the hidden layers of the network to be 64-128-128-64-32-16-16”: Please explain the meanings of each number, in particular for non-specialists in NN.
L. 182, L. 184, and L. 185: Same as the comment on L. 178.
L. 188: Please explain in more detail about “LeakyReLU activation function.”
L. 193 “the optimal number of days to be used in the short-term module”: What do the authors mean by “optimal”? Please explain in more detail.
L. 215~216 “the total computational cost remains far lower than physical firn models”: Can the authors add quantitative information for this explanation? I think such information is useful for other emulator developers.
Technical corrections
L. 89: It is better to add something like “within DMIHH” at the end of this sentence.
L. 111: It is better to add the mathematical symbol “x” after “heat flux values.”
Table 1 caption: Please add “Autoreg” after “the autoregressive element.”
Figure 3 caption: It is better to explain the numbers in Gt listed in each panel.
L. 319: Suggest adding “surface” before “atmospheric variables.”