A Deep Learning Approach for Lake Ice Cover Forecasting

Johnston, Samuel J.; Murfitt, Justin; Duguay, Claude

doi:10.5194/egusphere-2025-5576

Preprints

https://doi.org/10.5194/egusphere-2025-5576

Preprints

24 Feb 2026

| 24 Feb 2026

Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

A Deep Learning Approach for Lake Ice Cover Forecasting

Samuel J. Johnston, Justin Murfitt, and Claude Duguay

Abstract. Lakes cover a significant proportion of the high-latitude landscape and exert a strong influence on local weather and climate. Their seasonal lake ice cover (LIC) further impacts lake-atmosphere interactions, while also providing key socioeconomic services for northern communities. Climate change is impacting LIC and its thickness, two thematic products of Lakes as an Essential Climate Variable (ECV). Accurate prediction of LIC improves numerical weather prediction (e.g. lake-effect snowfall and thermal moderation) and is crucial for anticipating the impacts of climate change in lake-rich regions of the Northern Hemisphere.

This paper introduces LIF-DL (Lake Ice Forecasting using Deep Learning), a novel data-driven model for forecasting LIC extent across entire lake surfaces. LIF-DL uses Spatial-Temporal Transformer Networks (STTN) to capture relationships between lake conditions (ice and open water), lake depth and atmospheric forcings. The study focuses on five large Canadian lakes with pronounced ice phenology: Great Slave Lake, Great Bear Lake, Lake Winnipeg, Lake Athabasca, and Reindeer Lake. Data sources included ice cover observations from the Interactive Multi-Sensor Snow and Ice Monitoring System (IMS), atmospheric reanalysis from the European Centre for Medium-Range Weather Forecasts (ECMWF) 5th generation of European ReAnalysis (ERA5 and ERA5-Land), and Canadian Ice Service (CIS) records for external validation. To benchmark the proposed approach against a traditional physics-based model, the widely used Freshwater Lake (FLake) model embedded in ERA5 and ERA5-Land was employed. LIF-DL was trained to produce one-week forecasts using data from 2004–2017 and then deployed auto-regressively to predict ice cover during the 2018–2022 holdout period. Forecasts were evaluated against IMS and CIS observations and compared with those from FLake.

Across all evaluations—phenology timing, ice cover fraction, and spatial patterning—LIF-DL consistently outperformed FLake. Freeze-up and break-up events were predicted within 3–9 days of observations (versus 5–22 days for FLake), and ice cover fraction (range 0–1) root mean squared errors were reduced (0.06–0.16 versus 0.1–0.2). A key advantage of LIF-DL was its capacity to represent spatial dependencies across lake surfaces, producing coherent freeze-up and break-up dynamics and realistic spatial clustering of early and late ice timing compared to the fragmented patterns of FLake. These improvements reduced extreme timing biases—from as much as 30 days to only 4–6 days—particularly for large, deep lakes. Variable importance analysis indicated sensitivity to physically meaningful drivers, including air temperature, accumulated degree days, solar radiation, and lake depth, suggesting that LIF-DL learned relevant physical processes rather than statistical artifacts. Finally, the model maintained stable performance when iteratively forecasting over a four-year period, demonstrating robustness under varying atmospheric conditions.

The demonstrated accuracy, robustness, and physical interpretability of LIF-DL highlight the potential of deep learning for advancing lake ice modelling. Future research should focus on integrating physical constraints to develop hybrid physics-machine learning frameworks, improving model interpretability, and expanding to new predictive variables such as ice thickness and snow cover. Leveraging emerging high-resolution satellite datasets will further enhance spatial fidelity and enable application to smaller lakes. Ultimately, spatiotemporal deep learning represents a transformative step toward next-generation, spatially resolved lake ice forecasts that can improve weather and climate prediction, inform northern transportation planning, and support climate change adaptation in lake-rich regions of the Northern Hemisphere.

Received: 11 Nov 2025 – Discussion started: 24 Feb 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1836 KB)

Supplement (858 KB)

Download & links

Samuel J. Johnston, Justin Murfitt, and Claude Duguay

Status: open (until 01 May 2026)

Post a comment Subscribe to comment alert

RC1:
'Comment on egusphere-2025-5576', Anonymous Referee #1, 18 Mar 2026 reply
The manuscript addresses an important modelling problem concerning the cryosphere with a novel spatial machine learning approach. Generally, the manuscript is well written and well organised, and it is particularly strong in its evaluation, the interpretation of results, and the generation of insights. The scientific context is sufficiently laid out and limitations are carefully addressed throughout the manuscript. Data and code are provided to a detailed extent. Nonetheless, there are two major concerns, primarily regarding the clarity in introducing the machine learning model, and the framing of this work as a forecasting system:

Formal definition of model, inputs and outputs: Section 2.4 and the following, which introduce the model and its evaluation, are lacking clarity. No notation was used to formally and mathematically introduce the inputs and outputs of the model. This makes it impossible to grasp the modelling pipeline and should be revised. I suggest introducing variable names and spatio-temporal indices. A formula should formalise the input to output mapping. Without this formal introduction, understanding the sequence-to-sequence nature of the model, and the spatio-temporal attributes of the chosen architecture, are challenging to comprehend even for an attentive reader. Figure 3 to Figure 5 should be revised as well, and should contain notation but also connect the various model components.

Forecasting given future reanalysis data: The second major concern is that this work framed the proposed task as forecasting. However, it uses future ERA data, which in real-life would not be available beforehand, to forecast future LIC. This set-up does not emulate a realistic forecasting scenario and therefore it remains questionable whether it can claim to evaluate the ability of LIF-DL to forecast LIC. Actual forecasts may only rely on forecasted (rather than reanalysis/proxy ground truth) atmospheric forcings. Such forecasts contain more uncertainty themselves than ERA data. Furthermore, due to (1.) and missing descriptions, it is not necessarily clear that future forcings are the critical input.

These two concerns should be addressed by a revised manuscript.

Below I present more minor line-wise comments by section:

Abstract:
Line 6: Lakes only cover a significant proportion of Northern high-latitude landscape, but not the Southern high-latitude landscape. Insert "Northern" to make this distinction clear.
Line 8: Mentioning lake ice thickness this early on alongside LIC may lead readers to assume that lake ice thickness is also modelled in this paper. For the abstract I recommend focussing on the key variable modelled in this paper.
Line 9: The word "lakes" should not be capitalised here.
Line 9: Double "prediction": Potentially change the first "prediction" to "forecasting".
Line 15: Changing "lake conditions" to "lake phase" or "lake state" could help to avoid confusion about what conditions are modelled.
Line 15: I recommend changing the order to naming inputs first and outputs second, as such: "[…] to capture relationships between atmospheric forcings, lake depth, and lake phase (frozen or open water)."
Line 16: I suggest sticking to one order of naming the five lakes throughout the paper (e.g. the order used in Figure 1).
Line 21: Referring to "one-week-ahead" forecasts would be clearer here. Maybe also specify that the model makes daily LIC predictions at 4 km spatial resolution, and that forecasts are 1 to 7 days ahead. Referring to the task as a segmentation task would also add more clarity early on.
The abstract exceeds typical word count limits and should be shortened.
Introduction:
Line 51: Bracket "(freeze-up/break-up)" is not necessary here and harms reading flow. This is already explained in the latter part of the same sentence.
Line 52: Replace "stimulating" e.g. with "leading to".
Line 55: Potentially relate to similar trends observed in sea ice to provide wider scientific context.
Line 63: "Lakes" does not need to be capitalised.
Line 73: What composition variable is CLIMo predicting?
Line 73: I suggest replacing "more wholly" with something like "more comprehensively".
Line 74: Specify what aspect of the model is two-layer, and explain what type of model FLake is.
Line 85: I suggest referring to this as the "point-wise gridded application of one-dimensional lake models" to conform with the wider literature. "Multiple points" is not specific enough.
Line 90: No comma needed: "Data-driven deep learning approaches […]."
(Line 91: Include additional citations such as perhaps
Rolnick, David, et al. "Tackling climate change with machine learning." ACM Computing Surveys (CSUR) 55.2 (2022): 1-96.

Reichstein, Markus, et al. "Deep learning and process understanding for data-driven Earth system science." Nature 566.7743 (2019): 195-204.)

Line 92: The positioning of the subsentence ", such as Spatial-Temporal Transformer Networks […]", is not ideal, as it may rather suggest that STTNs are datasets.
Line 96: Towards reads oddly.
Line 99: I believe this should say "develop a deep learning model" rather than "develop a model using deep learning"?
Line 101: Be more specific about the adaptation of the pre-existing model: Was STTN extended?
Line 102: The text only just mentioned that the STTN was developed for video inpainting.
Data Sources:
Line 112: North Pole should be capitalised.
Line 115: Improve "applied persistence". Is temporal extrapolation used to fill data gaps? Or say "lake phase was assumed to remain unchanged (or stationary in time) when no data was available".
Line 120: Add a sentence to explain how these two "ground truth" datasets differ and foreshadow which one is used in this study. Also mention the spatial resolution of CIS.
Line 135: Potentially say "two additional temporally aggregated variables".
Line 138: From what I understand this is the sum of days with temperatures below/above 0, not the sum of temperatures. Table 1 also misspecifies this. Is this calculated for each calendar year or for 365 days following 1 August? Maybe add a sentence to convey the intent here ("freezing days since the last summer are accumulated…")
Line 164: Replace "also" with "additionally" to make clear that LIF-DL does not predict these.
Study Lakes:
Like 180: Figure 1: Order of the lake plots: Tile 3 is usually expected on the left (reading direction).
Line 181: Table 1: Make the text in Table 1 left-bounded for the ease of readability (particularly the leftmost column). There also is an issue with the relative humidity row. Fix AFDD and ATDD description: Some places suggest this is the number of days while others suggest this is a temperature.
Data preprocessing:
Line 195: Explain why nearest neighbour interpolation/regridding was chosen over e.g. bilinear interpolation.
Line 197: Why do we need one-hot encoding when "masked" is not part of the prediction task? Inference can just be run for test regions.
Line 208: "To provide additional testing, CIS records and FLake model predictions were used over the testing period." How else were these used?
Line 209: "For some of the lakes in this study, the CIS record divided the lake body into two sections. These separate records were combined and averaged to obtain a single ice cover observation for the entire lake." This is not clear to me. Why did various records exist for the same areas?
LIF-DL:
Line 215: Point to Figure 5 for the complete model visualisation.
Line 217: Specify that it produces a daily one-week-ahead forecast (or a 1- to 7-day-ahead forecast).
Line 217: "Parametrization" means something different in the context of machine learning. I suggest saying "forecast horizon" or "supervised learning set-up" to avoid confusion.
Line 221: See general comment on this point. Why would we assume to have access to ERA data for the future? This is reanalysis data, not forecast data.
Line 225: This schematic lacks clarity: The "first time step" gate is not very logical and inputs and outputs should have variable names, and formally introduced temporal indexing.
Line 231: At this point it is not clear at all that this is a sequence-to-sequence set-up. Formal notation (in addition to a slight hint in Figure 5) in the main text must be used to comprehensively introduce inputs, outputs, their dimensionalities and time indices. This is a main weakness of the manuscript.
Line 235: Figure 3, 4 and 5 are not very well connected in the text or visualisations. "MODEL" in Figure 3 should be replaced with the LIF-DL, and the three inputs into the STTN blocks (Q,K, V) from Figure 5 should match the inputs shown in Figure 4 for more coherence and clarity. Visually integrating the dual-branch encoder-STTN-decoder set-up in Figure 3 would be beneficial. Figure 4 is not needed if the reader can refer to existing literature and there are no novel aspects presented.
Line 245: The pre-defined Transformer architecture already utilises "multiple layers". What is meant here? Stacking multiple layers of Transformers or using the original Transformer architecture?
Line 250: You may want to refer to this as a dual-branch architecture.
Line 252: Relating to overall comment: It is not clear at this point in the text that future atmospheric forcing data is assumed to be available.
Line 264: The Figure caption is not sufficient and variables names and indexing needs to be used.
Model Optimisation:
Line 265: Maybe change to "hyperparameter tuning and parameter/model training" for parallelism and clarity.
Line 271: This is an unusual description since the model also is built in Pytoch.
Line 272: Calling this a "custom loss" if one filtering operation is applied is a bit of an overstatement.
Evaluation Methods:
Line 297: Table 2 may be moved to the Appendix.
Line 312: From line 293 I expect a comparison with both IMS and CIS.
Line 327: Mention if this is done per lake.
Line 352: It is not clear enough what a one-dimensional fraction of ice cover time series is. This is why notation (e.g. tensor notation) is necessary.
Results and Discussion:
Line 374: Maybe change to "the thermodynamics of lakes drive […]".
Line 379: Figure 6: In the top right corner or figure caption repeat the definitions of the Freeze-up and Break-up seasons and make clear that these are "variable importance estimates for predictions during freeze-up and break-up seasons".
Line 385: Air temperature does not need to be capitalised.
Line 420: Table 3: Consider displaying an additional digit to make differences a bit more clear. The superior performance should be highlighted in some way (applies to all results tables).
Line 467: Figure 7: Add (proposed) and (ground truth) labels for clarity. Maybe add a visualisation of the errors in the Appendix.
Line 526: Figure 11: FLake and LIF-DL are visually hard to discern (orange and red dotted/dashed lines are too similar). Maybe only chose zoomed in view on a selection of FUS/BUS segments.

I acknowledge the significant work contained in this manuscript and encourage the authors to address the two major and additional minor weaknesses. The FUS and BUS perspectives in the evaluation as well as the variable importance analysis already provide significant scientific insight. For future work I would also suggest considering “teacher forcing” as a training strategies for auto-regressive roll-outs, and a short discussion of the importance of lake ice cover to indigenous communities. Another research avenue would be to train a unified model across the full region.

Reply
Citation: https://doi.org/10.5194/egusphere-2025-5576-RC1

Samuel J. Johnston, Justin Murfitt, and Claude Duguay

Supplement

https://doi.org/10.5194/egusphere-2025-5576-supplement

Data sets

Lake Ice Forecasting with Deep Learning - Archived Data Samuel Johnston https://doi.org/10.5281/zenodo.17575279

Model code and software

H2OSam/lif-dl: LIF-DL v1.0 Samuel Johnston https://doi.org/10.5281/zenodo.17543536

Samuel J. Johnston, Justin Murfitt, and Claude Duguay

Viewed

Total article views: 287 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
178	87	22	287	70	13	16

HTML: 178
PDF: 87
XML: 22
Total: 287
Supplement: 70
BibTeX: 13
EndNote: 16

Views and downloads (calculated since 24 Feb 2026)

Month	HTML	PDF	XML	Total
Feb 2026	92	24	7	123
Mar 2026	80	59	15	154
Apr 2026	6	4	0	10

Cumulative views and downloads (calculated since 24 Feb 2026)

Month	HTML	PDF	XML	Total
Feb 2026	92	24	7	123
Mar 2026	80	59	15	154
Apr 2026	6	4	0	10

Viewed (geographical distribution)

Total article views: 289 (including HTML, PDF, and XML) Thereof 289 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 09 Apr 2026

Short summary

The Lake Ice Forecasting using Deep Learning model produces spatially explicit forecasts, which consistently outperform the popular Freshwater Lake model. Freeze-up and break-up timing was improved to within 3–9 days with greatly enhanced spatial accuracy of forecasted ice cover patterns. This establishes the potential of data-driven methods to advance lake ice models, with implications for enhancing weather prediction, northern transportation planning, and climate change adaptation.


Total:	0
HTML:	0
PDF:	0
XML:	0