the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A data-driven U-Net model with residual structures and attention mechanisms for short-term prediction of Arctic sea ice concentration
Abstract. Sea ice is vital in the global climate system, ecological balance and polar navigation. Arctic sea ice concentration (SIC) exhibits significant spatial heterogeneity and complex evolutionary patterns. In response to address these challenges, this study proposes a predictive model named sea ice concentration U-Net (SICUNet). SICUNet is a data-driven U-Net model that integrates attention mechanisms and residual structures for short-term prediction of SIC in the Arctic region. The model enhances the perception of multi-scale features through spatial-channel attention mechanisms. Meanwhile, it integrates residual structures to alleviate the vanishing gradient and improve training stability. SICUNet is trained and validated using SIC data from 1988 to 2020 and evaluated during the testing phase using data from 2021 to 2024. To accurately capture seasonal variations in SIC, each year is divided into a melting season and a freezing season. Model training and prediction are conducted separately for each season. The model input is a 448×304 tensor with 7 channels built from daily SIC data over seven consecutive days. It then predicts SIC for the subsequent 7 days. SICUNet is trained and validated based on this input-output structure, and further applied to recursive prediction of SIC. During the 2021–2024 testing period, SICUNet effectively predicts SIC for the upcoming 7 days and maintains stable and accurate performance across multiple recursive steps. It outperforms traditional U-Net, U2Net and numerical simulation methods, showing robust results under extreme SIC conditions.
- Preprint
(2896 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-4935', Anonymous Referee #1, 14 Jan 2026
-
AC1: 'Reply on RC1', Jinyun Guo, 23 Jan 2026
Thank you for your careful review and valuable comments. We have revised the manuscript accordingly and compiled our detailed responses in the attached ZIP file, which includes a PDF of the response to reviewers and the revised figures. Please kindly check.
-
RC2: 'Reply on AC1', Anonymous Referee #1, 01 Feb 2026
I thank the authors for their responses to my questions and comments.
Most of the proposed revisions are appropriate and well-integrated.
However, I still believe further clarification is needed regarding the comparison with TOPAZ5.The authors mention that they use nearest-neighbour interpolation to limit smoothing during resampling. However, this doesn’t really address my main concern. Since TOPAZ has a higher nominal resolution than the satellite product, it can resolve smaller-scale features. When the model is downsampled using nearest-neighbour interpolation, this small-scale variability is aliased onto the coarser grid, potentially altering the representation of larger scales. As a result, the comparison with a coarser satellite product may partially reflect representativeness error due to unresolved subgrid-scale features rather than true model error.
For this reason, the authors should discuss the implications of the chosen interpolation method in relation to the effective spatial spectral resolution of the datasets being compared, and consider whether more robust approaches (e.g. conservative remapping, filtering followed by resampling) would be more appropriate.
It would also be useful to check whether the presented results are consistent with those reported by MET Norway for TOPAZ5 (https://cmems.met.no/ARC-MFC/Validation/index.html).Minor comment:
I appreciate the authors’ efforts to improve the readability of the figures. I suggest limiting the plots in Figures 5 and 7 to higher latitudes and possibly presenting them as horizontal, full-page figures. In any case, please ensure that all figure labels remain readable in an A4 printed version.
Citation: https://doi.org/10.5194/egusphere-2025-4935-RC2
-
RC2: 'Reply on AC1', Anonymous Referee #1, 01 Feb 2026
-
AC1: 'Reply on RC1', Jinyun Guo, 23 Jan 2026
-
RC3: 'Comment on egusphere-2025-4935', Anonymous Referee #2, 05 Feb 2026
Summary
The manuscript presents SICUNET, an AI-based data-driven model for short-term sea ice concentration (SIC) prediction. The model architecture is based on a U-Net with Res-CBAM (Residual Convolutional Block Attention Module) blocks which combine lightweight channel and spatial attention mechanisms, to extract relevant spatio-temporal features, along with residual connections to stabilize training. SICUNet predicts Arctic SIC 7 days ahead using the information of the previous 7 days as input, with two separate models trained for ice growth and melt seasons. Performance is first evaluated through non-overlapping 7-day forecasts applied every 7 days over the test period. In a second step, an autoregressive approach is used to extend predictions to monthly horizons and assess longer-term forecast capability. Performance is compared against TOPAZ5 (a physics-based operational ocean-ice forecast system) and baseline U-Net architectures (with/without residual or attention mechanisms) to demonstrate the benefits of CBAM blocks for weekly to monthly SIC predictions.
General comments
The Res-CBAM-based architecture could have offered a valuable contribution for model inter-comparison perspective, but the manuscript misses important recent references (e.g., Durand et al. (2024), though model-driven) that could have provided (1) better physical grounding for the study and (2) a methodological guideline to help structure the paper. The approach would also benefit from focusing on sea ice edge dynamics (i.e., SIC increments) to better constrain the loss function and improve training quality. The current prediction strategy— 7-day forecasts based solely on the previous 7 days—raises concerns: the model performs pure temporal extrapolation with contextual information but without physical constraints. Consequently, it learns statistical patterns of sea ice development from training data that may become physically inconsistent with actual atmospheric conditions during inference. Moreover, the Res-CBAM spatial attention mechanism appears under-utilized for this given task. Combined with the seasonal distinction (melt/growth), these limitations reduce both the model's generalizability across seasonal transitions and diverse sea ice evolution scenarios (e.g., future ice-free Arctic summers) and its suitability for longer-term autoregressive forecasting. Additionally, figure clarity, tables, and metric selection require attention: more targeted, standard sea ice-specific metrics are needed. The current configuration makes it difficult for the intended readers in the sea ice community to assess whether the model truly improves SIC prediction by capturing sea ice edge evolution compared to persistence and/or predictions from other AI-based architectures. Finally, the paper suffers from the absence of essential discussions: (1) how the model provides a better or more appropriate solution for weekly to monthly SIC prediction than recent literature and existing approaches (AI or physics-based models), and (2) a critical assessment of the method's limitations and how to potentially address them. These omissions weaken the manuscript considerably. More fundamentally, the current study (1) should not be limited to a tentative extension of the work from Ren et al. (2022, 2023) using the Res-CBAM architecture from Woo et al. (2018) without demonstrating clear added scientific value.
Please find below my comments that might help revising the present manuscript.
Specific comments
Clarify relationship to work from Ren et al. (2022, 2023)
The manuscript shares considerable similarity with Ren et al. (2022, 2023) in both structure, content, metrics, and, in some instances, phrasing, in the introduction, data description, model and methodology, figures (1,2), and way of presenting results.References:
Ren et al. (2022). SICNet with Res-TSAM blocks. DOI: 10.1109/TGRS.2022.3177600
(not cited in the manuscript)Ren & Li (2023). SICNet90 for daily SIC prediction at 90 days horizon for the melting season. DOI: 10.1109/TGRS.2023.3279089
While generalizing the work from Ren and coauthors to year-round sea ice concentration prediction could constitute a reasonable objective, the current manuscript does not adequately acknowledge this foundation and go beyond their study.
The authors should:- explicitly state from the introduction that they build upon Ren et al.'s framework
- systematically, reference Ren et al.’s work along the methodology section.
- ensure all text is either properly paraphrased or attributed.
- demonstrate what significant benefits or improvements the proposed approach provides for year-round sea ice prediction compared to the work from Ren and coauthors.
Introduction:
1. The introduction does not adequately highlight the proposed approach or establish a clear research question.
It would benefit from:
- a concise description of the Res-CBAM architecture particularities (attention mechanisms and residual connections) with appropriate references to Woo et al. (2018) and possibly with some successful application in computer vision or medical imaging
- clear explanation for why this architecture is suited to the task and how it addresses limitations identified in previous studies
- explicit statement of the prediction task including temporal resolution (data frequency daily/monthly), forecast horizon (7 days to extended prediction with autoregressive approach), target variable (SIC), and spatial domain (Arctic)
2. Several key recent references related to AI applied to sea ice prediction are missing—including studies employing model-driven approaches—that could have offered valuable physical grounding and methodological insights for structuring this work.
Durand et al. (2024): DOI: 10.5194/tc-18-1791-2024
Finn et al. (2024): DOI: 10.1029/2024MS004395Data:
- It is unclear why TOPAZ5 is introduced in the data section when it serves only as a comparison in the results (section 4.5). TOPAZ5 is a more complex operational forecasting system (with data assimilation and dynamic sea ice state prediction). The potentially insightful information would be the comparison of the error growth characteristics from weekly to monthly timescales in the discussion to demonstrate forecast abilities.
Methodology:
- Did you consider training to predict SIC increments rather than SIC maps. That would definitely help the model focus to learn ice edge dynamics where changes occur, rather than stable pack ice regions (concentration ≈1). This approach would potentially eliminate the need for the NIIEE penalization term you introduced to enforce ice edge accuracy.
- The prediction strategy—7-day forecasts based on the previous 7 days of SIC—relies on pure temporal extrapolation without incorporating physical constraints or atmospheric forcing. This means the model learns statistical patterns from training data that may not align with actual atmospheric conditions during inference. It can potentially limit forecast skill when atmospheric drivers differ from historical patterns and for unseen conditions (e.g. ice free summers).
- The recursive approach used to extend 7-day forecasts to monthly predictions should be detailed in this section for clarity (not in the result section). The proposed approach raises concerns about error accumulation. Did you consider instead: predicting few days ahead to constrain sea ice edge dynamics, and using only the first predicted day to update the input window (rolling forward by one day), and repeat this process recursively to build weekly or monthly forecasts. This rolling daily approach with shorter horizons (requiring fewer days input) could better constrain error propagation. Alternatively, consider using n days of past data directly for n-day predictions as proposed by Ren et al. 2022.
- How is overfitting addressed in your training procedure? Several aspects of the experimental setup suggest high risk of overfitting: (1) the 7-day prediction task, and (2) the imbalanced data split (31 years training vs. only 2 years validation). Please provide the reviewers with a plot showing the evolution of training and validation loss during training. Additionally, what regularization techniques were implemented? What is the reason for the reduced batch size?
- The evaluation metrics employed are not appropriate for assessing sea ice prediction performance. They include the entire active domain, including stable regions (pack ice at SIC≈1, open ocean at SIC≈0) where values remain constant. Since changes occur primarily at narrow ice edge regions, these domain-averaged metrics are dominated by trivial predictions of stable areas. Consequently, a naive persistence forecast (maintaining conditions at time=t) would score comparably well, making it impossible to assess whether your model provides meaningful improvement over baseline approaches. The manuscript would benefit in employing sea ice-specific metrics that isolate edge dynamics like the IIEE (Integrated Ice Edge Error), explicitly designed for ice edge prediction assessment. This would demonstrate whether the model genuinely improves ice edge prediction. Could you also define here the SIEE (employed in the results section)?
- What justifies separate models for melt and growth seasons? This approach complicates operational forecasting (requiring model switching), creates transition period uncertainties, and limits generalizability. Do both models perform similarly during transitions?
Did you consider training a single unified model for all conditions? A year-round model would be more operationally practical and could better capture seasonal transitions. The training strategies suggested in previous comments (SIC increments, loss, physical constraints) may help achieve this.
Results:
- The performance of your model is not clearly highlighted in the current figures and tables. Please use more specific, targeted visualizations with sea ice edge-related metrics to improve readability and clarity.
- Figure 1 (a-d): (see comment 5 in the above section) Comparative IIEE plots (seasonal mean for example) showing your 7-day predictions against persistence and climatology baselines would be more efficient for demonstrating performance and quantifying error growth over the forecast horizon. None the less the plots reveal significant error growth in SIC predictions, which is confirmed and detailed in Figures 4d-e. There is a systematic failure to capture sea ice edge evolution, particularly evident in dynamic ice regions. This systematic error likely reflects the methodological limitations discussed in the section above.
- “Section 4.2 Stability of predictions”. What do you mean by "stability"? Presenting a single 7-day prediction of the sea ice minimum has limited value since the minimum's timing is climatologically known around September 15th. The real challenge is predicting it months in advance via extended recursive rollout or longer-range prediction (Ren et al., 2023). In sea ice forecasting, "stability" typically means maintaining low error growth over monthly-seasonal timescales, which doesn't appear to be demonstrated here.
- Figure 5: see comment 1 and comment related to figures 4 (d-e). Figure 6 does not provide substantial added value to the manuscript and could be removed
- Section 4.3. Recursive forecast: Table 3 and Figure 7 demonstrate significant performance degradation when using recursive forecasts, driven by substantial error growth over the 7-day prediction horizon. This confirms the methodological limitations discussed above.
- Section 4.4: compared performance with other UNet architectures. Please state in a table or in appendix what are the specifics of the other UNet and use IIEE for comparison.
It would be interesting to compare your model's performance with: (1) the UNet Res-TSAM developed by Ren et al. (2022), which represents an improvement over Res-CBAM, and (2) a CBAM variant without spatial attention (i.e., channel attention only), given that spatial attention appears under-utilized when changes occur primarily at the narrow ice edge region. - What "stability" is being demonstrated here? The approach switches between dedicated melt and growth models on fixed dates (March 31, September 29). Can either model actually capture the seasonal transitions (freeze-up, melt onset), or does performance degrade during these periods? The significant error growth in both models makes it difficult to assess whether genuine transition-period skill exists.
Discussion:
- The current "Discussion" section presents additional exploratory results rather than critical analysis and interpretation.
- A proper discussion section is absent. The manuscript does not critically assess whether objectives were achieved or address evident limitations.
Specifically:
- How does this work advance beyond Ren et al. (2023) for year-round prediction?
- The results demonstrate clear difficulties with ice edge prediction (the primary objective) and substantial error growth due to methodological bottleneck and physical grounding.
The manuscript needs to be revised accordingly with a substantive discussion. Scientific rigor requires honest evaluation of whether goals were achieved, interpretation of why limitations exist, and constructive future directions.
Minor commentsIntroduction:
- The introduction lists limitations of physical and statistical models but does not explain how AI-based approaches address or overcome these limitations.
- Temporal specifications (data frequency, prediction type, forecast horizon) for each cited work would help locate your approach within the existing literature. The limitations would be clearer if connected to the specific studies they reference rather than presented all at the end.
- Lines 53: (Ren et al., 2023)
Data:
- Please specify the spatial domain, temporal coverage, and final grid dimensions, as presented in the data description by Ren et al. (2022). Data rejection proportions for each season (melt/growth) would be a valuable addition, as would introducing the data split directly here (training, validation, test).
- The manuscript would benefit from restructuring the data description into a unified "Data and Methodology" section, which would also improve section balance. Consider also shifting the definition of the evaluation metrics at the end. The loss is somehow related to your modeling process. More details about your hyperparameter setup and callbacks would be also relevant for reproducibility purposes.
Methodology:
- Figures 1 & 2 contain considerable redundancy (CBAM cells and similar encoder-decoder convolutional layers). Consider either: (1) condensing them into a single figure with proper attribution ("adapted from Ren et al. 2022, doi:10.1109/TGRS.2022.3177600"), or (2) removing Figure 2 entirely and simply citing Ren et al. (2022) in the text where the architecture is described in section 3.1.
- Figure 3 does not provide substantial new information if the methodology section is clearly written. Consider removing it or relocating it to an appendix.
Citation: https://doi.org/10.5194/egusphere-2025-4935-RC3
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 181 | 82 | 20 | 283 | 15 | 16 |
- HTML: 181
- PDF: 82
- XML: 20
- Total: 283
- BibTeX: 15
- EndNote: 16
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
In this study, the authors propose a data-driven model for short-term sea ice concentration (SIC) prediction based on a U-Net neural network architecture, termed SICUnet. The model extends a baseline U-Net by incorporating spatial-channel attention mechanisms. It is trained exclusively on satellite observations and uses daily SIC maps from seven consecutive days to predict SIC over the subsequent seven days. Longer forecast horizons are obtained through recursive application of the model, and results are presented up to a 35-day lead time (five recursive cycles). Performance evaluation over four test years indicates improved skill relative to other neural network architectures and to a reference numerical model.
The manuscript is generally clear and focuses primarily on the performance gains achieved through the proposed architectural modifications. However, the comparison with alternative approaches is somewhat limited, and the manuscript would benefit from a more critical discussion of the results, particularly with respect to non-conventional methodological choices (e.g., network architecture and loss function).
Below I outline several points that should be considered in revising the manuscript.
Minor comments:
https://doi.org/10.5194/tc-18-1791-2024
https://doi.org/10.1029/2024MS004395
https://doi.org/10.48550/arXiv.2508.14984