Super-resolution of Arctic Sea Ice thickness using a conditional diffusion model
Abstract. Small‑scale variability (3–60 km) in Arctic sea‑ice thickness plays a crucial role in sea‑ice predictability and in the climate system. However, these scales are neither directly observed nor adequately represented in climate models. While coarse‑resolution observational products (e.g., CS2SMOS) and some high‑resolution model simulations exist, bridging the scale gap remains challenging.
In this work, we use machine learning to develop a super‑resolution algorithm that reconstructs small‑scale sea‑ice thickness features from low‑resolution input fields. The algorithm is trained on realistic high‑resolution model simulations and is based on diffusion models conditioned on low‑resolution observations. This class of models is inherently probabilistic, enabling the generation of an ensemble of plausible high‑resolution reconstructions from a single coarse‑resolution input.
We apply the method both to model simulation, where high‑resolution ground truth is available, and to the CS2SMOS observational product. We demonstrate that the algorithm produces realistic high‑resolution sea‑ice thickness fields with improved accuracy and provides meaningful uncertainty estimates through the ensemble spread.
This manuscript presents a conditional diffusion-model approach for super-resolving Arctic sea-ice thickness from low-resolution observational products. The model is trained using high-resolution neXtSIM simulations and synthetic low-resolution counterparts designed to mimic CS2SMOS and related satellite-derived inputs. The approach is then evaluated both in an idealised model setting, where high-resolution truth is available, and on CS2SMOS data.
Overall, I find the manuscript promising and potentially publishable after revision. The method appears sound, the paper is generally clear, and the results show meaningful improvement over the low-resolution baseline. However, there are some issues that should be addressed before publication, which are shown below.
Main comments:
1:
The baseline comparison is limited. Comparing against the low-resolution input field is necessary, but not sufficient to assess the added value of the proposed diffusion model. The paper would be stronger if the authors added at least one stronger baseline, such as a deterministic neural-network super-resolution model, a directly trained U-Net for the SIT increment (since the diffusion model used here has U-Net as backbone), a simpler stochastic baseline, or an ablation showing the added value of the diffusion formulation. This would help distinguish improvements due to the diffusion model from improvements due simply to supervised learning from neXtSIM.
2:
The construction of the synthetic low-res and high-res data seems to be central to this study, because the main quantitative validation is performed in a way that high-resolution neXtSIM fields are smoothed to create low-res inputs, and the original neXtSIM fields are then used as ground truth for evaluation. The observational application reveals a domain-shift issue. The authors state that spurious coastal features occur because OSI SAF deformation masks were not present in the training data. Since the model is intended for CS2SMOS/OSI SAF inputs, the synthetic training data should mimic their missing-data patterns as well if possible. Why were realistic OSI SAF masks not included during training? Perhaps the authors can clarify this point a bit more and address this in the training/evaluation setup and see if it improves. Also, it is not clear that what kinds of CS2SMOS patterns are absent from the training data, like where they occur, or how strongly they affect the results. The authors should elaborate on this point and provide clearer examples.
Other comments: