the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Combining the U-Net model and a Multi-textRG algorithm for fine SAR ice-water classification
Abstract. Synthetic Aperture Radar (SAR)-based sea ice classification faces challenges due to the similarity among surfaces such as wind-driven open water (OW), smooth thin ice, and melted ice surfaces. Previous algorithms combine pixel-based and region-based machine learning methods or statistical classifiers, yet struggle with hardly improved accuracy arrested by the fuzzy surfaces and limited manual labels. In this study, we propose an automated algorithm framework by combining the semantic segmentation of ice regions and the multi-stage detection of ice pixels to produce high-accuracy and high-resolution ice-water classification data. Firstly, we used the U-Net convolutional neural networks model with the well processed GCOM-W1 AMSR2 36.5 GHz H polarization, Sentinel-1 SAR EW dual-polarization data, and CIS/DMI ice chart labels as data inputs to train and perform semantic segmentation of major ice distribution regions with near-100 % accuracy. Subsequently, within the U-Net semantically segmented ice region, we redesigned the GLCM textures and the HV/HH polarization ratio of Sentinel-1 SAR images to create a combined texture, which served as the basis for the Multi-textRG algorithm to employ multi-stage region growing for retrieving ice pixel details. We validated the SAR classification results on Landsat-8 and Sentinel-2 optical data yielding an overall accuracy (OA) of 84.9 %, a low false negative (FN) of 4.24 % indicating underestimated low backscatter ice surfaces, and a higher false positive (FP) of 10.8 % reflecting their resolution difference along ice edges. Through detailed analyses and discussions of classification results under the similar ice and water conditions mentioned at the beginning, we anticipate that the proposed algorithm framework successfully addresses accurate ice-water classification across all seasons and enhances the labelling process for ice pixel samples.
- Preprint
(5492 KB) - Metadata XML
-
Supplement
(1867 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2760', Anonymous Referee #1, 02 Jun 2025
In this manuscript a U-Net model combined with Multi-textRG algorithm has been proposed for fine ice-water classification on SAR imagery from AI4Arctic dataset. Overall the manuscript is clearly written with sufficient details. However, as U-Net and other techniques such as GLCM feature extraction has already been broadly used for sea ice mapping and particularly in AI4Arctic dataset. The manuscript should further emphasize the novelty in terms of methodology in this research, as well as comparison with previous methods as baselines. Detailed comments are listed below.
Â
- The authors mentioned in the abstract that the proposed algorithm successfully addresses ice–water classification across all seasons. However, during the evolution of sea ice, the proportion of ice types presented by different seasonal patterns is unstable. In warmer seasons, melting ice surfaces affect the classification results, while in colder seasons, snow cover on sea ice also influences the outcomes. The authors did not evaluate the algorithm’s performance under varying environmental conditions, indicating a lack of demonstrated adaptability and effectiveness in different seasonal contexts. This limitation should be clearly emphasized and analyzed.
Â
- The authors stated that the framework is primarily designed for ice-covered regions and wind-driven open water areas. However, wind forcing can also affect the classification accuracy within ice-covered regions. Therefore, the authors should clarify how wind-driven dynamics influence the classification performance across different types of ice-covered areas.
Â
- In the third paragraph, the authors claimed that the algorithm combining CNN with empirical methods represents the optimal automatic approach for sea ice labeling. However, the evidence supporting this conclusion appears overly strong, as there is insufficient experimental validation to substantiate the claim of optimal performance. Multiple experimental results are needed to support such a conclusion. Furthermore, the proposed algorithm lacks comparative analysis with other sea ice classification methods, whether quantitative or qualitative, which further weakens the assertion of its superiority.
Â
- Line 105, the related publication concerning the AutoIce Challenge should be mentioned here (doi: 10.5194/tc-18-3471-2024), which would facilitate readers to refer to this particular challenge and its details.
Â
- Line 120, the U-Net-based model has already been further improved by the AutoIce participants using a bunch of techniques and achieved relatively high accuracy in the AI4Arctic dataset (illustrated in doi: 10.5194/tc-18-3471-2024 and doi: 10.5194/tc-18-1621-2024). According to Fig. 2, it seems that the U-Net used in this research has the same architecture as the one used in the challenge. Therefore, it is necessary to illustrate how the U-Net-based method proposed in this research different from the previous ones. It is also necessary to implement those U-Net-based models as benchmarks to compare with the proposed method in the manuscript.
Â
Â
Citation: https://doi.org/10.5194/egusphere-2024-2760-RC1 -
AC1: 'Reply on RC1', Yan Sun, 07 Jun 2025
We sincerely appreciate the reviewer’s insightful comments and constructive feedback. Our detailed responses to each point have been provided in the attached supplementary PDF file. We are glad to engage in further discussion if additional clarifications are needed.
-
CC1: 'Comment on egusphere-2024-2760', Morteza Karimzadeh, 16 Jun 2025
1. To clearly demonstrate the effectiveness of the proposed U-Net + Multi-textRG approach, it is necessary to include a quantitative comparison table showing the performance of the baseline U-Net model. While the paper reports an overall accuracy (OA) of 84.9% for the proposed approach, validated using Landsat-8 and Sentinel-2 optical data, no equivalent performance metrics (e.g., OA, false positive rate, false negative rate) are provided for the baseline model. Including these results is required to have a first assessment of improvement of this approach upon the baseline model and to clarify the contribution of the Multi-textRG algorithm.
2. The authors validate their U-Net + Multi-textRG method using optical imagery from Landsat-8 and Sentinel-2, but there are still some concerns about how accurate and reliable the created labels are for evaluating a SAR-based classification model. The ground truth is based on QA snow/ice signs from optical data, with further refinement using MNDWI and manual visual interpretation. While these steps show a careful attempt to improve the label quality, they also introduce some uncertainty, for example, it’s not clear how consistent the manual corrections were, how the MNDWI threshold was selected, or how cloud-related misclassifications were handled across scenes. It’s important to note that optical and SAR sensors are sensitive to fundamentally different surface properties, and the QA-based snow/ice signs, despite being algorithmically generated, aren’t always reliable, especially under thin or patchy cloud cover. Even with visual correction, the final labels may still be subjective and difficult to reproduce. To further strengthen the assessment, it would be worth providing results on an SAR-based dataset such as ExtremeEarth, which includes pixel-level, high-quality annotations directly derived from Sentinel-1 images. If not, this limitation of the work should be clearly discussed and the claims of the paper modified to tone down.
3. The authors applied the Multi-textRG algorithm as a separate, post-processing procedure following the primary segmentation by the U-Net model. This approach is computationally inefficient, but more importantly, it appears that Multi-textRG refinement is non-learnable and end-to-end trainability is not supported, which defeats the purpose of deep learning-based approaches. The approach also introduces additional overhead and therefore may be less suitable for operational or large-scale applications. Current studies have also suggested that texture information (e.g., GLCM texture features) can be more effectively utilized when incorporated into the model during training. For instance, the article "Deep learning techniques for enhanced sea-ice types classification in the Beaufort Sea via SAR imagery" explains how the application of GLCM-based texture information during training can be utilized to improve the classification of sea ice using a Dual-Branch U-Net (DBU-Net).
4. The paper lacks a quantitative comparison with more recent state-of-the-art sea ice segmentation models or any other recognized baseline to make it suitable for publication at a high-end journal. It would be helpful to include comparisons with deep learning-based approaches such as DeepLabv3, which captures multiscale textural information through atrous spatial pyramid pooling and has showed improvement on benchmarks such as AutoICE. Including such comparisons would help clearly demonstrate the effectiveness of the proposed method relative to baseline models that support end-to-end training and are more easily replaceable and adaptable in operational or large-scale settings.
5. While the use of GLCM features here is well grounded in earlier work, the specific selection of Sum Average and Contrast features, together with chosen window sizes, normalization intervals, and nonlinear HV/HH ratio transformation functions, appears to be based primarily on internal experimentation using the J-M distance metric. These choices are not justified through ablation or sensitivity analysis, and their individual contribution or robustness is hard to assess, particularly when used on other SAR observations. Ideally, further experiments in larger and more polar regions are also necessary to evaluate the robustness of the proposed algorithm as stated in this paper, but at the very least, more ablation can shot the value of the work.
Â
Citation: https://doi.org/10.5194/egusphere-2024-2760-CC1 -
RC2: 'Comment on egusphere-2024-2760', Sepideh Jalayer & Morteza Karimzadeh (co-review team), 17 Jun 2025
1. To clearly demonstrate the effectiveness of the proposed U-Net + Multi-textRG approach, it is necessary to include a quantitative comparison table showing the performance of the baseline U-Net model. While the paper reports an overall accuracy (OA) of 84.9% for the proposed approach, validated using Landsat-8 and Sentinel-2 optical data, no equivalent performance metrics (e.g., OA, false positive rate, false negative rate) are provided for the baseline model. Including these results is required to have a first assessment of improvement of this approach upon the baseline model and to clarify the contribution of the Multi-textRG algorithm.
2. The authors validate their U-Net + Multi-textRG method using optical imagery from Landsat-8 and Sentinel-2, but there are still some concerns about how accurate and reliable the created labels are for evaluating a SAR-based classification model. The ground truth is based on QA snow/ice signs from optical data, with further refinement using MNDWI and manual visual interpretation. While these steps show a careful attempt to improve the label quality, they also introduce some uncertainty, for example, it’s not clear how consistent the manual corrections were, how the MNDWI threshold was selected, or how cloud-related misclassifications were handled across scenes. It’s important to note that optical and SAR sensors are sensitive to fundamentally different surface properties, and the QA-based snow/ice signs, despite being algorithmically generated, aren’t always reliable, especially under thin or patchy cloud cover. Even with visual correction, the final labels may still be subjective and difficult to reproduce. To further strengthen the assessment, it would be worth providing results on an SAR-based dataset such as ExtremeEarth, which includes pixel-level, high-quality annotations directly derived from Sentinel-1 images. If not, this limitation of the work should be clearly discussed and the claims of the paper modified to tone down.
3. The authors applied the Multi-textRG algorithm as a separate, post-processing procedure following the primary segmentation by the U-Net model. This approach is computationally inefficient, but more importantly, it appears that Multi-textRG refinement is non-learnable and end-to-end trainability is not supported, which defeats the purpose of deep learning-based approaches. The approach also introduces additional overhead and therefore may be less suitable for operational or large-scale applications. Current studies have also suggested that texture information (e.g., GLCM texture features) can be more effectively utilized when incorporated into the model during training. For instance, the article "Deep learning techniques for enhanced sea-ice types classification in the Beaufort Sea via SAR imagery (https://doi.org/10.1016/j.rse.2024.114204)" explains how the application of GLCM-based texture information during training can be utilized to improve the classification of sea ice using a Dual-Branch U-Net (DBU-Net).
4. The paper lacks a quantitative comparison with more recent state-of-the-art sea ice segmentation models or any other recognized baseline to make it suitable for publication at a high-end journal. It would be helpful to include comparisons with deep learning-based approaches such as DeepLabv3, which captures multiscale textural information through atrous spatial pyramid pooling and has showed improvement on benchmarks such as AutoICE. Including such comparisons would help clearly demonstrate the effectiveness of the proposed method relative to baseline models that support end-to-end training and are more easily replaceable and adaptable in operational or large-scale settings.
5. While the use of GLCM features here is well grounded in earlier work, the specific selection of Sum Average and Contrast features, together with chosen window sizes, normalization intervals, and nonlinear HV/HH ratio transformation functions, appears to be based primarily on internal experimentation using the J-M distance metric. These choices are not justified through ablation or sensitivity analysis, and their individual contribution or robustness is hard to assess, particularly when used on other SAR observations. Ideally, further experiments in larger and more polar regions are also necessary to evaluate the robustness of the proposed algorithm as stated in this paper, but at the very least, more ablation can shot the value of the work.
Citation: https://doi.org/10.5194/egusphere-2024-2760-RC2 - AC2: 'Reply on RC2', Yan Sun, 04 Jul 2025
-
AC3: 'Comment on egusphere-2024-2760 (Authors' Statement)', Yan Sun, 25 Jul 2025
  Large AI models are rapidly evolving. Deep CNN, ResNet, ViT and other mechanisms demonstrate immense potential for feature extraction, while multi-modal diffusion models, generative models (GANs), foundation models (FMs), and large language models (LLMs) have shown remarkable performance in remote sensing object classification and recognition tasks. However, the application of these algorithms in polar sea-ice remote sensing has been delayed. This can be primarily due to the heightened demand for precision in parameter inversion for (such as) climate prediction studies, which necessitates the extensive sample labeling for supervised machine learnings.
  The work presented in this paper was initiated in 2023. It employs a CNN for coarse ice-water classification combined with the Multi-TextRG empirical statistical algorithm for fine-grained sea ice detection. The Multi-TextRG algorithm has demonstrated stable identification of sea-ice pixels through 2024. We achieved automated sample labeling (i.e., automated sea ice identification), though this process initially required manual training.
  The proposed methods diverge somewhat from prevailing AI research trends in 2025. However, their paramount significance lies in resolving the challenge of fine-grained sea-ice pixel-level identification with the absence of equally precise sample labels. This breakthrough can consequently addresses the scarcity of finely labeled sea ice concentration (SIC) data. Therefore, this study holds promise for providing essential training data support for multi-modal large AI models in polar sea-ice remote sensing. Furthermore, the approach demonstrated in our multi-level ice detection process relying on designed GLCM texture features, may offer valuable insights for feature extraction within model backbone modules. Â
  We will dedicate more efforts to the development of AI models for polar sea-ice remote sensing, aiming to provide novel approaches for precise sea-ice parameter inversion under complex sea conditions.
Citation: https://doi.org/10.5194/egusphere-2024-2760-AC3 -
EC1: 'Comment on egusphere-2024-2760', Ted Maksym, 27 Aug 2025
Note from handling editor:Â
After assessing the manuscript, the reviews, and the authors' replies, I have noted that much of the concerns about the manuscript may arise from some confusion about the intent of the combined technique of the U-Net and Multi-textRG, and the degree of novelty and/or validation for each. I urge the authors to carefully address the reviewer concerns in their manuscript, and I also add some comments below that I believe, based on the reviews, are key points to address:
1. One reviewer has noted that the U-Net is essentially the same as previously published, without any substantial improvement in results. The authors' contend the use is as a first step in classification. As such, the paper is primarily about the Multi-textRG pixel level classification scheme. I would encourage the authors to make this more clear and downplay the U-Net analysis. To me, this appears as just a technique to first provide regional classifications. In fact, I believe it may not be a necessary step - one could use a variety of techniques to do a first classification. The real innovation in the paper is the Multi-textRG algorithm, and the paper could be more clear on this.
2. Following from this, it is not clear to me, based on my understanding of the Multi-textRG algorithm, that the first step regional classification is even necessary at all. Since the pixel level classification can identify ice, open water, or wind-affected open water within the regional ice classes, then can it not do the same for areas that were outside of these original classes? If I misunderstood this, then the authors should clarify why the first step is needed. If I am correct, I don't think the authors need to remove the U-Net classification, but it should be clarified why it may be advantageous to use (or a similar technique). Why the regional segmentation was needed is not clear.
3. The reviewers make a valid point about how well the Multi-textRG algorithm works vs state-of-the art. I appreciate the performance metrics that are included, but the reader is left to judge for themselves whether the results are actually acceptable. Even though in some cases presented, the results appear subjectively good, there is no context presented for how good these actually are compared to alternatives. In some cases, the false positive and false negative rates might be viewed as unacceptable. This is the critical factor if the manuscript is to be acceptable for publication in The Cryosphere. Based on my assessment of the reviews, I feel the authors need to show that the Multi-textRG algorithm works effectively compared to standard classification schemes, or other approaches previously proposed.
4. The reviewers express some concern about the quality of the training data for the Multi-textRG algorithm. I agree that there is a question about how well ice classes are classified in that imagery. It is challenging to show definitively that their classification of these images is robust, but some additional evaluation and discussion of this is warranted to give the reader a sense of how much to trust the results.
Citation: https://doi.org/10.5194/egusphere-2024-2760-EC1 - AC5: 'Reply on EC1', Yan Sun, 04 Sep 2025
- AC4: 'Comment on egusphere-2024-2760', Yan Sun, 04 Sep 2025
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
538 | 132 | 29 | 699 | 43 | 18 | 38 |
- HTML: 538
- PDF: 132
- XML: 29
- Total: 699
- Supplement: 43
- BibTeX: 18
- EndNote: 38
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1