Towards a manual-free labelling approach for deep learning-based ice floe instance segmentation in airborne and high-resolution optical satellite images
Abstract. Floe size distribution (FSD) has become a parameter of great interest in observations of sea ice because of its importance in affecting climate change, marine ecosystems, and human activities in the polar ocean. The sizes of ice floes can range from less than a square metre to hundreds of square kilometres, so the most effective way to monitor FSD in the ice-covered regions is to apply image processing techniques to airborne and satellite remote sensing data. The segmentation of individual ice floes is crucial for obtaining FSD from remotely sensed images, and it is a challenge to separate floes that appear to be connected. Although deep learning (DL) networks have achieved great success in image processing, they still have limitations in this application. A key reason is the lack of sufficient labelled data, which is costly and time-consuming to produce. In order to alleviate this issue, we use classical image processing techniques to achieve a manual-label free ice floe image annotation, which is further used to train DL models for fast and adaptive individual ice floe segmentation, especially for separating visibly connected floes. A post-processing algorithm is also proposed in our work to refine the segmentation. Our approach has been applied to both airborne and high-resolution optical (HRO) satellite images, and successfully derived FSD at local and global scales.
Qin Zhang and Nick Hughes
Status: open (until 28 Apr 2023)
- RC1: 'Comment on egusphere-2023-295', Anonymous Referee #1, 30 Mar 2023 reply
Qin Zhang and Nick Hughes
Qin Zhang and Nick Hughes
Viewed (geographical distribution)
In this study the authors used a classical image processing technique to label the ice floe samples, and then used these samples for training a deep learning model, which was used for ice floe segmentation. The authors evaluated the algorithm using two types of remote sensing data and compared its accuracy and runtime with other methods. They claimed that this approach can achieve faster processing speed and higher accuracy. Deep learning models have been widely used in remote sensing image processing, but the prerequisite for obtaining ideal accuracy is usually a sufficient amount of training samples. Sample labeling is usually done manually, which often requires a lot of manpower and time. Using an automatic labeling method to obtain samples has certain advantages.
Although the deep learning method achieved the best results, which was also expected, using an automatic method for labeling a large number of samples and then for training deep learning models is a commonly used approach. The paper did not provide sufficient innovation, whether in terms of methodology or scientific application. It is recommended that the authors focus more on the methodology itself to address the specific technical issues encountered in the ice floe segmentation, rather than simply using samples to train the deep learning models to obtain so-called high accuracy.
Using simple methods for automatic labeling of samples and applying them to the training of deep learning models is a common practice, and this paper does not provide enough innovation in this regard. Therefore, I believe that the originality of the paper is relatively limited.
As the authors point out, one of the advantages of this method is that it can reduce the running time. Classical methods for sample labeling take a considerable amount of time. As the number of training samples increases with the further application of the model, the training time of the model will also increase. If we only compare it with classical methods, this method additionally needs the time for model training. Of course, if we only compare the running time, the deep learning model takes less. But what is the practical significance of shortening the time? Can it be used for some near-real-time applications?
What is the difference between results from classical methods and deep learning methods? The training samples of deep learning come from the classification results of classical methods. If there are some errors in the training samples, these errors may also be introduced into the deep learning model. Although the authors believe that deep learning can overcome this problem by itself, the influence will still exist. How do the classical methods and deep learning methods affect the subsequent acquisition of ice floe parameters, and is the difference obvious?
The authors used two resolutions of remote sensing images to test the method, but I did not see the comparison of the two results. Will the spatial resolution have an impact on the algorithm? How does the sample size of different resolutions compare? What kind of impact will it have on the training of the model? How sensitive is the proposed method to the size of the ice floe? Can similar accuracy be achieved in other regions of the Arctic or at other times?
The parameter settings of the deep learning model are not clear enough, and the influences of multiple parameters on the results need to be compared to obtain the best training and classification results. The method process is not clear, making it difficult for readers to follow and implement. The testing images is also limited, which makes it difficult to demonstrate the robustness of the method.
In addition, this method does have some practical value, and the author can consider making the method model public.
Line 15: What is the difference between "sea ice" and "floe" here?
Line 15-20: This sentence is too long. It is recommended to rewrite it.
Line 20: What does "environment information" refer to?
Line 21: What does "floe parameters" refer to?
Line 22: Classical methods have some difficulties in distinguishing connected floes, and these errors may also exist in the training samples of deep learning models. Why can deep learning models overcome these problems?
Line 50: The poor performance of these models may be due to model structures, other types of models (e.g., the model you used) may resolve this problem.
Figure 2: The title of the figure is too simple and lacks necessary descriptions.
Section 3: Although you have written a lot in this section, it is still difficult to understand the processes. It is recommended that this part should be rewritten with a clearer logic, so that readers can follow and implement the method. A method flowchart is recommended here.
Line 86-87: Some data or references are needed to support this.
Line 98: What is the ratio of airborne data to satellite data in the MIZ image, and will it affect the performance of deep learning model?
Line 111-112: Is this correct? Does it contain real ice floe boundaries?
Line 119: Do you mean that this method may not achieve good results in the local regions?
Line 129: “an ice floe can be resized into several smaller ones of different scales”, do you mean that you create more small ice floe objects by resizing the large one?
Section 3.1.2: I don't understand why you divided the images into multiple scales and how to implement it specifically. More details are needed here.
In the method section, I did not see the impact of sample size on the method. You used two different resolution data sources. What is the impact of the number ratio between them on the method? If this method is applied to other regions, it may be more realistic to increase the amount of satellite data. What kind of impact will it have on the classification results mainly based on satellite images?
Line 170: So you applied an aforehand quality control on the samples, is this manual-free?
Line 172: Is the sample size too small for the deep learning model?
Line 217: There are no obvious differences between deep learning models of the same category, and their performances are also similar.
Figure 9: It is difficult to distinguish the difference in results between different methods, and the image below is the same.
Line 243-244: This may affect the classification performance of deep learning.
Line 250: The training samples also contain these errors. Why can deep learning automatically overcome this problem?
Figure 15: Similarly, the title of the figure is too simple and lacks necessary information.