the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Towards a manual-free labelling approach for deep learning-based ice floe instance segmentation in airborne and high-resolution optical satellite images
Abstract. Floe size distribution (FSD) has become a parameter of great interest in observations of sea ice because of its importance in affecting climate change, marine ecosystems, and human activities in the polar ocean. The sizes of ice floes can range from less than a square metre to hundreds of square kilometres, so the most effective way to monitor FSD in the ice-covered regions is to apply image processing techniques to airborne and satellite remote sensing data. The segmentation of individual ice floes is crucial for obtaining FSD from remotely sensed images, and it is a challenge to separate floes that appear to be connected. Although deep learning (DL) networks have achieved great success in image processing, they still have limitations in this application. A key reason is the lack of sufficient labelled data, which is costly and time-consuming to produce. In order to alleviate this issue, we use classical image processing techniques to achieve a manual-label free ice floe image annotation, which is further used to train DL models for fast and adaptive individual ice floe segmentation, especially for separating visibly connected floes. A post-processing algorithm is also proposed in our work to refine the segmentation. Our approach has been applied to both airborne and high-resolution optical (HRO) satellite images, and successfully derived FSD at local and global scales.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(120727 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(120727 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-295', Anonymous Referee #1, 30 Mar 2023
In this study the authors used a classical image processing technique to label the ice floe samples, and then used these samples for training a deep learning model, which was used for ice floe segmentation. The authors evaluated the algorithm using two types of remote sensing data and compared its accuracy and runtime with other methods. They claimed that this approach can achieve faster processing speed and higher accuracy. Deep learning models have been widely used in remote sensing image processing, but the prerequisite for obtaining ideal accuracy is usually a sufficient amount of training samples. Sample labeling is usually done manually, which often requires a lot of manpower and time. Using an automatic labeling method to obtain samples has certain advantages.
Although the deep learning method achieved the best results, which was also expected, using an automatic method for labeling a large number of samples and then for training deep learning models is a commonly used approach. The paper did not provide sufficient innovation, whether in terms of methodology or scientific application. It is recommended that the authors focus more on the methodology itself to address the specific technical issues encountered in the ice floe segmentation, rather than simply using samples to train the deep learning models to obtain so-called high accuracy.
General comments:
Using simple methods for automatic labeling of samples and applying them to the training of deep learning models is a common practice, and this paper does not provide enough innovation in this regard. Therefore, I believe that the originality of the paper is relatively limited.
As the authors point out, one of the advantages of this method is that it can reduce the running time. Classical methods for sample labeling take a considerable amount of time. As the number of training samples increases with the further application of the model, the training time of the model will also increase. If we only compare it with classical methods, this method additionally needs the time for model training. Of course, if we only compare the running time, the deep learning model takes less. But what is the practical significance of shortening the time? Can it be used for some near-real-time applications?
What is the difference between results from classical methods and deep learning methods? The training samples of deep learning come from the classification results of classical methods. If there are some errors in the training samples, these errors may also be introduced into the deep learning model. Although the authors believe that deep learning can overcome this problem by itself, the influence will still exist. How do the classical methods and deep learning methods affect the subsequent acquisition of ice floe parameters, and is the difference obvious?
The authors used two resolutions of remote sensing images to test the method, but I did not see the comparison of the two results. Will the spatial resolution have an impact on the algorithm? How does the sample size of different resolutions compare? What kind of impact will it have on the training of the model? How sensitive is the proposed method to the size of the ice floe? Can similar accuracy be achieved in other regions of the Arctic or at other times?
The parameter settings of the deep learning model are not clear enough, and the influences of multiple parameters on the results need to be compared to obtain the best training and classification results. The method process is not clear, making it difficult for readers to follow and implement. The testing images is also limited, which makes it difficult to demonstrate the robustness of the method.
In addition, this method does have some practical value, and the author can consider making the method model public.
Specific comments:
Line 15: What is the difference between "sea ice" and "floe" here?
Line 15-20: This sentence is too long. It is recommended to rewrite it.
Line 20: What does "environment information" refer to?
Line 21: What does "floe parameters" refer to?
Line 22: Classical methods have some difficulties in distinguishing connected floes, and these errors may also exist in the training samples of deep learning models. Why can deep learning models overcome these problems?
Line 50: The poor performance of these models may be due to model structures, other types of models (e.g., the model you used) may resolve this problem.
Figure 2: The title of the figure is too simple and lacks necessary descriptions.
Section 3: Although you have written a lot in this section, it is still difficult to understand the processes. It is recommended that this part should be rewritten with a clearer logic, so that readers can follow and implement the method. A method flowchart is recommended here.
Line 86-87: Some data or references are needed to support this.
Line 98: What is the ratio of airborne data to satellite data in the MIZ image, and will it affect the performance of deep learning model?
Line 111-112: Is this correct? Does it contain real ice floe boundaries?
Line 119: Do you mean that this method may not achieve good results in the local regions?
Line 129: “an ice floe can be resized into several smaller ones of different scales”, do you mean that you create more small ice floe objects by resizing the large one?
Section 3.1.2: I don't understand why you divided the images into multiple scales and how to implement it specifically. More details are needed here.
In the method section, I did not see the impact of sample size on the method. You used two different resolution data sources. What is the impact of the number ratio between them on the method? If this method is applied to other regions, it may be more realistic to increase the amount of satellite data. What kind of impact will it have on the classification results mainly based on satellite images?
Line 170: So you applied an aforehand quality control on the samples, is this manual-free?
Line 172: Is the sample size too small for the deep learning model?
Line 217: There are no obvious differences between deep learning models of the same category, and their performances are also similar.
Figure 9: It is difficult to distinguish the difference in results between different methods, and the image below is the same.
Line 243-244: This may affect the classification performance of deep learning.
Line 250: The training samples also contain these errors. Why can deep learning automatically overcome this problem?
Figure 15: Similarly, the title of the figure is too simple and lacks necessary information.
Citation: https://doi.org/10.5194/egusphere-2023-295-RC1 -
AC1: 'Reply on RC1', Qin Zhang, 11 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-295/egusphere-2023-295-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Qin Zhang, 11 Jun 2023
-
RC2: 'Comment on egusphere-2023-295', Anonymous Referee #2, 14 Apr 2023
Floe size distribution (FSD) becomes a very important parameter in nowadays sea ice modelling; however, high-resolution imagery seems to be the only source to obtain such kind of information. Thus, an automatic image-processing method is also important in this field. This study provided a deep learning-based segmentation method to process airborne and optical satellite images, and obtained good results of FSD. It seems that a completely automatic method to get FSD becomes possible.
Actually, it is not the first time for me to review this manuscript. I understand the solid revision that the authors have conducted to improve the paper. I still encourage the authors to address the remaining issues, and make the manuscript smoother to follow. Such an interesting topic merits publications and will be valuable for more accurate access on FSD.
- The abstract talks more about the background, instead of the solid achievements in the present study. I suggest a shorter background, and more results of the present study should be presented.
- Is there any relationship between the airborne data in 2.1 and the satellite data in 2.2? Or both of them are employed here just to test the effect of the new method on different kind of imagery.
- Lines 210-215. “U-Net++ with the depth of 5 achieved the best floe instance segmentation”, is this a result of “experiments to compare the performance of U-Net++ with other SoA semantic segmentation architectures”? I mean if you have known U-Net++ is the best among all, why do you compare them again? And for the other methods such as ResUNet, ResUNet++, additional explanations should be added here to tell the difference between them.
- There are two fig10e. And for figures 9-10, the difference between these results are very difficult to distinguish if no additional notations such as in fig11e are presented.
- It is a little difficult for me to follow the contents in sections 4 and 5. A possible reason is that so many names of processing methods are presented here, and also two kinds imagery are included as examples to show the effect of these methods. I was not told why airborne data were employed here but satellite imagery were employed there. Thus the main improvement of the present study are submerged by these information.
- There are some very interesting results in section 6, for the variations in the power-law exponent, can you give some more explanations on them? Otherwise, it is not necessary to present so many pictures as example without any discission.
- There is a so quick stop in the conclusion section, can you give some evaluations on the limitations of the present study?
Citation: https://doi.org/10.5194/egusphere-2023-295-RC2 -
AC2: 'Reply on RC2', Qin Zhang, 11 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-295/egusphere-2023-295-AC2-supplement.pdf
-
RC3: 'Comment on egusphere-2023-295', Anonymous Referee #3, 08 May 2023
-
AC3: 'Reply on RC3', Qin Zhang, 11 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-295/egusphere-2023-295-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Qin Zhang, 11 Jun 2023
-
EC1: 'Comment on egusphere-2023-295', Bin Cheng, 15 May 2023
Dear Authors,
Your manuscript received comments from 3 reviewers. Two reviewers suggested rejection, one reviewer suggested major revision is necessary. One reviewer even indicated "it is not the first time for me to review this manuscript" which means your manuscript has been submitted to other Journal(s) before. You should have pointed out this information in your cover letter when submitting this manuscript. TC required such information:
Information about previous/concurrent submission or preprint:
Not available.Please remember to do it for your next submission. Back to this manuscript, reviewers pointed out the potential of your work but also gave important comments/criticism of this study. I give you one more chance to take all comments from reviewers into account to improve your manuscript. Please provide a decent manuscript with a truly major revision and please provide your point-by-point response to the reviewer's comments. If you can't finish the major revision within the time window by the TC system, you may do it as a new submission
Citation: https://doi.org/10.5194/egusphere-2023-295-EC1 -
AC4: 'Reply on EC1', Qin Zhang, 11 Jun 2023
Dear Editor,
Thanks for giving us the opportunity to revise our manuscirpt and thank the reviewers for their valuable comments. We will address the comments and revise our manuscript.
Citation: https://doi.org/10.5194/egusphere-2023-295-AC4
-
AC4: 'Reply on EC1', Qin Zhang, 11 Jun 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-295', Anonymous Referee #1, 30 Mar 2023
In this study the authors used a classical image processing technique to label the ice floe samples, and then used these samples for training a deep learning model, which was used for ice floe segmentation. The authors evaluated the algorithm using two types of remote sensing data and compared its accuracy and runtime with other methods. They claimed that this approach can achieve faster processing speed and higher accuracy. Deep learning models have been widely used in remote sensing image processing, but the prerequisite for obtaining ideal accuracy is usually a sufficient amount of training samples. Sample labeling is usually done manually, which often requires a lot of manpower and time. Using an automatic labeling method to obtain samples has certain advantages.
Although the deep learning method achieved the best results, which was also expected, using an automatic method for labeling a large number of samples and then for training deep learning models is a commonly used approach. The paper did not provide sufficient innovation, whether in terms of methodology or scientific application. It is recommended that the authors focus more on the methodology itself to address the specific technical issues encountered in the ice floe segmentation, rather than simply using samples to train the deep learning models to obtain so-called high accuracy.
General comments:
Using simple methods for automatic labeling of samples and applying them to the training of deep learning models is a common practice, and this paper does not provide enough innovation in this regard. Therefore, I believe that the originality of the paper is relatively limited.
As the authors point out, one of the advantages of this method is that it can reduce the running time. Classical methods for sample labeling take a considerable amount of time. As the number of training samples increases with the further application of the model, the training time of the model will also increase. If we only compare it with classical methods, this method additionally needs the time for model training. Of course, if we only compare the running time, the deep learning model takes less. But what is the practical significance of shortening the time? Can it be used for some near-real-time applications?
What is the difference between results from classical methods and deep learning methods? The training samples of deep learning come from the classification results of classical methods. If there are some errors in the training samples, these errors may also be introduced into the deep learning model. Although the authors believe that deep learning can overcome this problem by itself, the influence will still exist. How do the classical methods and deep learning methods affect the subsequent acquisition of ice floe parameters, and is the difference obvious?
The authors used two resolutions of remote sensing images to test the method, but I did not see the comparison of the two results. Will the spatial resolution have an impact on the algorithm? How does the sample size of different resolutions compare? What kind of impact will it have on the training of the model? How sensitive is the proposed method to the size of the ice floe? Can similar accuracy be achieved in other regions of the Arctic or at other times?
The parameter settings of the deep learning model are not clear enough, and the influences of multiple parameters on the results need to be compared to obtain the best training and classification results. The method process is not clear, making it difficult for readers to follow and implement. The testing images is also limited, which makes it difficult to demonstrate the robustness of the method.
In addition, this method does have some practical value, and the author can consider making the method model public.
Specific comments:
Line 15: What is the difference between "sea ice" and "floe" here?
Line 15-20: This sentence is too long. It is recommended to rewrite it.
Line 20: What does "environment information" refer to?
Line 21: What does "floe parameters" refer to?
Line 22: Classical methods have some difficulties in distinguishing connected floes, and these errors may also exist in the training samples of deep learning models. Why can deep learning models overcome these problems?
Line 50: The poor performance of these models may be due to model structures, other types of models (e.g., the model you used) may resolve this problem.
Figure 2: The title of the figure is too simple and lacks necessary descriptions.
Section 3: Although you have written a lot in this section, it is still difficult to understand the processes. It is recommended that this part should be rewritten with a clearer logic, so that readers can follow and implement the method. A method flowchart is recommended here.
Line 86-87: Some data or references are needed to support this.
Line 98: What is the ratio of airborne data to satellite data in the MIZ image, and will it affect the performance of deep learning model?
Line 111-112: Is this correct? Does it contain real ice floe boundaries?
Line 119: Do you mean that this method may not achieve good results in the local regions?
Line 129: “an ice floe can be resized into several smaller ones of different scales”, do you mean that you create more small ice floe objects by resizing the large one?
Section 3.1.2: I don't understand why you divided the images into multiple scales and how to implement it specifically. More details are needed here.
In the method section, I did not see the impact of sample size on the method. You used two different resolution data sources. What is the impact of the number ratio between them on the method? If this method is applied to other regions, it may be more realistic to increase the amount of satellite data. What kind of impact will it have on the classification results mainly based on satellite images?
Line 170: So you applied an aforehand quality control on the samples, is this manual-free?
Line 172: Is the sample size too small for the deep learning model?
Line 217: There are no obvious differences between deep learning models of the same category, and their performances are also similar.
Figure 9: It is difficult to distinguish the difference in results between different methods, and the image below is the same.
Line 243-244: This may affect the classification performance of deep learning.
Line 250: The training samples also contain these errors. Why can deep learning automatically overcome this problem?
Figure 15: Similarly, the title of the figure is too simple and lacks necessary information.
Citation: https://doi.org/10.5194/egusphere-2023-295-RC1 -
AC1: 'Reply on RC1', Qin Zhang, 11 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-295/egusphere-2023-295-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Qin Zhang, 11 Jun 2023
-
RC2: 'Comment on egusphere-2023-295', Anonymous Referee #2, 14 Apr 2023
Floe size distribution (FSD) becomes a very important parameter in nowadays sea ice modelling; however, high-resolution imagery seems to be the only source to obtain such kind of information. Thus, an automatic image-processing method is also important in this field. This study provided a deep learning-based segmentation method to process airborne and optical satellite images, and obtained good results of FSD. It seems that a completely automatic method to get FSD becomes possible.
Actually, it is not the first time for me to review this manuscript. I understand the solid revision that the authors have conducted to improve the paper. I still encourage the authors to address the remaining issues, and make the manuscript smoother to follow. Such an interesting topic merits publications and will be valuable for more accurate access on FSD.
- The abstract talks more about the background, instead of the solid achievements in the present study. I suggest a shorter background, and more results of the present study should be presented.
- Is there any relationship between the airborne data in 2.1 and the satellite data in 2.2? Or both of them are employed here just to test the effect of the new method on different kind of imagery.
- Lines 210-215. “U-Net++ with the depth of 5 achieved the best floe instance segmentation”, is this a result of “experiments to compare the performance of U-Net++ with other SoA semantic segmentation architectures”? I mean if you have known U-Net++ is the best among all, why do you compare them again? And for the other methods such as ResUNet, ResUNet++, additional explanations should be added here to tell the difference between them.
- There are two fig10e. And for figures 9-10, the difference between these results are very difficult to distinguish if no additional notations such as in fig11e are presented.
- It is a little difficult for me to follow the contents in sections 4 and 5. A possible reason is that so many names of processing methods are presented here, and also two kinds imagery are included as examples to show the effect of these methods. I was not told why airborne data were employed here but satellite imagery were employed there. Thus the main improvement of the present study are submerged by these information.
- There are some very interesting results in section 6, for the variations in the power-law exponent, can you give some more explanations on them? Otherwise, it is not necessary to present so many pictures as example without any discission.
- There is a so quick stop in the conclusion section, can you give some evaluations on the limitations of the present study?
Citation: https://doi.org/10.5194/egusphere-2023-295-RC2 -
AC2: 'Reply on RC2', Qin Zhang, 11 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-295/egusphere-2023-295-AC2-supplement.pdf
-
RC3: 'Comment on egusphere-2023-295', Anonymous Referee #3, 08 May 2023
-
AC3: 'Reply on RC3', Qin Zhang, 11 Jun 2023
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-295/egusphere-2023-295-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Qin Zhang, 11 Jun 2023
-
EC1: 'Comment on egusphere-2023-295', Bin Cheng, 15 May 2023
Dear Authors,
Your manuscript received comments from 3 reviewers. Two reviewers suggested rejection, one reviewer suggested major revision is necessary. One reviewer even indicated "it is not the first time for me to review this manuscript" which means your manuscript has been submitted to other Journal(s) before. You should have pointed out this information in your cover letter when submitting this manuscript. TC required such information:
Information about previous/concurrent submission or preprint:
Not available.Please remember to do it for your next submission. Back to this manuscript, reviewers pointed out the potential of your work but also gave important comments/criticism of this study. I give you one more chance to take all comments from reviewers into account to improve your manuscript. Please provide a decent manuscript with a truly major revision and please provide your point-by-point response to the reviewer's comments. If you can't finish the major revision within the time window by the TC system, you may do it as a new submission
Citation: https://doi.org/10.5194/egusphere-2023-295-EC1 -
AC4: 'Reply on EC1', Qin Zhang, 11 Jun 2023
Dear Editor,
Thanks for giving us the opportunity to revise our manuscirpt and thank the reviewers for their valuable comments. We will address the comments and revise our manuscript.
Citation: https://doi.org/10.5194/egusphere-2023-295-AC4
-
AC4: 'Reply on EC1', Qin Zhang, 11 Jun 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
437 | 218 | 33 | 688 | 18 | 19 |
- HTML: 437
- PDF: 218
- XML: 33
- Total: 688
- BibTeX: 18
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Cited
Nick Hughes
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(120727 KB) - Metadata XML