the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Ensemble reconstruction of missing satellite data using a denoising diffusion model: application to chlorophyll a concentration in the Black Sea
Abstract. Satellite observations provide a global or near-global coverage of the World Ocean. They are however affected by clouds (among others), which severely reduce their spatial coverage. Different methods have been proposed in the literature to reconstruct missing data in satellite observations. For many applications of satellite observations, it has been increasingly important to accurately reflect the underlying uncertainty of the reconstructed observations. In this paper, we investigate the use of a denoising diffusion model to reconstruct missing observations. Such methods can naturally provide an ensemble of reconstructions where each member is spatially coherent with the scales of variability and with the available data. Rather than providing a single reconstruction, an ensemble of possible reconstructions can be computed, and the ensemble spread reflects the underlying uncertainty. We show how this method can be trained from a collection of satellite data without requiring a prior interpolation of missing data and without resorting to data from a numerical model. The reconstruction method is tested with chlorophyll a concentration from the Ocean and Land Color Instrument (OLCI) sensor (onboard the satellites Sentinel- 3A and Sentinel-3B) on a small area of the Black Sea and compared with the neural network DINCAE (Data-Interpolating Convolutional Auto-Encoder). The spatial scales of the reconstructed data are assessed via a variogram, and the accuracy and statistical validity of the produced ensemble reconstructed are quantified using the continuous ranked probability score and its decomposition into reliability, resolution and uncertainty.
- Preprint
(2581 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2024-1075', Anonymous Referee #1, 22 May 2024
The study presents a novel application of denoising diffusion probabilistic models in the context of missing observation reconstruction in satellite data. The solve the problem of training such a model on images with missing data and demonstrate the effectiveness of their method on a specific case of chlorophyll a reconstruction.
I consider the contents of the manuscript to be worthy of publication, however, some shortcomings have to be addressed prior to publication. A detailed description of the raised comments is provided in the attachment.
- AC1: 'Reply on RC1', Alexander Barth, 02 Aug 2024
-
RC2: 'Comment on egusphere-2024-1075', Anonymous Referee #2, 07 Jun 2024
GENERAL COMMENTS
This research work proposes a new method that can be applied in the reconstruction of missing data in geophysical datasets, and more specifically cloudy satellite images. The method is applied successfully to chlorophyll images, improving the skill of state of the art reconstruction methods. The paper is well written, scientifically consistent and contains enough novel contributions. Without doubt it deserves publication, but after some technical clarifications/corrections and a couple of small extensions of the results.
Thinking in a broad audience dealing with the problem of missing data in geophysical datasets, some parts of the document might not be very reader friendly, and look biased towards the computer vision community. Considering the scope of this journal, an effort in that direction would be appreciated.
The relatively small area considered is the main weakness of the study and introduces some concern about the applicability of the technique elsewhere. However, as the diffusion model is already trained using the complete Black Sea, it would be possible to extend the analysis by repeating the analysis (with the same hyper-parameters) for other areas (dynamically similar and not) to produce a figure like Figure 9 for multiple locations (see details later).
Identification of specific comments and technical corrections: P==page; L==line
SPECIFIC COMMENTS
P2L41: DINCAE refers to version in Barth et al., 2020; or alternatively to modified version in Barth et al., 2022? To both?
P3L67-73: for the broad audience dealing with geophysical missing data reconstruction techniques, this could be made a bit less technical and more descriptive
P3L86-P4L90: it is not clear if the mean and std are these of a pixel over the group of images, or alternatively are calculated image by image
P4L91-93: alpha and beta parameters are identified as diffusion step t dependent. True? Make that more explicit if it is the case. For the ease of understanding also make explicit that degradation level increases with increasing t
P4L106: implications for this application of “small steps sizes beta” not very clear
P6L135: image frequency is? 3h, 6h, daily…?
P6L140: why 20%? Conservative approach? Sensitivity detected for certain threshold?
P6L138-143: training and validation datasets are defined here, but later also test dataset is referred.
P6L140: is that the number of training images or the number after breaking the original figures in tiles?
P6L139: reason for the 64x64 tile splitting is given later; here is confusing without justification; say choice is justified later?
P6L146: what is meant with DINCAE being only applied with a fixed location? Does it mean that while the diffusion model is trained using data over the complete area DINCAE is applied only to the small box in Figure 2, and hence they are only compared over that small location?
P6L146-147: that justifies the use of a small area for testing purposes, but why not extend the comparison to other locations in the area (by for example adding extra small areas of the same size, maybe also with some overlapping?)
P6L146-149: why is this relevant? Why proceed this way? It seems relevant for the image reconstruction community, but is that the case for an audience dealing with geophysical data reconstruction? P9L165: eq. 1 also?
P9L165: how does it ensure spatial coherence?
P9L173-174: it refers to the pixels of the added cloud mask? T is randomly selected for each pixel in that mask or the whole mask? From figure 4 and explanation in P11L196-197 it looks that is shared for all pixels in the mask…
P9L177and Figure 4 caption: “Scaled diffusion step” of figure lacks explanation at this point (comes later); a basic description in the figure caption would help
P10L180-182 and Figure 4: the way “predicted noise” in figure 4 is created from the “partially corrupted image”, based on the description within this line, is not easy to understand and could be more explicit (how the neural network operates to create the figure)
P10 Figure 5: this figure is not mentioned in the text and it is not clear how it interacts with the neural network; if it is part of it or a separated process…
P11L198 would indicate that the training operates in the forward direction, while P5L127-129 that it is the same trained network that is applied, but in the reverse mode, to produce a ensemble of possible reconstructed versions… True? Make this clearer
P11L197: then -1/2 refers to initial or non degraded while 1/2 is fully degraded? Make that explicit
P11L205-212: No comment about future or present public code availability?
P11L211-212: training and validation datasets were presented in “Data” section; does this refer to “validation” dataset? Is there a third “test” dataset?
P12L222: same as previous comments
P12L225: comment about “P9L165” on the spatial coherence made before clarifies here, maybe is enough to mention that only here
P13L229: any reason to select that particular ensemble size?
P13L241: Blue for zero std? Mask out using another out of the color bar (white,…)?
P15L248: regarding proposal of extension to other areas made in the “general comments”, this would imply training of DINCAE for such areas in this step.
P15L248-250: previous comments on the distinction of training, verification and test periods apply here in comparison with development phase mentioned now
P15L257-258: RMS error of the diffusion model is calculated as the ensemble average of individual RMS values?
P15L259-260 and Tables 3 and 4: It would be interesting to add some metric about the intra-ensemble variability of the RMS for the diffusion model (in Tables 3 and 4): RMS_max and RMS_min and/or variance/std of the RMS inside the ensemble. This would provide a basic idea about the quality of individual reconstructions inside the ensemble.
P18L270: “randomly chosen locations” including a certain number of comparisons? all possible?
P18L271-272: that means that comparisons are made for individual members and then averaged over the ensemble to produce the metric value? P18L278: “does not converge to zero” for the original data and the diffusion model, not for DINCAE?
P20L295-297: meaning of the over estimated upper and lower limits?
P20-21L299-320: Although interesting, the analysis of the CRPS seem somehow weak without any reference to compare with...
P22L336-342: multivariate reconstructions are mentioned, but what about univariate reconstructions like for Sea Surface Temperature or Sea Surface Salinity, for instance, how is the method expected to behave?
TECHNICAL CORRECTIONS
P2L33-35: the reference to the error of initially missing values is confusing
P2L51-53: hard to understand, could be clearer? An example maybe?
Citation: https://doi.org/10.5194/egusphere-2024-1075-RC2 - AC2: 'Reply on RC2', Alexander Barth, 02 Aug 2024
Status: closed
-
RC1: 'Comment on egusphere-2024-1075', Anonymous Referee #1, 22 May 2024
The study presents a novel application of denoising diffusion probabilistic models in the context of missing observation reconstruction in satellite data. The solve the problem of training such a model on images with missing data and demonstrate the effectiveness of their method on a specific case of chlorophyll a reconstruction.
I consider the contents of the manuscript to be worthy of publication, however, some shortcomings have to be addressed prior to publication. A detailed description of the raised comments is provided in the attachment.
- AC1: 'Reply on RC1', Alexander Barth, 02 Aug 2024
-
RC2: 'Comment on egusphere-2024-1075', Anonymous Referee #2, 07 Jun 2024
GENERAL COMMENTS
This research work proposes a new method that can be applied in the reconstruction of missing data in geophysical datasets, and more specifically cloudy satellite images. The method is applied successfully to chlorophyll images, improving the skill of state of the art reconstruction methods. The paper is well written, scientifically consistent and contains enough novel contributions. Without doubt it deserves publication, but after some technical clarifications/corrections and a couple of small extensions of the results.
Thinking in a broad audience dealing with the problem of missing data in geophysical datasets, some parts of the document might not be very reader friendly, and look biased towards the computer vision community. Considering the scope of this journal, an effort in that direction would be appreciated.
The relatively small area considered is the main weakness of the study and introduces some concern about the applicability of the technique elsewhere. However, as the diffusion model is already trained using the complete Black Sea, it would be possible to extend the analysis by repeating the analysis (with the same hyper-parameters) for other areas (dynamically similar and not) to produce a figure like Figure 9 for multiple locations (see details later).
Identification of specific comments and technical corrections: P==page; L==line
SPECIFIC COMMENTS
P2L41: DINCAE refers to version in Barth et al., 2020; or alternatively to modified version in Barth et al., 2022? To both?
P3L67-73: for the broad audience dealing with geophysical missing data reconstruction techniques, this could be made a bit less technical and more descriptive
P3L86-P4L90: it is not clear if the mean and std are these of a pixel over the group of images, or alternatively are calculated image by image
P4L91-93: alpha and beta parameters are identified as diffusion step t dependent. True? Make that more explicit if it is the case. For the ease of understanding also make explicit that degradation level increases with increasing t
P4L106: implications for this application of “small steps sizes beta” not very clear
P6L135: image frequency is? 3h, 6h, daily…?
P6L140: why 20%? Conservative approach? Sensitivity detected for certain threshold?
P6L138-143: training and validation datasets are defined here, but later also test dataset is referred.
P6L140: is that the number of training images or the number after breaking the original figures in tiles?
P6L139: reason for the 64x64 tile splitting is given later; here is confusing without justification; say choice is justified later?
P6L146: what is meant with DINCAE being only applied with a fixed location? Does it mean that while the diffusion model is trained using data over the complete area DINCAE is applied only to the small box in Figure 2, and hence they are only compared over that small location?
P6L146-147: that justifies the use of a small area for testing purposes, but why not extend the comparison to other locations in the area (by for example adding extra small areas of the same size, maybe also with some overlapping?)
P6L146-149: why is this relevant? Why proceed this way? It seems relevant for the image reconstruction community, but is that the case for an audience dealing with geophysical data reconstruction? P9L165: eq. 1 also?
P9L165: how does it ensure spatial coherence?
P9L173-174: it refers to the pixels of the added cloud mask? T is randomly selected for each pixel in that mask or the whole mask? From figure 4 and explanation in P11L196-197 it looks that is shared for all pixels in the mask…
P9L177and Figure 4 caption: “Scaled diffusion step” of figure lacks explanation at this point (comes later); a basic description in the figure caption would help
P10L180-182 and Figure 4: the way “predicted noise” in figure 4 is created from the “partially corrupted image”, based on the description within this line, is not easy to understand and could be more explicit (how the neural network operates to create the figure)
P10 Figure 5: this figure is not mentioned in the text and it is not clear how it interacts with the neural network; if it is part of it or a separated process…
P11L198 would indicate that the training operates in the forward direction, while P5L127-129 that it is the same trained network that is applied, but in the reverse mode, to produce a ensemble of possible reconstructed versions… True? Make this clearer
P11L197: then -1/2 refers to initial or non degraded while 1/2 is fully degraded? Make that explicit
P11L205-212: No comment about future or present public code availability?
P11L211-212: training and validation datasets were presented in “Data” section; does this refer to “validation” dataset? Is there a third “test” dataset?
P12L222: same as previous comments
P12L225: comment about “P9L165” on the spatial coherence made before clarifies here, maybe is enough to mention that only here
P13L229: any reason to select that particular ensemble size?
P13L241: Blue for zero std? Mask out using another out of the color bar (white,…)?
P15L248: regarding proposal of extension to other areas made in the “general comments”, this would imply training of DINCAE for such areas in this step.
P15L248-250: previous comments on the distinction of training, verification and test periods apply here in comparison with development phase mentioned now
P15L257-258: RMS error of the diffusion model is calculated as the ensemble average of individual RMS values?
P15L259-260 and Tables 3 and 4: It would be interesting to add some metric about the intra-ensemble variability of the RMS for the diffusion model (in Tables 3 and 4): RMS_max and RMS_min and/or variance/std of the RMS inside the ensemble. This would provide a basic idea about the quality of individual reconstructions inside the ensemble.
P18L270: “randomly chosen locations” including a certain number of comparisons? all possible?
P18L271-272: that means that comparisons are made for individual members and then averaged over the ensemble to produce the metric value? P18L278: “does not converge to zero” for the original data and the diffusion model, not for DINCAE?
P20L295-297: meaning of the over estimated upper and lower limits?
P20-21L299-320: Although interesting, the analysis of the CRPS seem somehow weak without any reference to compare with...
P22L336-342: multivariate reconstructions are mentioned, but what about univariate reconstructions like for Sea Surface Temperature or Sea Surface Salinity, for instance, how is the method expected to behave?
TECHNICAL CORRECTIONS
P2L33-35: the reference to the error of initially missing values is confusing
P2L51-53: hard to understand, could be clearer? An example maybe?
Citation: https://doi.org/10.5194/egusphere-2024-1075-RC2 - AC2: 'Reply on RC2', Alexander Barth, 02 Aug 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
447 | 180 | 32 | 659 | 19 | 19 |
- HTML: 447
- PDF: 180
- XML: 32
- Total: 659
- BibTeX: 19
- EndNote: 19
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1