AI-derived 3D cloud tomography from geostationary 2D satellite data

Brüning, Sarah; Niebler, Stefan; Tost, Holger

doi:https://doi.org/10.5194/egusphere-2023-1834

Preprints

https://doi.org/10.5194/egusphere-2023-1834

Preprints

16 Aug 2023

| 16 Aug 2023

AI-derived 3D cloud tomography from geostationary 2D satellite data

Sarah Brüning, Stefan Niebler, and Holger Tost

Abstract. Satellite instruments provide spatially extended data with a high temporal resolution on almost global scales. However, nowadays, it is still a challenge to extract fully three-dimensional data from the current generation of satellite instruments, which either provide horizontal patterns or vertical profiles along the orbit track. Following this, we train a neural network in this study to generate three-dimensional cloud structures from MSG SEVIRI satellite data in high spatio-temporal resolution. We evaluate the derived artificial intelligence-based predictions against the along-track radar reflectivity from the CloudSat satellite. By inferring the pixel-wise cloud column to the satellite’s full disk, our results emphasize that spatio-temporal dynamics can be delineated for the whole domain. Robust reflectivities are derived for different cloud types with a clear distinction regarding the cloud's intensity, height, and shape. Cloud-free pixels tend to be over-represented because of the high imbalance between cloudy and clear-sky samples. The average error (RMSE) spans about 7.5 % (3.41 dBZ) of the total value range enabling the advanced analysis of vertical cloud properties. Although we receive high accordance between radar data and our predictions, the quality of the results varies with the complexity of the cloud structure. The representation of multi-level and mesoscale clouds is often simplified. Despite current limitations, the obtained results can help close current data gaps and exhibit the potential to be applied to various climate science questions, like the further investigation of deep convection through time and space.

Received: 11 Aug 2023 – Discussion started: 16 Aug 2023

Download & links

Preprint (PDF, 8128 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (8128 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

09 Feb 2024

Artificial intelligence (AI)-derived 3D cloud tomography from geostationary 2D satellite data

Sarah Brüning, Stefan Niebler, and Holger Tost

Atmos. Meas. Tech., 17, 961–978, https://doi.org/10.5194/amt-17-961-2024,https://doi.org/10.5194/amt-17-961-2024, 2024

Short summary

Sarah Brüning, Stefan Niebler, and Holger Tost

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1834', Anonymous Referee #1, 08 Sep 2023

Review of "AI-derived 3D cloud tomography from geostationary 2D satellite data" by Brüning et al.
This paper presents a neural network that can reconstruct 3D cloud structures from 2D visible/IR satellite data. The authors describe a network architecture that maps the 2D observations to 3D fields. They then present a comparison to other machine-learning methods and compare the statistics of network-predicted and real observations.
The most interesting part of the paper to me is the network structure that generates the 3D fields. The U-Net structure is appropriate for the problem at hand, although it is not quite clear from the description how the transformation from 2D to 3D is performed.
While the network is interesting, I wonder about the usefulness of the model outputs since the model clearly suffers from regression to the mean and blurriness of the results. Also, the uncertainty of the outputs is not estimated. Could you comment a bit on which applications would find the current results useful? And how the abovementioned issues could be improved?
The language in the paper needs some improvement. There are awkward sentence structures throughout the paper, as well as misuse of vocabulary and terminology, which make the paper hard to understand at times. I point out some examples in the specific comments, but these are not an exclusive list; the whole paper would benefit from editing. In contrast, the figures in the paper are clear and well made.
Specific comments:
Lines 24-25: "Passive sensors such as geostationary satellites": the statement needs more precision, satellites are not sensors
Lines 36-37: "The large-scale generability of these methods is expandable since their 3D results are limited to the cloud’s spatial vicinity": I don't understand this sentence, please clarify.
Line 48: "time efficiency and feasibility": it's not clear to me what this means
Line 80: "orbiting the globe on a sinusoidal track": how does a satellite orbit on a "sinusoidal track"?
Lines 91-92: "resampled to a geographic grid": what kind of grid, a lat-lon one?
Section 2.1.2: CloudSat is on a sun-synchronous orbit, meaning it sees every location at the same local solar time. This might introduce some diurnal bias to the data; this should be acknowledged.
Line 109: The use of "XY" is confusing, I read this initially as "X times Y" but apparently you mean a diagonal transect through the image? Or did I misunderstand?
Line 125: "smoothing" should probably be "filtering"
Line 137: The use of the word "delineate" here and a couple of other places in the places seems incorrect
Line 159: I would like some more details on how the network structure maps the 2D input fields to the 3D output fields. Is the channels dimension used transformed to the Z dimension of the output?
Lines 163-164: How is the 3D scene predicted by the network compared to the CloudSat data during training? CloudSat only gets a 2D vertical cross section of the scene. Is only part of the scene selected for comparison? Also, what loss function do you use for training?
Line 176: "Both models": unclear which models this refers to
Line 181: "pictures" is used incorrectly here.
Lines 189-190: In the joined 2400 x 2400 pixel 3D prediction, is the field continuous at the borders of the 100 x 100 pixel tiles? Or do you see discontinuities or artifacts?
Line 220: "That said" seems out of place here - please revise.
Line 231: I don't understand that "denominational structure" means here.
Lines 262-263: High, thin ice clouds may also not be observed by CloudSat due to being under the minimum detectable reflectivity.
Figure 7: Maybe you could add a panel showing the difference of a and b to illustrate the biases better.
Line 288: "Leaving out the affected channels downgrades the overall performance": it would be good to see something to demonstrate this.
Lines 292-293: "In contrast to pixel-based DL methods like the CNN or CGAN, the Res-UNet utilizes a larger receptive field preserving the spatial dimensionality and global context information during the training routine." This is not a correct statement regarding the CNN or CGAN architectures. CNN architectures can also achieve large receptive fields and global context using downsampling. In fact the UNet itself is a type of CNN - its distinguishing feature is the addition of skip connections to preserve resolution. As for the CGAN, it refers to a certain training setup of generative models that could be implemented with either normal CNNs or with (Res-)UNets.
Lines 312-313: Approximately 1 km resolution is also already available from the GOES-R series and Himawari 8/9 satellites.
Lines 321-322: "Since it is independent of external or interconnected data sources, the bias within the data is reduced.": unclear sentence, I'm not sure how the latter follows from the former.

Citation: https://doi.org/10.5194/egusphere-2023-1834-RC1
- AC1: 'Reply on RC1', Sarah Brüning, 03 Nov 2023
  
  We appreciate your comments and thank you for taking your time to write the detailed review. Please find our answers to each of your comments in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1834-AC1
RC2:
'Comment on egusphere-2023-1834', Anonymous Referee #2, 06 Oct 2023
This paper by Brüning et al. presents an ML-based method to reconstruct 3D cloud profiles from 2D images. The authors employ a Res-UNet architecture to translate data collected by the GOES satellite to cloud profiles measured by CloudSat, and compare their methodology to pixel-based translation approaches.
This is an interesting piece of work with promising results. To the best of my knowledge, existing work using ML to reconstruct cloud profiles primarily focuses on pixel-based approaches, i.e. translating 1D to 2D information. In this context, this work presents a substantial improvement, translating 2D images into 3D cloud structures.
While I recommend the paper for publication, I suggest addressing a series of general and specific comments beforehand to increase the rigor and clarity of the presented work.
General comments:
In my opinion, any application of machine learning to scientific research questions requires some level of uncertainty quantification. I appreciate that this might go beyond the scope of the paper, but estimating model uncertainties would greatly improve the trustworthiness of the results, and likely help inform how (and for which scenarios) the model needs improving.

The writing is at times too verbose and unclear. I am including a number of specific comments below, but I would suggest going through the paper again and streamlining the narrative.

At times, the section headings could be more specific. Some of them (e.g. “Comprehensive predictions”) are not particularly descriptive of the work done.

Please add tables of the training and hyperparameters in the paper (or better Supplementary Information). Most importantly, please include information about the number of parameters of each of your models.

Specific comments:
Line 9; Abstract: “average error” is a bit vague and not the same as RMSE. Please revise. Furthermore, it is not clear what “total value range” in this sentence is referring to.

Line 27: “[...] spatially and temporally limited perspective [...].” Please expand on what the spatial and temporal characteristics of the different instruments used in this work are. Specifically, I think the authors could expand on the benefit of geostationary satellites when it comes to temporal resolution, as polar-orbiting satellites often require 1+ days to observe the same area again.

Line 29: “While this analysis often rests upon subjective labeling or fixed thresholds [...].” Please add example citations to previous work.

Line 33: “joined” ==> joint

Line 36: I am not sure what “generability” means.

Line 54: I would suggest including other works, such as https://arxiv.org/abs/1911.04227, when discussing cloud classification.

Line 70+: The sentence is very long and hard to follow. Please rephrase.

Line 78: “[...] originates a geostationary satellite.” ==> “from” missing.

Line 84: The EUMETSAT abbreviation was already used in line 78, before it was defined here.

Figure 1: It would be great if you could add information about the size of the boxes shown in Part 1, to make it more clear how the matching algorithm works. Furthermore, it was confusing to see 90 height levels in the figure, when the text previously described CloudSat to have 125 height levels. Could you also comment on the missing data shown in the CloudSat profile?

Line 114: “Extracted satellite samples display the physical predictors fed into the network [...].” Please rephrase, as it is not clear to me what you mean.

Line 115: I would rename “samples” as “matched image-profile pairs” or similar to be more exact.

Line 117: How come you have no test set? Validation sets are great, but susceptible to “human gradient descent”, since we usually justify model modifications by improved performances on the validation set. This doesn’t necessarily mean that the model is better, it just means that performance on these specific examples is improved.

Line 123: How was the reduction from 125 to 90 height levels done?

Line 140: “By seeking non-linear approximations of between the input and the output data [...].” Please rephrase.

Line 163: “As flipped images are perceived as new samples, we enhance the amount of training data by giving all samples a chance of 25% to be either vertically or horizontally rotated.” Please comment on whether these transformations are valid for the context of satellite measurements, especially CloudSat, with its ascending and descending orbits. Are the two acquisitions totally equal, in that flipping one can simulate the other?

Equation 3: Either define all variables in the equation, or remove from the paper, as the RMSE is a relatively standard quantity.

Line 173-174: Rephrase “horizontal diagonal”.

Figure 3: Please explain how many samples the joint plot was calculated over.

Line 218+: This sentence is very hard to follow. Please re-write.

Line 220+: Do you have any idea why you observe different performance trends for your pixel-based and Res-UNet approaches?

Line 226: How were the four samples chosen?

Line 228: What do you mean by “transferability” in this context?

Line 231: “A denominational structure [...].” I am not sure what you mean by this.

Figure 4: Please label each subplot, and refer to the subplots as you discuss the results in the main text to make it easier to follow your arguments.

Line 233: “[...] the Res-UNet shows more robust results [...].” I am not 100% convinced that the RMSE and visual inspection of the results qualify this sentence. Have you looked at any other metrics, or quantitatively studied performance as a function of cloud type for more than a couple of samples?

Line 235+: I am not sure I follow your discussion about quality flags. Please clarify.

Line 238-239: “[...] this leads to a lower model uncertainty.” Do you mean “lower error”? Model uncertainties to me means quantification of how certain a model is in its predictions.

Line 239: “That said [...].” This sentence isn’t quite clear to me. Please rephrase/clarify.

Line 260: “The first peak [...].” It would be really helpful if you could refer to the labels of the subplots when discussing the results.

Line 264+: “These channels are identified as essential information [...].” It would be great if you could show evidence for this, or expand the discussion on what makes you draw this conclusion.

Line 267+: “Comparing [...] reveals an overall high agreement.” Do you have any quantitative comparisons between your model outputs and the CTH from CLAAS, or is this mainly from visual inspection?

Line 277: “The approach offers [...].” Which approach are you referring to?

Figure 6: Is the comparison calculated across the entire FD of the model predictions, or just the tracks that overlap with CloudSat? If it’s non-normalized frequencies, they should be calculated across the same number of datapoints to be comparable, if I am not mistaken? Please clarify in the caption and main text.

Discussion section: I found it at times hard to follow when you are referring to your own work, versus work done by other people. Could you please go through this section again?

Line 280: Please clarify what you mean by “minimal architecture”.

Line 282: “others” ==> others’ work

Line 284: “Nonetheless, defining those variables as additional predictors has a negligible effect on the model performance.” Are you showing evidence for this somewhere?

Line 288: “Leaving out the affected channels downgrades the overall performance.” Same here, do you have evidence supporting this statement?
Citation: https://doi.org/10.5194/egusphere-2023-1834-RC2
- AC2: 'Reply on RC2', Sarah Brüning, 03 Nov 2023
  
  Thank you for the overall positive evaluation and for taking your time to write the detailed review. We appreciate your comments and interesting suggestions. Please find our detailed answer to all of your comments in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1834-AC2
RC3:
'Comment on egusphere-2023-1834', Anonymous Referee #3, 11 Oct 2023

This work sets out to provide a 3D radar reflectance ``retrieval'' from SEVIRI data using a machine learning approach. By training a network to reconstruct CloudSat reflectivity profiles, the authors reconstruct reflectivity profiles across the centre of the SEVIRI disc, demonstrating reasonable skill in the network to retrieve vertical cloud properties from SEVIRI radiances alone.
I think this is an exciting study of interest to the readers of AMT. I would consider it suitable for publication, subject to some modifications.

Main points
This looks to me as if this study has the characteristics of an ML-based retrieval, where to properties being retrieved from the SEVIRI radiances are the vertical profile of radar reflectivity. This has some similarities to other algorithms, such as the GPROF retrieval (Kummerow et al, J. App. Met., 2001), which could be discussed. Perhaps the method in this paper could produce improved precipitation retrievals in a similar style to the GPROF retrieval (although not in this publication)?
Although it doesn't seem to be at the level where I could be comfortable using the retrieved reflectivity profiles yet, but perhaps there are certain conditions where the data could be more trustworthy? Would limiting the study to a smaller area (e.g. close to nadir for SEVIRI) produce better results?

Line by Line comments
Bias in retrievals based on approximations and cloud physical properties L9 - It is later stated that RMSE varies significantly by cloud situation - how is this value derived?
L10 - receive high accordance -> find high agreement?
L30 - at risk of bias. I would argue this is almost always an issue (and would be for a ML method too)
L30 - If passive sensors lack this information, how can it be recovered from this method?
L33 - joined -> combined
L88 - The near IR channels also include reflected solar radiation (depending on channel wavelength and time of day)
L109 - How is parallax addressed? SEVIRI has an off-nadir view for most points on the disc, while CloudSat always views at nadir. This could cause registration issues between different cloud layers. Changes in resolution might also become an issue closer to the edge of the disc.
L110 - I found the last sentence difficult to follow. What does the 'factor to coarse grain the data' mean here?
Fig. 2 - Why not low cloud layers? This is explained later on, but could have been included in the description of the radar data (unless I missed it?)
Fig. 3 - How is this normalisation done? Is it across the whole image?
L220 - Why is the DL model different here? Is this due to a regression-to-the-mean effect for the Res-UNet?
L262 - I think thin clouds are also difficult to detect from CloudSat, given they might have a small radar reflectivity. Both SEVIRI and CloudSat on their own are able to produce this second, higher peak in cloud top height, suggesting they can detect these thinner clouds.
Fig. 4 - Perhaps blur the Cloudsat data to match the DL resolution? It would also be nice to have 5km marked at the bottom of the y-axis (if that is the case), to highlight this doesn't go to zero.
L270 - So it is (or could be) a simultaneous retrieval, due to the inclusion of the water vapour channels?
Fig. 7 - Much more structure on the Res-UNet results - why is this? Also, the Res-UNet image seems to have almost all the cloud tops at the same altitude, other than a small fraction of low-level cloud in some regions.
L299 - I guess this is much like satellite retrievals in general? Presumably there is actually just a lack of information there?
L300 - 5km is quite high from the ground to deal with clutter. It also cuts out a lot of the lower level clouds with a more challenging cloud top height retrieval, which might make the DL method seem better overall? I am not suggesting they should be included (working with just high clouds is an important task), but could be worth mentioning.
L321 - I think this reduction in bias should be shown if it is claimed. Those external data sources might decrease bias themselves in some situations (e.g. with a better estimate of water vapour than is available from the SEVIRI IR channels).
L325 - maybe ``remote oceanic regions'' instead of ``secluded regions above the sea surface''?

Citation: https://doi.org/10.5194/egusphere-2023-1834-RC3
- AC3: 'Reply on RC3', Sarah Brüning, 03 Nov 2023
  
  Thank you for your overall positive feedback and for taking your time to write the detailed review. We appreciate your comments and interesting suggestions. Please find our detailed answer to all of your comments in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1834-AC3

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1834', Anonymous Referee #1, 08 Sep 2023

Review of "AI-derived 3D cloud tomography from geostationary 2D satellite data" by Brüning et al.
This paper presents a neural network that can reconstruct 3D cloud structures from 2D visible/IR satellite data. The authors describe a network architecture that maps the 2D observations to 3D fields. They then present a comparison to other machine-learning methods and compare the statistics of network-predicted and real observations.
The most interesting part of the paper to me is the network structure that generates the 3D fields. The U-Net structure is appropriate for the problem at hand, although it is not quite clear from the description how the transformation from 2D to 3D is performed.
While the network is interesting, I wonder about the usefulness of the model outputs since the model clearly suffers from regression to the mean and blurriness of the results. Also, the uncertainty of the outputs is not estimated. Could you comment a bit on which applications would find the current results useful? And how the abovementioned issues could be improved?
The language in the paper needs some improvement. There are awkward sentence structures throughout the paper, as well as misuse of vocabulary and terminology, which make the paper hard to understand at times. I point out some examples in the specific comments, but these are not an exclusive list; the whole paper would benefit from editing. In contrast, the figures in the paper are clear and well made.
Specific comments:
Lines 24-25: "Passive sensors such as geostationary satellites": the statement needs more precision, satellites are not sensors
Lines 36-37: "The large-scale generability of these methods is expandable since their 3D results are limited to the cloud’s spatial vicinity": I don't understand this sentence, please clarify.
Line 48: "time efficiency and feasibility": it's not clear to me what this means
Line 80: "orbiting the globe on a sinusoidal track": how does a satellite orbit on a "sinusoidal track"?
Lines 91-92: "resampled to a geographic grid": what kind of grid, a lat-lon one?
Section 2.1.2: CloudSat is on a sun-synchronous orbit, meaning it sees every location at the same local solar time. This might introduce some diurnal bias to the data; this should be acknowledged.
Line 109: The use of "XY" is confusing, I read this initially as "X times Y" but apparently you mean a diagonal transect through the image? Or did I misunderstand?
Line 125: "smoothing" should probably be "filtering"
Line 137: The use of the word "delineate" here and a couple of other places in the places seems incorrect
Line 159: I would like some more details on how the network structure maps the 2D input fields to the 3D output fields. Is the channels dimension used transformed to the Z dimension of the output?
Lines 163-164: How is the 3D scene predicted by the network compared to the CloudSat data during training? CloudSat only gets a 2D vertical cross section of the scene. Is only part of the scene selected for comparison? Also, what loss function do you use for training?
Line 176: "Both models": unclear which models this refers to
Line 181: "pictures" is used incorrectly here.
Lines 189-190: In the joined 2400 x 2400 pixel 3D prediction, is the field continuous at the borders of the 100 x 100 pixel tiles? Or do you see discontinuities or artifacts?
Line 220: "That said" seems out of place here - please revise.
Line 231: I don't understand that "denominational structure" means here.
Lines 262-263: High, thin ice clouds may also not be observed by CloudSat due to being under the minimum detectable reflectivity.
Figure 7: Maybe you could add a panel showing the difference of a and b to illustrate the biases better.
Line 288: "Leaving out the affected channels downgrades the overall performance": it would be good to see something to demonstrate this.
Lines 292-293: "In contrast to pixel-based DL methods like the CNN or CGAN, the Res-UNet utilizes a larger receptive field preserving the spatial dimensionality and global context information during the training routine." This is not a correct statement regarding the CNN or CGAN architectures. CNN architectures can also achieve large receptive fields and global context using downsampling. In fact the UNet itself is a type of CNN - its distinguishing feature is the addition of skip connections to preserve resolution. As for the CGAN, it refers to a certain training setup of generative models that could be implemented with either normal CNNs or with (Res-)UNets.
Lines 312-313: Approximately 1 km resolution is also already available from the GOES-R series and Himawari 8/9 satellites.
Lines 321-322: "Since it is independent of external or interconnected data sources, the bias within the data is reduced.": unclear sentence, I'm not sure how the latter follows from the former.

Citation: https://doi.org/10.5194/egusphere-2023-1834-RC1
- AC1: 'Reply on RC1', Sarah Brüning, 03 Nov 2023
  
  We appreciate your comments and thank you for taking your time to write the detailed review. Please find our answers to each of your comments in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1834-AC1
RC2:
'Comment on egusphere-2023-1834', Anonymous Referee #2, 06 Oct 2023
This paper by Brüning et al. presents an ML-based method to reconstruct 3D cloud profiles from 2D images. The authors employ a Res-UNet architecture to translate data collected by the GOES satellite to cloud profiles measured by CloudSat, and compare their methodology to pixel-based translation approaches.
This is an interesting piece of work with promising results. To the best of my knowledge, existing work using ML to reconstruct cloud profiles primarily focuses on pixel-based approaches, i.e. translating 1D to 2D information. In this context, this work presents a substantial improvement, translating 2D images into 3D cloud structures.
While I recommend the paper for publication, I suggest addressing a series of general and specific comments beforehand to increase the rigor and clarity of the presented work.
General comments:
In my opinion, any application of machine learning to scientific research questions requires some level of uncertainty quantification. I appreciate that this might go beyond the scope of the paper, but estimating model uncertainties would greatly improve the trustworthiness of the results, and likely help inform how (and for which scenarios) the model needs improving.

The writing is at times too verbose and unclear. I am including a number of specific comments below, but I would suggest going through the paper again and streamlining the narrative.

At times, the section headings could be more specific. Some of them (e.g. “Comprehensive predictions”) are not particularly descriptive of the work done.

Please add tables of the training and hyperparameters in the paper (or better Supplementary Information). Most importantly, please include information about the number of parameters of each of your models.

Specific comments:
Line 9; Abstract: “average error” is a bit vague and not the same as RMSE. Please revise. Furthermore, it is not clear what “total value range” in this sentence is referring to.

Line 27: “[...] spatially and temporally limited perspective [...].” Please expand on what the spatial and temporal characteristics of the different instruments used in this work are. Specifically, I think the authors could expand on the benefit of geostationary satellites when it comes to temporal resolution, as polar-orbiting satellites often require 1+ days to observe the same area again.

Line 29: “While this analysis often rests upon subjective labeling or fixed thresholds [...].” Please add example citations to previous work.

Line 33: “joined” ==> joint

Line 36: I am not sure what “generability” means.

Line 54: I would suggest including other works, such as https://arxiv.org/abs/1911.04227, when discussing cloud classification.

Line 70+: The sentence is very long and hard to follow. Please rephrase.

Line 78: “[...] originates a geostationary satellite.” ==> “from” missing.

Line 84: The EUMETSAT abbreviation was already used in line 78, before it was defined here.

Figure 1: It would be great if you could add information about the size of the boxes shown in Part 1, to make it more clear how the matching algorithm works. Furthermore, it was confusing to see 90 height levels in the figure, when the text previously described CloudSat to have 125 height levels. Could you also comment on the missing data shown in the CloudSat profile?

Line 114: “Extracted satellite samples display the physical predictors fed into the network [...].” Please rephrase, as it is not clear to me what you mean.

Line 115: I would rename “samples” as “matched image-profile pairs” or similar to be more exact.

Line 117: How come you have no test set? Validation sets are great, but susceptible to “human gradient descent”, since we usually justify model modifications by improved performances on the validation set. This doesn’t necessarily mean that the model is better, it just means that performance on these specific examples is improved.

Line 123: How was the reduction from 125 to 90 height levels done?

Line 140: “By seeking non-linear approximations of between the input and the output data [...].” Please rephrase.

Line 163: “As flipped images are perceived as new samples, we enhance the amount of training data by giving all samples a chance of 25% to be either vertically or horizontally rotated.” Please comment on whether these transformations are valid for the context of satellite measurements, especially CloudSat, with its ascending and descending orbits. Are the two acquisitions totally equal, in that flipping one can simulate the other?

Equation 3: Either define all variables in the equation, or remove from the paper, as the RMSE is a relatively standard quantity.

Line 173-174: Rephrase “horizontal diagonal”.

Figure 3: Please explain how many samples the joint plot was calculated over.

Line 218+: This sentence is very hard to follow. Please re-write.

Line 220+: Do you have any idea why you observe different performance trends for your pixel-based and Res-UNet approaches?

Line 226: How were the four samples chosen?

Line 228: What do you mean by “transferability” in this context?

Line 231: “A denominational structure [...].” I am not sure what you mean by this.

Figure 4: Please label each subplot, and refer to the subplots as you discuss the results in the main text to make it easier to follow your arguments.

Line 233: “[...] the Res-UNet shows more robust results [...].” I am not 100% convinced that the RMSE and visual inspection of the results qualify this sentence. Have you looked at any other metrics, or quantitatively studied performance as a function of cloud type for more than a couple of samples?

Line 235+: I am not sure I follow your discussion about quality flags. Please clarify.

Line 238-239: “[...] this leads to a lower model uncertainty.” Do you mean “lower error”? Model uncertainties to me means quantification of how certain a model is in its predictions.

Line 239: “That said [...].” This sentence isn’t quite clear to me. Please rephrase/clarify.

Line 260: “The first peak [...].” It would be really helpful if you could refer to the labels of the subplots when discussing the results.

Line 264+: “These channels are identified as essential information [...].” It would be great if you could show evidence for this, or expand the discussion on what makes you draw this conclusion.

Line 267+: “Comparing [...] reveals an overall high agreement.” Do you have any quantitative comparisons between your model outputs and the CTH from CLAAS, or is this mainly from visual inspection?

Line 277: “The approach offers [...].” Which approach are you referring to?

Figure 6: Is the comparison calculated across the entire FD of the model predictions, or just the tracks that overlap with CloudSat? If it’s non-normalized frequencies, they should be calculated across the same number of datapoints to be comparable, if I am not mistaken? Please clarify in the caption and main text.

Discussion section: I found it at times hard to follow when you are referring to your own work, versus work done by other people. Could you please go through this section again?

Line 280: Please clarify what you mean by “minimal architecture”.

Line 282: “others” ==> others’ work

Line 284: “Nonetheless, defining those variables as additional predictors has a negligible effect on the model performance.” Are you showing evidence for this somewhere?

Line 288: “Leaving out the affected channels downgrades the overall performance.” Same here, do you have evidence supporting this statement?
Citation: https://doi.org/10.5194/egusphere-2023-1834-RC2
- AC2: 'Reply on RC2', Sarah Brüning, 03 Nov 2023
  
  Thank you for the overall positive evaluation and for taking your time to write the detailed review. We appreciate your comments and interesting suggestions. Please find our detailed answer to all of your comments in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1834-AC2
RC3:
'Comment on egusphere-2023-1834', Anonymous Referee #3, 11 Oct 2023

This work sets out to provide a 3D radar reflectance ``retrieval'' from SEVIRI data using a machine learning approach. By training a network to reconstruct CloudSat reflectivity profiles, the authors reconstruct reflectivity profiles across the centre of the SEVIRI disc, demonstrating reasonable skill in the network to retrieve vertical cloud properties from SEVIRI radiances alone.
I think this is an exciting study of interest to the readers of AMT. I would consider it suitable for publication, subject to some modifications.

Main points
This looks to me as if this study has the characteristics of an ML-based retrieval, where to properties being retrieved from the SEVIRI radiances are the vertical profile of radar reflectivity. This has some similarities to other algorithms, such as the GPROF retrieval (Kummerow et al, J. App. Met., 2001), which could be discussed. Perhaps the method in this paper could produce improved precipitation retrievals in a similar style to the GPROF retrieval (although not in this publication)?
Although it doesn't seem to be at the level where I could be comfortable using the retrieved reflectivity profiles yet, but perhaps there are certain conditions where the data could be more trustworthy? Would limiting the study to a smaller area (e.g. close to nadir for SEVIRI) produce better results?

Line by Line comments
Bias in retrievals based on approximations and cloud physical properties L9 - It is later stated that RMSE varies significantly by cloud situation - how is this value derived?
L10 - receive high accordance -> find high agreement?
L30 - at risk of bias. I would argue this is almost always an issue (and would be for a ML method too)
L30 - If passive sensors lack this information, how can it be recovered from this method?
L33 - joined -> combined
L88 - The near IR channels also include reflected solar radiation (depending on channel wavelength and time of day)
L109 - How is parallax addressed? SEVIRI has an off-nadir view for most points on the disc, while CloudSat always views at nadir. This could cause registration issues between different cloud layers. Changes in resolution might also become an issue closer to the edge of the disc.
L110 - I found the last sentence difficult to follow. What does the 'factor to coarse grain the data' mean here?
Fig. 2 - Why not low cloud layers? This is explained later on, but could have been included in the description of the radar data (unless I missed it?)
Fig. 3 - How is this normalisation done? Is it across the whole image?
L220 - Why is the DL model different here? Is this due to a regression-to-the-mean effect for the Res-UNet?
L262 - I think thin clouds are also difficult to detect from CloudSat, given they might have a small radar reflectivity. Both SEVIRI and CloudSat on their own are able to produce this second, higher peak in cloud top height, suggesting they can detect these thinner clouds.
Fig. 4 - Perhaps blur the Cloudsat data to match the DL resolution? It would also be nice to have 5km marked at the bottom of the y-axis (if that is the case), to highlight this doesn't go to zero.
L270 - So it is (or could be) a simultaneous retrieval, due to the inclusion of the water vapour channels?
Fig. 7 - Much more structure on the Res-UNet results - why is this? Also, the Res-UNet image seems to have almost all the cloud tops at the same altitude, other than a small fraction of low-level cloud in some regions.
L299 - I guess this is much like satellite retrievals in general? Presumably there is actually just a lack of information there?
L300 - 5km is quite high from the ground to deal with clutter. It also cuts out a lot of the lower level clouds with a more challenging cloud top height retrieval, which might make the DL method seem better overall? I am not suggesting they should be included (working with just high clouds is an important task), but could be worth mentioning.
L321 - I think this reduction in bias should be shown if it is claimed. Those external data sources might decrease bias themselves in some situations (e.g. with a better estimate of water vapour than is available from the SEVIRI IR channels).
L325 - maybe ``remote oceanic regions'' instead of ``secluded regions above the sea surface''?

Citation: https://doi.org/10.5194/egusphere-2023-1834-RC3
- AC3: 'Reply on RC3', Sarah Brüning, 03 Nov 2023
  
  Thank you for your overall positive feedback and for taking your time to write the detailed review. We appreciate your comments and interesting suggestions. Please find our detailed answer to all of your comments in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-1834-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Sarah Brüning on behalf of the Authors (15 Nov 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (17 Nov 2023) by Cuiqi Zhang

RR by Anonymous Referee #3 (06 Dec 2023)

RR by Anonymous Referee #2 (07 Dec 2023)

ED: Publish subject to minor revisions (review by editor) (11 Dec 2023) by Cuiqi Zhang

AR by Sarah Brüning on behalf of the Authors (13 Dec 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (22 Dec 2023) by Cuiqi Zhang

AR by Sarah Brüning on behalf of the Authors (30 Dec 2023)

Journal article(s) based on this preprint

09 Feb 2024

Artificial intelligence (AI)-derived 3D cloud tomography from geostationary 2D satellite data

Sarah Brüning, Stefan Niebler, and Holger Tost

Atmos. Meas. Tech., 17, 961–978, https://doi.org/10.5194/amt-17-961-2024,https://doi.org/10.5194/amt-17-961-2024, 2024

Short summary

Sarah Brüning, Stefan Niebler, and Holger Tost

Viewed

Total article views: 572 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
340	201	31	572	14	12

HTML: 340
PDF: 201
XML: 31
Total: 572
BibTeX: 14
EndNote: 12

Views and downloads (calculated since 16 Aug 2023)

Month	HTML	PDF	XML	Total
Aug 2023	131	106	5	242
Sep 2023	66	18	4	88
Oct 2023	69	28	6	103
Nov 2023	43	14	10	67
Dec 2023	11	11	5	27
Jan 2024	15	14	1	30
Feb 2024	5	10	0	15

Cumulative views and downloads (calculated since 16 Aug 2023)

Month	HTML	PDF	XML	Total
Aug 2023	131	106	5	242
Sep 2023	66	18	4	88
Oct 2023	69	28	6	103
Nov 2023	43	14	10	67
Dec 2023	11	11	5	27
Jan 2024	15	14	1	30
Feb 2024	5	10	0	15

Viewed (geographical distribution)

Total article views: 556 (including HTML, PDF, and XML) Thereof 556 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 09 Feb 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (8128 KB)
Metadata XML

Short summary

This study applies a Res-UNet to derive a comprehensive 3D cloud tomography from 2D satellite data over heterogeneous landscapes. We combine observational data from passive and active remote sensing sensors by an automated matching algorithm. This data is fed into a neural network to predict cloud reflectivities on the whole satellite domain between 5–24 km height. With an average RMSE of 3.41 dBZ, we contribute to closing existing data gaps in the representation of real-world cloud structures.

AI-derived 3D cloud tomography from geostationary 2D satellite data

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Viewed

Viewed (geographical distribution)


Total:	0
HTML:	0
PDF:	0
XML:	0