the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Applying machine learning to improve the near-real-time products of the Aura Microwave Limb Sounder
Abstract. A new algorithm to derive near-real-time (NRT) data products for the Aura Microwave Limb Sounder (MLS) is presented. The old approach was based on a simplified optimal estimation retrieval algorithm (OE-NRT) to reduce computational demands and latency. This manuscript describes the setup, training, and evaluation of a redesigned approach based on artificial neural networks (ANN-NRT), which is trained on > 17 years of MLS radiance observations and composition profile retrievals. Comparisons of joint histograms and performance metrics derived between the two NRT results and the operational MLS products demonstrate a noticeable statistical improvement from ANN-NRT. This new approach results in higher correlation coefficients, as well as lower root-mean-square deviations and biases at almost all retrieval levels compared to OE-NRT. The exceptions are pressure levels with concentrations close to 0 ppbv, where the ANN models tend to underfit and predict zero. Depending on the application, this behavior might be advantageous. While the developed models can take advantage of the extended MLS data record, this study demonstrates that training ANN-NRT on just a single year of MLS observations is sufficient to improve upon OE-NRT. This confirms the potential of applying machine learning to the NRT efforts of other current and future mission concepts.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(8320 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(8320 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-101', Anonymous Referee #1, 16 Feb 2023
The authors present a near real-time processor of Aura/MLS observations using a supervised neural network. The manuscript is easy to follow and shows that the processor has very good performance, very close to the operational processor. The new method presents a significant improvement compared to the previous near real-time processor based on a simplified optimal estimation method. I recommend the manuscript for publication, but I have minor comments that could be clarified by the authors.
General comments
1) I am impressed by the results overall, and more particularly with the ability of the model to capture the increase in H2O induced by the volcanic eruption, though the statistical weight of such events in the training dataset should be low. This illustrates the high potential of the model to capture special disturbances that occur over a restricted spatio-temporal range. However, I found that such abnormal conditions are not sufficiently discussed in the manuscript. Indeed, these are scientifically the most interesting cases but have a low impact on the overall statistical evaluation. For example, in Figure 4b, the increase in H2O at 100 hPa over India and part of Southeast Asia is clearly underestimated with ANN-NRT. This should be discussed in the manuscript and the authors should mention if they have found other cases where significant discrepancies were seen.
2) More generally, the authors do not show results for the whole test dataset (5% of 17 years corresponds to almost 1 year), in particular winter time which is strongly disturbed in the northern hemisphere. Is there a seasonal pattern in the results? Authors should clarify why the test data are well suited for describing the capability of the model and the limitations of such a choice (that could further be investigated in future studies). For instance, I would personally have used 2 entire years with very different conditions (e.g., SSW strength or QBO phase) to test the models.
3) Regarding the vertical resolution of profiles predicted with ANN-NRT. This issue is not addressed in the manuscript and could be clarified. If I understand the NN setting correctly, the vertical resolution of the predicted profile is the same as that of the level 2 operational product (here I am referring to the resolution derived from the operational averaging kernels and not the retrieval levels spacing). Am I right? This could be clarified.
For low SNR cases, the authors mentioned that the NN tends to smooth the noise compared to the operational product. Is this effect could be related to a degradation of the vertical resolution similar to the regularization effect in the OE method?Specific comments
Line 87: “n” is already used to define the number of input features. It would be clearer if another letter is used for the number of neurons per hidden layer.
Line 93: is the levels of the predicted profile the same as the number of levels of the operational product?
Table 1: I understand that the hyperparameters are defined by a set of tests but the differences between the models could be discussed. Why the number of hidden neurons is much smaller for the H2O model than for T and O3? Why is the tanh activation preferred over Relu for some species? (It is considered that Relu make the training more efficient)
Line188/Table 3: Are the scores calculated for the same periods as Figure 3?
Line207: “Here the ANN … , and the results are close to L2 data”: there is a clear underestimation of the H2O vmr over india and East-Asia. This issue could be mentioned and what could be the reason?
Line 219/Line 241: Would it be possible to complete a small training dataset with simulated data?
Line 244: I don’t understand the sentence “The previous version…”. Do the authors mean: The previous version of MLS NRT data products (OE-NRT, Lambert et al., 2022) is replaced with predictions from an artificial neural network (ANN).
Citation: https://doi.org/10.5194/egusphere-2023-101-RC1 - AC1: 'Reply on RC1', Frank Werner, 16 Apr 2023
-
RC2: 'Comment on egusphere-2023-101', Anonymous Referee #2, 20 Feb 2023
General comments
This paper presents new near real-time products of the Aura Microwave Limb Sounder (MLS) using artificial neural networks (ANN-NRT). The ANN-NRT show good performance and demonstrates the potential of applying machine learning to generate NRT products. The paper is clearly written and the study is well explained. I recommend the manuscript for publication, but I have some minor comments.
(1) Global maps show ANN-NRT is better than OE-NRT, but more discussion should be given to the special area of that ANN overestimates or underestimates.
(2) For performance evaluation of T model, I think it is more intuitive to use unit K rather than relative values. At least it should be described in the paper.
Specific comments
Line 96: I know brightness temperatures sampled over 2005–2022 are very large. However, it is better to describe the exact amount of input features for training, validation, and test data.
Table 1: The number of neurons of T and O3 are much larger than other products, is it necessary? Why choose so many neurons instead of adding hidden layers? The MBS of T (i.e. 8192) is much larger than the others (i.e. 32), it should be discussed.
Line 189: The SO2 statistics in Table 3 are based on the observations which were also included in training data set. So, the comparison of OE and ANN doesn't make much sense. Is there no other data for comparison?
Line 237: All metrics get better with the increasing data except the absolute bias in Fig. 5(c), it should be discussed.
Citation: https://doi.org/10.5194/egusphere-2023-101-RC2 - AC2: 'Reply on RC2', Frank Werner, 16 Apr 2023
-
RC3: 'Comment on egusphere-2023-101', Anonymous Referee #3, 28 Feb 2023
GENERAL COMMENTS
================The paper describes the application of an artifical neural network (ANN) to the
retrieval of trace gas profiles from the MLS instrument. ANN have been applied
recently to different problems, partially with large success.
Here, the intent is to replace a primarily fast but comparatively inaccurate
near-real-time retrieval with something both faster and more accurate.
The presented results indicate that the approach has succeeded on both ends.The study is on the point, well described, and executed.
The topic fits the journal.
I recommend publication.
SPECIFIC COMMENTS
=================lines 80ff: The underlying software seems to be readily available. Could the training
model employed here be made available as well? This might be applicable for similar
tasks and/or other limb sounders.
lines 130ff: A general problem with trained models is how the model copes with
unexpected situations. Here, you describe how you adapted the training data set
to cope with volcanic activity. How important was this for the performance and
how likely is it that, e.g. the Ozone hole would have been missed?
lines 235ff: This result suggests that the training data set contains a lot of
redundancy, as is expected for such a large set measuring effectively the same
planet all over. Do you have means to identify profiles with high influence
on the training performance? And if yes, what were they?Do you foresee a possibility to generate a synthetic set of training data for a
new instrument, for which no historic data is available?How would this compare for instruments, which measure more seldomly, such as ACE-FTS.
Would a year of data still be sufficient to train the retrieval?
lines 244ff: Typically, level 2 products are associated with a zoo of diagnostic data
from precision to resolution etc. How is the data provided by the ANN characterised?
lines 261ff: The speed-up of the NRT retrieval is impressive and very useful for the
purpose of providing near-real-time data. How does this relate to the computational
effort for training the model? Is this (over the foreseen runtime) still a net
positive or does one trade in training effort for faster operational results? Does one
need a super-computer/cloud service for training or is this feasible with a well-equipped
work station?
lines 263ff: Are NRT retrievals the only application of the ANN model discussed here?
Could this data serve as an initial guess to the OE to speed up convergence or are
there reasons not to use this?Citation: https://doi.org/10.5194/egusphere-2023-101-RC3 - AC3: 'Reply on RC3', Frank Werner, 16 Apr 2023
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2023-101', Anonymous Referee #1, 16 Feb 2023
The authors present a near real-time processor of Aura/MLS observations using a supervised neural network. The manuscript is easy to follow and shows that the processor has very good performance, very close to the operational processor. The new method presents a significant improvement compared to the previous near real-time processor based on a simplified optimal estimation method. I recommend the manuscript for publication, but I have minor comments that could be clarified by the authors.
General comments
1) I am impressed by the results overall, and more particularly with the ability of the model to capture the increase in H2O induced by the volcanic eruption, though the statistical weight of such events in the training dataset should be low. This illustrates the high potential of the model to capture special disturbances that occur over a restricted spatio-temporal range. However, I found that such abnormal conditions are not sufficiently discussed in the manuscript. Indeed, these are scientifically the most interesting cases but have a low impact on the overall statistical evaluation. For example, in Figure 4b, the increase in H2O at 100 hPa over India and part of Southeast Asia is clearly underestimated with ANN-NRT. This should be discussed in the manuscript and the authors should mention if they have found other cases where significant discrepancies were seen.
2) More generally, the authors do not show results for the whole test dataset (5% of 17 years corresponds to almost 1 year), in particular winter time which is strongly disturbed in the northern hemisphere. Is there a seasonal pattern in the results? Authors should clarify why the test data are well suited for describing the capability of the model and the limitations of such a choice (that could further be investigated in future studies). For instance, I would personally have used 2 entire years with very different conditions (e.g., SSW strength or QBO phase) to test the models.
3) Regarding the vertical resolution of profiles predicted with ANN-NRT. This issue is not addressed in the manuscript and could be clarified. If I understand the NN setting correctly, the vertical resolution of the predicted profile is the same as that of the level 2 operational product (here I am referring to the resolution derived from the operational averaging kernels and not the retrieval levels spacing). Am I right? This could be clarified.
For low SNR cases, the authors mentioned that the NN tends to smooth the noise compared to the operational product. Is this effect could be related to a degradation of the vertical resolution similar to the regularization effect in the OE method?Specific comments
Line 87: “n” is already used to define the number of input features. It would be clearer if another letter is used for the number of neurons per hidden layer.
Line 93: is the levels of the predicted profile the same as the number of levels of the operational product?
Table 1: I understand that the hyperparameters are defined by a set of tests but the differences between the models could be discussed. Why the number of hidden neurons is much smaller for the H2O model than for T and O3? Why is the tanh activation preferred over Relu for some species? (It is considered that Relu make the training more efficient)
Line188/Table 3: Are the scores calculated for the same periods as Figure 3?
Line207: “Here the ANN … , and the results are close to L2 data”: there is a clear underestimation of the H2O vmr over india and East-Asia. This issue could be mentioned and what could be the reason?
Line 219/Line 241: Would it be possible to complete a small training dataset with simulated data?
Line 244: I don’t understand the sentence “The previous version…”. Do the authors mean: The previous version of MLS NRT data products (OE-NRT, Lambert et al., 2022) is replaced with predictions from an artificial neural network (ANN).
Citation: https://doi.org/10.5194/egusphere-2023-101-RC1 - AC1: 'Reply on RC1', Frank Werner, 16 Apr 2023
-
RC2: 'Comment on egusphere-2023-101', Anonymous Referee #2, 20 Feb 2023
General comments
This paper presents new near real-time products of the Aura Microwave Limb Sounder (MLS) using artificial neural networks (ANN-NRT). The ANN-NRT show good performance and demonstrates the potential of applying machine learning to generate NRT products. The paper is clearly written and the study is well explained. I recommend the manuscript for publication, but I have some minor comments.
(1) Global maps show ANN-NRT is better than OE-NRT, but more discussion should be given to the special area of that ANN overestimates or underestimates.
(2) For performance evaluation of T model, I think it is more intuitive to use unit K rather than relative values. At least it should be described in the paper.
Specific comments
Line 96: I know brightness temperatures sampled over 2005–2022 are very large. However, it is better to describe the exact amount of input features for training, validation, and test data.
Table 1: The number of neurons of T and O3 are much larger than other products, is it necessary? Why choose so many neurons instead of adding hidden layers? The MBS of T (i.e. 8192) is much larger than the others (i.e. 32), it should be discussed.
Line 189: The SO2 statistics in Table 3 are based on the observations which were also included in training data set. So, the comparison of OE and ANN doesn't make much sense. Is there no other data for comparison?
Line 237: All metrics get better with the increasing data except the absolute bias in Fig. 5(c), it should be discussed.
Citation: https://doi.org/10.5194/egusphere-2023-101-RC2 - AC2: 'Reply on RC2', Frank Werner, 16 Apr 2023
-
RC3: 'Comment on egusphere-2023-101', Anonymous Referee #3, 28 Feb 2023
GENERAL COMMENTS
================The paper describes the application of an artifical neural network (ANN) to the
retrieval of trace gas profiles from the MLS instrument. ANN have been applied
recently to different problems, partially with large success.
Here, the intent is to replace a primarily fast but comparatively inaccurate
near-real-time retrieval with something both faster and more accurate.
The presented results indicate that the approach has succeeded on both ends.The study is on the point, well described, and executed.
The topic fits the journal.
I recommend publication.
SPECIFIC COMMENTS
=================lines 80ff: The underlying software seems to be readily available. Could the training
model employed here be made available as well? This might be applicable for similar
tasks and/or other limb sounders.
lines 130ff: A general problem with trained models is how the model copes with
unexpected situations. Here, you describe how you adapted the training data set
to cope with volcanic activity. How important was this for the performance and
how likely is it that, e.g. the Ozone hole would have been missed?
lines 235ff: This result suggests that the training data set contains a lot of
redundancy, as is expected for such a large set measuring effectively the same
planet all over. Do you have means to identify profiles with high influence
on the training performance? And if yes, what were they?Do you foresee a possibility to generate a synthetic set of training data for a
new instrument, for which no historic data is available?How would this compare for instruments, which measure more seldomly, such as ACE-FTS.
Would a year of data still be sufficient to train the retrieval?
lines 244ff: Typically, level 2 products are associated with a zoo of diagnostic data
from precision to resolution etc. How is the data provided by the ANN characterised?
lines 261ff: The speed-up of the NRT retrieval is impressive and very useful for the
purpose of providing near-real-time data. How does this relate to the computational
effort for training the model? Is this (over the foreseen runtime) still a net
positive or does one trade in training effort for faster operational results? Does one
need a super-computer/cloud service for training or is this feasible with a well-equipped
work station?
lines 263ff: Are NRT retrievals the only application of the ANN model discussed here?
Could this data serve as an initial guess to the OE to speed up convergence or are
there reasons not to use this?Citation: https://doi.org/10.5194/egusphere-2023-101-RC3 - AC3: 'Reply on RC3', Frank Werner, 16 Apr 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
256 | 124 | 21 | 401 | 6 | 6 |
- HTML: 256
- PDF: 124
- XML: 21
- Total: 401
- BibTeX: 6
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Nathaniel J. Livesey
Luis F. Millán
William G. Read
Michael J. Schwartz
Paul A. Wagner
William H. Daffer
Alyn Lambert
Sasha N. Tolstoff
Michelle L. Santee
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(8320 KB) - Metadata XML