the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Machine-learning Based Marine Planetary Boundary Layer (MPBL) Moisture Profile Retrieval Product from GNSS-RO Deep Refraction Signals
Abstract. Marine planetary boundary layer (MPBL) water vapor amount and gradient impact the global energy transport through directly affecting the sensible and latent heat exchange between the ocean and atmosphere. Yet, it is a well-known challenge for satellite remote sensing to profile MABL water vapor, especially when cloud or sharp gradient of water vapor are present. Wu et al. (2022) identified good correlations between Global Navigation Satellite System (GNSS) deep refraction signals (SNR) and the global MPBL water vapor specific humidity when the radio occultation (RO) signal is ducted by the moist PBL layer, and they laid out the underlying physical mechanisms to explain such a correlation. In this work, we apply a machine-learning/artificial intelligence (ML/AI) technique to realize pixel-level MPBL water vapor profiling. A convolutional neural network (CNN) model is trained using 20 months of global collocated hourly ERA-5 reanalysis and COSMIC1 1 Hz SNR observations between 975 – 850 hPa with 25 hPa vertical resolution, and then the model is applied to both COSMIC1 and COSMIC2 in other time ranges for independent retrieval and validation. Monte Carlo Dropout method was employed for the uncertainty estimation. Comparison against multiple field campaign radiosonde/dropsonde observations globally suggests SNR-retrieved water vapor consistently outperforms ERA-5 reanalysis and the Level-2 standard retrieval product at all six pressure levels between 975 hPa and 850 hPa, indicating real and useful information is gained from the SNR signal albeit training was performed against the reanalysis. The only exception is in the deep tropics where the fundamental assumption for SNR-retrieval to work is invalidated frequently by interactions among ocean surface, MPBL and shallow convections. Climatology and diurnal cycle of MPBL structure constructed from the ML-SNR technique is studied and compared to the reanalysis. Disparities of climatology suggest ERA-5 may systematically produces dry biases at high-latitudes, and wet biases in marine stratocumulus regions. The diurnal cycle amplitudes are too weak and off-phase in ERA-5, especially in Arctic and stratocumulus regions. These areas are particularly prone to PBL processes where this GNSS-SNR water vapor product may contribute the most.
- Preprint
(11707 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 26 Jul 2024)
-
RC1: 'Comment on egusphere-2024-973', Anonymous Referee #1, 28 Jun 2024
reply
General comments
This paper proposes a method to obtain vertical profiles of water vapor from GNSS radio occultation (RO) observation in the marine planetary boundary layer (MPBL) using machine learning (ML), which is a form of artificial intelligence (AI). I know little about ML and AI (the paper should be reviewed by an expert in this area), so I reviewed the rest of the paper. The basic idea may be useful in a practical sense, but in my opinion it is not acceptable for publication because it is difficult to understand and is unclear and imprecise in many places. Thus, it does not make a convincing case of the merit of using ML/AI to improve RO retrievals of water vapor in the MPBL. I recommend major revisions with care taken to use clear, precise, and understandable language.
A clear description of the scientific basis for the method, under what atmospheric conditions it is valid and useful, its limitations, and how it compares with 1D-Var retrievals of water vapor profiles in the MPBL would be useful. This is especially important for this paper, since it presents results from a technique that is unfamiliar to most experts in radio occultation. There are some odd words and phrases that should be replaced with more scientific or precise words. Please see some examples in the detailed comments section; these are only examples; I stopped looking carefully for language issues after a while.
A recent paper that is relevant to the discussion regarding SNR and the HSL is Sokolovskiy et al. (2024), and this paper should be referenced in the Introduction (somewhere after Line 65).
The paper uses several different names for the ML method, most often “SNR-method” which is misleading. I suggest using the more descriptive name “ML-SNR method” (as done in Line 16) consistently throughout the paper. Whatever acronym is used, it should be consistent everywhere.
The authors use marine planetary boundary layer (MPBL), which is OK. But they may wish to consider using marine atmospheric boundary layer (MABL) instead to be consistent with what they used before in the closely related paper Wu et al. (2022).
The numbering of Section 2 is currently:
2. Data and Model
2.1 Training and Validation Datasets
2.1.1 Machine Learning Model Selection
The number of Section 2.1.1 should be changed to 2.2.
It would be useful to have a short simple summary of the steps used to train the model and then to validate it, perhaps including a numbered series of steps in the process or a flow chart. This could go at the end of Section 2. The highly technical first paragraph of Section 2.1.1 is not very useful to the nonexpert in ML. A new subsection to to Section 2 could be added which included the series of steps or a flow chart showing the steps in the process: 2.3 Summary of ML-SNR model and validation
The authors use ERA5 MPBL water vapor data to train the ML-SNR model and then test the model using independent datasets. Although ERA5 is a well-tested and widely used reanalysis, there are likely significant uncertainties in the MPBL water vapor analysis, so it is only an approximation to “Truth.” A ML model trained on ERA5 data that is tested with an independent data set will return retrievals that are consistent with ERA5. This seems to be the case in Fig. 4, although there is a lot of scatter. The comparisons of the ML-method retrievals of water vapor to radiosondes as done in the paper will contain the influence of ERA5 data. It would be useful to discuss the influence of the training data set on the retrievals. It would also be interesting to discuss how the retrieval of individual water vapor profiles would be used if the scatter (or uncertainty) of each profile is as large as the scatter in Fig. 4 suggests.
The physical basis for the correlation of SNR as a function of the HSL with MPBL water vapor content, which was found by a related paper Wu et al. 2022 (I did not review this paper), is not explained well. The availability of meaningful SNR (SNR above the noise level) at all HSL levels depends on the boundary layer structure. For moist boundary layers that have no sharp inversions, this correlation is understandable; usable SNR are available all HSL levels in their model. However, for dry boundary layers there may be no useful SNR at deep levels, and for moist boundary layers with sharp inversions, there may be HSL levels with no useful values of SNR. The difference between dry and moist boundary layers and moist boundary layers with and without sharp inversions, and the effect of ducting and superrefraction should be discussed. All boundary layer structures are lumped together in this paper. Related to this issue is the confusing sentence beginning in Line 68 “The paper attributed such a positive correlation to the strong refraction from a horizontally stratiform and dynamically quiet MPBL water vapor layer that acts to enhance the SNR amplitude at deep HSL through ducting and diffraction interference.” A similar issue exists with the sentence in Lines 244-247, which I do not understand.
The paper refers to the 1D-Var retrieval of water vapor as “the standard Level-2 product” (line 12) which is imprecise and will mean nothing to most readers. Apparently it refers to the retrieval of water vapor and temperature from 1D-variational analysis (wetPrf in their paper). Please use a clear and precise term for this product and define it.
Johnston et al. (2021) use the newer wetPf2 water vapor data (Wee et al. 2022.) The Gong et al. paper refers to wetPrf and uses it in the comparison with the ML-SNR retrievals. WetPrf is the older and less accurate retrieval.
Why are the differences in the penetration rates in atmPrf and wetPrf different in COSMIC-1 (blue) and COSMIC-2 (red)? In COSMIC-1 the wetPrf retrieval rate at low levels is greater than the atmPrf retrieval rate (e.g. at 4 km, ~28% for wetPrf and less than 2% for AtmPrf). In COSMIC-2, the opposite is shown; the penetration rate for atmPrf is greater than that for wetPrf in the low levels. Why do the authors use the number of Level-1B files as the denominator; it would be better to use the total number of each files (profiles) in the denominator (success rate = number of retrieved values/number of profiles).
How are the uncertainty values in Section 3.2 and Fig. 6 defined and determined?
The References are not in alphabetical order.
Lines 55-56-In addition to decreasing SNR, which limits the vertical penetration of the RO profiles, superrefraction in the PBL is an issue. Superrefraction makes it impossible to obtain a unique bending angle profile.
Figure 2 needs to be improved. The numbers on the x- and y-axis are not legible, and the axes are not labeled. It appears that there are two figures in 2a and 2b, grid indices at the top and correlations with ERA5 specific humidities in the lower right corner. But the lower right corner is solid dark green, indicating a perfect correlation of 1.0? The other “boxes” at the bottom of the figure to the left of the solid green box at the lower right corner oscillate between positive and negative correlations, and this should be discussed, The caption refers to Table A2, which does not exist.
Fig. 3 has a lot of blank space and in the three regions of campaigns it difficult to see the details. Consider three separate maps of the three regions.
Figure 7 also needs a better explanation. The gray dashed line (ERA5) is difficult to see. How are the solid black lines and the dashed black lines constructed? They are irregular so they don’t look like best fit lines. I presume the solid thin straight black line is the 1:1 line, but why does it not extend to the corners of the grid? SNR retrieval should be ML-SNR retrieval in the caption. “Level-2 retrievals” in the caption should be “wetPrf” retrievals. There are very faint orthogonal red and black lines in the figure—what are these? This figure contains a lot of information and detail; consider breaking into two figures and/or making them larger. Presenting results at 4 pressure levels rather than 6 might help. This is an important figure and should be clear and explained well.
I did not review Section 4 carefully.
Detailed comments
- Lines 1 and 3, also Line 299—what kind of gradient? Horizontal gradient or vertical gradient?
- Line 5 Define SNR as signal to noise ratio—it is not an acronym for deep refraction signals.
- Line 7—what is “pixel-level water vapor profiling?” What pixels?
- Lines 30-31: What is “polar proneness to the climate change”?
- Line 38: grammatically incorrect; I suggest rewriting “Emissions from clouds often overwhelm the emission signal…”
- Line 39: Delete “in the scene”
- Line 44: delete “which couldn’t be used to gain….MPBL.”
- Line 46-use “high resolution” rather than “superb resolution” and give the nominal vertical resolution (100-200 m).
- Line 52: “coarse horizontal resolution” should be replace by “relatively large horizontal footprint.” Resolution refers to the average distance between observation points, footprint refers to the spatial scale of the atmosphere that affects the observation (see Boukabara et al. 2021).
- Line 53-typo-concern.
- Line 53- What does the sentence “This is typically not a big concern in MPBL as vertical gradient if much sharper and harder to characterize if not using in-situ measurements )e.g. shipborne radiosonde” mean?
- Line 56—SNR decreases with decreasing height. Current sentence “decreases with height” implies that it decreases upward.
- Lines 57-59—This is misleading. It says the water vapor retrievals fail to converge because they require a high SNR, when in fact the main issue is that the RO signals do not penetrate deeply enough because of decreasing SNR near the surface.
- Line 61—Fig. 1 does not show that C2 has improved its SNR—it shows that C2 has a deeper penetration rate that C1, which is a result of higher SNR. This is another example of an imprecise/incorrect statement.
- Line 64 and 119-The Maneshan et al. (2024) reference is not in the References section.
- In Fig. 1, it should be stated that the height is mean-sea level rather than impact height, which is often used, and the y-axis should be labeled MSL (km).
- Line 72-delete “understandably”
- Lines 79-80: These two sentences could be replaced by something like “Artificial Intelligence/Machine Learning (AI/ML) has been increasingly used in remote sensing in recent years.”
- Line 95—delete “thoroughly”
- Line 110-define fL1
- Line 112: How can you say ERA-5 is the best reanalysis? I agree that it is very good, but the best? Do you mean ERA5 is better than MERRA-2 in the metric you talk about in the next few sentences?
- Lines 113-114—Johnston et al. (2021) used the improved wetPf2, not wetPrf.
- Line 144---Reference for CNN model?
- Line 144 and 157—replace “old-fashioned” with something like “earlier” or “simpler”.
- Lines 202, 235 and other places—Use “wetPrf” retrievals instead of “Level-2 retrievals.”
- Line 258---Rewrite to say “…the general patterns in the ML-SNR method and MERRA-2 specific humidities agree fairly well.”
- Line 263---I would not describe the comparisons shown in Fig. 11 as “more boring.” I am not sure what is meant by this characterization. Perhaps it means that the structures are less complicated than in Fig. 10, or that the agreement is better? In any case, that is not necessarily “more boring.”
- Line 270—delete “notorious.”
- 12 is not mentioned in the text of section 4.2. It should be introduced in the text somewhere around Line 275.
- Line 287—replace “drops down” with “decreases.”
- Line 293—delete “from this exercise.”
- Line 296—replace “will keep this topic foggy” by something like “make observing and verifying the true diurnal cycle difficult.”
- Line 297—delete “disentangle this mystery”
References
Boukabara, S.-A., J. Eyre, R.A. Anthes, K. Holmlund, K. St. Germain, and R.N. Hoffman, 2021: The Earth-Observing Satellite Constellation: A review from a meteorological perspective of a complex, interconnected global system with extensive applications, IEEE Geoscience and Remote Sensing Magazine, 9, 3, 26-42. https://doi.org/10.1109/MGRS.2021.3070248
Sokolovskiy, S., Z. Zeng, D. Hunt, J.-P. Weiss, J. Braun, W. Schreiner, R. Anthes, Y.-H. Kuo, H. Zhang, D. Lenschow, and T. VanHove, 2024: Detection of super-refraction at the top of the atmospheric boundary layer from COSMIC-2 radio occultations. J. Atmos. and Ocean Tech., 40, 65-78. https://doi.org/10.1175/JTECH-D-22-0100.1
Wee, T.-K.; R.A. Anthes, D.C. Hunt, W.S. Schreiner, and Y.-H. Kuo, 2022: Atmospheric GNSS RO 1D-Var in Use at UCAR: Description and Validation. Remote Sens., 14, 5614. https://doi.org/10.3390/rs14215614
Citation: https://doi.org/10.5194/egusphere-2024-973-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
68 | 12 | 4 | 84 | 2 | 2 |
- HTML: 68
- PDF: 12
- XML: 4
- Total: 84
- BibTeX: 2
- EndNote: 2
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1