the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Direct assimilation of ground-based microwave radiometer observations with machine learning bias correction based on developments of RTTOV-gb v1.0 and WRFDA v4.5
Abstract. The application of ground-based microwave radiometers (MWRs), which provide high-quality and continuous vertical atmospheric observations, has traditionally focused on the indirect assimilation of retrieved profiles. This study advanced this application by developing a direct assimilation capability for MWR radiance observations within the Weather Research and Forecasting model data assimilation (WRFDA) system, along with a bias correction scheme based on random forest technique. The proposed bias correction scheme effectively reduced the observation-minus-background (O−B) biases and standard deviations by 0.83 K (97.1 %) and 1.63 K (64.6 %), respectively. A series of ten-day-long experiments demonstrated that assimilating MWR radiances improves both the initial conditions and the forecasts, with additional benefits from higher assimilation frequencies. In the initial conditions, hourly assimilation significantly enhanced low-level temperature and humidity fields, reducing the root-mean-square-error (RMSE) for temperature and water vapor mixing ratio by 6.32 % below 1 km and 1.98 % below 5 km. These improvements extended to forecasts, where 2 m temperature and humidity showed sustained benefits for over 12 hours, and precipitation forecasts exhibited notable gains, particularly for higher intensity events. The time-averaged Fractions Skill Score (FSS) for 3 h accumulated precipitation within the 24 h forecasts increased by 0.04–0.11 (10.2–58.1 %) for thresholds of 6–15 mm.
- Preprint
(2804 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 11 Apr 2025)
-
CEC1: 'Comment on egusphere-2025-12 - No compliance with the policy of the journal', Juan Antonio Añel, 12 Feb 2025
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.htmlYou have archived your code on several servers or sites which are not suitable repositories for scientific publication (actually only one complies, stored in Zenodo). Also, to access part of the code you request the readers to contact the corresponding author, which is totally unacceptable according to our policy, which clearly states that all the code and data must be published before submitting a paper and available to anyone without any restriction.
Therefore, the current situation with your manuscript is irregular. Please, publish your code and input and output data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy. In the meantime I advise the Topical Editor to stall the review process for your manuscript, as it should have not been accepted for Discussions given the mentioned shortcomings.
I note that if you do not fix this problem and reply to this comment with the requested information, we will have to reject your manuscript for publication in our journal.
Please, note that after addressing the mentioned issues, if your manuscript is considered acceptable to continue under review, you must include a modified 'Code and Data Availability' section in a potentially reviewed manuscript, containing the DOI and links of the new repositories.
Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/egusphere-2025-12-CEC1 -
AC1: 'Reply on CEC1', Qing Zheng, 14 Feb 2025
reply
Dear Juan A. Añel,
Thank you for your feedback and for bringing this to our attention.
As requested, we have now archived all relevant code in a single Zenodo repository. The RTTOV-gb v1.0, WRF v4.5, WRFDA v4.5, along with the code for developing a direct assimilation module for MWR radiances and training a machine learning-based MWR bias correction model, are now fully available on Zenodo: https://doi.org/10.5281/zenodo.14865778.
Please let us know if any further adjustments are needed.
Best regards,
Qing ZhengCitation: https://doi.org/10.5194/egusphere-2025-12-AC1
-
AC1: 'Reply on CEC1', Qing Zheng, 14 Feb 2025
reply
-
RC1: 'Comment on egusphere-2025-12', Anonymous Referee #1, 12 Mar 2025
reply
First of all, I'd like to emphasize the very high quality of this article, in terms of science, innovation and writing. This article, although covering a complex subject, is very well explained.
The results presented here are very interesting, since they show that the impact of MWR data assimilation on thermodynamic fields is propagated by physics over several lead times, but also within microphysics, since it ultimately improves precipitation scores.
This paper will be a good reference in addition to that of Vural et al (2024), who have worked on the direct assimilation of MWRs but with a LETKF-type assimilation scheme, as well as that of Thomas et al (2024), who use a 3D-Var-type assimilation scheme but indirectly assimilate temperatures retrieved from MWR brightness temperatures.
Before publishing this article, I do have few minor comments or questions:
- Firstly, I'd like to have more details about the 10-day period over which the assimilation experiments were conducted. Unless I'm mistaken, it's implicitly understood that this is a clear-sky period (especially when the cloud cover mask is mentioned), but this is explicitly stated only at the end of the article. On the other hand, I think it would have been nice to provide more justification for this particular period.
- Still on the subject of this 10-day period, I'd like to know if you've carried out assimilation experiments over other periods? If so, what were the results equivalent? Why wasn't a longer period considered?
- Regarding the single observation experiment, I'd like to know why the specific humidity analysis increments aren't totally isotropic (although they're close) as they are for temperature.
- My final comment concerns Figures 10 and 12. I think it would have been clearer to present the RMSE or FSS values directly, rather than the differences with the control experiment. I understand that this removes a curve from the graphs and perhaps improves readability. But having the RMSE values would be informative about the errors made by the model.
In conclusion, despite some minor comments and questions, I would like to emphasize once again the quality of this article. It will make an excellent contribution to the GMD journal.
Citation: https://doi.org/10.5194/egusphere-2025-12-RC1 -
RC2: 'Comment on egusphere-2025-12', Alistair Bell, 01 Apr 2025
reply
This paper works on the important issue of the direct assimilation of ground based microwave radiometer measurements for the purpose of numerical weather prediction. A sensible and innovative methodology is proposed to correct for the bias of certain channels in the microwave radiometers used, which is currently a key limiting factor for the use of these observations in operational data assimilation.
The authors demonstrate a successful methodology for the assimilation of brightness temperatures from HATPRO and MP3000A microwave radiometers. The assimilation results in improvements in low-level temperature and humidity fields, and has a neutral impact on temperature above 1 km. It was also found that increasing assimilation frequency enhances observational impacts on the forecast, with the 1hr assimilation frequency giving the best results from the experiment. Another postive impact was on the wind field, which improved in the analysis after the assimilation of microwave radiometer observations. The authors also show a positive impact of the observations on short range forecasts for temperature (0-6h), humidity (0-12h), and for precipitation (9-24h).
General Comments
Could you please include a few more details about the assimilation framework. Specifically:
-Timing of Observations: Do you select the observation that most closely matches the analysis time, or do you integrate over a broader time window?
-Data Processing Before Bias Correction:Aside from using the cloud mask, do you apply any additional procedures to improve measurement quality before performing bias correction?
-QC:Do you use any quality-control checks (e.g., discarding observations based on O−B thresholds), and if so, how many observations are rejected?
In Chapter 3, the O−B statistics are examined in detail. From the scatter plots, it appears that the K-band O−B distribution may be bimodal for both radiometer types—a potential issue for 3D-Var, which typically assumes unimodal (Gaussian) errors. After showing the initial scatter plots of BT(sim) vs. BT(obs), the paper primarily focuses on bias and standard deviation. It would be very interesting to see if the bias correction addresses this bimodality. I encourage you to present histograms or PDFs of the O−B errors before and after the bias correction is applied. I would also encourage you to comment briefly on the skew and kurtosis of the distributions.
The FSS improvements of rain rate seem impressive to say that the only change here is ground based radiometers. I suppose that the reason any impact is visible is only after 9 hours is because cloudy data is excluded. It may be interesting to show some statistics of the rainfall events in the 10-day period. Is one heavy precipitation event responsible for this improvement in forecast skill? Do you intend to repeat the experiment including cloudy data?
Could you add some information about how the background error covariance matrix was generated? Was any testing done to optimise this?Similarly, it would be helpful to know how you specified the observation-error covariances—for instance, the assumed observation-error standard deviations or correlations.
Specific Comments
Figure 1: Was the south-west China domain pre-defined in advance of the study? Were these conditions defined from the border of the computation domain or otherwise? Please justify.
line 106: “However, MWR radiances are upward-looking microwave observations, which differ from the downward-looking observation of satellites.”
I find this sentence quite jarring. Before in the article you refer to “ground-based MWRs”, so when you state that “MWR radiances are upward-looking” it seems like you refer to the microwave radiometer instrument in general, not simply ground based microwave radiometers. I would change the sentence to refer simply to the platform (ground-based vs satellite) and not contrast microwave radiometer with satellite as this doesn’t make sense.
Line 134: Could you say (at some point in the paper, not necessarily here) when the three month training data for your bias correction algorithm was taken?
Figure 2: Could you explain the plot axis label d Tans/ d ln P, I am not familiar with weighting functions of this type. Are the temperature/humidity increments representative of all data assimilation experiments? If not, please elaborate in the text. For plot c,d,e and f please label the colourbars and make the magnitude (1eX) more evident.
Line 160: “For HATPRO, more than 6,000 samples are analyzed. The O−B biases are 1.25 K for the K band (channel 1) and 2.14 K for the V band (channel 13)”. Are these statistics the average for the whole band or that particular channel? If they are for the whole band, make this explicit on Figure 3. If not, then these results should not be generalised to the whole band.
Figure 3: Why were these particular channels selected. Could you comment in the text about whether these plots are representative of the whole band? It would also be nice if you could include the same plots for the other channels in the appendix.
With respect to the apparent two clusters in the V band, were the (clustered) offset values consistently from the same radiometers. Do the offsets also correspond to a particular time period (e.g. before vs after calibration) or weather condition (after rain, in direct sunlight) ?
line 182: “Regarding the correlation coefficients between observed and simulated
brightness temperatures, the overall values are high but slightly lower for channels 4 to 9.”
What is high? Please be more precise.
Figure 4: Please add colourbar axis label and units for all sub plots. For HATPRO channel 14, values are mainly white, but this colour is not included in the colourbar- does that mean that the values are out of range, missing or otherwise?
Line 215: “These hyperparameters were tuned using GridSearchCV”
Please properly reference scikit learn (or the library in question) when refering to this.
Line 235: “. Furthermore, some individual channels, such as channel 10, display bimodal distributions.”
As stated above, it would be interesting to see these plots.
Figure 6: temperatrure and water vapour bands are labelled the wrong way around. I’m not sure if a continuous line plot is the most appropriate for the unlinked variables (bottom plot).
Line 415: “The RTTOV-gb coefficient files are trained on global profiles and are not tailored to the plateau region; consequently, their vertical coordinates extend up to 1050 hPa, while surface pressure in the plateau region typically exceeds 700 hPa.”
Could you please clarify why having the RTTOV-gb coefficient files only extend down to 1050 hPa creates an issue at high-altitude stations? Conceptually, if the sensor is located in a region where the surface pressure is around 700 hPa, that site is “above” the portion of the atmosphere from 1050–700 hPa (which would presumably be below ground in the plateau setting). How does this mismatch in the vertical extent of the RTTOV-gb coefficient files lead to biases for sensors effectively well above 1050 hPa? In other words, why does the “upper pressure limit” become problematic when the actual surface is located at a pressure lower (i.e., a higher altitude) than the coefficient file’s assumed maximum pressure?
Line 435: “It is noted that satellite-based microwave radiometers are primarily sensitive to the middle and upper atmosphere...”
This isn't technically correct. In atmospheric science, the middle atmosphere generally refers to the stratosphere + mesosphere, i.e. from the tropopause (roughly 8–17 km, depending on latitude) up to ~80–85 km. The upper atmosphere is often taken to be the thermosphere and above. Many operational satellite microwave sensors (e.g., AMSU, MHS, ATMS) retrieve temperature and humidity by sounding channels peaked in the troposphere and lower stratosphere, though they can extend somewhat upward. Their highest sensitivity is thus often in the mid- to upper troposphere, not solely above it.
Citation: https://doi.org/10.5194/egusphere-2025-12-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
186 | 47 | 9 | 242 | 4 | 4 |
- HTML: 186
- PDF: 47
- XML: 9
- Total: 242
- BibTeX: 4
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 74 | 30 |
China | 2 | 64 | 26 |
France | 3 | 15 | 6 |
Germany | 4 | 12 | 5 |
Brazil | 5 | 11 | 4 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 74