the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
An ensemble machine-learning first-guess approach for physics-based retrieval of ice particle size distributions from multi-frequency radar, validated with CCREST-M aircraft observations
Abstract. The Characterising CiRrus and icE cloud acrosS the specTrum-Microwave (CCREST-M) aircraft campaign (February–March 2024) was based around the Chilbolton Observatory, UK, using the Facility for Airborne Atmospheric Measurements (FAAM) BAe-146 aircraft. The campaign was designed primarily as a testbed for ice-cloud scattering and radiative transfer models across the microwave and sub-millimetre spectrum. A key requirement for such closure tests is a near one-to-one relationship between the ice particle size distributions (PSDs) that enter the radiative transfer model and the radiometric measurements. Owing to the FAAM BAe-146 aircraft being unable to perform simultaneously above-cloud radiometric measurements and in-situ sampling within the same volume of cloud, we retrieve PSDs from the ground-based zenith-pointing radars at the time of the radiometric overpasses and then use the aircraft in-situ PSDs as an independent validation dataset.
We present a novel hybrid retrieval framework for mid-latitude ice PSD parameters (slope λ, intercept No, and shape μ of the gamma size distribution) that combines a machine-learning (ML) ensemble with physics-based multi-frequency radar retrievals using 3, 35, and 94 GHz reflectivities. An ensemble of ML models is trained on observations from the Parameterising Ice Clouds using Airborne ObServationS and triple–frequency dOppler radar (PICASSO) campaign, also centred on Chilbolton Observatory. These models predict PSD moments from temperature, pressure, 3 GHz-retrieved ice water content (IWC), and the mean mass-weighted dimension. The ML predictions are converted into first guess gamma-PSD parameters at each height. A subsequent deterministic optimisation then adjusts No and λ, using a randomly oriented rosette-aggregate scattering model, to enforce simultaneous agreement with the observed 35 and 94 GHz reflectivities. In this way, the ML ensemble acts as a compact, data-driven representation of the prior information, this being an alternative approach to the Bayesian optimal-estimation framework.
We apply the retrieval to three of the CCREST-M cases with co-incident in-situ aircraft data. We show that the ML ensemble reproduces PSD moments well for two cases but fails when extrapolating beyond its trained temperature range in the third case. Retrieved IWCs from the 3 GHz radar compare favourably with PICASSO derived in-situ measurements of IWC, and exponential (μ=0) and gamma PSD assumptions show comparable performance overall. Retrieved mean and median PSDs show generally good agreement with in-situ PSDs as a function of temperature, although systematic biases remain in one case, likely due to temporal cloud evolution between radar and in-situ sampling. The IWCs derived from the retrieved PSDs are generally within about 50 % of the in-situ measured IWCs over much of the –50 to –10o C temperature range, with near-unity agreement between the estimated and in-situ IWCs for one of the cases. Independent validation using 200 GHz radar reflectivity profiles confirms retrieval consistency where ML predictions are reliable and for a well constrained case, reinforcing the robustness of the retrieval approach and ice crystal scattering model. The retrieved PSDs provide radar-constrained inputs for forthcoming radiative transfer closure studies using collocated mm-wave and sub-mm-wave radiometer observations.
- Preprint
(3721 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2026-784', Anonymous Referee #1, 21 Apr 2026
-
AC1: 'Reply on RC1', Anthony Baran, 27 May 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-784/egusphere-2026-784-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Anthony Baran, 27 May 2026
-
RC2: 'Comment on egusphere-2026-784', Haoran Li, 23 Apr 2026
Haoran Li lihr@cma.gov.cn Chinese Academy of Meteorological Sciences This work presents a retrieval algorithm that integrates machine-learning-derived prior constraints with dual-frequency radar inversion. The 3 GHz radar observations provide prior information for the machine learning model, whereas the 35 GHz and 94 GHz radars are adopted for subsequent microphysical parameter retrieval. To the best of my knowledge, CCREST-M data is very valuable from the perspective of aircraft validation to triple-frequency retrievals. The results were compared to aircraft observations and G-band radar observations. I do not have critiques on the presented results, while I have a feeling that this algorithm could benefit from a direct triple-frequency retrieval. Since triple-frequency radars were practically used in this work, it remains unclear why the synergistic retrieval of 3 GHz, 35 GHz and 94 GHz radars was not adopted directly. If the value of single-frequency plus dual-frequency retrieval is well supported, I would be very happy to see its publication. Major comments, 1. As discussed above, I am very confused that you did not use a triple-frequency approach. Alternatively, is it possible to use the Ka-band or W-band radar in the ML model? Then, you do not need the third frequency. 2. The adequacy of using a single ice type. It seems to be a bold assumption to me that a single ice type was used in the radar retrieval. I understand that the CPI imagers suggest the presence of ice rosettes, but the sampling area is very limited compared to radar observations. I would encourage a thorough discussion on this limitation. 3. Redundant figures for PSD comparisons. It is recommended that in-situ observations, ML predictions and dual-frequency radar retrievals be integrated into a single panel. In this way, direct and intuitive comparisons among PSD results from different methods can be easily conducted. 4. Similarly, Tables 2,3,4 should be integrated into one table for a direct comparison. The same to all the PDF plots of different moments (e.g., fig9&10). 5. ML predictions for PSD moments were validated to in situ observations, but the dual-frequency validations are missing. 6. I like the G-band validation part. Since the G-band radar was collocated with other radars, I would recommend a long-term validation. I believe it is very handy to implement. Some technical comments, 1. Figure 2. Mark the periods where the aircraft validations were made. 2. L204. Liquid clouds are not uniformly distributed. How did you do the liquid attenuation correction with LWP? 3. L212. L267. Looks conflicting. You definitely need gaseous attenuation for Ka- and W-band radars. 4. L210. It should be 1 dB. In addition, I would not say ‘much less’ than 1 dB. We recently compared different parameterizations, and it is sub-dB difference. Li, Q., Li, H., Sun, X., et al. (2026). A survey of snow growth signatures from tropics to Antarctica using triple-frequency radar observations. Atmospheric Chemistry and Physics, 26(2), 1249-1264.
Citation: https://doi.org/10.5194/egusphere-2026-784-RC2 -
AC2: 'Reply on RC2', Anthony Baran, 27 May 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-784/egusphere-2026-784-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Anthony Baran, 27 May 2026
Status: closed
-
RC1: 'Comment on egusphere-2026-784', Anonymous Referee #1, 21 Apr 2026
This manuscript introduces a new multi-frequency ice particle size distribution retrieval and evaluates the algorithm against airborne in situ observations during the Characterising CiRrus and icE cloud acrosS the specTrum-Microwave (CCREST-M) that took place in 2024 at the Chilbolton Observatory in the UK. The paper is well written, and the analyses support the findings. The subject is appropriate for Atmospheric Measurement Techniques. My primary concern is the length of the manuscript. This arises because the paper includes a detailed description of the CCREST-M campaign/measurements as well as the algorithm (which has several elements) and its evaluation using multiple diagnostics from three separate case studies. While this has the advantage of presenting all this information in one place, it results in a very dense manuscript that was somewhat challenging to read. Though this is not a critical flaw (I believe there are no strict word/page limits), I encourage the authors to explore opportunities to make the manuscript more concise. For example, there is some repetition in the results and explanations of algorithm behavior that could probably be shortened. Since this is just a suggestion, my recommendation is to accept with minor revisions.
- The abstract could also be more concise. Some of the background in the first paragraph could be left for the Introduction and some of the details in the 3rd paragraph could be consolidated into a more concise description of the results.
- The authors have decided to combine the description of CCREST-M and the algorithm evaluation into a single manuscript. While I think I understand the motivation for this choice, these topics could probably be covered in two distinct, concise, and equally interesting, papers.
- The results (Section 5) are very thorough, but somewhat repetitive. For example, many figures are repeated for different variables, temperature ranges, and different cases and several points are reiterated multiple times. I recognize that each case illustrates a distinct aspect of algorithm performance, but I wonder if the results would be more impactful if they were conveyed more succinctly, perhaps consolidating the complementary findings from individual cases/comparisons where possible. Also, could some of the figures that represent variants of the same plot be moved to an appendix or supplementary material?
- The labels on several figures are very small (e.g. Figures 4, 5, 8, 9, and 10), some to the point of being illegible (11, 12, 14, 17, 20 - 23).
Citation: https://doi.org/10.5194/egusphere-2026-784-RC1 -
AC1: 'Reply on RC1', Anthony Baran, 27 May 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-784/egusphere-2026-784-AC1-supplement.pdf
-
RC2: 'Comment on egusphere-2026-784', Haoran Li, 23 Apr 2026
Haoran Li lihr@cma.gov.cn Chinese Academy of Meteorological Sciences This work presents a retrieval algorithm that integrates machine-learning-derived prior constraints with dual-frequency radar inversion. The 3 GHz radar observations provide prior information for the machine learning model, whereas the 35 GHz and 94 GHz radars are adopted for subsequent microphysical parameter retrieval. To the best of my knowledge, CCREST-M data is very valuable from the perspective of aircraft validation to triple-frequency retrievals. The results were compared to aircraft observations and G-band radar observations. I do not have critiques on the presented results, while I have a feeling that this algorithm could benefit from a direct triple-frequency retrieval. Since triple-frequency radars were practically used in this work, it remains unclear why the synergistic retrieval of 3 GHz, 35 GHz and 94 GHz radars was not adopted directly. If the value of single-frequency plus dual-frequency retrieval is well supported, I would be very happy to see its publication. Major comments, 1. As discussed above, I am very confused that you did not use a triple-frequency approach. Alternatively, is it possible to use the Ka-band or W-band radar in the ML model? Then, you do not need the third frequency. 2. The adequacy of using a single ice type. It seems to be a bold assumption to me that a single ice type was used in the radar retrieval. I understand that the CPI imagers suggest the presence of ice rosettes, but the sampling area is very limited compared to radar observations. I would encourage a thorough discussion on this limitation. 3. Redundant figures for PSD comparisons. It is recommended that in-situ observations, ML predictions and dual-frequency radar retrievals be integrated into a single panel. In this way, direct and intuitive comparisons among PSD results from different methods can be easily conducted. 4. Similarly, Tables 2,3,4 should be integrated into one table for a direct comparison. The same to all the PDF plots of different moments (e.g., fig9&10). 5. ML predictions for PSD moments were validated to in situ observations, but the dual-frequency validations are missing. 6. I like the G-band validation part. Since the G-band radar was collocated with other radars, I would recommend a long-term validation. I believe it is very handy to implement. Some technical comments, 1. Figure 2. Mark the periods where the aircraft validations were made. 2. L204. Liquid clouds are not uniformly distributed. How did you do the liquid attenuation correction with LWP? 3. L212. L267. Looks conflicting. You definitely need gaseous attenuation for Ka- and W-band radars. 4. L210. It should be 1 dB. In addition, I would not say ‘much less’ than 1 dB. We recently compared different parameterizations, and it is sub-dB difference. Li, Q., Li, H., Sun, X., et al. (2026). A survey of snow growth signatures from tropics to Antarctica using triple-frequency radar observations. Atmospheric Chemistry and Physics, 26(2), 1249-1264.
Citation: https://doi.org/10.5194/egusphere-2026-784-RC2 -
AC2: 'Reply on RC2', Anthony Baran, 27 May 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-784/egusphere-2026-784-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Anthony Baran, 27 May 2026
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 814 | 475 | 84 | 1,373 | 69 | 151 |
- HTML: 814
- PDF: 475
- XML: 84
- Total: 1,373
- BibTeX: 69
- EndNote: 151
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript introduces a new multi-frequency ice particle size distribution retrieval and evaluates the algorithm against airborne in situ observations during the Characterising CiRrus and icE cloud acrosS the specTrum-Microwave (CCREST-M) that took place in 2024 at the Chilbolton Observatory in the UK. The paper is well written, and the analyses support the findings. The subject is appropriate for Atmospheric Measurement Techniques. My primary concern is the length of the manuscript. This arises because the paper includes a detailed description of the CCREST-M campaign/measurements as well as the algorithm (which has several elements) and its evaluation using multiple diagnostics from three separate case studies. While this has the advantage of presenting all this information in one place, it results in a very dense manuscript that was somewhat challenging to read. Though this is not a critical flaw (I believe there are no strict word/page limits), I encourage the authors to explore opportunities to make the manuscript more concise. For example, there is some repetition in the results and explanations of algorithm behavior that could probably be shortened. Since this is just a suggestion, my recommendation is to accept with minor revisions.