the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improved Simulation of Thunderstorm Characteristics and Polarimetric Signatures with LIMA 2-Moment Microphysics in AROME
Abstract. Thunderstorm forecasting remains challenging despite advances in numerical weather prediction (NWP) systems. The microphysics scheme, that represents clouds in the model, partly contributes to the introduction of uncertainties in the simulations. To better understand the discrepancies, synthetic radar data simulated by a radar forward operator (applied to model outputs) are usually compared to dual-polarization radar observations, as they provide insight into the microphysical structure of clouds. However, despite the diversity of microphysics schemes and forward operators, the modelling of polarimetric values and radar signatures such as the ZDR column (ZDRC) remains a complex issue, especially above the freezing level where too low values are often found.
The aim of this work is to assess the ability of the AROME NWP convective model, when coupled with two distinct microphysics schemes (ICE3 one-moment and LIMA partially two-moment), to accurately reproduce thunderstorms characteristics. A statistical evaluation is conducted on 34 convective days of 2022 using both a global and an object-oriented approach, and a ZDRC detection algorithm is implemented. Simulations performed with LIMA microphysics showed a good agreement with observed ZH, ZDR and KDP below the melting layer in convective cores. Moreover, it demonstrated a remarkable capacity to generate a realistic number of ZDRC, as well as a distribution of (1) the ZDRC area, and (2) the first ZDRC occurrence, very close to the observations. Enhancements in the forward operator have also been suggested to improve the simulations in the mixed phase and cold phase regions.
These findings are highly encouraging in the context of data assimilation, where one could leverage the combination of advanced microphysics schemes and improved forward operators to improve storm forecasts.
- Preprint
(1781 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-685', Anonymous Referee #1, 26 Mar 2025
General comments:
The paper analyses the ability of the AROME NVP convection model to represent convective thunderstorms. The model is run with two different microphysical schemes, which are validated on the basis of the French polarimetric radar network for 34 convective days in 2022.
In general, the paper is well written, the work is comprehensive but also quite lengthy as a result. However, I would not have a direct suggestion on how it should be shortened: The introduction is well researched, chapters 2 and 3 are mandatory for data and methodology, and the results section with comparisons of a) precipitation b) polarimetric moments c) ZDR columns is adequate. I also liked the appendix with the more technical insights into the forward operators melting scheme. Nevertheless, I would have a few minor points regarding clarity. Otherwise, I am pleasantly surprised that I found hardly any technical corrections.
Specific comments:
L117: Why were not all 51 days selected for this study? Is it because only 34 days of these 51 days have QC1 or QC2? That is not entirely clear here.
L124: I wonder whether AROME is perhaps predicting too many convective events in general because the events were selected for days on which thunderstorms actually occurred. It would be interesting to know whether AROME perhaps has a too high false alarm rate, i.e. whether too many thunderstorms were predicted even on calm days.
L126: So the newly calculated polarimetric data come from all three bands that were mixed together? Depending on the bands, the distance and the hydrometers present, the polarimetric signal could be different. What is selected then? I assume that ANTILOPE QPE takes this into account, but how is this taken into account in section 4.3?
L255: Is coverage by 3 different radar stations per grid box required? Or do you mean at least 3 radar observations per grid box? If 3 radar stations are really required, you would have to exclude a lot of data, or how dense is the radar network? And again, how are the S, C and X bands combined?
L259: Have you analyzed the impact of the lead time of your prediction? In other words, does it make a difference how long after the NWP start the event took place? I would imagine that your results could be different depending on the lead time, as the model tends to produce fuzzier predictions with longer lead times.
L357: 50 x 50 km box means that with the regrided radar resolution of 500m per lon/lat in each box there are 10000 values, right? And from that the 99 percentile is taken? And what is true for the model data with 1.3km resolution? Something like 1600 values per 50 x 50 km box? Why are you not using the same grid?
L365: How is the bootstrapping done? Out of the 50x50 km boxes? How often?
L386/ Figure 3: As far as I understand, there is a discrepancy between the resolution of the model and the observation grid. So does it make sense to compare the very small observed cells with the coarser model? I don't understand that here. And the discussion about the small cells being ‘simulated too rarely’ is then misleading. How should a model with a resolution of 1.3 km be able to simulate these very small cells (i.e. with a size of about 0 km^2)? I think there should be a discussion about model resolution in this section. The same goes for the discussion about lifetimes: I don't think AROME is able to predict the very small lifetimes. That is related to the model resolution.
L416: Why should LIMA be compared to OBS_no_hail? LIMA is not directly including hail, but it is included in the graupel class.
L425/Figure 6: Why CFADs instead of CFTDs (Contoured Frequency by Temperature Diagrams)? Maybe then the melting layer would be more sharp and the whole discussion in 4.3 more meaningful. I could imagine that the 44 events vary a lot in surface temperature and by that the distributions of observed pol. Moments in Fig. 6 are broader.
Technical corrections:
L366/Figure 2: „The HSS score“: leave out „score“ as it is already in HSS.
Figure 9: ‚Altitudes are given in km AGL‘: Is this a remark for Figure 8?
Citation: https://doi.org/10.5194/egusphere-2025-685-RC1 -
RC2: 'Comment on egusphere-2025-685', Anonymous Referee #2, 28 Mar 2025
Summary
This study presents an evaluation of the AROME model with two cloud microphysics schemes based on polarimetric radar observations over a dataset of 34 days of convective precipitation in France. The evaluation encompasses general precipitation metrics and convective cell characteristics, with a special focus on ZDR column statistics. Leveraging polarimetric radar observations for model evaluation is a promising approach, and this study benefits from a large sample size. It is remarkable that the model with LIMA is able to reproduce area and frequency statistics of ZDR columns that well as demonstrated by the authors. I find the paper to be well written, well structured and with informative and well visualized images. The methods are explained thoroughly and clear. The content is extensive, resulting in a very long paper. I appreciate the discussion about limitations of the approach. In general, I think this paper is already of high quality, and my suggestions below are mainly to address clarity or for quality improvements.General comments
- At some instances, I find the result section to be rather descriptive and to sometimes lack possible explanations for the observed discrepancies. This concerns mainly section 4.2. It would be interesting to know about potential reasons for these discrepancies. This might be related to grid spacing or interpolation issues and should be discussed.
- I do not fully understand the focus on the problems regarding the bright band and melting scheme of the forward operator as it seems to me that this is not important for the purpose of this study. The uncertainties in that region are dominated by the forward operator and as such do not help much in evaluating the microphysics model. Perhaps you could add a paragraph explaining why you think this is important for your topic. Alternatively, less attention to this topic might be a way to reduce the length of this paper.
Specific comments
- Line 6: I think I disagree with the logic. I would argue it is the other way around: It is a complex issue, which results in a diversity of forward operators and microphysics schemes.
- Lines 36 - 57: This paragraph is well written and I enjoyed reading it. However, I do not understand the focus on nowcasting in the scope of your paper. Generally, AROME is not used for nowcasting, is it? I do think there is plenty of motivation to analyze ZDR columns anyway, as is described, because of data assimilation, and because ZDR columns are an indicator to analyze the performance of models in terms of correct updraft characteristics.
- Line 118: Why do you select only 34 of the 51 days with hailstones?
- Line 126: How was the network in 2022, given your evaluation is done with data from 2022?
- Line 127: Does the radar frequency have an effect on your evaluation? E.g., are ZDR columns or convective cells detected in X-band the same way as in S-band?
- Line 171 and 186: What do you mean by flexible? I would suggest to either explain or omit this word.
- Line 199: If hail is included in the graupel species, what does that mean for the density of this particle class? Given that you evaluate hail events, I would expect differences to the observed events purely as a result of the particle property assumptions here. Is there a reason why you chose not to use hail as a sixth category?
- Line 217: Why is melting snow transferred to graupel instead of rain? Is this a common approach?
- Line 249: I do not understand how the nearest radar determines the ROI for a given grid point. How is the radar affecting the radius that is taken into account for interpolation to a given grid point? Especially since you use multiple radars for interpolation, but only one determines the ROI?
- Line 256: Do you mean three radars are required for each 3D grid point? Or only after projection to a 2D plane? I suspect the latter, because only then your argument with vertical coverage makes sense. Perhaps you can rephrase this to make it clearer.
- Line 261: The word 'constrained' sounds like you make the domain smaller. But I think your point is that you take an extra area around the observed event into account, to include also mislocated predicted convection. Perhaps rephrase this to make this more clear.
- Line 264: So the model output is at a different horizontal resolution than the radar observations. Also, given the much higher native resolution of the radar compared to the model, I would expect differences that are solely based on these resolution effects. This is never discussed in this paper, but might be important.
- Line 281: Perhaps for clarification you could write 'top of the Zdr column'. This is clear from the context, but one might also think of the full model column here.
- Line 313: What effect does that have? If you have a large precipitation area of > 36 dBZ, with multiple embedded cores of 40 dBZ < max reflectivity < 48 dBZ, then they would count multiple times for your simulations, but only one time for your observations. Am I understanding this right? If only the highest reflectivities are missing in simulations (e.g., due to density assumptions), then it would perhaps be fairer to lower the threshold for both, observations and simulations. Perhaps a simple histogram of the observed / simulated reflectivities could help identify which range of reflectivities is affected by this bias. Perhaps you could also elaborate on why exactly you chose to reduce the threshold to 40 dBZ, and not any other number. And do you have an idea about the reason for the bias? I think this was not mentioned in the result section. I think this choice requires a bit more discussion about the reasoning and the implications.
- Line 319: It is not clear to me what the objects are that you use for later analysis. Are these the features identified in the identification step with 36 and 40/48 dBZ? Or the areas as defined in the segmentation step with 40 dBZ?
- Line 319: Following up on the previous question, what happens if you identify a core with a maximum reflectivity of less than 40 dBZ? Then you have identified a core, but no area assigned to it, given that the threshold you use for segmentation is higher than 36 dBZ, in my understanding.
- Line 329: I assume you use a 2 dB threshold as described earlier for the segmentation process? This should be mentioned here.
- Line 359: I do not understand the reasoning here. Convective precipitation is indeed typically associated with intense precipitation. However, RRtot is the total accumulated precipitation over the entire event length, as you describe in line 352. That means RRtot is not directly an intensity measure, as an event of medium/low intensity that lasts over a long time period might also produce high RRtot values.
- Line 373: This is a significant difference between model and observations of more than a factor of 10. Is there an explanation for this? Is this a known problem of the model?
- Line 439: Why is KDP increasing towards the ground in LIMA, even though the actual rain mass is decreasing? The explanation of KDP as a proxy for the amount of liquid water seems to hold only for ICE3, not for LIMA.
- Line 454: Could there be other differences between observations and model purely as a result of interpolation issues? Perhaps this should be discussed, similar to grid resolution differences.
- Part 4.3.2: It seems to me that the melting scheme is producing unrealistically strong melting signals. Are the bright band signatures important for your study? Perhaps an analysis without the bright band and without a melting scheme might provide results just as good?
- Line 472: How can spherical droplets produce a ZDR that is not 0?
- Line 468 - 477: This is a forward operator issue. The simulated ZDR/KDP depends strongly on the orientation and shape of the particles. What does the forward operator assume here for graupel and snow? This could also be related to the forward operator being based on the T-matrix, as the T-matrix assumes soft spheroids and hence cannot represent the density/shape of realistic aggregates which can strongly deviate from soft spheroids. You discuss this well in your discussion section 5.1. I would suggest to mention here that there is a discussion about this later on.
- Line 484: Is this really the main reason for the reflectivity differences? The ice class surely stands out in the mixing ratio plot (6d) / 6e), but mainly due to a high relative difference (basically no ice in ICE3). However, the simulated snow mixing ratio deviates also by a lot between the schemes. And at least the Q75 for graupel is significantly higher in LIMA too. Generally, the effect on reflectivity for a given mixing ratio depends a lot on the assumed density and the simulated particle sizes. To draw conclusions about reflectivity biases from mixing ratio differences without context about size distributions and density assumptions is difficult. Some radar forward operators distinguish between class contributions to the total signal in their output. Is it perhaps possible with radar forward operator applied here?
- Line 514: Do you have an idea about why cloud water is available with such high proportion within the identified ZDR columns? Cloud water should not increase Zdr, and as such not be relevant to ZDR column detection, in my understanding. Is there a physical reason why cloud water is enhanced within Zdr columns, e.g., updrafts?
- Line 524-531: While I do agree that high ZDR is potentially a result of large raindrops, I am not sure about size-sorting as the reason, because the ZDR is already very high at 3.5 - 4 km. However, comparing the ZDR distribution to the mixing ratio of rain, it seems convincing that rain is generally responsible for the high ZDR values. I am also not sure about your argument regarding the melting of graupel. 1) ZDR values are already too large above the melting layer, and 2) if melting graupel is the reason for too large rain drops, also the ZDR values within the convective core (Figure 6 b) would be too large too. Perhaps some updraft related process is producing rain drop sizes that are too large? I am not sure about the actual reason, but this would be interesting to discuss.
- Line 549: How did you come up with these numbers? Based on previous studies? Based on your own observations?
- Line 561: Was this discussed before? You say, 'to summarize', but I could not find a discussion about the lack of rain (relative to ICE3 and only in ZDRC, I assume) and the impact on liquid water fraction.
- Line 577: Are you referring to the ZDRC? Within the convective cores, both schemes show the same rain mixing ratio distribution.
- Line 668: You applied the same threshold for all simulations/observations. That means a lower number of ZDRC in the ICE3 simulations is not purely a result of the threshold, but rather points to the inability of that model to produce high ZDR values in these columns.
- Line 677: If the detected ZDRCs consist mainly of wet graupel, this should be part of the ZDRC section (4.4). In that section, the reasons for the discrepancies should be discussed and the impact of wet graupel is not mentioned so far.
- Line 677: Following up, I would expect that graupel does not produce high ZDR. Why do you think that the detected ZDRCs consist of wet graupel? If that is the case, why is LIMA then detecting so much more ZDRC than ICE3, given that ICE3 has higher wet graupel water fraction? (Figure 8d)
- Line 687: How do you know that the rain mixing ratios below freezing temperatures were insufficient? You don't have real measured rain mixing ratios. I agree that rain is probably not reaching up high enough given that the ZDR columns reach up higher in observations. But statements about biases in the actual mixing ratios are difficult to make in your context. If you think statements about actual mixing ratio biases are possible from your analysis, please elaborate.
- Line 709: What could be reasons for this?
- Figure 1: The ZDR column contours are really hard to see. Perhaps a different color would help to distinguish them from the background map? I am not sure. Also, what are these the spherical grey dots with black contours? Generally, the best way to demonstrate the tracking behavior would be a video as a supplement. I am aware that this might be too much work, so this is just a suggestion.
- Figure 5: Reflectivities below 36 cannot exist, due to your tracking thresholds of at least 36 dBZ. Perhaps adjust the x-axis limits correspondingly, otherwise one might be confused by the fact that there are never reflectivities below 36 dBZ.
- Figure 5: Is this the maximum cell reflectivity over the whole lifetime of the storm? Or does a cell contribute multiple times, like in Figure 3?
- Figure 5: Perhaps the absolute frequency is more helpful than the relative frequency. I would expect the observations then to overestimate the smaller max reflectivity values. Is this the case?
- Figure 5: It is remarkable that small reflectivities are as frequent as high reflectivities, e.g., that this is more or less a Gaussian distribution. I would have expected the extremely high reflectivities to occur much more rarely than smaller max reflectivities. Is this a result of the events you chose? Or a result of the thresholds applied within the tracking?
Technical corrections
- Line 708: Our results show
- Figure 8: Description of d) missing.
Citation: https://doi.org/10.5194/egusphere-2025-685-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
96 | 14 | 2 | 112 | 5 | 3 |
- HTML: 96
- PDF: 14
- XML: 2
- Total: 112
- BibTeX: 5
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
France | 1 | 27 | 25 |
United States of America | 2 | 26 | 25 |
China | 3 | 13 | 12 |
Germany | 4 | 13 | 12 |
United Kingdom | 5 | 4 | 3 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 27