the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Benefits of the simplified MEV for analyzing hourly precipitation extremes in a changing climate
Abstract. Predicting the likelihood of extreme hourly rainfall events is crucial in mitigating risks associated with flash floods and related hazards. Previous research shows that, for limited sample sizes, the simplified Metastatistical Extreme Value (sMEV) distribution can significantly reduce the associated uncertainty in rainfall return levels compared to the more commonly used General Extreme Value (GEV) distribution. Recent research also highlights the possibility to analyze the effects of climate change using the non-stationary versions of both distributions. Thus, we evaluate the performance of the sMEV and GEV distributions for hourly precipitation obtained from a convection-permitting regional climate model. The global climate model MIROC5 is employed to drive the regional climate model COSMO over the greater Germany area for historical, near-future and far-future periods. To our knowledge, this is the first application of the sMEV distribution to time series from a convection-permitting-model. The results show that the sMEV outperforms the GEV in terms of uncertainty across almost all return periods regardless of the length of observational records. In addition, there is a north-south gradient in the return level difference, the uncertainty difference and crucially the adequacy of the sMEV left-censoring threshold. Investigating non-stationary versions of the sMEV and GEV shows that the non-stationary sMEV is more suitable to describing the change in return levels under climate change. However, both non-stationary versions analyzed lack complexity and should be used carefully when projecting future rainfall extremes.
- Preprint
(13477 KB) - Metadata XML
-
Supplement
(10627 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-6419', Anonymous Referee #1, 12 Jan 2026
-
AC1: 'Reply on RC1', Marc Lennartz, 23 Feb 2026
Dear Referee,
Thank you for your timely review and suggested improvements. We agree with your critique that the objective of the study needs to be clarified, as well as most of your specific comments. Through restating the aim and novelty of the study more clearly, we hope to alleviate your concerns and improve the quality of the study. We will start with an explanatory introduction, before answering your comments (in italic) point by point.
The objectives of the study are to: (i) quantify differences in return levels between sMEV and GEV under uniform left-censoring across different global warming levels, return periods, and regions (ii) assess differences in sensitivity to small samples across return periods, sample sizes, and regions, and (iii) explore the adequacy of non-stationary formulations in representing the relationship between regional temperature changes and return levels.
While we hold the opinion that the study provides important and original results, we agree that the claim of novelty needs to be restated. We will summarize the novelty in three points:
(i) For Germany, we analyze the effects of climate change on hourly return level differences between the GEV and sMEV on an unprecedented spatial scale. Among other spatial and temporal patterns, our analysis shows that the return level differences for estimates made using the GEV and sMEV reverse in a warmer climate. These findings are novel, especially in the spatial scale they are presented.
(ii) The sensitivity to small samples for hourly quantiles, quantified using the RRMSE, shows interesting spatial differences. These spatial variations are also present in the RRMSE of the individual distributions. The patterns across return periods and sample length are similar to results from Schellander et al. for daily precipitation over Austria (Schellander, Harald, Alexander Lieb, and Tobias Hell. "Error structure of metastatistical and generalized extreme value distributions for modeling extreme rainfall in Austria." Earth and Space Science 6.9 (2019)), even though a slightly different methodology is used. However, for hourly precipitation over Germany, we can newly diagnose that the sMEV provides an added value over the GEV in terms of RRMSE across the full matrix of sample lengths and return periods, while Schellander et al. found the added value only for return periods above the sample length and sample lengths of less than 30 years.
(iii) The last novelty is the exploration into non-stationary distributions of the sMEV and GEV. Here we try to gauge how two example distributions are able to capture changes in the return levels under global warming. This comparison is novel for hourly precipitation with such a large continuous spatial domain. However, the comparison is done without clear performance indicators. Nonetheless, this can provide a helpful point of reference for future studies.
As pointed out, statements claiming a “first application” of sMEV to convection-permitting simulations were included incorrectly and will be removed.
Major comments:
If the methodological objective is to say that the SMEV method is more robust than the GEV approach, there is nothing new here, as this has been demonstrated in previous work.
We agree with this statement and admit that objective has not been formulated clearly. We will revise the manuscript according to the points raised in the introductory part above and clarify where our work goes beyond previous work.
On the other hand, I think it would be interesting to present future projections for hourly rainfall in Germany using these two approaches . If this objective is retained, a more comprehensive literature review is needed to explain what projections are currently available in Germany on hourly extremes and how the results of this study either support them or produce new results.
Thank you for the suggestions of viewing the results of this study from a different angle. We agree that comparing return level estimations to other studies would be interesting. We will add a set of figures in the supplement showing the spatial distributions of return levels for different return periods and different regional warming levels. We will add two of these maps for 10-year return levels, as well as two maps showing the spatial distribution of the temperature scaling rate, in the main article. This will make it possible to directly compare the results of the GEV and sMEV method to the peak-over-threshold method, as calculated by Rybka et al. (2023) for the same data set. However, we note that gauging the reliability and plausibility of such projections is out of scope as they are governed by model uncertainty, scenario uncertainty, and internal climate variability beyond the differences in extreme value approaches.
[…] In fact, this is not really what is done here to evaluate the performance of one model compared to another; they must be compared to an observed reference. Furthermore, talking about performance is not very clear here, as it is not a very specific objective.
We agree that the phrase “evaluate performance” could be potentially misleading, by implying that the distributions are evaluated, based on comparisons between their respective quantile estimations and a common ground-truth. In fact, the return level estimations are not validated and only the sensitivity to small samples is used as a performance-related metric. However, this is a very important property, especially when estimating rare extremes, which has been applied in many similar studies. We believe that the presented variability in sensitivity between the GEV and sMEV is not trivial and exhibits interesting patterns across return periods and across space. This has implication when deciding which distribution is most appropriate when estimating a specific return period across Germany.
A comparison to an observed reference, in turn, evaluates the capabilities of the climate model to reproduce the characteristics of hourly precipitation extremes. This has already been done by Rybka et al. (2023) as described in lines 106ff. For clarification, we will slightly extend this description so that the reader is well aware of the comparison COSMO vs. observational reference.
To our knowledge, this is the first application of the sMEV distribution to time series from a convection-permitting-mode => this is wrong, see these references below, among others: […] Generally speaking, as soon as I read an abstract "for the first time", I think to myself that the results of an article are potentially overstated.
Thank you, for pointing out this inaccuracy. The relevant statement will be removed and paragraphs claiming novelty in this context will be reformulated accordingly.
Specific comments:
Page 4, line 95. It is somewhat surprising to read that the scenario is not suitable because it is too pessimistic. A little more context is needed here to explain why this scenario was selected in this context.
We agree that the section should be reformulated to provide better clarity. Here, the term “pessimistic” referred to recent literature suggesting that future emissions may remain below RCP8.5 levels (Hausfather, Zeke, and Glen P. Peters. "Emissions–the ‘business as usual’story is misleading." Nature). The scenario was selected because it spans a wide range of regional warming levels, allowing analysis beyond weak warming conditions.
Page 5, line 115. It is very good to cite the sources for the codes used. However, it would be interesting to specify what changes have been made and possibly produce the modified code.
Thank you for the suggestion. We will now clearly specify that only selected sub-functions from the existing code (e.g. event separation and inversion routines) were used, while the greater framework of the analysis was developed independently. The full workflow, including preprocessing, main calculations, and plotting functions, will be made publicly available via the GFZ data repository.
Page 5, line 116 The lmoment method does not focus solely on extreme values. Above all, the method allows for a more robust estimation of the parameters. This needs to be changed here.
We agree that the original phrasing was lacking important properties of the L-moments approach. We will add an emphasis on robust parameter estimation rather than focusing solely on extremes.
Page 5, line 117 There are also regional approaches that provide much more robust estimates for parameters of frequency models compared to pixel-by-pixel estimation. Even though such approaches are not used here, I think we should include this methodological warning.
Thank you for adding this important context, which will be added as a limitation. More specifically, we can mention the study from Schellander et al. (2019), where spatially smoothed parameter values were used as the “ground-truth”, against which, the individual point values were evaluated. Furthermore, we will add a few sentences on regional frequency analysis (RFA) and “smooth GEV models” as by Blanchet & Lehning (2010: https://doi.org/10.5194/hess-14-2527-2010) estimating smooth trend surfaces for the GEV parameters from the observational data and allowing for covariates for enhanced robustness.
Page 5, line 120. I think some comments should be added about the difficulty of working with 200-year return periods calculated from 30-year time series, particularly for highly variable precipitation extremes. I think this is a highly questionable methodological choice given the methodology and data available.
We acknowledge the difficulty of estimating 200-year return periods from 30-year samples. However, this choice was made deliberately, as one advantage of the sMEV framework is robust estimation of extremes using short sample sizes. The large uncertainties present in such return period estimations will be discussed more explicitly.
Page 5, line 123. This is my main issue with the methodology here. I don't see how repeating by bootstrap observational data over a long series will give more confidence in validating one model over another. I think the methodology here is not suitable for comparing the two distributions. To compare distributions, you need to use observational data and compare the fit to the observed data and calculate, for example, the errors between observed and simulated quantiles for the two distributions.
The reviewer is correct that the bootstrap approach is not suitable for validation against observations. We would like to clarify that the aim is instead to compare sensitivity to small samples. This is quantified using the relative root mean square error, applied to simulated climate model data.
You can also look at the confidence interval produced by the two approaches with the two distributions.
If we understand the suggestion correctly, it seems to us, that the RRMSE values we present are not very different to calculating the confidence interval. Confidence intervals are typically calculated via bootstrapping the available years. They show the sensitivity of the return levels to sample variability. Similarly, via bootstrapping we show the return level sensitivity to the combined effect of sample variability and sample length.
In addition, as pointed out by the authors, this bootstrap approach does not take into account temporal dependence or possible trends at all. So it is difficult to see how this approach can be adapted to model comparison.
Indeed, effects of interannual autocorrelation and temporal trends are assumed to be negligible when comparing the sensitivity of small samples between the GEV and sMEV distribution. Justifying this assumption demands further investigation. We now compare the results of the RRMSE, if applied to the 3 time periods individually. These time periods are themselves not stationary, but they exhibit significantly less variability within the time period compared to the variability across time periods. These newly produced results reveal the same spatial patterns as well as similar relative differences between the GEV and sMEV for sample sizes below the maximum 30 years. These figures will be added in the supplement of the revised article.
Page 6, line 147. The GEV parameters are associated with temperature as a covariate. However, there is no justification of the added value of this covariate. For example, a deviance test could be used to verify the added value of this covariate compared to a stationary model. Otherwise, the authors do not provide sufficient guarantees to verify that the principle of parsimony is satisfied.
We agree that the added value of temperature as a covariate should be formally tested. The revised manuscript will, therefore, include a deviance test comparing stationary models to non-stationary models with temperature and time as a covariate. More specifically, we will estimate the Kullback–Leibler divergence using the corrected Akaike’s information criterion, similar to Kim et al. (2017, http://dx.doi.org/10.1016/j.jhydrol.2017.02.005).
Furthermore, it is rather difficult to understand why, in the context of trend detection, time is not used as a covariate, which is the most commonly used approach.
Regional temperature was chosen because of its more direct physical relationship with precipitation extremes, compared to time. More specifically, in central Europe moisture from precipitation extremes are supplied by local sources (Keune, Jessica, and D. G. Miralles. "A precipitation recycling network to assess freshwater vulnerability: Challenging the watershed convention." Water Resources Research). The assumption is that higher regional temperatures have positive feedback on the intensity of these extremes referring to the thermodynamic scaling of extreme precipitation.
Page 8, line 215 I don't understand the consistency here between using a GEV with a linear dependence on temperature, an SMEV model with an exponential dependence, and then explaining in this section that ultimately a linear dependence is used.
We would like to clarify that the ns-GEV and ns-sMEV formulations differ fundamentally: the ns-GEV includes a linear temperature dependence in the location parameter, while the ns-sMEV combines an exponential dependence in the scale parameter with a linear dependence in the rate of ordinary events. The relevant section will be rewritten to remove ambiguity.
Page 11, line 258. It is unclear how the root mean square error is calculated here. This concerns the observed data? or the simulation approach proposed in the methodology?
The RRMSE is calculated using only simulated data, as detailed in the methodology. The section will be rewritten to remove ambiguity.
Page 15, line 296. Could this result be influenced by the covariate temperature? I don't understand why the quantiles are not compared with the stationary models here.
These results are derived by the stationary models. At this point the covariable temperature is not yet included explicitly. Thus, it does not directly influence these quantiles of the distributions. The section will be reformulated adding this information.
Page 18, line 400 This sentence is incomprehensible. The authors write that they do not make scenarios about extreme precipitation events, yet that is exactly what they are doing. Explaining that this work is on the “challenges and opportunities of stationary and non-stationary distributions” does not mean much.
Thank you for pointing out this mistake. The sentence does not add any value and will be removed. Please refer to the introduction for an updated overview of the objectives and novelties.
Page 19, line 431. “, which is why more complex versions are needed to represent strong change” = what does it mean?
Thank you highlighting this poor phrasing. We would like to rephrase the sentence starting on page 19, line 430 as such: “Neither of the chosen non-stationary distributions are able to fully capture the effect of climate change on extreme precipitation. It is likely that more degrees of freedom in the non-stationary parameters of the distributions are necessary to improve the ability of capturing the effect of climate change.”
Again, we greatly appreciate all comments from you. Hopefully, we were able to address your concerns by incorporating your suggestions and restating the aim of the study.
Kind regards
Marc Lennartz, Benjamin Poschlod, and Bruno Merz
Citation: https://doi.org/10.5194/egusphere-2025-6419-AC1
-
AC1: 'Reply on RC1', Marc Lennartz, 23 Feb 2026
-
RC2: 'Comment on egusphere-2025-6419', Anonymous Referee #2, 02 Feb 2026
The paper presents an interesting assessment of how using the simplified MEV distribution can capture variations in return level values for short-duration precipitation in contexts of climate change with less uncertainty.
The application uses data and assumptions that limit the generality of the results, but the authors clearly highlight these limitations. Therefore, I believe the paper offers a contribution that nevertheless has its own originality.For this reason, I believe that the work can be published after a minor revision.
In general, I don't believe that the application to modeling results represents a source of originality fundamental to the paper's value. In my opinion, the database to which the methodologies are applied is a tool rather than an objective, so it doesn't matter whether the application is the first, and therefore I would relax the various statements in this regard.
It seems to me that the variations in RL captured by the distributions for the different RPs are not very far from the RRMSE values for the same RPs. I would suggest, perhaps in the limitations section, mentioning this more explicitly when discussing the uncertainties of the distributions due to the assumptions.
par. 2.1 study area: maybe adding a figure with the K-G climate regions would help readability (maybe in the supplement?)
par. 3 methods, row 130 "effects introduced by climate change... etc" is an important statement that should require some references
Par. 3.2 sMEV: row 180-189 I found it difficult to interpret what is written in the paragraph. It certainly refers to the method of Marra et al., but I would suggest expanding the section to include more details on the specific application of the method.
Par. 5.1 The first comment clearly refers to GEV, but it's not immediately clear from the text. I'd suggest rewriting it, immediately citing the context to which the comment refers. It would also be helpful to include, at row 290, a reference to the figure where the median shape parameter is represented (Fig. 3?).
Citation: https://doi.org/10.5194/egusphere-2025-6419-RC2 -
AC2: 'Reply on RC2', Marc Lennartz, 23 Feb 2026
Dear Referee,
Thank you for your constructive review and for highlighting the potential value of this study. We appreciate your assessment that the work offers an original contribution and your recommendation for publication after minor revisions. We fully agree with all suggestions made. To add more detail on the implementation, we will go through your comments (in italic) point by point.
In general, I don't believe that the application to modeling results represents a source of originality fundamental to the paper's value. In my opinion, the database to which the methodologies are applied is a tool rather than an objective, so it doesn't matter whether the application is the first, and therefore I would relax the various statements in this regard.
We agree with your observation and appreciate the suggestion. Consequently, we will remove all claims of "first application" to avoid overstating the results. For a revised version of the studies’ objective and aim, please refer to the response to Referee #1.
It seems to me that the variations in RL captured by the distributions for the different RPs are not very far from the RRMSE values for the same RPs. I would suggest, perhaps in the limitations section, mentioning this more explicitly when discussing the uncertainties of the distributions due to the assumptions.
Thank you for the hint. Indeed, the spatial variations in return levels, even between neighbouring grid cells, are of a similar magnitude to the relative root mean square error. This is important when putting the uncertainties between the GEV and sMEV into context. This will be added in the limitation section.
par. 2.1 study area: maybe adding a figure with the K-G climate regions would help readability (maybe in the supplement?)
We agree that a map of the climate regions would be very helpful. Especially, since climatological properties greatly influence the results of our analysis. We will include a map of the Köppen-Geiger climate regions based on Beck et al. (2023, https://doi.org/10.1038/s41597-023-02549-6) in the supplementary material.
par. 3 methods, row 130 "effects introduced by climate change... etc" is an important statement that should require some references
Indeed, effects of interannual autocorrelation and temporal trends are assumed to be negligible when comparing the sensitivity of small samples between the GEV and sMEV distribution. Justifying this assumption demands further investigation. We can compare the results of the RRMSE, if applied to the 3 time periods individually. These time periods are themselves not stationary, but they exhibit significantly less variability within the time period compared to the variability across time periods. The results reveal the same spatial patterns as well as similar relative differences between the GEV and sMEV for sample sizes below the maximum 30 years. These figures will be added in the supplement.
Par. 3.2 sMEV: row 180-189 I found it difficult to interpret what is written in the paragraph. It certainly refers to the method of Marra et al., but I would suggest expanding the section to include more details on the specific application of the method.
We recognize that the description of the iterative sMEV procedure was overly condensed. The methods will be explained in greater detail. The loop applied in the test will be described in a more structured way, explaining every step iteratively and clearly stating the break condition.
Par. 5.1 The first comment clearly refers to GEV, but it's not immediately clear from the text. I'd suggest rewriting it, immediately citing the context to which the comment refers. It would also be helpful to include, at row 290, a reference to the figure where the median shape parameter is represented (Fig. 3?).
Thank you for these suggestions for improving clarity. We will rewrite the opening of Paragraph 5.1 to explicitly state when we are referring to the GEV distribution. A reference to Figure 3 will also be added.
We appreciate all comments. Hopefully, we were able to address all your concerns.
Kind regards
Marc Lennartz, Benjamin Poschlod, and Bruno Merz
Citation: https://doi.org/10.5194/egusphere-2025-6419-AC2
-
AC2: 'Reply on RC2', Marc Lennartz, 23 Feb 2026
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 147 | 89 | 27 | 263 | 46 | 11 | 19 |
- HTML: 147
- PDF: 89
- XML: 27
- Total: 263
- Supplement: 46
- BibTeX: 11
- EndNote: 19
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript applies the GEV and SMEV statistical distributions to estimate extreme rainfall return periods using data from a high-resolution climate model that allows for convection. The article is relatively well written, but I find many sentences and sections to be very succinct and in need of further development. In terms of objectives, they are not clear to me. Is the methodological objective to compare distributions, in which case I think the method is not appropriate, or is it to compare the projections produced by the two frequency models?
I strongly recommend a major revision of this manuscript to clarify the methods used and, above all, to specify the objective of this work. If the methodological objective is to say that the SMEV method is more robust than the GEV approach, there is nothing new here, as this has been demonstrated in previous work. On the other hand, I think it would be interesting to present future projections for hourly rainfall in Germany using these two approaches. If this objective is retained, a more comprehensive literature review is needed to explain what projections are currently available in Germany on hourly extremes and how the results of this study either support them or produce new results.
Abstract: “. Thus, we evaluate the performance of the sMEV5 and GEV distributions for hourly precipitation obtained from a convection-permitting regional climate mode” => In fact, this is not really what is done here to evaluate the performance of one model compared to another; they must be compared to an observed reference. Furthermore, talking about performance is not very clear here, as it is not a very specific objective.
Abstract : “To our knowledge, this is the first application of the sMEV distribution to time series from a convection-permitting-mode => this is wrong, see these references below, among others:
Dallan, E., Marra, F., Fosser, G., Marani, M., Formetta, G., Schär, C., and Borga, M.: How well does a convection-permitting regional climate model represent the reverse orographic effect of extreme hourly precipitation?, Hydrol. Earth Syst. Sci., 27, 1133–1149, https://doi.org/10.5194/hess-27-1133-2023, 2023.
Dallan, E., Marra, F., Fosser, G., Marani, M., & Borga, M. (2024). Dynamical Factors Heavily Modulate the Future Increase of Sub‐Daily Extreme Precipitation in the Alpine‐Mediterranean Region. Earth’s Future, 12(12). https://doi.org/10.1029/2024ef005185
Vohnicky, P., Dallan, E., Marra, F., Fosser, G., & Borga, M. (2025). Future precipitation extremes: Differential changes from point to catchment scale revealed by a convection-permitting model ensemble. Journal of Hydrology, 662, 133822. https://doi.org/10.1016/j.jhydrol.2025.133822
Lompi, M., Marra, F., Deidda, R., Caporali, E., Borga, M., & Dallan, E. (2025). Non-stationary frequency analysis of long-term convection permitting simulations reveals sub-daily extreme precipitation changes in central-southern Europe. Advances in Water Resources, 205, 105071. https://doi.org/10.1016/j.advwatres.2025.105071
Generally speaking, as soon as I read an abstract "for the first time", I think to myself that the results of an article are potentially overstated.
The following are specific comments mainly related to methodology, as I believe these points need to be clarified before analyzing in detail the results.
Page 4, line 95. It is somewhat surprising to read that the scenario is not suitable because it is too pessimistic. A little more context is needed here to explain why this scenario was selected in this context.
Page 5, line 115. It is very good to cite the sources for the codes used. However, it would be interesting to specify what changes have been made and possibly produce the modified code.
Page 5, line 116 The lmoment method does not focus solely on extreme values. Above all, the method allows for a more robust estimation of the parameters. This needs to be changed here.
Page 5, line 117 There are also regional approaches that provide much more robust estimates for parameters of frequency models compared to pixel-by-pixel estimation. Even though such approaches are not used here, I think we should include this methodological warning.
Page 5, line 120. I think some comments should be added about the difficulty of working with 200-year return periods calculated from 30-year time series, particularly for highly variable precipitation extremes. I think this is a highly questionable methodological choice given the methodology and data available.
Page 5, line 123. This is my main issue with the methodology here. I don't see how repeating by bootstrap observational data over a long series will give more confidence in validating one model over another. I think the methodology here is not suitable for comparing the two distributions. To compare distributions, you need to use observational data and compare the fit to the observed data and calculate, for example, the errors between observed and simulated quantiles for the two distributions. You can also look at the confidence interval produced by the two approaches with the two distributions. In addition, as pointed out by the authors, this bootstrap approach does not take into account temporal dependence or possible trends at all. So it is difficult to see how this approach can be adapted to model comparison.
Page 6, line 147. The GEV parameters are associated with temperature as a covariate. However, there is no justification of the added value of this covariate. For example, a deviance test could be used to verify the added value of this covariate compared to a stationary model. Otherwise, the authors do not provide sufficient guarantees to verify that the principle of parsimony is satisfied.
Furthermore, it is rather difficult to understand why, in the context of trend detection, time is not used as a covariate, which is the most commonly used approach.
Page 8, line 215 I don't understand the consistency here between using a GEV with a linear dependence on temperature, an SMEV model with an exponential dependence, and then explaining in this section that ultimately a linear dependence is used.
Page 11, line 258. It is unclear how the root mean square error is calculated here. This concerns the observed data? or the simulation approach proposed in the methodology?
Page 15, line 296. Could this result be influenced by the covariate temperature? I don't understand why the quantiles are not compared with the stationary models here.
Page 18, line 400 This sentence is incomprehensible. The authors write that they do not make scenarios about extreme precipitation events, yet that is exactly what they are doing. Explaining that this work is on the “challenges and opportunities of stationary and non-stationary distributions” does not mean much.
Page 19, line 431. “, which is why more complex versions are needed to represent strong change” = what does it mean ?