Technical note: Temperature dependence of precipitation tail heaviness in the TENAX model
Abstract. Climate change is causing the magnitudes of extreme sub-daily precipitation events to increase. The ability to predict changes to these precipitation extremes is crucial for disaster preparedness. The TENAX model was proposed to predict return levels of sub-daily extreme precipitation under climate change based on the projected temperature shifts. It combines a Weibull distribution with an exponential temperature dependence in the scale parameter, accounting for the Clausius–Clapeyron relation, with an explicit representation of the temperatures during precipitation events. The Weibull distribution's shape parameter could also have a temperature dependence, which would mean that the tail heaviness changes with temperature. This implies that the rarest events may increase at faster rates. However, implementing this dependence increases the number of parameters to be estimated, affecting the model's accuracy. Here, we use hourly data from thousands of rain gauges in Germany, Japan, the UK, and the USA to assess the dependence of the Weibull shape parameter on temperature, exploring how it should be implemented in the TENAX model. We find that there is a significant dependence in many stations and that the magnitude and sign of the dependence have regional patterns. In the majority of stations, the sign is negative, implying that rarer events intensify with temperature at a higher rate. However, Monte Carlo simulations show that including this dependence without careful consideration may lead to overestimation of precipitation return levels and increase the model uncertainty. The dependence should therefore be introduced with caution, in the context of surrounding stations.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Hydrology and Earth System Sciences
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
The technical note Temperature dependence of precipitation tail heaviness in the TENAX model by Thomas et al presents an investigation of the possibility of including a varying shape in the TENAX model by Marra et al. (2024). The note investigates potential consequences of the inclusion of the varying shape and probes some possible modelling choices. The note is quite interesting and it explores a very relevant topic in the modelling of extreme precipitation: changes in the shape of the distribution of natural hazards might result in dramatic changes in risk levels and as such the topic is very relevant (indeed I think a recent interesting exploration of this topic should be at least mentioned by the authors https://doi.org/10.1029/2023WR036426). The note is a bit crude in some of the analysis, it could go a bit deeper in some explorations and could present some more in-depth statistical analysis, but I imagine this is why it was submitted as a technical note and not as a research article. It is clearly an initial step towards a larger change in the TENAX approach, which I think will be an interesting development. Overall I think the authors did a very good job at presenting the material, the aim is clear, the analysis are to the point and the methods relatively well presented. This being said, while reading the manuscript I noted down some comments which I hope the authors will consider to possibly improve the presentation of the material.
Major comments:
- "that is, how much larger extremes are with respect to the average." I don't love this definition, since it sort of defines the tail itself rather than the heaviness (which, especially for the Weibull, is often defined in relation to the exponential tail). While this is not a major comment, I do think the authors could help the reader understanding better what the changes in the tail/shape parameter might entail by providing some more intuition/explanation on the meaning of the shape parameter in the Weibull distribution, especially when this is then allowed to vary. Maybe some pictures of a fitted magnitude model might be helpful in appreciating the implications of the increase or decrease of the shape parameter?
- Line 90 (and throughout the manuscript): the shape parameter is assumed to be positive, I imagine this is why others have enforced this assumption by using the exponential transformation. When using a linear model, do you do anything to enforce a positive shape? Could the linear model not be problematic if one wished to use the fitted model using for example climate projection in which one might need to evaluate the model for a range of T values such that κ(T) < 0?
- I found the explanation on line 118-119 quite hard to follow. You generate 5-times the number of events at each station (if so why?) or five-times the number of stations (you have one set of parameters per station and you calculate the parameters five times?)? Also I think a better wording would be that you "approximate stochastically the sampling distribution of the parameters"; I found "stochastically calculate" quite unclear (you estimate, no?). Indeed, line 120 "The distribution of these parameter" *estimates*
- Line 145: while I think it is OK to simply show the at-site information on the detection of the trend at this stage, I think the authors should acknowledge the issues linked to multiple hypothesis testing and field significance (see for example https://doi.org/10.1029/2021WR030172 or https://doi.org/10.1029/2007WR006268, but the topic is very present in the literature on trend detection). Also, I have found it quite useful for this type of trend detection to see the histogram of the test statistics derived for each station: how far away is this from the standard gaussian distribution (it is a cruder way of assessing the overall signal)?
- Line 173: it is not entirely clear to me why the likelihood of outliers appearing should increase. The total number of "outliers" might increase, but why the likelihood? Also, I find the term outlier not ideal: these are situations generated by the controlled data-generating process, they are simply part of the tail of the distribution, not really outliers. Of course, they complicate the story, but they are part of the possible outcomes. Also, I don't fully understand/agree with the sentence "The increase in outliers shows that the number of events at a station affects the uncertainty in estimating the parameters." First of all, I don't fully understand how this is related to the number of events at a station. More importantly, the fact that larger sample sizes result in more precise estimation is surely not newsworthy, just like the fact that maximum likelihood is not robust (to outliers, but I think you mean something else here).
- Line 194: is the test used to carry out this check the LRT test mentioned in line 127? Make this more explicit if so (and actually, more details in Section 4 could be beneficial to appreciate how the testing is done). I admit I did not fully understand Figure 4 and the discussion around it, it is not clear to me what the "significantly different in one case only" category means: you are testing all the parameters? What is the one case? I think the discussion of the Figure could be improved.
- Line 209: I am not surprised to hear that the estimation of the temperature-varying parameters is challenging and that the estimated functions tend to compensate each other. Could one think of models in which the change in shape and scale is somehow interlinked (so estimating a unique slope, modulo some constant)?
- Line 274: I think the idea of a regional trend is a good one, there is plenty of literature on this carried out for similar applications (https://doi.org/10.1016/j.advwatres.2021.103852 or https://doi.org/10.1029/2005WR004591 for example).
Some other minor points:
- Line 20, when mentioning the event in Italy: any references? You provide refs for the other events.
- Line 98: a Monte Carlo approximation with $N = 2·10^5$ iterations is used *to approximate it* (or something to make the sentence clearer)
- Line 109: ERA-5 land cells are indeed cells, so their coordinates are representative of the cell. I don't remember (I always need to double check) if the cell coordinates are the centre of the cell or one of the corners, but I am a bit puzzled by the sentence.
- Line 123: make clear that what is assumed to be time-invariant is the dependence on T, and that this implies that the physical relationship between temperature and precipitation is constant. It would be possible to create a TENAX model in which the parameters change with time, so the choice of investigating whether the parameter estimates change in time as a check for the validity of the model is something you should motivate (I agree that it is a sensible check). One more small point on this: can some of the differences in the estimation of the parameters in the early and later period not be also due to the difference in the distribution g(T) in the different time periods?
- Line 137 (and Figure 5): if you want to call the quantity you compute Bias, it should at least be Relative bias (Bias would be estimate - true). Honestly, I would simply call it ratio of estimated against true value of return levels, since you also discuss the variability of the estimates when discussing the result, so really, you are interested in the Bias/Variance tradeoffs. Also in the Figure: I would maybe delete the long y-axis label from panel b and c (and only leave it on Panel a) to keep the left sides a bit less cluttered. Also in the discussion of Figure 5 I think you should very clearly say that panel a shows what the risk of ignoring trends in the shape is in terms of underestimation. Given that you mostly found negative trends in the countries you have studied, this is a very consequential fact.
- Line 142: just write p-value, you can use the extra 4 characters!
- In Figure 2: if I understand correctly there are no boxplots to show for the lower panels of the right columns. Having those little bars made me wonder if I was looking at very precise estimations: I would leave them blank or put a cross (or a different symbol). Also: what are the black lines in the plot: make explicit in the caption or legend.
- Line 167: I guess it is not very surprising to see that all the variation can not be explained by sampling errors, surely stations are affected by the same climate and are not independent of each other. There is a plethora of studies on this, and the behaviour shown in the Figure is to be expected: this fact is the reason behind the existence of spatial interpolation and statistics.
- Line 188: b is always included in the model, the point is that it is included *as an unknown parameter which needs to be estimated*
- Line 249: "We conclude" -> this is maybe a bit strong, you don't give very much evidence for this. "This seems to indicate that ..."
- Line 295: I imagine one could expand the recent work on Bayesian Spatial model on this type of models (https://doi.org/10.1007/s13253-025-00719-0)