the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Precipitation-temperature scaling: current challenges and proposed methodological strategies
Abstract. Sub-daily to daily extreme precipitation intensities are expected to increase in a warming climate, consistent with the Clausius-Clapeyron (CC) relationship, which predicts a ∼7 % increase in atmospheric moisture-holding capacity per °C of warming. Many studies have benchmarked observed extreme precipitation–temperature (P–T) scaling rates against this theoretical value, finding that global averages align closely with CC, while regional and seasonal estimates often diverge substantially. Significant challenges remain, however, in accurately estimating and interpreting P–T scaling rates, particularly at point scales. In this study, we use observational data from the Upper Colorado River Basin to explore these challenges and propose methodological improvements. Specifically, we compare multiple approaches, including those using raw (non-normalized) and normalized data, to estimate P–T scaling for hourly and daily extreme precipitation. Model performance is assessed using a cross-validation framework. Our results demonstrate that normalizing data, independently for every station and each calendar month, is essential to account for spatial and temporal climatological variability. Without normalization, estimated scaling rates can be inaccurate and misleading.
- Preprint
(4974 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-3881', Anonymous Referee #1, 24 Sep 2025
-
RC2: 'Comment on egusphere-2025-3881', Anonymous Referee #2, 06 Oct 2025
This paper proposes an integrated approach to improve our estimates of precipitation-dew point scaling rates. The idea relies on the seasonal normalisation of dew point (additive normalisation) and monthly maxima of precipitation at daily/hourly scales (multiplicative normalisation).
Overall, the manuscript collects a set of ideas from literature (although sometimes with incomplete referencing), and proposes an approach to integrate those ideas. For this reason, I think the title is slightly misleading, as it tends to overgeneralise the breadth of the contribution and oversell the novelty.
Before commenting, I’d like to say that I found the manuscript difficult to follow, so some of my comments may be associated with misunderstandings. I’ll be happy to discuss them further.
I think the methodological idea has potential, but the implementation is rather convoluted and it hinges on some subjective choices not fully motivated in the text. Also, I am not convinced by the validation that, to my understanding, is based on mean deviations - while in precipitation-temperature scaling we are typically interested in extremes.
Overall, the study has potential, but I think major work is required before it can be reconsidered for publication.
I provide below here my specific comments, in the order they appear in the manuscript.
- Line 7-8: here it is not yet clear what ‘normalised’ means. I suggest to briefly explain it
- Line 20: is looks odd to have two significant digits for the 1 degree and not for the 7% (which is approximated)
- Line 20: Extreme precipitation events depend on other variables, not only moisture in the column. I suggest to slightly rephrase.
- Line 42: Technically, dew point is defined as “the temperature at which air saturates when cooled at constant pressure” (e.g., see wikipedia or any atmospheric sciences book). It follows that dew point over a given location contains information only on the available moisture, and not on temperature. In fact, knowing the dew point and the pressure, it is not possible to calculate the temperature. For this reason, also the sentence in lines 49-50 needs to be updated (“the chosen temperature variable” should be changed to something like “the chosen variable”).
- Line 45: Marra & al 2024 do not use the binning method, please remove the reference. The reference instead can be relevant to the sentence at lines 65-67 and 67-68.
- Line 54: data normalisation has not been defined. What do you mean by that? I expect to learn it later, but it should be explained earlier.
- Line 69: you seem to use the binning method. Indeed the introduction mentions this method, but does not mention there are some alternatives, namely the quantile regression, which is known to provide more robust estimates (if used properly) and to require much less subjective choices. This becomes more relevant given the fact that later you use an exponential model for the means and that the evaluation is done on mean values. I believe quantile regressions could help you solve both these issues.
- Line 75: the introduction fails to mention the literature that investigated the impact of process heterogeneity on the emerging scaling rates (e.g., Molnar et al 2015, which is cited but for other reasons, or De Silva & al 2025 Nat Geo). Given the focus on seasonality, this is a critical aspect that needs to be addressed. For example, is the normalisation handling the same problems? Is it only an approximation of what a classification would do in a more proper manner?
- Line 108: “make skilful predictions of extreme precipitation”. It is not yet clear what you mean by “predictions”. What are you trying to predict? is it the scaling rate for a given place and month? Is it the precipitation magnitude associated with a given probability (percentile) at a given temperature? Needs to be clear right from here, otherwise it is difficult to follow.
- Fig. 2 and the related analyses. Technically, the hook structure could be created by lack of sufficient observations to properly estimate rare percentiles (Marra & al 2024). What the normalisation does is to remove the heterogeneity. The result is that the sample is homogeneous and its portion at high temperatures becomes more populated, allowing for a better estimate of the large percentiles. Therefore, the normalisation alone is not the “cure” to the hook structure, it also needs sufficient data sample. I suggest to better state this.
- Lines 198-205: this is a lot of text to say “precipitation is stochastic”.
- Lines 209-210: this resembles some of the ideas behind Marra & al 2024
- Line 236: why only the mean is included in the normalisation? Doesn’t the variance also count? It should be stated something about the assumption behind this normalisation.
- Line 246-246: you repeatedly claim that the normalisation above “effectively timely removed the three common challenges”. This needs to be shown, for example you can compare the distributions before and after for some example cases.
- Line 251-252: “any reference…” I suggest to include this part, in some way (perhaps referring to section 3.2), much earlier in the text.
- Line 294: I don’t understand how this is possible. Perhaps the way daily and hourly maxima are defined is not sufficiently clear?
- Line 308 and Figures 6,7,8,11,12: if I understood correctly, the evaluation is done on the mean values of the bins, and not on the extremes. Is this correct? I don’t understand how is this useful for extremes, which are the target of P-T scaling applications. I think more reasoning should be provided here.
- Fig 7 and several other results/validation: why are 2 degree C bins used? What is the sensitivity of the outcomes to this choice?
- Fig 7 and 10: usually the reference value (observed in our case) is plotted in the x-axis to facilitate interpretation (model overestimation is above the 1:1 line, and underestimation below)
- Line 460-461: I agree on this consideration, but I wonder how much statistically robust it can be considered. For example, the plots for -3 and +3 (Fig 11a,f) show quite many large dots at the boundaries of the distribution. Perhaps a statistical test can help on this. Perhaps the same Montecarlo used here can be used, if many more samples are generated?
- Lines 492-494: I don’t understand how this finding relates with the finding that normalised values allowed for putting different months together (e.g., fig 9). Isn’t this result suggesting that we should not mix months even with normalisation? Please explain.
Citation: https://doi.org/10.5194/egusphere-2025-3881-RC2 -
RC3: 'Comment on egusphere-2025-3881', Anonymous Referee #3, 09 Oct 2025
The paper presents a relevant and interesting idea. Normalizing station-month data to reduce artefacts in precipitation-temperature scaling is sensible, and the exponential model fitted to normalized anomalies improves predictive skill in the Upper Colorado River Basin. However, several aspects of the analysis and interpretation need tightening before the conclusions can be considered robust.
The main issue is the lack of discussion on data quality control. The paper doesn’t explain how precipitation observations were checked or filtered. Since Ali et al. (2022) highlights errors from coarse measurement precision and faulty readings, this needs to be addressed directly in the data section, with a brief note on the checks used or potential uncertainties.
The temperature binning method could also be improved. Fixed temperature intervals cause uneven sampling-cooler bins dominate while warm bins remain sparse. Using bins with roughly equal numbers of data pairs would produce more balanced estimates.
Normalization is useful for removing spatial and seasonal effects, but it can hide genuine long-term trends. Subtracting historical station-month means risks erasing real climate signals in dew point or rainfall. The assumption of stationarity and the leave-one-year-out validation don’t fully test for this. If the data are non-stationary, the resulting scaling estimates may be biased.
From a statistical standpoint, a hierarchical model would be more robust than treating all stations equally or independently. It would allow shared information across stations while preserving local variations. Alternatively, quantile regression or a generalized additive model could capture nonlinear relationships without relying on arbitrary bins, and would better describe high-end percentiles.
The fitted exponential coefficient also needs clearer interpretation. The slope parameter b is treated as “% per °C,” but the correct expression is (exp b - 1). Using b directly can slightly misstate the scaling rate.
Equation 7 divides precipitation by its mean for each station-month, but many of these means are very small or zero, inflating anomalies and adding noise. Although this is mentioned briefly, its effect isn’t explored or corrected.
Using monthly-mean dew point as a predictor helps correlation but weakens the physical link to rainfall extremes, which depend on short-term moisture and dynamics such as CAPE or large-scale ascent. Higher-frequency predictors would strengthen the physical interpretation.
The assumption that station-month maxima are statistically independent isn’t fully demonstrated. Dependence across years or from climate modes like ENSO could still exist. Block-bootstrap or similar resampling methods would provide more realistic uncertainty estimates.
Citation: https://doi.org/10.5194/egusphere-2025-3881-RC3
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,705 | 67 | 15 | 1,787 | 26 | 26 |
- HTML: 1,705
- PDF: 67
- XML: 15
- Total: 1,787
- BibTeX: 26
- EndNote: 26
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Let me start by stating the manuscript is well written and I believe worthy of publication. However, I am concerned that the novelty has been misrepresented. The main issue here is the authors provide 3 shortcomings they have overcome building on Zhang et al (2017). I do not agree with this statement as I believe each individual shortcoming has been addressed in previous literature. Here the novelty lies in combining the approaches from three different manuscripts. I believe this combining of methods is a worthy contribution and an important one. I have two major comments which I hope the authors can address.
Major comment 1.
I strongly believe that data for P-T scaling should not be pooled without standardising – and agree with the authors, but this was demonstrated in Visser et al (2022) and Molnar et al (2015). The authors have cited these papers in their manuscript and to their own admission at line 379 they state their work bears a strong resemblance to Visser et al (2022) and Molnar et al (2015) so why not make this point in the introduction that they build on these authors work?
Further, Figure 2: A standardisation was already proposed in Visser et al (2022) and I quote in their introduction “We introduce standardized pooling…”. This should be acknowledged here.
Figure 3 and 4: The issue that bins at the extremes have less events, and some binning techniques don’t consider independence were both points made in Wasko and Sharma (2014) and hence quantile regression using independent events was proposed. This point also relates to Line 406. This should be acknowledged here.
Figure 5: The use of a monthly (or seasonal) temperature was proposed by Zhang et al (2017). This should be acknowledged here.
In sum, while the justification of the proposed methodology presented in this manuscript is much more elaborate than previous manuscripts (and hence I am a proponent of it being published) the framing needs to change I believe to duly pay respect to the previous research. The method proposed here is more a combination of methods proposed by Visser et al (2022), Wasko et al (2014), and Zhang et al (2017) and the introduction and conclusion should be restructured accordingly.
Major comment 2.
It is odd that the authors choose 7% per degree as their truth when calculating the skill, when by their own admission in Figure 12 the scaling is not aligned with CC? In some way this should addressed, with at least more focus on the actual scaling rates. The reason is – these are empirical relationships, without a “truth”.
Minor comments:
Title: The title suggests a review and noting that some of the current challenges have been resolved the title could be amended.
Line 1-2: Does sub-daily rainfall scale at 7%? There is now much review/meta-analysis work showing it is likely higher? e.g. Fowler et al (2021); Wasko et al (2024). The IPCC reports also point to higher than 7% scaling for sub-daily rainfall.
Line 60 onwards: The point of pooling resulting in “incorrect” scaling was well made in Molnar et al (2015) and has been made in papers by Berg and Haerter – making the point that a lot of this has to do with different storm types, but this was never mentioned here?
Line 113: “incorrect” is a strong word when we don’t know the truth, scaling’s are correlations and they’re all true in some way regardless of the method.
Figure 8 nicely presents that the pooling of standardized data works, but Figure 8d also shows that monthly data can be safely pooled after standardization and the performance is similar (Column 1 vs Column 4) – could this point me made in the text?
References:
Fowler, H.J., Lenderink, G., Prein, A.F., Westra, S., Allan, R.P., Ban, N., Barbero, R., Berg, P., Blenkinsop, S., Do, H.X., Guerreiro, S., Haerter, J.O., Kendon, E.J., Lewis, E., Schaer, C., Sharma, A., Villarini, G., Wasko, C., Zhang, X., 2021. Anthropogenic intensification of short-duration rainfall extremes. Nature Reviews Earth & Environment 2, 107–122. https://doi.org/10.1038/s43017-020-00128-6
Molnar, P., Fatichi, S., Gaál, L., Szolgay, J., Burlando, P., 2015. Storm type effects on super Clausius–Clapeyron scaling of intense rainstorm properties with air temperature. Hydrology and Earth System Sciences 19, 1753–1766. https://doi.org/10.5194/hess-19-1753-2015
Visser, J.B., Wasko, C., Sharma, A., Nathan, R., 2021. Eliminating the “hook” in Precipitation-Temperature Scaling. Journal of Climate 34, 9535–9549. https://doi.org/10.1175/JCLI-D-21-0292.1
Wasko, C., Sharma, A., 2014. Quantile regression for investigating scaling of extreme precipitation with temperature. Water Resources Research 50, 3608–3614. https://doi.org/10.1002/2013WR015194
Wasko, C., Westra, S., Nathan, R., Pepler, A., Raupach, T.H., Dowdy, A., Johnson, F., Ho, M., McInnes, K.L., Jakob, D., Evans, J., Villarini, G., Fowler, H.J., 2024. A systematic review of climate change science relevant to Australian design flood estimation. Hydrology and Earth System Sciences 28, 1251–1285. https://doi.org/10.5194/hess-28-1251-2024
Zhang, X., Zwiers, F.W., Li, G., Wan, H., Cannon, A.J., 2017. Complexity in estimating past and future extreme short-duration rainfall. Nature Geoscience 10, 255–259. https://doi.org/10.1038/ngeo2911