the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Developing and evaluating a Bayesian weather generator for UK precipitation conditioned on discrete storm types
Abstract. Weather generators (WGs) are important tools for downscaling General Circulation Model (GCM) output for climate impact modelling. This study introduces a precipitation WG conditioned on a recent storm types dataset and outlines a methodology for evaluating WGs using proper scoring rules. The storm types are a set of discrete weather types that use atmospheric variables to categorise ERA5 grid cells into fronts, cyclones and thunderstorms or combinations of these. The WG is a Bayesian Generalised Linear Model (GLM) with vertical velocity and humidity as covariates, conditioned on the storm types, and trained on 6 hourly precipitation accumulations. This approach contrasts with previous WGs based on weather types, which use clustering methods such as k-means to generate weather types. A Bayesian model framework is used, instead of typical maximum likelihood approaches, and an informed prior choice is made on observable quantities using the prior predictive distribution. The WG is assessed using proper scoring rules and Diebold-Mariano (DM) significance tests. The use of a DM test to assess the statistical significance of average proper score differences is a key addition to typical WG evaluation approaches, as it helps model developers avoid changes that improve an average score by chance. Calibration is assessed using the probability integral transform histogram and by comparing draws from the posterior predictive distribution to observations. Compared to the same WG not conditioned on storm types the inclusion of storm types improved the average Continuous Ranked Probably Score (CRPS) by a statistically significant amount across the stations considered. When storm types are used as an alternative to continuous atmospheric variables, they provide 33 % of the improvement in average CRPS that the atmospheric variables do, averaged over the stations. To quantify the WG's ability to represent extremes the threshold weighted CRPS (twCRPS) is explored. For three different thresholds the twCRPS corroborates the results for the CRPS. The use of proper scoring rules in conjunction with a DM test is highlighted as a powerful tool for assessing WG skill.
- Preprint
(1658 KB) - Metadata XML
-
Supplement
(1847 KB) - BibTeX
- EndNote
Status: open (until 13 Nov 2025)
- RC1: 'Potentially interesting, but doesn't read like a HESS paper and doesn't make a strong enough case at present', Richard Chandler, 08 Oct 2025 reply
-
RC2: 'Comment on egusphere-2025-4110', Anonymous Referee #2, 09 Oct 2025
reply
The paper entitled “Developing and evaluating a Bayesian weather generator for UK precipitation conditioned on discrete storm types” by Paul Bell and co-authors presents a 6h-resolution single-site precipitation generator and evaluates the performance of the model for several locations in the UK. The three novelties claimed by the authors are (1) the use of easy to interpret storm types to define the weather types used in the precipitation generator, (2) the use of a Bayesian framework, and (3) the assessment of the precipitation generator using scoring rules.
In my opinion the topic of precipitation generators fits in the scope of the journal HESS and the paper is well written. In contrast, I was not able to understand for what purpose the authors are developing this new precipitation generator (i.e., what are the expected applications?) nor to fully grasp what is the novelty of their approach. As acknowledged by the authors, all the building blocks of the proposed framework already exist, and the novelty probably comes from putting them together. But in that case the new model must be able to do something that no existing model can do, which does not seem to be the case here. Indeed, the results of the case study are not very good and do not allow to conclude that the proposed generator outperforms existing models. On balance, I don’t think this manuscript could be published in its present form.
I summarize my main concerns below:
1) Lack of motivation for the model.
The main application mentioned in the introduction is stochastic downscaling of GCM projections. However, the proposed model has very limited skills, which will most likely prevent applications of the downscaled precipitation for impact studies. Indeed, the proposed generator can only deal with precipitation (no other weather variable), it is single-site, and it does not include temporal correlation/persistence. I therefore believe that many precipitation generators can do the same, and even more. To justify their model the authors should demonstrate that in the present setting the proposed model performs better for (at least) one application of GCM downscaling, i.e. it better reproduces a set of statistics that matter for a specific application.
2) Model performance is poor and only partially assessed
The manuscript describes an interesting framework to compare precipitation generators, but the assessment of model performance “in absolute” (i.e., evaluating how the simulated precipitation statistics compare to the observed ones) is very sparse. Only Fig 9 and 10 go in this direction and the results shown in these figures are disappointing. In my opinion the model is not able to properly simulate the distribution of precipitation amount, which is concerning. The authors mention that it may be due to the use of a Gamma distribution, but usually this distribution performs reasonably well for moderate intensities and problems mostly occur in the tail of the distribution. Here, however, even the moderate amounts are biased, which may point towards a more structural problem than just the choice of the Gamma distribution to model precipitation amounts. And if the authors think that the choice of the Gamma distribution is the problem, they must propose an alternative that properly model the data instead of ignoring the problem.
In addition, figures 9 and 10 must be complemented by figures assessing the distribution of precipitation occurrence, persistence, and also extreme amounts. About extremes: the current framework compares different “flavors” of the same model with regard to extreme precipitation simulation (using the twCRPS), but does not evaluate how well these models simulate intense precipitation. This could be done using quantile-quantile plots (focusing on high quantiles) or maybe using some tail distribution index.
3) The use of scoring rules to evaluate this model (which is not designed for forecasting) is questionable
I find the framework proposed to compare probabilistic forecasts interesting, and I like the idea of using the Diebold-Mariano test to evaluate the significance of CRPS. But the proposed model is not really suited to perform forecasting, but rather for stochastic simulation. Indeed the model does not include temporal correlation or persistence, so using it for forecasting seems surprising to me.
4) The use of “physically defined” storm types as weather types is interesting, but unfortunately it is not compared with other ways to define weather types
The idea of using easy-to-interpret/physically-defined storm types as weather types is novel and interesting. Unfortunately the impact of such storm types on the performance of a weather-type precipitation generator is not compared to alternative approaches. For instance, I would like to see if the same precipitation generator designed using weather types defined using for instance k-means clustering performs better or worth than the one using storm types as weather types.
5) The choice of a Bayesian approach is not sufficiently justified
Using a Bayesian perspective is potentially interesting but also largely complicates the modeling. So I think it should be carefully motivated and evaluated. Does the Bayesian approach allow a better interpretation of the values of the parameters (by providing uncertainty ranges)? Or does it provide better simulations with more realistic precipitation variability? The current manuscript does not answer these questions.
Citation: https://doi.org/10.5194/egusphere-2025-4110-RC2
Interactive computing environment
Companion Rmarkdown to "Developing and evaluating a Bayesian weather generator for UK precipitation conditioned on discrete storm types" Paul Bell https://doi.org/10.5281/zenodo.16795621
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 2,104 | 45 | 13 | 2,162 | 24 | 18 | 18 |
- HTML: 2,104
- PDF: 45
- XML: 13
- Total: 2,162
- Supplement: 24
- BibTeX: 18
- EndNote: 18
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
See attached file.