Using synthetic case studies to explore the spread and calibration of ensemble atmospheric dispersion forecasts
Abstract. Ensemble predictions of atmospheric dispersion that account for the meteorological uncertainties in a weather forecast are constructed by propagating the individual members of an ensemble numerical weather prediction forecast through an atmospheric dispersion model. Two event scenarios involving hypothetical atmospheric releases are considered: a near-surface radiological release from a nuclear power plant accident, and a large eruption of an Icelandic volcano releasing volcanic ash into the upper air. Simulations were run twice-daily in real time over a four month period to create a large data set of cases for this study. Performance of the ensemble predictions is measured against retrospective simulations using analysed meteorological fields. The focus of this paper is on comparing the spread of the ensemble members against forecast errors and on the calibration of probabilistic forecasts derived from the ensemble distribution.
Results show good overall performance by the dispersion ensembles in both studies, but with simulations for the upper air ash release generally performing better than those for the near-surface release of radiological material. The near-surface results demonstrate a sensitivity to the release location, with good performance in areas dominated by the synoptic-scale meteorology and generally poorer performance at some other sites where, we speculate, the global-scale meteorological ensemble used in this study has difficulty in adequately capturing the uncertainty from local and regional scale influences on the boundary layer. The ensemble tends to be under-spread, or over-confident, for the radiological case in general, especially at earlier forecast steps. The limited ensemble size of 18 members may also affect its ability to fully resolve peak values or adequately sample outlier regions. Probability forecasts of threshold exceedances show a reasonable degree of calibration, though the over-confident nature of the ensemble means that it tends to be too keen on using the extreme forecast probabilities.
Ensemble forecasts for the volcanic ash study demonstrate an appropriate degree of spread and are generally well-calibrated, particularly for ash concentration forecasts in the troposphere. The ensemble is slightly over-spread, or under-confident, within the troposphere at the first output time step T+6, thought to be attributable to a known deficiency in the ensemble perturbation scheme in use at the time of this study, but improves with probability forecasts becoming well-calibrated here by the end of the period. Conversely, an increasing tendency towards over-confident forecasts is seen in the stratosphere, which again mirrors an expectation for ensemble spread to fall away at higher altitudes in the met ensemble. Results in the volcanic ash case are also broadly similar between the three different eruption scenarios considered in the study, suggesting that good ensemble performance might apply to a wide range of eruptions with different heights and mass eruption rates.
Andrew Richard Jones et al.
Status: open (until 21 Jun 2023)
- RC1: 'Comment on egusphere-2023-628', Slawomir Potempski, 08 May 2023 reply
Andrew Richard Jones et al.
Hypothetical ensemble dispersion model runs with statistical verification https://doi.org/10.5281/zenodo.4770066
Andrew Richard Jones et al.
Viewed (geographical distribution)
The paper deals with the problem of the propagation of meteorological forecast uncertainty through atmospheric dispersion model. The ensemble prediction system with 18 forecast members from the MOGREPS-G has been used for performing atmospheric dispersion simulations using NAME model, so the final output is in the form of the ensemble of atmospheric dispersion predictions. The investigation of the spread and calibration of this ensemble is one of the main purposes of this work. Two main hypothetical scenarios have been investigated: low elevated radiolological release for selected 12 sites in Europe and high or even very high elevated 3 volcanic ash releases. Very extensive simulations for a period of 5 months with two releases daily for both scenarios have been performed. Finally, a huge set of data has been produced thus giving sound ground for any statistical analysis. The setup of such experiment is highly appreciated and can be considered as recommended for making deep analysis of the behaviour of any atmospheric dispersion ensemble system, in particular the ones used in operational mode. The final aim should be estimation of uncertainty of atmospheric dispersion modelling for various meteorological conditions. In this respect at some stage a comparison with other models and real measurements will be also necessary, but first proper calibration of the ensemble is one of the key factors, and this is why in the paper the authors concentrate on the analysis of the spread and calibration. However, it could be probably worth to put the work into a bit broader context, so the reader could better understand the whole process of uncertainty analysis and complexity of this problem, the more so a number of works have been already published aiming at the analysis of various types of ensembles, both from theoretical and practical points of view. It should also added that the added value of such extensive calculations producing large data, is such that various analyses can be performed, for example by comparing the results for different places or at different meteorological conditions.
1. One of the basic questions related to the presented methodology is whether 18 members is enough to produce sufficient statistics to cover interested range of possible results. It seems that there are situations when this is not the case, and the authors are aware that either more ensemble members would be needed or other models can be applied. ECMWF produces large forecast ensembling that can be used to drive atmospheric dispersion calculations, however it'd be very time consuming. The other possibility is to produce multi-model ensemble, which usually has bigger spread than the ensemble based on one dispersion model. In fact there are many articles already published dealing with these issues.
2. Table 1 contains thresholds used for both scenarios. Obviously, in case of operational system, the best would be, when these thresholds reflect some criteria used operationally. For radiological scenario mostly doses are applied in various criteria, however in some countries, like Austria also time integrated concentration and deposition are used. For example some agriculture countermeasures can be implemented, if time integrated concentration of Cs-137 exceeds 350 Bq*s/m3 or depostion is higher than 650 Bq/m2 (for iodine I-131 this is respectively 170 Bq*s/m3 and 700 Bq/m2). Thresholds shown in Table 1 are much higher, but this is obviously arbitrary choice of the modellers.
3. The authors use quite simple indicators (rank histogram, attribute diagram, spread-error relation), but it seems they are mostly sufficient. On the other hand it would be convenient to see the values in the form of table (ensemble spread vs error in ensemble mean) to see how the results are changing in time. Some additional indicators can be also considered: like factor of 2 for spread-error diagram.
4. The way of rank maps presentation with two colour sections is appreciated. However, the reader should be warned against too simple interpretation of these maps. The fact that the ensemble system predicts areas where "real plume" (i.e. from analysis) are not present does not mean that the ensemble gave bad prognosis. If the ensemble shows low probability for such areas it is fine, otherwise you can say that prognosis was not very accurate. The role of ensemble is to predict areas when plume can, but not necessarily, must appear.
The main comment is related to the request of including mathematical formulas for quantities used in the article, firstly, in order to avoid any ambiguity, and secondly simply for the reader's convenience. This concerns also the way how the figures have been constructed.