the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Forecasting European temperature-related mortality in Summer 2024: data-driven vs physics-based forecast approaches
Abstract. Heat has emerged as a major public health concern. Over 62,000 heat-related deaths were estimated to have occurred during the European summer of 2024, exemplifying the pressing need to develop effective early warning systems. Such systems depend critically on the quality of the underlying forecasts, and recent work has focused on developing impact-based forecasts for heat-related mortality, which provide impact-oriented information. To date, heat-related mortality forecasts have been based on the output of numerical weather prediction models, or physics-based forecasts. The field of weather forecasting is undergoing a rapid transformation with the advent of skillful data-driven forecasts. This study compares European temperature-related mortality forecasts for summer 2024 based on physics-based weather forecasts with those based on data-driven weather forecasts. Our results highlight that both the physics-based and data-driven forecasts systematically underestimate temperature-related mortality, more pronouncedly so in the latter. Both types of forecasts appear sensitive to forecast errors at hot temperatures, due to the non-linear relationship between temperature and mortality. Nevertheless, temperature-related mortality forecasts based on data-driven weather forecasts appear to be a promising alternative to traditional physics-based weather forecasts, and targeted improvement of the representation of hot temperatures through bias correction or adjustment of the loss function to give greater weighting to hot temperatures would be beneficial for temperature-related mortality forecasting. We suggest the application of this approach to both data-driven and physics-based forecast ensembles as an important next step in the continued development of informative, impact-oriented forecasts.
- Preprint
(3152 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 20 May 2026)
- RC1: 'Comment on egusphere-2026-1144', Anonymous Referee #1, 13 Mar 2026 reply
-
RC2: 'Comment on egusphere-2026-1144', Anonymous Referee #2, 28 Apr 2026
reply
The paper addresses an important current issue regarding the applicability of AI-generated weather forecasts for impact-oriented predictions of mortality.
The analysis is presented in a clear and structured manner throughout, making it easy to follow the line of reasoning.
Overall, the authors achieve the paper’s objectives with their presentation.
Nevertheless, I recommend making a few minor adjustments to the technical and visual presentation, primarily to facilitate the interpretation of the results.- Both the title and the abstract give the impression of a systematic comparison of data-driven and physics-based forecasts. However, the paper compares only two deterministic models for one summer, so the generalisability of the findings is not obvious. In the interest of transparency, it would therefore be advisable to address this limitation already in the abstract.
- The authors may add their reasons for the choice of the two particular models.
- It might be helpful for the reader if the ordering in section 2.1 would be changed such that the used models are mentioned before the specifics are explained.
- There is no mentioning of the population data that is used for the weighting. This should be added to increase transparency. Additionally, a map or a list of the used cities would help the reader to assess the geographical distribution of the data.
- Since several results are visually on the edge of significance, a more thorough use of statistical methods could be helpful for the interpretation of the stated results. In particular this holds for the following points:
- The means of figures 1c-f are compared (lines 117-118) without stating the uncertainty of the calculated means, which should be accessible from the mentioned bootstrapping procedure. The results of a statistical test on the significance of the difference would then help to interpret the result.
- The same is true for the interpretation of figure 7c-d where the significance of the difference would improve the reported observations in line 167-170.
- The comparison in line 156 mentions a Kolmogorov-Smirnov test whose results should be included for transparency.
- The performed fits in figure A4 and A5 are subject to uncertainties that could visually be shown by plotting the confidence band. Since the results of figure A4 are a central result discussed in section 4 (lines 175-179), the significance of the difference in the fits, especially at high temperatures, is important for the interpretation of the stated conclusion. Also testing whether the fitted slope differs significantly from zero would be an interesting information.
- Lines 151-159 describe the results of figure 6, which compares the distribution of the daily averaged AF forecast biases with the distribution of daily averaged temperatures in form of a QQ plot. However I do not see what information can be obtained from such a comparison, which is also not used in any further discussion. The authors should clarify why these two distributions are expected to be related and what the intended interpretation is, or replace the figure with the ones in figure A4 (see next point).
- The central result that the forecasts underestimate AF at hot temperatures (lines 175-179) is visible only in figure A4, which is relegated to the appendix. This should at least be mentioned via a reference and/or by moving the figure to the main part of the paper.
- Several plots contain inconsistencies which should be corrected.
- The values of the mean bias in figures 1c-f and figures A2a-d are different, however they should be equal. Further, the y axes in figures A2a-d do not show the whole data range which is from -0.4 to 0.4 as shown in figures 1c-f.
- The text explanation to figure 6 (line 152) mentions a deviation from the diagonal, however, the plot shows a horizontal line. If this really is a QQ-plot, a diagonal line should be added.
- Figure A3 gives mean MAE values, which I assume are the mean values over all temperatures, however, from the shown data it is obvious that these values do not match the mean of the shown data. Is there data missing in the plot? If so, the axes should be changed such that all data is included.
- The data in figures A4a-d should be the same as in figures 3c-f. Therefore, the magnitude of the values should match, which is not the case. For example figure A4d has values between -0.1 and 0.25 while figure 3f has values between -0.05 and 0.12.
Finally, the following list contains several purely technical issues.
- In line 93-94 the order should be reversed to "difference between forecast and reference" to match the given formula 4.
- Formula 4 uses j as index while the text and the following formula use i. This should be aligned.
- Figures 1, 3, A2 and A4 describes the mean of the time series as "mean error", a name which is already used as alternative to "bias". It should thus be "mean mean error" or better "mean bias".
- The caption of figure 1 misses a "line" in the last sentence.
- The caption of figure 3 has a wrong sub-figure labelling a-d -> c-f
- Figure 5 has different y-scales for the four plots. For a better comparison these should be aligned.
- For a better comparison the histograms in figure A1 should use the same width of the bins not the same number of bins for both forecasts.
- The caption to figure A2 mentions a time series, but it is a bias-temperature scatter-plot.
Citation: https://doi.org/10.5194/egusphere-2026-1144-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 959 | 510 | 83 | 1,552 | 61 | 59 |
- HTML: 959
- PDF: 510
- XML: 83
- Total: 1,552
- BibTeX: 61
- EndNote: 59
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The manuscript addresses an important and timely topic, namely the development of impact-based forecasts for heat-related mortality. Overall, the manuscript clearly achieves its stated objectives and is well written, with a logical structure that makes the analysis easy to follow. I suggest some minor revisions, mainly related to clarification of certain aspects, as outlined below.
"We also use 2m temperature from archived forecasts from two different types of weather prediction models, one physical model (IFS HRES cycle 48r1), and one data-driven model (AIFS single v1).” It would be helpful if the authors briefly explained why these specific models were selected.
The epidemiological framework relies on exposure–response functions from the multi-city analysis of Masselot et al. (2023), which is based on all-cause mortality data. It would be helpful to state this explicitly, as the estimated impacts therefore correspond to temperature-related mortality across all causes rather than specific cause-of-death categories (e.g., cardiovascular or respiratory mortality which are the most common causes that are used and studied in health imapct studies). A brief clarification regarding this would help.