the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Composition, frequency and magnitude of future rain-on-snow floods in Germany
Abstract. In Germany, severe trans-basin winter floods are often generated by rain-on-snow (ROS) phenomena. Under suitable conditions, when rain falls on the snow cover, the snow can melt and produce extreme amounts of runoff. In a warming climate, the frequency of ROS events is expected to change locally depending on elevation and regionally based on the general climate conditions. Consequently, the characteristics of ROS-driven winter floods are also anticipated to change. To investigate these changes, streamflow for multiple gauge stations in Germany was simulated using a deep learning model based on an ensemble of downscaled climate projections. Germany, as a representative mid-latitude country with a considerable portion of historical floods generated by ROS, offers extensive spatial and temporal coverage of hydrological observations spanning long temporal scales, and hence warranting efficient training of the deep learning model. We used explainable artificial intelligence to examine flood-generating processes, focusing primarily on ROS, for every simulated flood peak. Changes in frequency, feature importance, and magnitude of ROS flood events were assessed for individual streamflow gauges and for trans-basin floods across four major river basins in Germany. We found that with regard to the ensemble median, the frequency of ROS floods will decrease at the scale of individual gauges, as well as at the trans-basin scale for the Rhine, Elbe, and Weser River basins but increase in the Danube River basin. For all regions, the snowmelt component during ROS floods becomes less relevant, whereas the contribution of rainfall to these events increases. Interestingly, the severity of both the mean and the most extreme ROS trans-basin floods is projected to increase compared to the historical period in all major river basins in Germany, even though several individual gauges may experience a decrease in magnitude. Despite the overall agreement in the trends of the input features across climate models, the resulting trends in ROS floods are considerably disparate. This discrepancy is primarily attributed to the variations in snow dynamics in different climate models.
- Preprint
(4634 KB) - Metadata XML
-
Supplement
(23668 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-3532', Anonymous Referee #1, 30 Nov 2025
- AC1: 'Reply on RC1', Christian Czakay, 16 Apr 2026
-
RC2: 'Comment on egusphere-2025-3532', Anonymous Referee #2, 05 Mar 2026
The study by Czakay et al. applies an LSTM model, trained by ERA5 data, in combination with an ensemble of 31 downscaled CORDEX CMIP5 to investigate how climate change may impact rain-on-snow floods in Germany. The study is well-designed and comprehensive, and it provides some interesting new findings. Still, I have some questions and comments that should be considered in a revised manuscript.
General comments
- My major concern is about the detection of peak-over-threshold events and the definition of floods. The authors use a threshold value to derive on average five peak flow events per year (POT5). A flood is then defined as POT5 events exceeding a return period of two years. How is the independency of flood events guaranteed? Two subsequent peaks may actually belong to the same flood event. This could have major impact on the results and their interpretation.
- Flood magnitudes are displayed by ranks. This requires better introduction, explanation and guidance for the reader when explaining the results.
- The authors could elaborate more about the uncertainties of their simulations and projections. In particular, the role of the deep learning approach on the overall uncertainty could be discussed in some more detail.
Specific comments
- The results presented in the abstract could contain more specific, quantitative information. The author may also consider adding details on the data and methods used (which deep learning model used etc.)
- The introduction lacks a clears research question or research objective. Furthermore, the introduction should also introduce and explain trans-basin floods in some more detail. I also suggest to highlighting the relevance of rain-on-snow events compared to other flood types in Germany in some more detail.
- L39 add direction of change
- L54 why?
- L55-56 this sentence is not complete
- L70-78 on which spatial scales? What about uncertainties in the (hydrological) impact models?
- L93 already mentioned
- L96 add spatial scale of ERA data
- L107 would it make sense to apply an ensemble of LSTMs to learn more about the uncertainty of the deep learning approach?
- L135 check for consistent use of IGsm and IGsnw
- L186 the authors should provide more information on how they performed the bias correction.
- L187 This could need a clearer explanation. Maybe highlighting the difference between ERA5 and ERA-Interim could help.
- L230-235 What about anthropogenic activities in the study catchments in general? I would guess that this is an issue. How does this impact the results?
- Figure 3: I don not understand the orange line in combination with lines 240-244.
- Figure 4: The LSTM predicts floods not contained in the observation data. What does this imply for future projections?
- L290-295 I think these data sets need a better introduction in chapter 2.
- Figure 7c, f (plus corresponding text, L293): I do not see this improvement.
- With reference to the skill in predicting high-flows, this should read a bit more critical. How well does the deep learning approach compares against a regular hydrological model? Is it possible to assess/discuss this?
- L344 add a reference to Figure 10 underlining the spatial information
- Figure 10: is it possible to indicate the significance of changes here?
- Figure 11: notches in the boxplot may indicate the significance of changes; Fig. 11 c) why “all” smaller than individual river basins
- Figure 12 (c): Do I understand correctly that ROS events are currently by far the most dominant flood types in Germany? I can hardly believe that.
- Figure 13: maybe the small statement in the text on changes in the seasonality is enough and does not require this figure which does not provide so many extra information.
- L449 I wonder if percentage change is an appropriate measure here. If we assume the same absolute decrease in the number of days with snow cover in lowlands and the Alps, then the percentage decrease is of course smaller in the Alps since the total number of snow days is generally larger high-mountain regions.
- L486-490: repetition, this has already been mentioned in the method section
Citation: https://doi.org/10.5194/egusphere-2025-3532-RC2 - AC2: 'Reply on RC2', Christian Czakay, 16 Apr 2026
Data sets
Streamflow and flood-generating processes for CORDEX-CMIP5-driven streamflow simulations (1950-2100) using an LSTM and XAI. Christian Czakay et al. https://doi.org/10.5281/zenodo.16368450
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 404 | 140 | 27 | 571 | 80 | 27 | 24 |
- HTML: 404
- PDF: 140
- XML: 27
- Total: 571
- Supplement: 80
- BibTeX: 27
- EndNote: 24
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This study investigates how climate conditions may impact the rain-on-snow (ROS) floods in Germany using an LSTM model trained on downscaled climate projections and explainable artificial intelligence techniques. The results suggest that while ROS flood frequency will decrease in most regions, the severity of extreme ROS floods is expected to increase across all major river basins, and the role of snowmelt in these floods will become less significant. Overall, this study is comprehensive and well-designed. However, I still have several questions and suggestions for improving the current work.
1) The "magnitude" of floods is mentioned in the title. However, the absolute magnitudes of floods are represented by the relative ranks.
2) It is suggested to clearly define the “trans-basin” floods. How to distinguish whether the flood is "trans-basin" or "within-basin"?
3) Lines 5 and 82: The LSTM model may not be the state-of-art deep learning approach nowadays. It would be better to compare its performance to that of the emerging alternatives, e.g., physics-informed deep learning, and the ensemble-based approach is also beneficial given the limitations of any single deep learning model.
4) Lines 18, 471-472, and 523-524: The conclusion that the discrepancy is mainly due to different snow dynamics in climate models should be supported by more detailed arguments and evidence.
5) It would be better to make the Introduction Section more concise and to clearly state the research gaps and study objectives.
6) Figure 1 and the other maps: It is suggested to add a north arrow and a scale bar.
7) Line 98: How to define the "subsurface runoff" and the "day length"?
8) Line 108: The evaluation metric, NSE, has some inherent limitations. Also, given the sampling uncertainty and measurement errors in both temporal and spatial data, it is suggested to present the values of metrics through a statistical distribution instead of a fixed number for a single evaluation period and to apply the metric to the variable of interest like specific flood peaks. The authors can refer to the article below for more information about the limitations of some commonly used evaluation metrics in hydrology.
Reference: “Beyond a fixed number: Investigating uncertainty in popular evaluation metrics of ensemble flood modeling using bootstrapping analysis” (https://doi.org/10.1111/jfr3.12982)
9) Table 1: Could you add a reference for the criterion, "SWE>15mm"?
10) Line 230: How about the potential impacts of human activities on the other gauges? Any thoughts about how to incorporate those infrastructures into the modeling process, especially for the urban areas?
11) Figures 3, 7, 8, and 9: The horizontal axis, "Stations", was sorted by different criteria, such as the mean NSE and the median number of ROS floods. Is it possible to keep it consistent? So it is relatively easy to identify each gauge station.
12) Figure 8: Please add the denotation for the shaded areas to the figure caption.
13) Line 370: It would be better to specify the exact p-value rather than "P<0.05".
14) Line 479: Is it "GMCs" or "GCMs"?