the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Computing Extreme Storm Surges in Europe Using Neural Networks
Abstract. Because of the computational costs of computing storm surges with hydrodynamic models, projections of changes in extreme storm surges are often based on small ensembles of climate model simulations. This may be resolved by using data-driven storm-surge models instead, which are computationally much cheaper to apply than hydrodynamic models. However, the potential performance of data-driven models at predicting extreme storm surges is unclear because previous studies did not train their models to specifically predict the extremes, which are underrepresented in observations. Here, we investigate the performance of neural networks at predicting extreme storm surges at 9 tide-gauge stations in Europe when trained with a cost-sensitive learning approach based on the density of the observed storm surges. We find that density-based weighting improves both the error and timing of predictions of exceedances of the 99th percentile made with Long-Short-Term-Memory (LSTM) models, with the optimal degree of weighting depending on the location. At most locations, the performance of the neural networks also improves by exploiting spatiotemporal patterns in the input data with a convolutional LSTM (ConvLSTM) layer. The neural networks generally outperform an existing multi-linear regression model, and at the majority of locations, the performance of especially the ConvLSTM models approximates that of the hydrodynamic Global Tide and Surge Model. While the neural networks still predominantly underestimate the highest extreme storm surges, we conclude that addressing the imbalance in the training data through density-based weighting helps to improve the performance of neural networks at predicting the extremes and forms a step forward towards their use for climate projections.
- Preprint
(2747 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 09 May 2025)
-
RC1: 'Comment on egusphere-2025-196', Anonymous Referee #1, 10 Feb 2025
reply
Review of the Manuscript “Computing Extreme Storm Surges in Europe Using Neural Networks”
The manuscript presents an approach to storm surge prediction using deep learning. Several methodological issues must be addressed to improve the manuscript’s clarity and strength. Specifically, clarifying dataset construction, justifying hyperparameter choices, and improving performance evaluation will significantly enhance the manuscript. In its current form, the manuscript is not appropriate for publication.
Introduction:
- L41-48: The last phrase “Furthermore, because several … hydrodynamics models” does not sound logical to me. It sounds like the authors reached a "general" conclusion from the previous “several” studies.
Methodology – Data Preparation:
- Dataset Size and Class Distribution: The paper mentions using data from 1979 to 2017 at a three-hour resolution. However, it is unclear how many training samples remain after filtering or how the extreme events (99% and 99.9%) are distributed.
- Explanatory Variables: While the paper includes zonal, meridional, and absolute wind speed as predictors, absolute wind speed is directly derivable from the other two. The authors need to justify this inclusion. Otherwise, removing absolute wind speed could prevent redundancy and improve efficiency.
- Construction of Data Points: Storm surges can persist for several days. It is essential to clarify whether data points overlap, whether each event is treated as an independent sample, or if multi-day storm surges are captured uniquely.
- Atmospheric Variables: The authors predict sea-level height based on ERA5 atmospheric data but do not specify whether land-based data is included. If land data is incorporated, the authors need to justify and discuss how it was handled.
Model Training & Hyperparameter Tuning:
- Training Epochs: The authors use a maximum of 100 training epochs with early stopping. Given the complexity of the models, 100 epochs may not be sufficient. The authors need to justify the convergence of the model with 100 epochs.
- Dropout Rate: The dropout rate of 0.1–0.2 may be too low. LSTM and ConvLSTM models often use dropout rates of 0.3–0.5 to prevent overfitting. If different dropout rates have been tested, discussing their impact would improve transparency.
Performance Evaluation:
- Evaluation Metrics: The authors use the F1-score as a primary evaluation metric. In extreme event prediction, recall is often more important than precision, as missing a storm surge event is more consequential than a false positive. A high F1-score does not necessarily indicate strong model performance if recall is low. Reporting recall and precision alongside the F1-score would provide a more comprehensive assessment. A confusion matrix could be also beneficial.
Discussion & Conclusion:
- The discussion and conclusion are well-written based on the current results of the study, but they will need to be updated after the revision of the manuscript.
Minor Edits:
- "v.s." to "vs."
- L41: “more moderate” to “moderate.”
- L318: Remove “at least.”
Citation: https://doi.org/10.5194/egusphere-2025-196-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
209 | 63 | 6 | 278 | 6 | 5 |
- HTML: 209
- PDF: 63
- XML: 6
- Total: 278
- BibTeX: 6
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
Netherlands | 1 | 65 | 21 |
United States of America | 2 | 64 | 21 |
Germany | 3 | 34 | 11 |
China | 4 | 23 | 7 |
France | 5 | 21 | 7 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 65