the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Storm surge dynamics in the northern Adriatic Sea: comparing AI emulators with high-resolution numerical simulations
Abstract. Accurate storm surge forecasting is vital for protecting coastal regions, particularly in the northern Adriatic Sea where sea-level rise and increasingly severe storm events pose growing risks. Machine Learning (ML) approaches offer compelling speed and flexibility, yet their ability to emulate high-resolution dynamic models, especially for extreme surge events, has not been sufficiently assessed across methods and loss functions. In this study, a range of ML emulators, from Multivariate Linear Regression (MLR) to Long Short-Term Memory (LSTM) networks, is benchmarked against a high-resolution hydrodynamic model optimized for extreme surge representation. We also evaluate the impact of training loss functions, comparing the conventional Mean Squared Error (MSE) with the corrected Mean Absolute Deviation squared (MADc²), designed to better capture surge peaks. Results show that even simple models like MLR, when trained with MADc², achieve performance comparable to advanced neural networks while remaining orders of magnitude faster. These findings demonstrate that with appropriate training strategies, data-driven emulators can rival physics-based models in reproducing extremes. The MLR-MADc² configuration emerges as a practical balance between computational efficiency and accuracy, underscoring the potential of ML emulators for coastal forecasting and risk assessment.
Competing interests: Co-author Massimo Tondello is employed by the company HS Marine SrL. Co-author Michalis Vousdoukas is employed by the company MV Coastal and Climate Research Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationship that could be construed as a potential conflict of interest.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(4270 KB) - Metadata XML
-
Supplement
(2117 KB) - BibTeX
- EndNote
Status: open (until 14 Jan 2026)
-
RC1: 'Comment on egusphere-2025-5313', Anonymous Referee #1, 08 Dec 2025
reply
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-5313/egusphere-2025-5313-RC1-supplement.pdfReplyCitation: https://doi.org/
10.5194/egusphere-2025-5313-RC1 -
AC1: 'Reply on RC1', Rodrigo Campos Caba, 09 Dec 2025
reply
Dear reviewer,
We would like to sincerely thank you for the time and care dedicated to evaluating our manuscript. Your constructive comments highlight several important aspects that will help us improve the clarity, transparency, and contextualization of our work.
As this is an initial response within the open discussion, ahead of the full revised manuscript and detailed rebuttal, we take the opportunity to address the major scientific points raised, clarify aspects of the methodology that may have been misunderstood, and acknowledge the many legitimate suggestions that will strengthen the revised version.
Before addressing the major comments, we note that two of the critiques concern developments that occurred after our original submission date, while another relates to the interpretation of our methodological framework. We respectfully clarify these points below while also emphasizing that we greatly appreciate the reviewer’s careful assessment and the improvements their feedback enables.
POINT 1: NOVELTY AND TIMELINE OF EXTREME-FOCUSED LOSS FUNCTIONS
The reviewer notes that recent studies (Hermans et al., 2025; Longo et al., 2025) have also explored loss functions tailored to extremes, suggesting that our contribution may be less novel than stated. While we acknowledge these contributions, and thank you for mentioning them, we respectfully note that Hermans et al. (2025) was published on 21 November 2025, after our submission (18 November 2025), and therefore was not available during manuscript preparation. Longo et al. is still a preprint, as is our own manuscript. Although preprints may be accessible before formal publication, they have not yet undergone peer review and their content may still evolve. For this reason, preprints are typically not treated as part of the established scientific record at the time of submission and are therefore not used to retrospectively reassess the novelty of a contribution.Our work was developed independently, and the fact that Hermans et al. (2025) later introduced a dense-loss strategy for extremes, while Longo et al. (preprint) explored quantile-based alternatives, reinforces rather than diminishes the novelty of our contribution. Instead, this convergence of ideas across multiple groups highlights a timely and emerging research direction in which the community is increasingly recognizing the need for loss functions tailored to the tails of the distribution, precisely the motivation that led us to propose MADc². The independent appearance of related approaches underscores the relevance of our methodology and the importance of explicitly addressing extremes in data-driven storm surge modeling.
A central contribution of our work is the introduction of the MADc² loss function, which builds upon the MADc metric that we first introduced in Campos-Caba et al. (2024a) for the evaluation of extreme storm surge simulations. That earlier study demonstrated that MADc uniquely identifies the configuration of our high-resolution dynamic model, which was specifically designed with a coastal resolution of 50 m, forced by a dedicated atmospheric downscaling, and calibrated to maximize performance on extreme events. This provided us with a physically credible and exceptionally strong benchmark for evaluating ML emulators, which we consider a further novel contribution of the study.
Following the publication of that work, we extended MADc into a differentiable loss function (MADc²) and formally presented it as a learning objective for emulators in several workshops (Campos-Caba et al., 2024b; Mentaschi et al., 2025; Campos-Caba et al., 2025).
In the revision, we will:
• Clarify the developmental timeline of MADc and MADc².
• Contextualize our contribution within this emerging research direction.POINT 3: USE OF MED-MFC SEA SURFACE HEIGHT AS A PREDICTOR
We thank the reviewer for raising this important point.You express concern that using high-resolution Med-MFC sea surface height (SSH) as a predictor may be “circular” or may “defeat the purpose” of data-driven emulators. We respectfully clarify that this interpretation does not apply to our methodological framework, and underscore another contribution of our work. Our objective is statistical downscaling, not full model replacement.
Using coarse-resolution model output to produce refined local predictions is a long-established and foundational practice in coastal oceanography, forming the backbone of operational storm surge forecasting for decades (e.g., Flather, 2000; von Storch & Woth, 2008). Dynamical downscaling systems routinely use coarse ocean model SSH to force high-resolution coastal models (Trotta et al., 2016; Federico et al., 2017). Our ML emulator performs the same function statistically: refining operationally available basin-scale fields to resolve coastal processes that coarse models cannot capture.
Therefore, a further novel aspect of our approach is integrating ML emulation directly into operational forecasting workflows. Most ML storm surge studies predict from atmospheric variables alone. In contrast, we mirror operational downscaling chains: coarse SSH from Copernicus Med-MFC (which is freely and consistently available in near-real time) is statistically refined to tide-gauge scale. This positions our emulator as a computationally efficient component within existing systems, not a standalone replacement.
Far from circular, this approach is operationally advantageous: it leverages high-quality basin-scale output (already assimilating observations) and focuses ML on coastal refinement, where dynamical modeling is most expensive.
That said, your suggestion is valuable. In the revised manuscript, we will comment on this alternative approach and clarify the distinction between our statistical downscaling framework and time-series forecasting methods. We will also:
• More explicitly describe the downscaling framework and distinguish it from full surrogate modeling.
• Comment on the complementary experiment you propose (evaluating models using atmospheric predictors only).
• Clarify the strengths and limitations of using SSH-driven statistical downscaling for extreme-value prediction.We also acknowledge that your proposed improvements in this section are legitimate and will meaningfully enhance the manuscript.
OTHER COMMENTS
We acknowledge the remaining points and thank the reviewer for highlighting them. In the revised manuscript, we will:
• Expand the discussion of PCA limitations and clarify that several encoding approaches exist but were not explored here.
• Strengthen Section 4 by situating our findings more clearly within the broader literature on data-driven storm surge modeling (including Hermans et al., Longo et al., Tiggeloven et al., Tadesse et al., and others).
• Provide full hyperparameter details for all neural network architectures (depth, units, learning rate, dropout rates, and optimization settings).
• Add the predictor spatial domains directly to Figure 1.
• Clarify the rationale for using the 99th percentile threshold and include a brief discussion of behavior above the 99.9th percentile.
• Acknowledge the limitation that only two locations are analyzed.
• Replace rainbow color maps with perceptually uniform alternatives.
• Make all code and data available in a public repository to guarantee full reproducibility.
• Add clarifications regarding the temporal split and the representativeness of extremes across the three-year testing period.These are constructive suggestions, and we will incorporate them systematically into the revised manuscript.
Finally, we thank the reviewer again for their thoughtful and constructive feedback. We believe that addressing these points will substantially strengthen the manuscript, particularly in clarifying the novelty of our contribution, aligning the methodology with established downscaling practices, and enhancing transparency in our ML implementation. We are preparing a full revised version of the manuscript and a detailed rebuttal accordingly.
REFERENCES
Federico, I., Pinardi, N., Coppini, G., Oddo, P., Lecci, R., & Mossa, M. (2017). Coastal ocean forecasting with an unstructured grid model in the southern Adriatic and northern Ionian seas. Natural Hazards and Earth System Sciences, 17(1), 45–59. https://doi.org/10.5194/nhess-17-45-2017.Flather, R. (2000). Existing operational oceanography. Coastal Engineering 41, 13-40.
Hermans, T., Hammouda, C., Treu, S., Tiggeloven, T., Couasnon, A., Busecke, J., and van de Wal., R. (2025). Computing extreme storm surges in Europe using neural networks. Nat. Hazards Earth Syst. Sci., 25, 4593-4612. https://doi.org/10.5194/nhess-25-4593-2025.
Campos-Caba, R., Alessandri, J., Camus, P., Mazzino, A., Ferrari, F., Federico, I., Vousdoukas, M., Tondello, M., and Mentaschi, L. (2024a). Assessing storm surge model performance: what error indicators can measure the model’s skill? Ocean Sci., 20, 1513-1526. https://doi.org/10.5194/os-20-1513-2024.
Campos-Caba, R., Mentaschi, L., Pinardi, N., Alessandri, J., Camus, P., Tondello, M., Mazzino, A., and Ferrari, F. (2024b). Developments on a machine learning downscaling system for storm surge in the Northern Adriatic Sea. Fourth ESA-ECMWF workshop: Machine Learning for Earth system observation and prediction. Frascati, Italy. Available at: [https://www.ml4esop.esa.int/posters].
Campos-Caba, R., Alessandri, J., Camus, P., Mazzino, A., Ferrari, F., Federico, I., Vousdoukas, M., Tondello, M., Coppini, G., and Mentaschi, L. (2025). Enhancing storm surge downscaling: A comparative study of machine learning and dynamical modeling in the northern Adriatic Sea. 4th International Workshop on Waves, Storm Surges, and Coastal Hazards. Santander, Spain.
Longo, E., Ficchi, A., Verlaan, M., Muis, S., Castelletti, A. (preprint). A deep learning framework for extreme storm surge modelling under future climate scenarios. Manuscript submitted to Earth’s Future.
Mentaschi, L, Campos-Caba, R., Alessandri, J., Camus, P., Mazzino, A., Ferrari, F., Federico, I., Vousdoukas, M., Tondello, M., and Coppini, G. (2025). Storm surge prediction in the Northern Adriatic Sea: a comparison between Machine Learning and numerical modelling. European Geoscience Union, General Assembly 2025. Vienna, Austria. Abstract available at: [https://meetingorganizer.copernicus.org/EGU25/EGU25-17094.html].
Trotta, F., Fenu, E., Pinardi, N., Bruciaferri, D., Giacomelli, L., Federico, I., & Coppini, G. (2016). A structured and unstructured grid relocatable ocean platform for forecasting (SURF). Deep-Sea Research II, 133, 54–75. https://doi.org/10.1016/j.dsr2.2016.05.004.
Von Storch, H., and Woth, K. (2008). Storm surges: perspectives and options. Sustain Sci. 3:33-43. https://doi.org/10.1007/s11645-008-0044-2.
Citation: https://doi.org/10.5194/egusphere-2025-5313-AC1 -
RC2: 'Reply on AC1', Anonymous Referee #1, 10 Dec 2025
reply
The initial response of the authors to my review is sensible, but I disagree with the authors’ stance on using preprints. Dismissing preprints because they ‘are not part of the established scientific record’ goes against the purpose of preprints, which is to accelerate science by making research available earlier. Even though they have not been peer-reviewed yet, preprints have their own DOI and are therefore traceable and citeable. The specific studies I referred to have been available as preprints for multiple months, so could have been included. Furthermore, in the past years, different ways to address data imbalance in regression problems in general have been investigated as well. With the comment in my review I did not mean to discredit the present study, but rather to point out how, in my view, the authors could make a bigger contribution to advancing the science in this regard with relatively little additional effort. I will leave it up to the editor to decide whether they find that additional effort necessary or not, but it is good to read that the authors at least plan to include the latest research in their introduction and discussion.
Citation: https://doi.org/10.5194/egusphere-2025-5313-RC2
-
RC2: 'Reply on AC1', Anonymous Referee #1, 10 Dec 2025
reply
-
AC1: 'Reply on RC1', Rodrigo Campos Caba, 09 Dec 2025
reply
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 243 | 73 | 28 | 344 | 40 | 20 | 20 |
- HTML: 243
- PDF: 73
- XML: 28
- Total: 344
- Supplement: 40
- BibTeX: 20
- EndNote: 20
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1