Climate Adaptation-Aware Flood Prediction for Coastal Cities Using Deep Learning

Hassan, Bilal; Karapetyan, Areg; Chow, Aaron Chung Hin; Madanat, Samer

doi:10.5194/egusphere-2025-838

Preprints

https://doi.org/10.5194/egusphere-2025-838

Preprints

27 Mar 2025

| 27 Mar 2025

Climate Adaptation-Aware Flood Prediction for Coastal Cities Using Deep Learning

Bilal Hassan, Areg Karapetyan, Aaron Chung Hin Chow, and Samer Madanat

Abstract. Climate change and sea-level rise (SLR) pose escalating threats to coastal cities, intensifying the need for efficient and accurate methods to predict potential flood hazards. Traditional physics-based hydrodynamic simulators, although precise, are computationally prohibitive and impractical for city-scale coastal planning applications. Deep Learning (DL) techniques offer promising alternatives, however, they are often constrained by challenges such as data scarcity and high-dimensional output requirements. Leveraging a recently proposed vision-based, low-resource DL framework, we develop a novel, lightweight Convolutional Neural Network (CNN)-based model designed to predict coastal flooding under variable SLR projections and shoreline adaptation scenarios. Furthermore, we demonstrate the ability of the model to generalize across diverse geographical contexts by utilizing datasets from two distinct regions: Abu Dhabi and San Francisco. Our findings demonstrate that the proposed model significantly outperforms state-of-the-art methods, reducing the mean absolute error (MAE) in predicted flood depth maps on average by nearly 20 %. These results highlight the potential of our approach to serve as a scalable and practical tool for coastal flood management, empowering decision-makers to develop effective mitigation strategies in response to the growing impacts of climate change. Project Page: https://caspiannet.github.io

Received: 23 Feb 2025 – Discussion started: 27 Mar 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 44493 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (44493 KB)

Supplement (4405 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

11 Mar 2026

Climate adaptation-aware flood prediction for coastal cities using Deep Learning

Bilal Hassan, Areg Karapetyan, Aaron Chung Hin Chow, and Samer Madanat

Hydrol. Earth Syst. Sci., 30, 1333–1358, https://doi.org/10.5194/hess-30-1333-2026,https://doi.org/10.5194/hess-30-1333-2026, 2026

Short summary

Bilal Hassan, Areg Karapetyan, Aaron Chung Hin Chow, and Samer Madanat

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-838', Anonymous Referee #1, 25 Apr 2025
The manuscript proposes a novel deep learning framework for predicting coastal flooding (only tidal). The framework has been applied to the urban areas of Abu Dhabi and San Francisco. The proposed model performs better in comparison to the existing DL methods. The manuscript is free from grammatical errors and is interesting to read. However, the following serious concerns need to be addressed,
The developed model is only tuned for tidal flooding, and it does not account for the storm surge, which is a major source of coastal flooding. Especially, San Francisco is vulnerable to flooding by hurricanes. The paper doesn’t mention the idea behind the omission of flooding due to storm surge, and why modelling only tidal flooding is crucial in these example cases. The motivation for setting up this kind of prediction system is very unclear.

Also, what is the level of damage that can occur due to tidal flooding (as the storm surge is not included here) needs to be discussed so that the readers can understand the significance or the need for this framework.

The manuscript needs to contain the details of how the hydrodynamic model, whose results have been used for training the DL model, is validated. What kind of events were used? How closely they matched the observed flooding also needs to be included in the paper.

The paper argues that the developed model is computationally efficient, but it fails to elucidate the details on how fast it is in comparison to the hydrodynamic model. This is very crucial to establish the significance of this work.

There is no spatial validation of the over-/under-/correctly predicted flooded areas. For example, see Nithila Devi et al. (2024), where a spatial fitness index has been used to assess the spatial accuracy.

In San Francisco, there is also riverine and pluvial flooding present. How only modelling the tidal flooding sufficient in this case for assessing the flood protection capabilities?

Flood protection measures on the coast are very vaguely discussed. It is very difficult to understand what they are, how they function, and how they are incorporated into the hydrodynamic and DL models. It will be nice to discuss with figures showing their placement, functionality and model representation explicitly.

Regarding the writing style, I find that the paper is more from a computer science background, rather than explaining clearly the application and relevance to the hydraulic modelling application. For example, the manuscript lacks details on training data used, accuracy of the training data, spatial prediction accuracy, specifics of protection measures, etc., as mentioned in the earlier comments.

The paper is very wordy with several irrelevant information and lacks crucial information necessary for the understanding of the readers. For example, information on kind of damages due to tidal flooding, validation etc. Please discuss the important information briefly rather than just citing other works.

The following are the specific comments listed for each section,
Introduction and Related Works
The introduction is very wordy and unorganised. A separate section of “related works” is unwarranted and has a lot of redundant information. Therefore, sections 1 and 2 (introduction and related works) need to be shortened and merged into one, ensuring that there is a proper flow of content in them.

Lines 17-20 can be shortened.

Lines 26 – 34 need to be shortened, retaining information relevant to the kind of study the paper deals with.

Please use terms like computationally expensive/intensive instead of the term “computationally prohibitive” if possible.

Mention literature related to computationally efficient modelling approaches (subgrid, parallelisation, simplified models, etc.), such as De Almeida et al. (2013), Neal et al. (2012), Li and Hodges (2019), Sanders and Schubert (2019), Nithila Devi et al. (2024), etc. Discuss such methodologies and bring out the importance of DL/ML/AI in flood forecasting and modelling.

Highlight the need for high-resolution modelling, especially in the complex urban terrain and cite relevant literature.

Line 56, what do you mean by high dimensionality of outputs?

Since this paper expands on a deep-vision based framework, explain briefly about this in the introduction so that the readers from diverse backgrounds can appreciate the work. (Line 57)

The paper lacks a dedicated and concrete objective description; rather, it mentions the conclusions from the study in the introduction. Please move the lines 61 – 81 to conclusions or discussions, if redundant, considering removing them.

Lines 83 – 90, redundant information. Please remove them while merging sections 1 and 2.

Lines 91 – 98, irrelevant information.

Mention what kind of flood protection strategies exist here.

Lines 123 – 125, not clear.

How is the training done for different SLRs?

Figure 1 can be moved to Methodology and please describe the overall framework and the steps involved there.

Study Area and Data Description
Line 136, what do you mean by environmental effects?

Briefly describe OLU discretization in a few sentences for the benefit of readers.

Mention past flood damages caused by tides.

Briefly mention here about the training dataset.

How was the Delft 3D model validated? How accurate is it?

Rather than points, use line or raster to areas susceptible to flooding. Figure 2a.

Table 1 – What does main set, hold out set, etc., mean? Please explain when you have a table or figure in the manuscript for a general reader.

What are Shamal winds? Describe in a sentence or two with relevant literature.

It is unclear what you mean by three months. Period of simulation or the computational time itself? What is the significance of choosing this?

Method
Section 4.1, please shorten this to retain relevant general information. The rest can be moved to supplementary so that the readers don’t lose interest.

Describe a spatial fitness metric.

Results
Tabulate and discuss the values of the spatial fitness index for different DL/ML models.

Table 5, please describe the performance of the proposed method compared to the existing methods in the paper.

Please use a zoomed-in figure to illustrate the effect of the representation of the protection measure in the manuscript.

Section 6.2.2, Lines 475 – 478, lacks a clear explanation.

Section 6.2.4, a repeated heading, please be more specific and clearer.

Line 494, how is the ground truth information collected? What is the associated error?

Line 506, summarize in a few sentences about data augmentation.

Figure 9, please move it to the supplementary.

References
Bates, P. D., Horritt, M. S., & Fewtrell, T. J. (2010). A simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling. Journal of Hydrology, 387(1–2), 33–45. https://doi.org/10.1016/j.jhydrol.2010.03.027
De Almeida, G. A. M., & Bates, P. (2013). Applicability of the local inertial approximation of the shallow water equations to flood modeling. Water Resources Research, 49(8), 4833–4844. https://doi.org/10.1002/wrcr.20366
Li, Z., & Hodges, B. R. (2019). Modeling subgrid-scale topographic effects on shallow marsh hydrodynamics and salinity transport. Advances in Water Resources, 129, 1–15. https://doi.org/10.1016/j.advwatres.2019.05.004
Neal, J., Schumann, G., & Bates, P. (2012). A subgrid channel model for simulating river hydraulics and floodplain inundation over large and data sparse areas. Water Resources Research, 48(11), 1–16. https://doi.org/10.1029/2012WR012514
Nithila Devi, N., & Kuiry, S. N. (2024). A novel local‐inertial formulation representing subgrid scale topographic effects for urban flood simulation. Water Resources Research, 60(5), e2023WR035334.
Sanders, B. F., & Schubert, J. E. (2019). PRIMo: Parallel raster inundation model. Advances in Water Resources, 126, 79–95. https://doi.org/10.1016/J.ADVWATRES.2019.02.007
Citation: https://doi.org/10.5194/egusphere-2025-838-RC1
- AC1:
  'Reply on RC1', Bilal Hassan, 21 May 2025
  Anonymous Referee # 1
  The manuscript proposes a novel deep learning framework for predicting coastal flooding (only tidal). The framework has been applied to the urban areas of Abu Dhabi and San Francisco. The proposed model performs better in comparison to the existing DL methods. The manuscript is free from grammatical errors and is interesting to read. However, the following serious concerns need to be addressed,
  
  Response:
  We thank the reviewer for their positive assessment of our manuscript and for recognizing the novelty and potential of the proposed deep learning framework for coastal flood prediction. We appreciate the encouraging comments regarding the manuscript’s clarity and relevance. We also acknowledge the important concerns raised and will address each of them in detail in the revised manuscript. Below, we outline the actions we plan to take to respond to each point and improve the manuscript accordingly.
  The developed model is only tuned for tidal flooding, and it does not account for the storm surge, which is a major source of coastal flooding. Especially, San Francisco is vulnerable to flooding by hurricanes. The paper doesn’t mention the idea behind the omission of flooding due to storm surge, and why modelling only tidal flooding is crucial in these example cases. The motivation for setting up this kind of prediction system is very unclear.
  
  Response:
  We thank the reviewer for the comment. We acknowledge that the San Francisco area does indeed experience storm surges especially along the Pacific coastline and to a lesser extent within San Francisco Bay. In the Supplementary materials section S1, line 16, we justified not using a storm surge model for San Francisco Bay due to significant wave heights typically of about 0.07-0.2 m, an order of magnitude smaller than the 2.0-3.0m found on the California Coast outside San Francisco Bay. We will update our manuscript to clarify our focus area which is San Francisco Bay and its shoreline urban communities, as opposed to the coastal communities facing the Pacific Ocean. We have focused here because this is an area with a larger urban infrastructure such as highways that could suffer flood induced interruptions that affect the entire Bay Area transport network. Additionally, independent of storms, the addition of coastal protections (in the form of sea walls or levees) to one part of the Bay may affect other parts of the Bay; Hollemann and Stacey, 2014 showed that protecting the southern portion of the Bay may increase maximum water levels by 0.2 in the north of the Bay; Wang et al., 2018 showed that protection of the southern portions of the West Bay adversely affects the water levels of the East Bay, and vice-versa.
  Meanwhile, the Abu Dhabi coastline experiences storm surges that are caused by sustained northwesterly winds that generally occur in the winter (called the Shamal winds). They are incorporated into the model by the coupling of Delft3d with SWAN, a spectral wave model, and storm runup calculations were performed along the Abu Dhabi coastline in the same manner described in Chow and Sun, 2022. Details of the Abu Dhabi hydrodynamic models were provided in the Supplementary material, but in the next revision of the manuscript we will incorporate more detailed descriptions of the Abu Dhabi Delft3d and SWAN coupled models into the main body of the manuscript..
  Also, what is the level of damage that can occur due to tidal flooding (as the storm surge is not included here) needs to be discussed so that the readers can understand the significance or the need for this framework.
  
  Response:
  Thank you for the question. We will include in our revised manuscript that the motivations of studying the changes to tidal flooding within San Francisco Bay due to different protection scenarios are based on protection observations from previous studies that adding coastal protections to one part of the Bay may affect other parts of the Bay (Hollemann and Stacey, 2014; Wang et al., 2018).
  Meanwhile storm surges and tidal flooding have both been modeled for the Abu Dhabi coastline since there are effects both from storm surges with the presence of sustained onshore winds and tidal interactions between the multiple mangrove islands in the Abu Dhabi area (as shown in Figure 2).
  The manuscript needs to contain the details of how the hydrodynamic model, whose results have been used for training the DL model, is validated. What kind of events were used? How closely they matched the observed flooding also needs to be included in the paper.
  
  Response:
  Thank you for the comment. Previous work has performed validation of the hydrodynamic models used in this paper with the current tidal gauges so that the observed water levels (without sea level rise) are well reproduced by the simulation – see Wang et al., 2017 and Chow and Sun, 2022. The simulations were performed with a 0.5m SLR rise which reflects a possible future scenario for San Francisco Bay in the year (somewhere between 2050-2100 depending on the climate change scenario pathway (between SSP2-4.5 and SSP5-8.5) from IPCC AR6 report (referenced in Section 3.1.1 of the Supplementary Materials). As such the simulated flooding without SLR serves as a predicted flooding extent that is not currently observed.
  The paper argues that the developed model is computationally efficient, but it fails to elucidate the details on how fast it is in comparison to the hydrodynamic model. This is very crucial to establish the significance of this work.
  
  Response:
  We appreciate the reviewer’s comment regarding the need for a computational efficiency analysis. In the revised manuscript, we will include a detailed comparison of training time, inference time, and GPU memory usage across all evaluated models. Specifically, our CASPIAN-v2 model requires approximately 23 hours to train on a local machine equipped with an NVIDIA RTX 4090 GPU. Its inference time is around 0.227 seconds per scenario, and the model is lightweight, comprising only 0.38 million parameters.
  In contrast, simulating a single scenario using traditional hydrodynamic models (Delft3D for San Francisco and Delft3D coupled with SWAN for Abu Dhabi) takes approximately 14 hours and 19 hours, respectively, on high-performance computing infrastructure. This means that running all 72 scenarios in the test set (36 for Abu Dhabi and 36 for San Francisco) would take around 49 days using these simulators. In comparison, CASPIAN-v2 can process the same 72 scenarios in roughly 17 seconds.
  We will include these comparisons in a new table and accompanying discussion to clearly demonstrate the significant time and resource savings achieved by our surrogate model, directly addressing the computational motivation highlighted in the manuscript.
  There is no spatial validation of the over-/under-/correctly predicted flooded areas. For example, see Nithila Devi et al. (2024), where a spatial fitness index has been used to assess the spatial accuracy.
  
  Response:
  We thank the reviewer for the suggestion and for referencing Nithila Devi et al. (2024). We agree that spatial validation adds important insight into model performance. In the revised manuscript, we will reference past tidal gauge validation performed for the San Francisco model (Wang et al., 2017) and the Abu Dhabi model (Chow and Sun, 2022) for vertical water level validation with observations. While no flooding extent has been observed in the case of zero sea level rise, our model attempts to predict the extent of flooding in a future scenario of 0.5m sea level rise.
  In San Francisco, there is also riverine and pluvial flooding present. How only modelling the tidal flooding sufficient in this case for assessing the flood protection capabilities?
  
  Response:
  Thank you for the comment. Although our San Francisco model does include riverine input from the Sacramento and San Joaquin Rivers, the inflow rates into the Bay were baseline values rather than for extreme fluvial flood events. While we acknowledge that incorporating more refined hydrodynamic forcing conditions to include pluvial and riverine floods, as well as extreme storm events, can refine the hydrodynamic model to reflect more extreme flooding, our overall scope in this paper is in the use of machine learning to be able to act as a surrogate for a hydrodynamic model running at a more average condition but under a Sea Level Rise of 0.5m and in the presence of a coastal protection over many different stretches of the shoreline. Our focus on tidal flooding within San Francisco Bay was to highlight the unique tidal behavior within the Bay where the construction of sea walls along certain portions of the shoreline Bay may in fact exacerbate the sea level within the Bay to increase by up to 1 m (Holleman and Stacey, 2014).
  Flood protection measures on the coast are very vaguely discussed. It is very difficult to understand what they are, how they function, and how they are incorporated into the hydrodynamic and DL models. It will be nice to discuss with figures showing their placement, functionality and model representation explicitly.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will include and add more information contained in Supplemental Materials to the main body of the paper, namely: Sections 3.1.2 for San Francisco, which references previous work that provides more details of the coastal protection assumptions; and Section 3.1.1 that describes the flood protection measures taken for Abu Dhabi, which also references previous work that provide more details.
  Regarding the writing style, I find that the paper is more from a computer science background, rather than explaining clearly the application and relevance to the hydraulic modelling application. For example, the manuscript lacks details on training data used, accuracy of the training data, spatial prediction accuracy, specifics of protection measures, etc., as mentioned in the earlier comments.
  
  Response:
  We appreciate the observation of the reviewer regarding the writing style and the need to better align the manuscript with the expectations of the hydraulic modeling community. In the revised version, we will move some of the material presented in the Supplementary Materials section into the main body of the manuscript, we will revisit relevant sections to better highlight the application relevance and hydrodynamic context of our work (with some of the model development and validation material taken from Wang et al., 2018 and Chow and Sun, 2022). Finally, we will improve the explanation of the training data (including its source and accuracy), provide further clarification on spatial prediction accuracy, and include additional details on the definitions of the Operational Landscape Units (OLUs) and the protection scenarios of OLUs that were used. This will ensure that the manuscript clearly communicates both the modeling and application aspects of the study.
  The paper is very wordy with several irrelevant information and lacks crucial information necessary for the understanding of the readers. For example, information on kind of damages due to tidal flooding, validation etc. Please discuss the important information briefly rather than just citing other works.
  
  Response:
  Thank you for the feedback. In the revised manuscript, we will reduce wordiness and remove less relevant content. We will also briefly include key information, such as tidal flood impacts and data validation, to improve clarity and minimize reliance on external citations.
  The following are the specific comments listed for each section,
  
  Introduction and Related Works
  
  The introduction is very wordy and unorganised. A separate section of “related works” is unwarranted and has a lot of redundant information. Therefore, sections 1 and 2 (introduction and related works) need to be shortened and merged into one, ensuring that there is a proper flow of content in them.
  
  Response:
  Thank you for this suggestion. In the revised manuscript, we will merge the Introduction and Related Works sections to create a more concise and cohesive narrative. Redundant content will be removed, and we will ensure a clearer flow that integrates background, motivation, and relevant literature more effectively.
  Lines 17-20 can be shortened.
  
  Response:
  Thank you for pointing this out. In the revised manuscript, we will shorten lines 17–20 to convey the key message more concisely while maintaining clarity and relevance.
  Lines 26 – 34 need to be shortened, retaining information relevant to the kind of study the paper deals with.
  
  Response:
  Thank you for the suggestion. We will revise lines 26–34 to remove general content and retain only information directly relevant to the scope and objectives of this study.
  Please use terms like computationally expensive/intensive instead of the term “computationally prohibitive” if possible.
  
  Response:
  Thank you for the suggestion. As suggested by the reviewer, we will replace the term “computationally prohibitive” with “computationally expensive” in the revised manuscript, where applicable.
  Mention literature related to computationally efficient modelling approaches (subgrid, parallelisation, simplified models, etc.), such as De Almeida et al. (2013), Neal et al. (2012), Li and Hodges (2019), Sanders and Schubert (2019), Nithila Devi et al. (2024), etc. Discuss such methodologies and bring out the importance of DL/ML/AI in flood forecasting and modelling.
  
  Response:
  Thank you for this helpful recommendation regarding various methods of employing spatially nested models. In the revised manuscript, we will revise the introduction to include a brief discussion of computationally efficient hydrodynamic modeling approaches, as suggested by the reviewer. We wish to note that nesting has not been necessary for the DELFT3D portion of the model, as the model grid has been downscaled to about 30 m horizontal resolution for areas of interest such as the city of Abu Dhabi; and that the model is capable of modeling dry and wet grid cells. But we did employ a nesting method to incorporate the SWAN to model storm surges and waves for Abu Dhabi. We will highlight the limitations of traditional approaches and clearly motivate the role and advantages of AI-based methods in improving computational efficiency and scalability for flood forecasting and modeling.
  Highlight the need for high-resolution modelling, especially in the complex urban terrain and cite relevant literature.
  
  Response:
  We appreciate the suggestion of the reviewer to emphasize the need for high-resolution modeling, particularly in complex urban terrain. Initially, data from hydrodynamic simulations was in tabular format, which, although containing peak water levels and coordinates, lacked explicit spatial structure and contextual relationships among points. This made it difficult for a learning algorithm to fully understand the spatial dependencies critical in urban flood prediction.
  To address this, we transformed the tabular data into a high-dimensional spatial grid format (1024x1024) suitable for convolutional neural networks (CNNs). This transformation allowed us to encode not just the water level values but also the precise spatial positioning of each point within the urban landscape. By incorporating additional contextual information, we classified points by assigning two pixel values to each grid cell (one indicating proximity to protected OLU boundaries and the other to unprotected areas). This spatial encoding allowed the model to learn fine-grained flood patterns and localized behaviors more effectively within the CNN framework. This high-resolution spatial representation is essential in urban areas, where small variations in terrain or infrastructure can significantly influence inundation patterns. In future work, this framework also supports the integration of additional channels or modalities to further enrich spatial context.
  Line 56, what do you mean by high dimensionality of outputs?
  
  Response:
  Thank you for pointing this out. By “high dimensionality of outputs”, we refer to the pixel-wise prediction of inundation values over large spatial grids (e.g., 1024×1024), where the model must produce a continuous value for each pixel. This increases the complexity of capturing spatial dependencies and patterns, particularly under varying coastal boundary and sea level rise conditions. In the revised manuscript, we will clarify the sentence accordingly to avoid ambiguity.
  Since this paper expands on a deep-vision based framework, explain briefly about this in the introduction so that the readers from diverse backgrounds can appreciate the work. (Line 57)
  
  Response:
  Thank you for this suggestion. In the revised manuscript, we will include a brief explanation of the deep vision-based framework in the Introduction, as suggested by the reviewer. This addition will help readers from diverse backgrounds understand that our approach involves using convolutional neural networks (CNNs) to learn spatial patterns directly from gridded input data, enabling pixel-wise prediction of flood extent under varying coastal conditions.
  The paper lacks a dedicated and concrete objective description; rather, it mentions the conclusions from the study in the introduction. Please move the lines 61 – 81 to conclusions or discussions, if redundant, considering removing them.
  
  Response:
  We thank the reviewer for the feedback. The content in lines 61–81 was intended to highlight the key contributions of the study, which is a common practice in scientific writing to help readers quickly grasp the novelty and scope of the work. However, we understand that it may have been interpreted as a set of concluding remarks. To address this, we will revise the wording to clearly frame these points as the main contributions and ensure they are presented as part of the objective and motivation in the Introduction. If any parts appear repetitive or better suited for the conclusion, we will move or remove them accordingly to improve the flow and clarity.
  Lines 83 – 90, redundant information. Please remove them while merging sections 1 and 2.
  
  Response:
  Thank you for the suggestion. We will remove lines 83–90 while merging the sections, as they are redundant and do not contribute additional value to the streamlined narrative.
  Lines 91 – 98, irrelevant information.
  
  Response:
  Thank you for the comment. As part of merging and streamlining the Introduction and Related Works sections, we will remove lines 91–98, as they contain information that is not directly relevant to the focus of this study.
  Mention what kind of flood protection strategies exist here.
  
  Response:
  We thank the reviewer for the comment. While structural flood protection strategies, such as levees, seawalls, and revetments, are commonly used, our study does not focus on the specific type or engineering design of these structures. Instead, our primary concern is to evaluate the spatial importance of different shoreline segments in mitigating flood impacts.
  Specifically, the study area in Abu Dhabi includes 17 defined shoreline protection segments, while San Francisco includes 30. Our analysis explores how different combinations of these segments, when protected or left unprotected, influence inland inundation patterns. The objective is to identify which segments are most critical in reducing flood extent and should be prioritized for protection, irrespective of the exact structural form eventually adopted.
  It is also important to note that the Delft3D hydrodynamic simulator used in this study does not inherently distinguish between different types of flood protection structures in its core computations. Whether a barrier is a levee, seawall, or any other type, the model treats it generically as a solid obstruction defined by its geometry and crest elevation. This further justifies our focus on the presence or absence of protection along spatial segments rather than structural details.
  This evaluation is performed using our trained deep learning model, which can rapidly assess the impacts of hundreds of protection patterns across scenarios. In contrast, performing such analysis with traditional hydrodynamic simulators, such as Delft3D or Delft3D+SWAN, would be highly time-consuming; a single simulation can take up to 19 hours. Our surrogate model reduces this computation to a matter of seconds per scenario, enabling efficient and scalable analysis of spatial protection strategies. We will clarify this focus more in the revised manuscript.
  Lines 123 – 125, not clear.
  
  Response:
  We thank the reviewer for pointing this out. We agree that the lines could be made clearer. In the revised manuscript, we will rephrase this part as follows:
  Despite progress in flood modeling, current approaches often face limitations in computational efficiency and in incorporating dynamic factors such as land subsidence and urban morphology (Xu and Gao, 2024). Some recent studies have introduced hybrid surrogate models that combine machine learning with traditional hydrodynamic simulations for real-time coastal flood forecasting (Xu and Gao, 2024). However, many existing efforts remain focused on specific flood types or single triggering factors, without jointly considering long-term climate impacts such as sea level rise and evolving shoreline adaptation strategies.
  This revised version restores the missing context and clarifies the challenges and gaps that motivate our study.
  How is the training done for different SLRs?
  
  Response:
  We thank the reviewer for the question regarding the training process for different SLR scenarios. Our dataset includes two regions: Abu Dhabi and San Francisco. For Abu Dhabi, we had simulation data available for a single SLR level: 0.5 meters. For San Francisco, we had data corresponding to three SLR levels: 0.5 m, 1.0 m, and 1.5 m. Among these, the 1.0 m SLR scenario had sufficient simulation samples, while the other two had relatively limited data.
  To ensure stable and effective training, we used all available 0.5 m data from Abu Dhabi and the 1.0 m data from San Francisco for the primary training phase. This dataset was split into training, validation, and test sets to train and initially evaluate the model.
  After training, we performed fine-tuning using a subset of the limited samples available for the 0.5 m and 1.5 m SLR scenarios in San Francisco. The purpose of this step was to adapt the model and assess its ability to generalize across different SLR levels. The final evaluation for these additional SLR depths was then conducted on the remaining, unseen samples.
  This approach allowed the model to effectively learn from the most data-rich scenarios while still extending its predictive capability to other SLR conditions through targeted fine-tuning.
  Figure 1 can be moved to Methodology and please describe the overall framework and the steps involved there.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will move Figure 1 to the Methodology section and enhance the accompanying description to clearly outline the overall framework and the key steps involved in the proposed approach.
  Study Area and Data Description
  
  Line 136, what do you mean by environmental effects?
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will clarify by stating that the environmental impacts are mostly focused on the effects on transportation links, and specifically whether important arterials such as shoreline highways or freeways will be flooded due to SLR.
  Briefly describe OLU discretization in a few sentences for the benefit of readers.
  
  Response:
  Thank you for the question. In the revised manuscript, we will clarify that for San Francisco Bay, the discretization of the Operational Landscape Units (OLUs) was taken from SFEI, 2019. For Abu Dhabi, the urban coastline was also discretized into OLUs based on the Abu Dhabi Plan 2030 (Abu Dhabi Urban Planning Council, 2007). Maps of the OLU boundaries used for both San Francisco Bay and Abu Dhabi were shown in Section S1 in the supplementary Materials Section.
  Mention past flood damages caused by tides.
  
  Response:
  Thank you for the question. Our focus on tidal flooding within San Francisco Bay was to highlight the unique tidal behavior within the Bay where the construction of sea walls along certain portions of the shoreline Bay may in fact exacerbate the sea level in other areas of the Bay. For example, Hollemann and Stacey, 2014 showed that protecting the southern portion of the Bay may increase maximum water levels by 0.2 in the north of the Bay; Wang et al., 2018 showed that protection of the southern portions of the West Bay adversely affects the water levels of the East Bay, and vice-versa.
  Briefly mention here about the training dataset.
  
  Response:
  We appreciate the suggestion to provide more information about the training dataset. In the revised manuscript, we will briefly elaborate the dataset in the Study Area and Data Description section as suggested.
  How was the Delft 3D model validated? How accurate is it?
  
  Response:
  For San Francisco Bay, the Delft3D model was adapted from the CoSMoS model originally developed by Barnard (2014) and adapted to San Francisco Bay by Wang et al (2017), and validated in the past using tidal gages at 9 tidal gage locations in and around San Francisco Bay. Pearson correlation coefficients ranged from 0.9862 to 0.9996, while the RMS ratios (the ratio of modelled versus measured RMS amplitudes) ranged from 0.973 to 1.027 (please refer to Wang et al., 2017).
  For Abu Dhabi, the Delft3D model was validated using water level data from196 tidal gage locations throughout the Persian Gulf (as the hydrodynamic model encompassed the entire Gulf in addition to the western portions of the Gulf of Oman). The water levels at these locations were compared with one month’s worth of hydrodynamic simulation, and the resulting absolute RMSE values ranged from 0.0013 to 0.0043 m in the vicinity of Abu Dhabi. More validation details for Abu Dhabi can be found in Chow and Sun, 2022.
  Rather than points, use line or raster to areas susceptible to flooding. Figure 2a.
  
  Response:
  Thank you for the suggestion. As recommended, we will revise Figure 2a to replace the point-based visualization with a line or raster representation that more clearly delineates areas susceptible to flooding. This will improve readability and better convey spatial flood risk patterns across the region.
  Table 1 – What does main set, hold out set, etc., mean? Please explain when you have a table or figure in the manuscript for a general reader.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will add a clear explanation of the terms used in Table 1 for general readers. Specifically, we will clarify that the main set is used for training, validation, and testing of the model; the holdout set includes manually selected challenging scenarios to evaluate model performance under complex protection configurations; and the generalizability set consists of San Francisco data with two additional sea level rise depths, used to fine-tune and assess the model’s ability to generalize to new conditions not originally trained on.
  What are Shamal winds? Describe in a sentence or two with relevant literature.
  
  Response:
  Thank you for the question. While we have referenced Langodan et al., 2023 in Section 3.1.1 relating to the Shamal winds, we will place an explanation of these winds in the main body of the revised text. While the Persian Gulf does not typically experience tropical cyclones, it is known for its northwesterly winds generally occurring with winds at about 20 m/s with sudden onset and sustained over a period of up to 3-5 days. These are called the Shamal winds (meaning “North” in Arabic) and occur at least 10 times annually, mainly during the winter months (Senafi et al., 2015; Li et al., 2020).
  It is unclear what you mean by three months. Period of simulation or the computational time itself? What is the significance of choosing this?
  
  Response:
  Thank you for the question. A three-month period of simulation was chosen for both the San Francisco and the Abu Dhabi simulations. This was chosen to balance the need to model a larger number of tidal cycles that are modeled per simulation, with the computational time and storage space used for the simulations.
  Method
  
  Section 4.1, please shorten this to retain relevant general information. The rest can be moved to supplementary so that the readers don’t lose interest.
  
  Response:
  Thank you for the suggestion. As recommended, we will shorten Section 4.1 to retain only the most relevant general information about the model architecture. The more detailed architectural components and implementation specifics will be moved to the supplementary material to maintain reader engagement while keeping the core content concise.
  Describe a spatial fitness metric.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will include a spatial fitness metric to evaluate the alignment between predicted and ground truth flood extents, specifically assessing spatial accuracy in terms of overprediction, underprediction, and correctly matched inundated areas.
  Results
  
  Tabulate and discuss the values of the spatial fitness index for different DL/ML models.
  
  Response:
  Thank you for this valuable suggestion. In the revised manuscript, we plan to compute and tabulate the spatial fitness metric for our proposed CASPIAN-v2 model, along with any baseline DL/ML models evaluated in the study. We will include the resulting Table in the main text and briefly discuss the findings to highlight how CASPIAN-v2 performs in terms of spatial prediction accuracy relative to alternative approaches.
  Table 5, please describe the performance of the proposed method compared to the existing methods in the paper.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will include a discussion describing the performance of the proposed method relative to existing approaches as summarized in Table 5. However, as most existing ML/DL-based flood prediction methods in the literature have not open-sourced their code and are often designed for point-based or aggregated outputs rather than 2D gridded flood maps, direct benchmarking remains challenging. Nonetheless, we will highlight key differences in methodology and performance trends to contextualize the advantages of CASPIAN-v2.
  Please use a zoomed-in figure to illustrate the effect of the representation of the protection measure in the manuscript.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will include a zoomed-in figure, as part of the results section (e.g., Figures 5 and 6), to illustrate the effect of the representation of the protection measures. This will help highlight how different shoreline protection configurations influence localized inundation patterns and improve the interpretability of the results.
  Section 6.2.2, Lines 475 – 478, lacks a clear explanation.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will clarify that the referenced abrupt changes in local flood behavior relate to spatial variations in inundation caused by diverse OLU protection scenarios, as illustrated in Figure 6. The figure presents results on a holdout set specifically curated to include challenging scenarios, such as configurations where one side of a Bay is protected while the other is not. These cases test the ability of the model to capture complex spatial dynamics. We will also highlight that the model demonstrates strong generalization to such unseen scenarios, even though the training data covers only a small subset of the possible 2OLUs combinations. This indicates that CASPIAN-v2 learns to infer inundation patterns based on the spatial protection configuration of OLUs.
  Section 6.2.4, a repeated heading, please be more specific and clearer.
  
  Response:
  Thank you for pointing this out. In the revised manuscript, we will revise the headings within Section 6.1 and 6.2 to avoid repetition and improve clarity.
  Line 494, how is the ground truth information collected? What is the associated error?
  
  Response:
  Thank you for this important question. In the revised manuscript, we will clarify that the ground truth inundation data used for training and evaluation were derived from physics-based hydrodynamic simulations. We will also provide information on the validation of these simulations and discuss the expected level of error or uncertainty in the simulation outputs, where available. This clarification will be included near Line 494 to improve transparency regarding data quality.
  Line 506, summarize in a few sentences about data augmentation.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will summarize the data augmentation strategy in a few concise sentences, as suggested.
  Figure 9, please move it to the supplementary.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will move Figure 9 to the supplementary material, as recommended.
  References
  
  Bates, P. D., Horritt, M. S., & Fewtrell, T. J. (2010). A simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling. Journal of Hydrology, 387(1–2), 33–45. https://doi.org/10.1016/j.jhydrol.2010.03.027
  
  De Almeida, G. A. M., & Bates, P. (2013). Applicability of the local inertial approximation of the shallow water equations to flood modeling. Water Resources Research, 49(8), 4833–4844. https://doi.org/10.1002/wrcr.20366
  
  Li, Z., & Hodges, B. R. (2019). Modeling subgrid-scale topographic effects on shallow marsh hydrodynamics and salinity transport. Advances in Water Resources, 129, 1–15. https://doi.org/10.1016/j.advwatres.2019.05.004
  
  Neal, J., Schumann, G., & Bates, P. (2012). A subgrid channel model for simulating river hydraulics and floodplain inundation over large and data sparse areas. Water Resources Research, 48(11), 1–16. https://doi.org/10.1029/2012WR012514
  
  Nithila Devi, N., & Kuiry, S. N. (2024). A novel local‐inertial formulation representing subgrid scale topographic effects for urban flood simulation. Water Resources Research, 60(5), e2023WR035334.
  
  Sanders, B. F., & Schubert, J. E. (2019). PRIMo: Parallel raster inundation model. Advances in Water Resources, 126, 79–95. https://doi.org/10.1016/J.ADVWATRES.2019.02.007
  
  Response:
  Thank you for sharing these references. We will incorporate the relevant literature, as suggested, into the revised manuscript to strengthen the discussion on computationally efficient flood modeling approaches and better contextualize the role of deep learning in this domain.
  Response References:
  Abu Dhabi Urban Planning Council. Plan Abu Dhabi 2030. Urban Structure Framework Plan. 2007. Available online: https://www.ecouncil.ae/PublicationsEn/plan-abu-dhabi-full-version-EN.pdf
  
  Al Senafi, F.; Anis, A. Shamals and climate variability in the Northern Arabian/Persian Gulf from 1973 to 2012. Int. J. Clim. 2015,35, 4509–4528.
  
  Chow, A.C. and Sun, J., 2022. Combining Sea level rise inundation impacts, tidal flooding and extreme wind events along the Abu Dhabi coastline. Hydrology, 9(8), p.143.
  
  Holleman, R.C. and Stacey, M.T., 2014. Coupling of sea level rise, tidal amplification, and inundation. Journal of Physical Oceanography, 44(5), pp.1439-1455.
  
  IPCC. Summary for Policymakers. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S.L., Péan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M.I., et al., Eds.; Cambridge University Press: Cambridge, UK, 2021.
  
  Li, D.; Anis, A.; Al Senafi, F. Physical response of the Northern Arabian Gulf to winter Shamals. J. Mar. Syst. 2020, 203, 103280.
  
  San Francisco Estuary Institute (SFEI), 2019 (Beagle, J., Lowe, J., McKnight, K., Safran, S., Tam, L. and Szambelan, S.J., authors). San Francisco Bay shoreline adaptation atlas: Working with nature to plan for sea level rise using operational landscape units (No. SFEI publication# 915).
  
  Wang, R.Q., Herdman, L.M., Erikson, L., Barnard, P., Hummel, M. and Stacey, M.T., 2017. Interactions of estuarine shoreline infrastructure with multiscale sea level variability. Journal of Geophysical Research: Oceans, 122(12), pp.9962-9979.
  
  Wang, R.Q., Stacey, M.T., Herdman, L.M.M., Barnard, P.L. and Erikson, L., 2018. The influence of sea level rise on the regional interdependence of coastal infrastructure. Earth's Future, 6(5), pp.677-688.
  
  Citation: https://doi.org/10.5194/egusphere-2025-838-AC1
RC2:
'Comment on egusphere-2025-838', Anonymous Referee #2, 28 Apr 2025
Anonymous Referee #2
Summary
This manuscript presents CASPIAN-v2, a novel deep learning framework for predicting coastal flooding under varying sea level rise (SLR) scenarios and shoreline protection strategies. The authors test their approach on two distinct geographical regions (Abu Dhabi and San Francisco) and demonstrate superior performance compared to state-of-the-art methods. The paper makes several contributions, including a new CNN-based architecture, comprehensive datasets from vulnerable coastal areas, and validation of generalizability across different SLR scenarios.
While the work represents a significant advancement in data-driven coastal flood prediction, several aspects require substantial revision before the manuscript is suitable for publication in HESS.
General Comments
Model Architecture Presentation and Justification

The CASPIAN-v2 architecture is sophisticated but presented in an overly complex manner that hampers understanding. Figure 4 contains too much information without sufficient explanation of design choices. The authors introduce multiple novel components (MARX blocks, SEE blocks) without adequate justification for these specific innovations over simpler alternatives.
Example: In Section 4.1.2, the rationale for integrating ResNeXt blocks with CBAM is not clearly connected to the specific challenges of flood prediction. The authors should explain why this combination addresses spatial dependencies in coastal flooding better than other attention mechanisms.
Recommendation: Provide a simplified schematic of the architecture alongside the detailed one and clearly justify each novel component in relation to the specific requirements of flood prediction tasks.
Computational Efficiency Analysis

A primary motivation for developing surrogate models is the computational burden of physics-based simulations. However, the paper lacks a rigorous comparison of computational efficiency between the proposed model and alternatives.
Example: While Table 5 thoroughly compares prediction accuracy, it contains no information about training times, inference times, or memory requirements. This is particularly important given that lines 45-49 on page 2 emphasize computational burden as a key limitation of current approaches.
Recommendation: Include a comprehensive analysis of computational efficiency, comparing training and inference times across all evaluated models, and explicitly stating the practical time savings compared to hydrodynamic simulations.
Uncertainty Quantification

The model provides deterministic predictions without addressing prediction uncertainties, which is crucial for risk assessment and decision support in coastal planning.
Example: The error maps in Figures 5-7 show where predictions differ from ground truth, but they don't indicate the model's confidence in its predictions, which is essential for reliable risk assessment.
Recommendation: Incorporate uncertainty quantification into the model (e.g., through ensemble methods, Bayesian techniques, or prediction intervals) or thoroughly discuss this limitation and its implications for practical use.
Data Imbalance Handling

Figure 9 reveals severe class imbalance in the dataset, with non-inundated areas predominating. While the authors acknowledge this challenge, they don't adequately explain how their approach specifically addresses it.
Example: Section 7 mentions the imbalance issue but doesn't describe specific techniques beyond the hybrid loss function that were employed to mitigate its effects. It's unclear how the model achieves its reported high accuracy despite this challenge.
Recommendation: Elaborate on specific techniques used to address data imbalance, potentially including specialized sampling strategies, data augmentation approaches tailored to rare flood events, or custom components in the architecture designed for imbalanced spatial data.
Real-world Application Context

The practical utility of the model for coastal planning is asserted but not demonstrated through concrete examples or integration pathways.
Example: The conclusion claims CASPIAN-v2 is "an essential tool for coastal resilience planning" (lines 539-540, page 28), but doesn't provide specific guidance on how planners might integrate this tool with existing decision-making frameworks.
Recommendation: Include a case study or conceptual workflow showing how the model could be integrated into actual coastal planning processes, identifying key stakeholders and decision points where the model adds value.

Specific Comments
Mathematical Notation Inconsistency

The paper uses inconsistent notation, particularly in Section 4.1, making the mathematical formulations difficult to follow.
Example: In Equations 1-5, subscripts sometimes denote indices and sometimes represent different variables entirely. The relationship between tensors across equations is not always clear.
Recommendation: Standardize notation throughout the paper and provide a notation table for reference.
Evaluation Metrics Justification

While the paper employs multiple evaluation metrics, the rationale for these specific choices and their relevance to practical flood prediction applications isn't fully explained.
Example: The threshold exceedance metric (δ > Δ) is introduced in Section 5.4, but its practical significance for flood risk assessment isn't discussed.
Recommendation: Justify the choice of each evaluation metric in terms of its relevance to practical flood prediction applications and decision-making contexts.
Data Preprocessing Details

The data preprocessing section (3.3) lacks sufficient detail on critical aspects that could impact model performance.
Example: The method for mapping inundation coordinates onto a 1024×1024 grid (lines 182-184, page 9) is mentioned but not described in detail, despite this being a critical step that affects the spatial resolution of predictions.
Recommendation: Provide more detailed explanation of preprocessing steps, potentially with illustrative examples showing the transformation from raw data to model inputs.
Ablation Study Presentation

The paper mentions ablation studies in the supplementary material but doesn't adequately summarize key findings in the main text.
Example: Line 341-342 on page 16 mentions "extensive ablation studies" but doesn't present the key insights derived from these experiments.
Recommendation: Include a summary table of ablation study results in the main text, highlighting the contribution of each novel component to overall performance.
Figure Clarity and Interpretation

Several figures are complex and difficult to interpret, with insufficient explanation in captions and text.
Example: Figure 7 compares model predictions across different approaches, but the subtle differences between models are difficult to discern with the chosen color scale and presentation format.
Recommendation: Improve figure clarity through better color scales, simplified presentations, or additional explanatory elements like difference maps to highlight where each model performs better or worse.
Summary Assessment
This manuscript presents valuable research on deep learning for coastal flood prediction, with promising results that could significantly advance the field. However, major revisions are needed to address issues related to model architecture presentation, computational efficiency analysis, uncertainty quantification, data imbalance handling, and real-world application context.
With these improvements, the paper has the potential to make a significant contribution to both the technical literature on deep learning for environmental modeling and practical coastal planning applications.
Citation: https://doi.org/10.5194/egusphere-2025-838-RC2
- AC2:
  'Reply on RC2', Bilal Hassan, 21 May 2025
  Anonymous Referee # 2
  Summary
  
  This manuscript presents CASPIAN-v2, a novel deep learning framework for predicting coastal flooding under varying sea level rise (SLR) scenarios and shoreline protection strategies. The authors test their approach on two distinct geographical regions (Abu Dhabi and San Francisco) and demonstrate superior performance compared to state-of-the-art methods. The paper makes several contributions, including a new CNN-based architecture, comprehensive datasets from vulnerable coastal areas, and validation of generalizability across different SLR scenarios.
  
  While the work represents a significant advancement in data-driven coastal flood prediction, several aspects require substantial revision before the manuscript is suitable for publication in HESS.
  
  Response:
  We thank the reviewer for their thoughtful summary and for recognizing the key contributions of our work. At the same time, we fully understand that several areas require further refinement. In the revised manuscript, we will address all raised concerns in detail to ensure the work meets the standards of HESS. We believe these revisions will significantly improve clarity, rigor, and impact of our work.
  General Comments
  
  Model Architecture Presentation and Justification
  
  The CASPIAN-v2 architecture is sophisticated but presented in an overly complex manner that hampers understanding. Figure 4 contains too much information without sufficient explanation of design choices. The authors introduce multiple novel components (MARX blocks, SEE blocks) without adequate justification for these specific innovations over simpler alternatives.
  
  Example: In Section 4.1.2, the rationale for integrating ResNeXt blocks with CBAM is not clearly connected to the specific challenges of flood prediction. The authors should explain why this combination addresses spatial dependencies in coastal flooding better than other attention mechanisms.
  
  Recommendation: Provide a simplified schematic of the architecture alongside the detailed one and clearly justify each novel component in relation to the specific requirements of flood prediction tasks.
  
  Response:
  We thank the reviewer for the helpful feedback regarding the clarity and justification of the CASPIAN-v2 architecture. In the revised manuscript, we plan to redraw Figure 4 to include a simplified schematic of the overall encoder–bottleneck–decoder structure, alongside the existing detailed version. This will help improve readability and guide the reader through the hierarchical design.
  Additionally, to better justify the inclusion of the MARX and SEE blocks, we will update the ablation study to show model performance when these components are entirely removed or replaced with alternative modules. While the current ablation study already includes results with different numbers of MARX and SEE blocks, we recognize the importance of evaluating their overall necessity. We have also experimented with other configurations in the bottleneck, including standard ResNet, ResNeXt blocks, and Squeeze-and-Excitation modules. These additional comparisons will be included to demonstrate why the selected components (ResNeXt blocks with CBAM) in the bottleneck are more effective in capturing spatial dependencies and multi-scale interactions critical for accurate flood prediction.
  Computational Efficiency Analysis
  
  A primary motivation for developing surrogate models is the computational burden of physics-based simulations. However, the paper lacks a rigorous comparison of computational efficiency between the proposed model and alternatives.
  
  Example: While Table 5 thoroughly compares prediction accuracy, it contains no information about training times, inference times, or memory requirements. This is particularly important given that lines 45-49 on page 2 emphasize computational burden as a key limitation of current approaches.
  
  Recommendation: Include a comprehensive analysis of computational efficiency, comparing training and inference times across all evaluated models, and explicitly stating the practical time savings compared to hydrodynamic simulations.
  
  Response:
  We appreciate the reviewer’s comment regarding the need for a computational efficiency analysis. In the revised manuscript, we will include a detailed comparison of training time, inference time, and GPU memory usage across all evaluated models. Specifically, our CASPIAN-v2 model requires approximately 23 hours to train on a local machine equipped with an NVIDIA RTX 4090 GPU. Its inference time is around 0.227 seconds per scenario, and the model is lightweight, comprising only 0.38 million parameters.
  In contrast, simulating a single scenario using traditional hydrodynamic models (Delft3D for San Francisco and Delft3D coupled with SWAN for Abu Dhabi) takes approximately 14 hours and 19 hours, respectively, on high-performance computing infrastructure. This means that running all 72 scenarios in the test set (36 for Abu Dhabi and 36 for San Francisco) would take around 49 days using these simulators. In comparison, CASPIAN-v2 can process the same 72 scenarios in roughly 17 seconds.
  We will include these comparisons in a new table and accompanying discussion to clearly demonstrate the significant time and resource savings achieved by our surrogate model, directly addressing the computational motivation highlighted in the manuscript.
  Uncertainty Quantification
  
  The model provides deterministic predictions without addressing prediction uncertainties, which is crucial for risk assessment and decision support in coastal planning.
  
  Example: The error maps in Figures 5-7 show where predictions differ from ground truth, but they don't indicate the model's confidence in its predictions, which is essential for reliable risk assessment.
  
  Recommendation: Incorporate uncertainty quantification into the model (e.g., through ensemble methods, Bayesian techniques, or prediction intervals) or thoroughly discuss this limitation and its implications for practical use.
  
  Response:
  We appreciate the valuable feedback from the reviewer regarding uncertainty quantification. We fully agree that predictive uncertainty is essential for informed decision-making in coastal risk management. While we previously used Grad-CAM to provide qualitative insights into the spatial regions influencing the model’s predictions, we acknowledge that this approach supports interpretability but does not quantify uncertainty.
  To address this comment, in the revised manuscript we plan to incorporate a predictive uncertainty estimation method. Specifically, we propose to apply Monte Carlo Dropout by enabling dropout at inference time to generate multiple outputs for the same input. Alternatively, we may implement deep ensembles, where multiple independently trained models are used to derive a distribution over predictions. In addition, we will explore the use of random cutout–based test-time augmentation, where different variants of the same input are passed through the model, and the variability in outputs is used to estimate uncertainty. This method leverages our existing data augmentation strategy and provides a simple, architecture-agnostic way to probe model robustness.
  All three approaches will help generate confidence maps alongside the predicted inundation, offering a more robust basis for risk-sensitive planning. We will include the resulting uncertainty maps and a corresponding discussion in the revised manuscript..
  Data Imbalance Handling
  
  Figure 9 reveals severe class imbalance in the dataset, with non-inundated areas predominating. While the authors acknowledge this challenge, they don't adequately explain how their approach specifically addresses it.
  
  Example: Section 7 mentions the imbalance issue but doesn't describe specific techniques beyond the hybrid loss function that were employed to mitigate its effects. It's unclear how the model achieves its reported high accuracy despite this challenge.
  
  Recommendation: Elaborate on specific techniques used to address data imbalance, potentially including specialized sampling strategies, data augmentation approaches tailored to rare flood events, or custom components in the architecture designed for imbalanced spatial data.
  
  Response:
  Thank you for highlighting this important point. We acknowledge the significant class imbalance in the dataset, as shown in Figure 9, with non-inundated points dominating the spatial distribution. While we did not apply explicit sampling or augmentation strategies to overcome this, our current formulation incorporates a hybrid loss function and custom architectural components (e.g., MARX and SEE blocks) designed to learn and prioritize spatially meaningful features, particularly around protection boundaries and inundation-prone regions.
  As further evidenced in Figure 10, the model tends to focus more on areas near unprotected OLUs, which are more likely to flood. This behavior suggests that despite the numerical dominance of non-inundated points, the network implicitly assigns greater representational focus to the spatial characteristics associated with flood-prone zones. In the revised manuscript, we will clarify this point and also briefly discuss possible future enhancements, such as spatial weighting schemes or scenario-focused data balancing, to further address the imbalance in flood prediction tasks..
  Real-world Application Context
  
  The practical utility of the model for coastal planning is asserted but not demonstrated through concrete examples or integration pathways.
  
  Example: The conclusion claims CASPIAN-v2 is "an essential tool for coastal resilience planning" (lines 539-540, page 28), but doesn't provide specific guidance on how planners might integrate this tool with existing decision-making frameworks.
  
  Recommendation: Include a case study or conceptual workflow showing how the model could be integrated into actual coastal planning processes, identifying key stakeholders and decision points where the model adds value.
  
  Response:
  Thank you for this insightful comment. In the revised manuscript, we will include a workflow explanation describing how CASPIAN-v2 can be applied in real-world coastal planning contexts. Specifically, we will outline how the output of the model, such as scenario-based flood maps and uncertainty estimates, can support planners, engineers, and policymakers at various decision points. We will also describe how the model complements traditional hydrodynamic simulations by enabling rapid scenario analysis, which is particularly useful for emergency preparedness and evaluating the impacts of different shoreline protection strategies. This addition will clarify the practical value of the work and strengthen its positioning as a decision-support tool.
  Specific Comments
  
  Mathematical Notation Inconsistency
  
  The paper uses inconsistent notation, particularly in Section 4.1, making the mathematical formulations difficult to follow.
  
  Example: In Equations 1-5, subscripts sometimes denote indices and sometimes represent different variables entirely. The relationship between tensors across equations is not always clear.
  
  Recommendation: Standardize notation throughout the paper and provide a notation table for reference.
  
  Response:
  Thank you for pointing out the issue with notation clarity. In the revised manuscript, we will standardize the mathematical notation, ensuring consistent use of subscripts, tensor dimensions, and variable references. Additionally, we will include a notation table to clearly define all symbols and improve readability.
  Evaluation Metrics Justification
  
  While the paper employs multiple evaluation metrics, the rationale for these specific choices and their relevance to practical flood prediction applications isn't fully explained.
  
  Example: The threshold exceedance metric (δ > Δ) is introduced in Section 5.4, but its practical significance for flood risk assessment isn't discussed.
  
  Recommendation: Justify the choice of each evaluation metric in terms of its relevance to practical flood prediction applications and decision-making contexts.
  
  Response:
  Thank you for highlighting this important point. In the revised manuscript, we will elaborate on the rationale behind our choice of evaluation metrics. While we included multiple metrics to comprehensively assess model performance from different perspectives, we now recognize the need to clarify what each metric represents and how it relates to practical flood prediction and decision-making.
  Data Preprocessing Details
  
  The data preprocessing section (3.3) lacks sufficient detail on critical aspects that could impact model performance.
  
  Example: The method for mapping inundation coordinates onto a 1024×1024 grid (lines 182-184, page 9) is mentioned but not described in detail, despite this being a critical step that affects the spatial resolution of predictions.
  
  Recommendation: Provide more detailed explanation of preprocessing steps, potentially with illustrative examples showing the transformation from raw data to model inputs.
  
  Response:
  Thank you for this observation. Due to space constraints in the main manuscript, we provided a detailed description of the data preprocessing steps in Section S2 of the supplementary material. This section outlines each stage of the transformation from raw simulation outputs to the standardized 1024×1024 input grids used by the model. We will ensure that this is clearly referenced in Section 3.3 of the revised manuscript.
  Ablation Study Presentation
  
  The paper mentions ablation studies in the supplementary material but doesn't adequately summarize key findings in the main text.
  
  Example: Line 341-342 on page 16 mentions "extensive ablation studies" but doesn't present the key insights derived from these experiments.
  
  Recommendation: Include a summary table of ablation study results in the main text, highlighting the contribution of each novel component to overall performance.
  
  Response:
  Thank you for this helpful suggestion. In the revised manuscript, we will include a summary table of the ablation study results in the main text, highlighting the performance impact of each novel component. This will provide a clearer understanding of their individual contributions and key insights from the experiments currently detailed in the supplementary material.
  Figure Clarity and Interpretation
  
  Several figures are complex and difficult to interpret, with insufficient explanation in captions and text.
  
  Example: Figure 7 compares model predictions across different approaches, but the subtle differences between models are difficult to discern with the chosen color scale and presentation format.
  
  Recommendation: Improve figure clarity through better color scales, simplified presentations, or additional explanatory elements like difference maps to highlight where each model performs better or worse.
  
  Response:
  Thank you for pointing this out. In the revised manuscript, we will improve the clarity of key figures, particularly Figure 7, by adopting more perceptually distinct color scales and adding explanatory elements to better highlight variations between model predictions. We will also enhance the figure captions and in-text descriptions to guide interpretation and ensure that the visual comparisons are more accessible and informative.
  Summary Assessment
  
  This manuscript presents valuable research on deep learning for coastal flood prediction, with promising results that could significantly advance the field. However, major revisions are needed to address issues related to model architecture presentation, computational efficiency analysis, uncertainty quantification, data imbalance handling, and real-world application context.
  
  With these improvements, the paper has the potential to make a significant contribution to both the technical literature on deep learning for environmental modeling and practical coastal planning applications.
  
  Response:
  We sincerely thank the reviewer for their encouraging assessment and constructive feedback. We appreciate the recognition of the potential impact of our work and fully acknowledge the areas that require improvement. In response, we plan to undertake substantial revisions to enhance the clarity of the model architecture presentation, provide a comprehensive computational efficiency analysis, incorporate or discuss uncertainty quantification, clarify our approach to data imbalance handling, and better contextualize the results for real-world coastal planning applications. We are confident that these revisions will strengthen the manuscript and align it more closely with the expectations of both technical and applied research communities.
  
  Citation: https://doi.org/10.5194/egusphere-2025-838-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-838', Anonymous Referee #1, 25 Apr 2025
The manuscript proposes a novel deep learning framework for predicting coastal flooding (only tidal). The framework has been applied to the urban areas of Abu Dhabi and San Francisco. The proposed model performs better in comparison to the existing DL methods. The manuscript is free from grammatical errors and is interesting to read. However, the following serious concerns need to be addressed,
The developed model is only tuned for tidal flooding, and it does not account for the storm surge, which is a major source of coastal flooding. Especially, San Francisco is vulnerable to flooding by hurricanes. The paper doesn’t mention the idea behind the omission of flooding due to storm surge, and why modelling only tidal flooding is crucial in these example cases. The motivation for setting up this kind of prediction system is very unclear.

Also, what is the level of damage that can occur due to tidal flooding (as the storm surge is not included here) needs to be discussed so that the readers can understand the significance or the need for this framework.

The manuscript needs to contain the details of how the hydrodynamic model, whose results have been used for training the DL model, is validated. What kind of events were used? How closely they matched the observed flooding also needs to be included in the paper.

The paper argues that the developed model is computationally efficient, but it fails to elucidate the details on how fast it is in comparison to the hydrodynamic model. This is very crucial to establish the significance of this work.

There is no spatial validation of the over-/under-/correctly predicted flooded areas. For example, see Nithila Devi et al. (2024), where a spatial fitness index has been used to assess the spatial accuracy.

In San Francisco, there is also riverine and pluvial flooding present. How only modelling the tidal flooding sufficient in this case for assessing the flood protection capabilities?

Flood protection measures on the coast are very vaguely discussed. It is very difficult to understand what they are, how they function, and how they are incorporated into the hydrodynamic and DL models. It will be nice to discuss with figures showing their placement, functionality and model representation explicitly.

Regarding the writing style, I find that the paper is more from a computer science background, rather than explaining clearly the application and relevance to the hydraulic modelling application. For example, the manuscript lacks details on training data used, accuracy of the training data, spatial prediction accuracy, specifics of protection measures, etc., as mentioned in the earlier comments.

The paper is very wordy with several irrelevant information and lacks crucial information necessary for the understanding of the readers. For example, information on kind of damages due to tidal flooding, validation etc. Please discuss the important information briefly rather than just citing other works.

The following are the specific comments listed for each section,
Introduction and Related Works
The introduction is very wordy and unorganised. A separate section of “related works” is unwarranted and has a lot of redundant information. Therefore, sections 1 and 2 (introduction and related works) need to be shortened and merged into one, ensuring that there is a proper flow of content in them.

Lines 17-20 can be shortened.

Lines 26 – 34 need to be shortened, retaining information relevant to the kind of study the paper deals with.

Please use terms like computationally expensive/intensive instead of the term “computationally prohibitive” if possible.

Mention literature related to computationally efficient modelling approaches (subgrid, parallelisation, simplified models, etc.), such as De Almeida et al. (2013), Neal et al. (2012), Li and Hodges (2019), Sanders and Schubert (2019), Nithila Devi et al. (2024), etc. Discuss such methodologies and bring out the importance of DL/ML/AI in flood forecasting and modelling.

Highlight the need for high-resolution modelling, especially in the complex urban terrain and cite relevant literature.

Line 56, what do you mean by high dimensionality of outputs?

Since this paper expands on a deep-vision based framework, explain briefly about this in the introduction so that the readers from diverse backgrounds can appreciate the work. (Line 57)

The paper lacks a dedicated and concrete objective description; rather, it mentions the conclusions from the study in the introduction. Please move the lines 61 – 81 to conclusions or discussions, if redundant, considering removing them.

Lines 83 – 90, redundant information. Please remove them while merging sections 1 and 2.

Lines 91 – 98, irrelevant information.

Mention what kind of flood protection strategies exist here.

Lines 123 – 125, not clear.

How is the training done for different SLRs?

Figure 1 can be moved to Methodology and please describe the overall framework and the steps involved there.

Study Area and Data Description
Line 136, what do you mean by environmental effects?

Briefly describe OLU discretization in a few sentences for the benefit of readers.

Mention past flood damages caused by tides.

Briefly mention here about the training dataset.

How was the Delft 3D model validated? How accurate is it?

Rather than points, use line or raster to areas susceptible to flooding. Figure 2a.

Table 1 – What does main set, hold out set, etc., mean? Please explain when you have a table or figure in the manuscript for a general reader.

What are Shamal winds? Describe in a sentence or two with relevant literature.

It is unclear what you mean by three months. Period of simulation or the computational time itself? What is the significance of choosing this?

Method
Section 4.1, please shorten this to retain relevant general information. The rest can be moved to supplementary so that the readers don’t lose interest.

Describe a spatial fitness metric.

Results
Tabulate and discuss the values of the spatial fitness index for different DL/ML models.

Table 5, please describe the performance of the proposed method compared to the existing methods in the paper.

Please use a zoomed-in figure to illustrate the effect of the representation of the protection measure in the manuscript.

Section 6.2.2, Lines 475 – 478, lacks a clear explanation.

Section 6.2.4, a repeated heading, please be more specific and clearer.

Line 494, how is the ground truth information collected? What is the associated error?

Line 506, summarize in a few sentences about data augmentation.

Figure 9, please move it to the supplementary.

References
Bates, P. D., Horritt, M. S., & Fewtrell, T. J. (2010). A simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling. Journal of Hydrology, 387(1–2), 33–45. https://doi.org/10.1016/j.jhydrol.2010.03.027
De Almeida, G. A. M., & Bates, P. (2013). Applicability of the local inertial approximation of the shallow water equations to flood modeling. Water Resources Research, 49(8), 4833–4844. https://doi.org/10.1002/wrcr.20366
Li, Z., & Hodges, B. R. (2019). Modeling subgrid-scale topographic effects on shallow marsh hydrodynamics and salinity transport. Advances in Water Resources, 129, 1–15. https://doi.org/10.1016/j.advwatres.2019.05.004
Neal, J., Schumann, G., & Bates, P. (2012). A subgrid channel model for simulating river hydraulics and floodplain inundation over large and data sparse areas. Water Resources Research, 48(11), 1–16. https://doi.org/10.1029/2012WR012514
Nithila Devi, N., & Kuiry, S. N. (2024). A novel local‐inertial formulation representing subgrid scale topographic effects for urban flood simulation. Water Resources Research, 60(5), e2023WR035334.
Sanders, B. F., & Schubert, J. E. (2019). PRIMo: Parallel raster inundation model. Advances in Water Resources, 126, 79–95. https://doi.org/10.1016/J.ADVWATRES.2019.02.007
Citation: https://doi.org/10.5194/egusphere-2025-838-RC1
- AC1:
  'Reply on RC1', Bilal Hassan, 21 May 2025
  Anonymous Referee # 1
  The manuscript proposes a novel deep learning framework for predicting coastal flooding (only tidal). The framework has been applied to the urban areas of Abu Dhabi and San Francisco. The proposed model performs better in comparison to the existing DL methods. The manuscript is free from grammatical errors and is interesting to read. However, the following serious concerns need to be addressed,
  
  Response:
  We thank the reviewer for their positive assessment of our manuscript and for recognizing the novelty and potential of the proposed deep learning framework for coastal flood prediction. We appreciate the encouraging comments regarding the manuscript’s clarity and relevance. We also acknowledge the important concerns raised and will address each of them in detail in the revised manuscript. Below, we outline the actions we plan to take to respond to each point and improve the manuscript accordingly.
  The developed model is only tuned for tidal flooding, and it does not account for the storm surge, which is a major source of coastal flooding. Especially, San Francisco is vulnerable to flooding by hurricanes. The paper doesn’t mention the idea behind the omission of flooding due to storm surge, and why modelling only tidal flooding is crucial in these example cases. The motivation for setting up this kind of prediction system is very unclear.
  
  Response:
  We thank the reviewer for the comment. We acknowledge that the San Francisco area does indeed experience storm surges especially along the Pacific coastline and to a lesser extent within San Francisco Bay. In the Supplementary materials section S1, line 16, we justified not using a storm surge model for San Francisco Bay due to significant wave heights typically of about 0.07-0.2 m, an order of magnitude smaller than the 2.0-3.0m found on the California Coast outside San Francisco Bay. We will update our manuscript to clarify our focus area which is San Francisco Bay and its shoreline urban communities, as opposed to the coastal communities facing the Pacific Ocean. We have focused here because this is an area with a larger urban infrastructure such as highways that could suffer flood induced interruptions that affect the entire Bay Area transport network. Additionally, independent of storms, the addition of coastal protections (in the form of sea walls or levees) to one part of the Bay may affect other parts of the Bay; Hollemann and Stacey, 2014 showed that protecting the southern portion of the Bay may increase maximum water levels by 0.2 in the north of the Bay; Wang et al., 2018 showed that protection of the southern portions of the West Bay adversely affects the water levels of the East Bay, and vice-versa.
  Meanwhile, the Abu Dhabi coastline experiences storm surges that are caused by sustained northwesterly winds that generally occur in the winter (called the Shamal winds). They are incorporated into the model by the coupling of Delft3d with SWAN, a spectral wave model, and storm runup calculations were performed along the Abu Dhabi coastline in the same manner described in Chow and Sun, 2022. Details of the Abu Dhabi hydrodynamic models were provided in the Supplementary material, but in the next revision of the manuscript we will incorporate more detailed descriptions of the Abu Dhabi Delft3d and SWAN coupled models into the main body of the manuscript..
  Also, what is the level of damage that can occur due to tidal flooding (as the storm surge is not included here) needs to be discussed so that the readers can understand the significance or the need for this framework.
  
  Response:
  Thank you for the question. We will include in our revised manuscript that the motivations of studying the changes to tidal flooding within San Francisco Bay due to different protection scenarios are based on protection observations from previous studies that adding coastal protections to one part of the Bay may affect other parts of the Bay (Hollemann and Stacey, 2014; Wang et al., 2018).
  Meanwhile storm surges and tidal flooding have both been modeled for the Abu Dhabi coastline since there are effects both from storm surges with the presence of sustained onshore winds and tidal interactions between the multiple mangrove islands in the Abu Dhabi area (as shown in Figure 2).
  The manuscript needs to contain the details of how the hydrodynamic model, whose results have been used for training the DL model, is validated. What kind of events were used? How closely they matched the observed flooding also needs to be included in the paper.
  
  Response:
  Thank you for the comment. Previous work has performed validation of the hydrodynamic models used in this paper with the current tidal gauges so that the observed water levels (without sea level rise) are well reproduced by the simulation – see Wang et al., 2017 and Chow and Sun, 2022. The simulations were performed with a 0.5m SLR rise which reflects a possible future scenario for San Francisco Bay in the year (somewhere between 2050-2100 depending on the climate change scenario pathway (between SSP2-4.5 and SSP5-8.5) from IPCC AR6 report (referenced in Section 3.1.1 of the Supplementary Materials). As such the simulated flooding without SLR serves as a predicted flooding extent that is not currently observed.
  The paper argues that the developed model is computationally efficient, but it fails to elucidate the details on how fast it is in comparison to the hydrodynamic model. This is very crucial to establish the significance of this work.
  
  Response:
  We appreciate the reviewer’s comment regarding the need for a computational efficiency analysis. In the revised manuscript, we will include a detailed comparison of training time, inference time, and GPU memory usage across all evaluated models. Specifically, our CASPIAN-v2 model requires approximately 23 hours to train on a local machine equipped with an NVIDIA RTX 4090 GPU. Its inference time is around 0.227 seconds per scenario, and the model is lightweight, comprising only 0.38 million parameters.
  In contrast, simulating a single scenario using traditional hydrodynamic models (Delft3D for San Francisco and Delft3D coupled with SWAN for Abu Dhabi) takes approximately 14 hours and 19 hours, respectively, on high-performance computing infrastructure. This means that running all 72 scenarios in the test set (36 for Abu Dhabi and 36 for San Francisco) would take around 49 days using these simulators. In comparison, CASPIAN-v2 can process the same 72 scenarios in roughly 17 seconds.
  We will include these comparisons in a new table and accompanying discussion to clearly demonstrate the significant time and resource savings achieved by our surrogate model, directly addressing the computational motivation highlighted in the manuscript.
  There is no spatial validation of the over-/under-/correctly predicted flooded areas. For example, see Nithila Devi et al. (2024), where a spatial fitness index has been used to assess the spatial accuracy.
  
  Response:
  We thank the reviewer for the suggestion and for referencing Nithila Devi et al. (2024). We agree that spatial validation adds important insight into model performance. In the revised manuscript, we will reference past tidal gauge validation performed for the San Francisco model (Wang et al., 2017) and the Abu Dhabi model (Chow and Sun, 2022) for vertical water level validation with observations. While no flooding extent has been observed in the case of zero sea level rise, our model attempts to predict the extent of flooding in a future scenario of 0.5m sea level rise.
  In San Francisco, there is also riverine and pluvial flooding present. How only modelling the tidal flooding sufficient in this case for assessing the flood protection capabilities?
  
  Response:
  Thank you for the comment. Although our San Francisco model does include riverine input from the Sacramento and San Joaquin Rivers, the inflow rates into the Bay were baseline values rather than for extreme fluvial flood events. While we acknowledge that incorporating more refined hydrodynamic forcing conditions to include pluvial and riverine floods, as well as extreme storm events, can refine the hydrodynamic model to reflect more extreme flooding, our overall scope in this paper is in the use of machine learning to be able to act as a surrogate for a hydrodynamic model running at a more average condition but under a Sea Level Rise of 0.5m and in the presence of a coastal protection over many different stretches of the shoreline. Our focus on tidal flooding within San Francisco Bay was to highlight the unique tidal behavior within the Bay where the construction of sea walls along certain portions of the shoreline Bay may in fact exacerbate the sea level within the Bay to increase by up to 1 m (Holleman and Stacey, 2014).
  Flood protection measures on the coast are very vaguely discussed. It is very difficult to understand what they are, how they function, and how they are incorporated into the hydrodynamic and DL models. It will be nice to discuss with figures showing their placement, functionality and model representation explicitly.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will include and add more information contained in Supplemental Materials to the main body of the paper, namely: Sections 3.1.2 for San Francisco, which references previous work that provides more details of the coastal protection assumptions; and Section 3.1.1 that describes the flood protection measures taken for Abu Dhabi, which also references previous work that provide more details.
  Regarding the writing style, I find that the paper is more from a computer science background, rather than explaining clearly the application and relevance to the hydraulic modelling application. For example, the manuscript lacks details on training data used, accuracy of the training data, spatial prediction accuracy, specifics of protection measures, etc., as mentioned in the earlier comments.
  
  Response:
  We appreciate the observation of the reviewer regarding the writing style and the need to better align the manuscript with the expectations of the hydraulic modeling community. In the revised version, we will move some of the material presented in the Supplementary Materials section into the main body of the manuscript, we will revisit relevant sections to better highlight the application relevance and hydrodynamic context of our work (with some of the model development and validation material taken from Wang et al., 2018 and Chow and Sun, 2022). Finally, we will improve the explanation of the training data (including its source and accuracy), provide further clarification on spatial prediction accuracy, and include additional details on the definitions of the Operational Landscape Units (OLUs) and the protection scenarios of OLUs that were used. This will ensure that the manuscript clearly communicates both the modeling and application aspects of the study.
  The paper is very wordy with several irrelevant information and lacks crucial information necessary for the understanding of the readers. For example, information on kind of damages due to tidal flooding, validation etc. Please discuss the important information briefly rather than just citing other works.
  
  Response:
  Thank you for the feedback. In the revised manuscript, we will reduce wordiness and remove less relevant content. We will also briefly include key information, such as tidal flood impacts and data validation, to improve clarity and minimize reliance on external citations.
  The following are the specific comments listed for each section,
  
  Introduction and Related Works
  
  The introduction is very wordy and unorganised. A separate section of “related works” is unwarranted and has a lot of redundant information. Therefore, sections 1 and 2 (introduction and related works) need to be shortened and merged into one, ensuring that there is a proper flow of content in them.
  
  Response:
  Thank you for this suggestion. In the revised manuscript, we will merge the Introduction and Related Works sections to create a more concise and cohesive narrative. Redundant content will be removed, and we will ensure a clearer flow that integrates background, motivation, and relevant literature more effectively.
  Lines 17-20 can be shortened.
  
  Response:
  Thank you for pointing this out. In the revised manuscript, we will shorten lines 17–20 to convey the key message more concisely while maintaining clarity and relevance.
  Lines 26 – 34 need to be shortened, retaining information relevant to the kind of study the paper deals with.
  
  Response:
  Thank you for the suggestion. We will revise lines 26–34 to remove general content and retain only information directly relevant to the scope and objectives of this study.
  Please use terms like computationally expensive/intensive instead of the term “computationally prohibitive” if possible.
  
  Response:
  Thank you for the suggestion. As suggested by the reviewer, we will replace the term “computationally prohibitive” with “computationally expensive” in the revised manuscript, where applicable.
  Mention literature related to computationally efficient modelling approaches (subgrid, parallelisation, simplified models, etc.), such as De Almeida et al. (2013), Neal et al. (2012), Li and Hodges (2019), Sanders and Schubert (2019), Nithila Devi et al. (2024), etc. Discuss such methodologies and bring out the importance of DL/ML/AI in flood forecasting and modelling.
  
  Response:
  Thank you for this helpful recommendation regarding various methods of employing spatially nested models. In the revised manuscript, we will revise the introduction to include a brief discussion of computationally efficient hydrodynamic modeling approaches, as suggested by the reviewer. We wish to note that nesting has not been necessary for the DELFT3D portion of the model, as the model grid has been downscaled to about 30 m horizontal resolution for areas of interest such as the city of Abu Dhabi; and that the model is capable of modeling dry and wet grid cells. But we did employ a nesting method to incorporate the SWAN to model storm surges and waves for Abu Dhabi. We will highlight the limitations of traditional approaches and clearly motivate the role and advantages of AI-based methods in improving computational efficiency and scalability for flood forecasting and modeling.
  Highlight the need for high-resolution modelling, especially in the complex urban terrain and cite relevant literature.
  
  Response:
  We appreciate the suggestion of the reviewer to emphasize the need for high-resolution modeling, particularly in complex urban terrain. Initially, data from hydrodynamic simulations was in tabular format, which, although containing peak water levels and coordinates, lacked explicit spatial structure and contextual relationships among points. This made it difficult for a learning algorithm to fully understand the spatial dependencies critical in urban flood prediction.
  To address this, we transformed the tabular data into a high-dimensional spatial grid format (1024x1024) suitable for convolutional neural networks (CNNs). This transformation allowed us to encode not just the water level values but also the precise spatial positioning of each point within the urban landscape. By incorporating additional contextual information, we classified points by assigning two pixel values to each grid cell (one indicating proximity to protected OLU boundaries and the other to unprotected areas). This spatial encoding allowed the model to learn fine-grained flood patterns and localized behaviors more effectively within the CNN framework. This high-resolution spatial representation is essential in urban areas, where small variations in terrain or infrastructure can significantly influence inundation patterns. In future work, this framework also supports the integration of additional channels or modalities to further enrich spatial context.
  Line 56, what do you mean by high dimensionality of outputs?
  
  Response:
  Thank you for pointing this out. By “high dimensionality of outputs”, we refer to the pixel-wise prediction of inundation values over large spatial grids (e.g., 1024×1024), where the model must produce a continuous value for each pixel. This increases the complexity of capturing spatial dependencies and patterns, particularly under varying coastal boundary and sea level rise conditions. In the revised manuscript, we will clarify the sentence accordingly to avoid ambiguity.
  Since this paper expands on a deep-vision based framework, explain briefly about this in the introduction so that the readers from diverse backgrounds can appreciate the work. (Line 57)
  
  Response:
  Thank you for this suggestion. In the revised manuscript, we will include a brief explanation of the deep vision-based framework in the Introduction, as suggested by the reviewer. This addition will help readers from diverse backgrounds understand that our approach involves using convolutional neural networks (CNNs) to learn spatial patterns directly from gridded input data, enabling pixel-wise prediction of flood extent under varying coastal conditions.
  The paper lacks a dedicated and concrete objective description; rather, it mentions the conclusions from the study in the introduction. Please move the lines 61 – 81 to conclusions or discussions, if redundant, considering removing them.
  
  Response:
  We thank the reviewer for the feedback. The content in lines 61–81 was intended to highlight the key contributions of the study, which is a common practice in scientific writing to help readers quickly grasp the novelty and scope of the work. However, we understand that it may have been interpreted as a set of concluding remarks. To address this, we will revise the wording to clearly frame these points as the main contributions and ensure they are presented as part of the objective and motivation in the Introduction. If any parts appear repetitive or better suited for the conclusion, we will move or remove them accordingly to improve the flow and clarity.
  Lines 83 – 90, redundant information. Please remove them while merging sections 1 and 2.
  
  Response:
  Thank you for the suggestion. We will remove lines 83–90 while merging the sections, as they are redundant and do not contribute additional value to the streamlined narrative.
  Lines 91 – 98, irrelevant information.
  
  Response:
  Thank you for the comment. As part of merging and streamlining the Introduction and Related Works sections, we will remove lines 91–98, as they contain information that is not directly relevant to the focus of this study.
  Mention what kind of flood protection strategies exist here.
  
  Response:
  We thank the reviewer for the comment. While structural flood protection strategies, such as levees, seawalls, and revetments, are commonly used, our study does not focus on the specific type or engineering design of these structures. Instead, our primary concern is to evaluate the spatial importance of different shoreline segments in mitigating flood impacts.
  Specifically, the study area in Abu Dhabi includes 17 defined shoreline protection segments, while San Francisco includes 30. Our analysis explores how different combinations of these segments, when protected or left unprotected, influence inland inundation patterns. The objective is to identify which segments are most critical in reducing flood extent and should be prioritized for protection, irrespective of the exact structural form eventually adopted.
  It is also important to note that the Delft3D hydrodynamic simulator used in this study does not inherently distinguish between different types of flood protection structures in its core computations. Whether a barrier is a levee, seawall, or any other type, the model treats it generically as a solid obstruction defined by its geometry and crest elevation. This further justifies our focus on the presence or absence of protection along spatial segments rather than structural details.
  This evaluation is performed using our trained deep learning model, which can rapidly assess the impacts of hundreds of protection patterns across scenarios. In contrast, performing such analysis with traditional hydrodynamic simulators, such as Delft3D or Delft3D+SWAN, would be highly time-consuming; a single simulation can take up to 19 hours. Our surrogate model reduces this computation to a matter of seconds per scenario, enabling efficient and scalable analysis of spatial protection strategies. We will clarify this focus more in the revised manuscript.
  Lines 123 – 125, not clear.
  
  Response:
  We thank the reviewer for pointing this out. We agree that the lines could be made clearer. In the revised manuscript, we will rephrase this part as follows:
  Despite progress in flood modeling, current approaches often face limitations in computational efficiency and in incorporating dynamic factors such as land subsidence and urban morphology (Xu and Gao, 2024). Some recent studies have introduced hybrid surrogate models that combine machine learning with traditional hydrodynamic simulations for real-time coastal flood forecasting (Xu and Gao, 2024). However, many existing efforts remain focused on specific flood types or single triggering factors, without jointly considering long-term climate impacts such as sea level rise and evolving shoreline adaptation strategies.
  This revised version restores the missing context and clarifies the challenges and gaps that motivate our study.
  How is the training done for different SLRs?
  
  Response:
  We thank the reviewer for the question regarding the training process for different SLR scenarios. Our dataset includes two regions: Abu Dhabi and San Francisco. For Abu Dhabi, we had simulation data available for a single SLR level: 0.5 meters. For San Francisco, we had data corresponding to three SLR levels: 0.5 m, 1.0 m, and 1.5 m. Among these, the 1.0 m SLR scenario had sufficient simulation samples, while the other two had relatively limited data.
  To ensure stable and effective training, we used all available 0.5 m data from Abu Dhabi and the 1.0 m data from San Francisco for the primary training phase. This dataset was split into training, validation, and test sets to train and initially evaluate the model.
  After training, we performed fine-tuning using a subset of the limited samples available for the 0.5 m and 1.5 m SLR scenarios in San Francisco. The purpose of this step was to adapt the model and assess its ability to generalize across different SLR levels. The final evaluation for these additional SLR depths was then conducted on the remaining, unseen samples.
  This approach allowed the model to effectively learn from the most data-rich scenarios while still extending its predictive capability to other SLR conditions through targeted fine-tuning.
  Figure 1 can be moved to Methodology and please describe the overall framework and the steps involved there.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will move Figure 1 to the Methodology section and enhance the accompanying description to clearly outline the overall framework and the key steps involved in the proposed approach.
  Study Area and Data Description
  
  Line 136, what do you mean by environmental effects?
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will clarify by stating that the environmental impacts are mostly focused on the effects on transportation links, and specifically whether important arterials such as shoreline highways or freeways will be flooded due to SLR.
  Briefly describe OLU discretization in a few sentences for the benefit of readers.
  
  Response:
  Thank you for the question. In the revised manuscript, we will clarify that for San Francisco Bay, the discretization of the Operational Landscape Units (OLUs) was taken from SFEI, 2019. For Abu Dhabi, the urban coastline was also discretized into OLUs based on the Abu Dhabi Plan 2030 (Abu Dhabi Urban Planning Council, 2007). Maps of the OLU boundaries used for both San Francisco Bay and Abu Dhabi were shown in Section S1 in the supplementary Materials Section.
  Mention past flood damages caused by tides.
  
  Response:
  Thank you for the question. Our focus on tidal flooding within San Francisco Bay was to highlight the unique tidal behavior within the Bay where the construction of sea walls along certain portions of the shoreline Bay may in fact exacerbate the sea level in other areas of the Bay. For example, Hollemann and Stacey, 2014 showed that protecting the southern portion of the Bay may increase maximum water levels by 0.2 in the north of the Bay; Wang et al., 2018 showed that protection of the southern portions of the West Bay adversely affects the water levels of the East Bay, and vice-versa.
  Briefly mention here about the training dataset.
  
  Response:
  We appreciate the suggestion to provide more information about the training dataset. In the revised manuscript, we will briefly elaborate the dataset in the Study Area and Data Description section as suggested.
  How was the Delft 3D model validated? How accurate is it?
  
  Response:
  For San Francisco Bay, the Delft3D model was adapted from the CoSMoS model originally developed by Barnard (2014) and adapted to San Francisco Bay by Wang et al (2017), and validated in the past using tidal gages at 9 tidal gage locations in and around San Francisco Bay. Pearson correlation coefficients ranged from 0.9862 to 0.9996, while the RMS ratios (the ratio of modelled versus measured RMS amplitudes) ranged from 0.973 to 1.027 (please refer to Wang et al., 2017).
  For Abu Dhabi, the Delft3D model was validated using water level data from196 tidal gage locations throughout the Persian Gulf (as the hydrodynamic model encompassed the entire Gulf in addition to the western portions of the Gulf of Oman). The water levels at these locations were compared with one month’s worth of hydrodynamic simulation, and the resulting absolute RMSE values ranged from 0.0013 to 0.0043 m in the vicinity of Abu Dhabi. More validation details for Abu Dhabi can be found in Chow and Sun, 2022.
  Rather than points, use line or raster to areas susceptible to flooding. Figure 2a.
  
  Response:
  Thank you for the suggestion. As recommended, we will revise Figure 2a to replace the point-based visualization with a line or raster representation that more clearly delineates areas susceptible to flooding. This will improve readability and better convey spatial flood risk patterns across the region.
  Table 1 – What does main set, hold out set, etc., mean? Please explain when you have a table or figure in the manuscript for a general reader.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will add a clear explanation of the terms used in Table 1 for general readers. Specifically, we will clarify that the main set is used for training, validation, and testing of the model; the holdout set includes manually selected challenging scenarios to evaluate model performance under complex protection configurations; and the generalizability set consists of San Francisco data with two additional sea level rise depths, used to fine-tune and assess the model’s ability to generalize to new conditions not originally trained on.
  What are Shamal winds? Describe in a sentence or two with relevant literature.
  
  Response:
  Thank you for the question. While we have referenced Langodan et al., 2023 in Section 3.1.1 relating to the Shamal winds, we will place an explanation of these winds in the main body of the revised text. While the Persian Gulf does not typically experience tropical cyclones, it is known for its northwesterly winds generally occurring with winds at about 20 m/s with sudden onset and sustained over a period of up to 3-5 days. These are called the Shamal winds (meaning “North” in Arabic) and occur at least 10 times annually, mainly during the winter months (Senafi et al., 2015; Li et al., 2020).
  It is unclear what you mean by three months. Period of simulation or the computational time itself? What is the significance of choosing this?
  
  Response:
  Thank you for the question. A three-month period of simulation was chosen for both the San Francisco and the Abu Dhabi simulations. This was chosen to balance the need to model a larger number of tidal cycles that are modeled per simulation, with the computational time and storage space used for the simulations.
  Method
  
  Section 4.1, please shorten this to retain relevant general information. The rest can be moved to supplementary so that the readers don’t lose interest.
  
  Response:
  Thank you for the suggestion. As recommended, we will shorten Section 4.1 to retain only the most relevant general information about the model architecture. The more detailed architectural components and implementation specifics will be moved to the supplementary material to maintain reader engagement while keeping the core content concise.
  Describe a spatial fitness metric.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will include a spatial fitness metric to evaluate the alignment between predicted and ground truth flood extents, specifically assessing spatial accuracy in terms of overprediction, underprediction, and correctly matched inundated areas.
  Results
  
  Tabulate and discuss the values of the spatial fitness index for different DL/ML models.
  
  Response:
  Thank you for this valuable suggestion. In the revised manuscript, we plan to compute and tabulate the spatial fitness metric for our proposed CASPIAN-v2 model, along with any baseline DL/ML models evaluated in the study. We will include the resulting Table in the main text and briefly discuss the findings to highlight how CASPIAN-v2 performs in terms of spatial prediction accuracy relative to alternative approaches.
  Table 5, please describe the performance of the proposed method compared to the existing methods in the paper.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will include a discussion describing the performance of the proposed method relative to existing approaches as summarized in Table 5. However, as most existing ML/DL-based flood prediction methods in the literature have not open-sourced their code and are often designed for point-based or aggregated outputs rather than 2D gridded flood maps, direct benchmarking remains challenging. Nonetheless, we will highlight key differences in methodology and performance trends to contextualize the advantages of CASPIAN-v2.
  Please use a zoomed-in figure to illustrate the effect of the representation of the protection measure in the manuscript.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will include a zoomed-in figure, as part of the results section (e.g., Figures 5 and 6), to illustrate the effect of the representation of the protection measures. This will help highlight how different shoreline protection configurations influence localized inundation patterns and improve the interpretability of the results.
  Section 6.2.2, Lines 475 – 478, lacks a clear explanation.
  
  Response:
  Thank you for the comment. In the revised manuscript, we will clarify that the referenced abrupt changes in local flood behavior relate to spatial variations in inundation caused by diverse OLU protection scenarios, as illustrated in Figure 6. The figure presents results on a holdout set specifically curated to include challenging scenarios, such as configurations where one side of a Bay is protected while the other is not. These cases test the ability of the model to capture complex spatial dynamics. We will also highlight that the model demonstrates strong generalization to such unseen scenarios, even though the training data covers only a small subset of the possible 2OLUs combinations. This indicates that CASPIAN-v2 learns to infer inundation patterns based on the spatial protection configuration of OLUs.
  Section 6.2.4, a repeated heading, please be more specific and clearer.
  
  Response:
  Thank you for pointing this out. In the revised manuscript, we will revise the headings within Section 6.1 and 6.2 to avoid repetition and improve clarity.
  Line 494, how is the ground truth information collected? What is the associated error?
  
  Response:
  Thank you for this important question. In the revised manuscript, we will clarify that the ground truth inundation data used for training and evaluation were derived from physics-based hydrodynamic simulations. We will also provide information on the validation of these simulations and discuss the expected level of error or uncertainty in the simulation outputs, where available. This clarification will be included near Line 494 to improve transparency regarding data quality.
  Line 506, summarize in a few sentences about data augmentation.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will summarize the data augmentation strategy in a few concise sentences, as suggested.
  Figure 9, please move it to the supplementary.
  
  Response:
  Thank you for the suggestion. In the revised manuscript, we will move Figure 9 to the supplementary material, as recommended.
  References
  
  Bates, P. D., Horritt, M. S., & Fewtrell, T. J. (2010). A simple inertial formulation of the shallow water equations for efficient two-dimensional flood inundation modelling. Journal of Hydrology, 387(1–2), 33–45. https://doi.org/10.1016/j.jhydrol.2010.03.027
  
  De Almeida, G. A. M., & Bates, P. (2013). Applicability of the local inertial approximation of the shallow water equations to flood modeling. Water Resources Research, 49(8), 4833–4844. https://doi.org/10.1002/wrcr.20366
  
  Li, Z., & Hodges, B. R. (2019). Modeling subgrid-scale topographic effects on shallow marsh hydrodynamics and salinity transport. Advances in Water Resources, 129, 1–15. https://doi.org/10.1016/j.advwatres.2019.05.004
  
  Neal, J., Schumann, G., & Bates, P. (2012). A subgrid channel model for simulating river hydraulics and floodplain inundation over large and data sparse areas. Water Resources Research, 48(11), 1–16. https://doi.org/10.1029/2012WR012514
  
  Nithila Devi, N., & Kuiry, S. N. (2024). A novel local‐inertial formulation representing subgrid scale topographic effects for urban flood simulation. Water Resources Research, 60(5), e2023WR035334.
  
  Sanders, B. F., & Schubert, J. E. (2019). PRIMo: Parallel raster inundation model. Advances in Water Resources, 126, 79–95. https://doi.org/10.1016/J.ADVWATRES.2019.02.007
  
  Response:
  Thank you for sharing these references. We will incorporate the relevant literature, as suggested, into the revised manuscript to strengthen the discussion on computationally efficient flood modeling approaches and better contextualize the role of deep learning in this domain.
  Response References:
  Abu Dhabi Urban Planning Council. Plan Abu Dhabi 2030. Urban Structure Framework Plan. 2007. Available online: https://www.ecouncil.ae/PublicationsEn/plan-abu-dhabi-full-version-EN.pdf
  
  Al Senafi, F.; Anis, A. Shamals and climate variability in the Northern Arabian/Persian Gulf from 1973 to 2012. Int. J. Clim. 2015,35, 4509–4528.
  
  Chow, A.C. and Sun, J., 2022. Combining Sea level rise inundation impacts, tidal flooding and extreme wind events along the Abu Dhabi coastline. Hydrology, 9(8), p.143.
  
  Holleman, R.C. and Stacey, M.T., 2014. Coupling of sea level rise, tidal amplification, and inundation. Journal of Physical Oceanography, 44(5), pp.1439-1455.
  
  IPCC. Summary for Policymakers. In Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Masson-Delmotte, V., Zhai, P., Pirani, A., Connors, S.L., Péan, C., Berger, S., Caud, N., Chen, Y., Goldfarb, L., Gomis, M.I., et al., Eds.; Cambridge University Press: Cambridge, UK, 2021.
  
  Li, D.; Anis, A.; Al Senafi, F. Physical response of the Northern Arabian Gulf to winter Shamals. J. Mar. Syst. 2020, 203, 103280.
  
  San Francisco Estuary Institute (SFEI), 2019 (Beagle, J., Lowe, J., McKnight, K., Safran, S., Tam, L. and Szambelan, S.J., authors). San Francisco Bay shoreline adaptation atlas: Working with nature to plan for sea level rise using operational landscape units (No. SFEI publication# 915).
  
  Wang, R.Q., Herdman, L.M., Erikson, L., Barnard, P., Hummel, M. and Stacey, M.T., 2017. Interactions of estuarine shoreline infrastructure with multiscale sea level variability. Journal of Geophysical Research: Oceans, 122(12), pp.9962-9979.
  
  Wang, R.Q., Stacey, M.T., Herdman, L.M.M., Barnard, P.L. and Erikson, L., 2018. The influence of sea level rise on the regional interdependence of coastal infrastructure. Earth's Future, 6(5), pp.677-688.
  
  Citation: https://doi.org/10.5194/egusphere-2025-838-AC1
RC2:
'Comment on egusphere-2025-838', Anonymous Referee #2, 28 Apr 2025
Anonymous Referee #2
Summary
This manuscript presents CASPIAN-v2, a novel deep learning framework for predicting coastal flooding under varying sea level rise (SLR) scenarios and shoreline protection strategies. The authors test their approach on two distinct geographical regions (Abu Dhabi and San Francisco) and demonstrate superior performance compared to state-of-the-art methods. The paper makes several contributions, including a new CNN-based architecture, comprehensive datasets from vulnerable coastal areas, and validation of generalizability across different SLR scenarios.
While the work represents a significant advancement in data-driven coastal flood prediction, several aspects require substantial revision before the manuscript is suitable for publication in HESS.
General Comments
Model Architecture Presentation and Justification

The CASPIAN-v2 architecture is sophisticated but presented in an overly complex manner that hampers understanding. Figure 4 contains too much information without sufficient explanation of design choices. The authors introduce multiple novel components (MARX blocks, SEE blocks) without adequate justification for these specific innovations over simpler alternatives.
Example: In Section 4.1.2, the rationale for integrating ResNeXt blocks with CBAM is not clearly connected to the specific challenges of flood prediction. The authors should explain why this combination addresses spatial dependencies in coastal flooding better than other attention mechanisms.
Recommendation: Provide a simplified schematic of the architecture alongside the detailed one and clearly justify each novel component in relation to the specific requirements of flood prediction tasks.
Computational Efficiency Analysis

A primary motivation for developing surrogate models is the computational burden of physics-based simulations. However, the paper lacks a rigorous comparison of computational efficiency between the proposed model and alternatives.
Example: While Table 5 thoroughly compares prediction accuracy, it contains no information about training times, inference times, or memory requirements. This is particularly important given that lines 45-49 on page 2 emphasize computational burden as a key limitation of current approaches.
Recommendation: Include a comprehensive analysis of computational efficiency, comparing training and inference times across all evaluated models, and explicitly stating the practical time savings compared to hydrodynamic simulations.
Uncertainty Quantification

The model provides deterministic predictions without addressing prediction uncertainties, which is crucial for risk assessment and decision support in coastal planning.
Example: The error maps in Figures 5-7 show where predictions differ from ground truth, but they don't indicate the model's confidence in its predictions, which is essential for reliable risk assessment.
Recommendation: Incorporate uncertainty quantification into the model (e.g., through ensemble methods, Bayesian techniques, or prediction intervals) or thoroughly discuss this limitation and its implications for practical use.
Data Imbalance Handling

Figure 9 reveals severe class imbalance in the dataset, with non-inundated areas predominating. While the authors acknowledge this challenge, they don't adequately explain how their approach specifically addresses it.
Example: Section 7 mentions the imbalance issue but doesn't describe specific techniques beyond the hybrid loss function that were employed to mitigate its effects. It's unclear how the model achieves its reported high accuracy despite this challenge.
Recommendation: Elaborate on specific techniques used to address data imbalance, potentially including specialized sampling strategies, data augmentation approaches tailored to rare flood events, or custom components in the architecture designed for imbalanced spatial data.
Real-world Application Context

The practical utility of the model for coastal planning is asserted but not demonstrated through concrete examples or integration pathways.
Example: The conclusion claims CASPIAN-v2 is "an essential tool for coastal resilience planning" (lines 539-540, page 28), but doesn't provide specific guidance on how planners might integrate this tool with existing decision-making frameworks.
Recommendation: Include a case study or conceptual workflow showing how the model could be integrated into actual coastal planning processes, identifying key stakeholders and decision points where the model adds value.

Specific Comments
Mathematical Notation Inconsistency

The paper uses inconsistent notation, particularly in Section 4.1, making the mathematical formulations difficult to follow.
Example: In Equations 1-5, subscripts sometimes denote indices and sometimes represent different variables entirely. The relationship between tensors across equations is not always clear.
Recommendation: Standardize notation throughout the paper and provide a notation table for reference.
Evaluation Metrics Justification

While the paper employs multiple evaluation metrics, the rationale for these specific choices and their relevance to practical flood prediction applications isn't fully explained.
Example: The threshold exceedance metric (δ > Δ) is introduced in Section 5.4, but its practical significance for flood risk assessment isn't discussed.
Recommendation: Justify the choice of each evaluation metric in terms of its relevance to practical flood prediction applications and decision-making contexts.
Data Preprocessing Details

The data preprocessing section (3.3) lacks sufficient detail on critical aspects that could impact model performance.
Example: The method for mapping inundation coordinates onto a 1024×1024 grid (lines 182-184, page 9) is mentioned but not described in detail, despite this being a critical step that affects the spatial resolution of predictions.
Recommendation: Provide more detailed explanation of preprocessing steps, potentially with illustrative examples showing the transformation from raw data to model inputs.
Ablation Study Presentation

The paper mentions ablation studies in the supplementary material but doesn't adequately summarize key findings in the main text.
Example: Line 341-342 on page 16 mentions "extensive ablation studies" but doesn't present the key insights derived from these experiments.
Recommendation: Include a summary table of ablation study results in the main text, highlighting the contribution of each novel component to overall performance.
Figure Clarity and Interpretation

Several figures are complex and difficult to interpret, with insufficient explanation in captions and text.
Example: Figure 7 compares model predictions across different approaches, but the subtle differences between models are difficult to discern with the chosen color scale and presentation format.
Recommendation: Improve figure clarity through better color scales, simplified presentations, or additional explanatory elements like difference maps to highlight where each model performs better or worse.
Summary Assessment
This manuscript presents valuable research on deep learning for coastal flood prediction, with promising results that could significantly advance the field. However, major revisions are needed to address issues related to model architecture presentation, computational efficiency analysis, uncertainty quantification, data imbalance handling, and real-world application context.
With these improvements, the paper has the potential to make a significant contribution to both the technical literature on deep learning for environmental modeling and practical coastal planning applications.
Citation: https://doi.org/10.5194/egusphere-2025-838-RC2
- AC2:
  'Reply on RC2', Bilal Hassan, 21 May 2025
  Anonymous Referee # 2
  Summary
  
  This manuscript presents CASPIAN-v2, a novel deep learning framework for predicting coastal flooding under varying sea level rise (SLR) scenarios and shoreline protection strategies. The authors test their approach on two distinct geographical regions (Abu Dhabi and San Francisco) and demonstrate superior performance compared to state-of-the-art methods. The paper makes several contributions, including a new CNN-based architecture, comprehensive datasets from vulnerable coastal areas, and validation of generalizability across different SLR scenarios.
  
  While the work represents a significant advancement in data-driven coastal flood prediction, several aspects require substantial revision before the manuscript is suitable for publication in HESS.
  
  Response:
  We thank the reviewer for their thoughtful summary and for recognizing the key contributions of our work. At the same time, we fully understand that several areas require further refinement. In the revised manuscript, we will address all raised concerns in detail to ensure the work meets the standards of HESS. We believe these revisions will significantly improve clarity, rigor, and impact of our work.
  General Comments
  
  Model Architecture Presentation and Justification
  
  The CASPIAN-v2 architecture is sophisticated but presented in an overly complex manner that hampers understanding. Figure 4 contains too much information without sufficient explanation of design choices. The authors introduce multiple novel components (MARX blocks, SEE blocks) without adequate justification for these specific innovations over simpler alternatives.
  
  Example: In Section 4.1.2, the rationale for integrating ResNeXt blocks with CBAM is not clearly connected to the specific challenges of flood prediction. The authors should explain why this combination addresses spatial dependencies in coastal flooding better than other attention mechanisms.
  
  Recommendation: Provide a simplified schematic of the architecture alongside the detailed one and clearly justify each novel component in relation to the specific requirements of flood prediction tasks.
  
  Response:
  We thank the reviewer for the helpful feedback regarding the clarity and justification of the CASPIAN-v2 architecture. In the revised manuscript, we plan to redraw Figure 4 to include a simplified schematic of the overall encoder–bottleneck–decoder structure, alongside the existing detailed version. This will help improve readability and guide the reader through the hierarchical design.
  Additionally, to better justify the inclusion of the MARX and SEE blocks, we will update the ablation study to show model performance when these components are entirely removed or replaced with alternative modules. While the current ablation study already includes results with different numbers of MARX and SEE blocks, we recognize the importance of evaluating their overall necessity. We have also experimented with other configurations in the bottleneck, including standard ResNet, ResNeXt blocks, and Squeeze-and-Excitation modules. These additional comparisons will be included to demonstrate why the selected components (ResNeXt blocks with CBAM) in the bottleneck are more effective in capturing spatial dependencies and multi-scale interactions critical for accurate flood prediction.
  Computational Efficiency Analysis
  
  A primary motivation for developing surrogate models is the computational burden of physics-based simulations. However, the paper lacks a rigorous comparison of computational efficiency between the proposed model and alternatives.
  
  Example: While Table 5 thoroughly compares prediction accuracy, it contains no information about training times, inference times, or memory requirements. This is particularly important given that lines 45-49 on page 2 emphasize computational burden as a key limitation of current approaches.
  
  Recommendation: Include a comprehensive analysis of computational efficiency, comparing training and inference times across all evaluated models, and explicitly stating the practical time savings compared to hydrodynamic simulations.
  
  Response:
  We appreciate the reviewer’s comment regarding the need for a computational efficiency analysis. In the revised manuscript, we will include a detailed comparison of training time, inference time, and GPU memory usage across all evaluated models. Specifically, our CASPIAN-v2 model requires approximately 23 hours to train on a local machine equipped with an NVIDIA RTX 4090 GPU. Its inference time is around 0.227 seconds per scenario, and the model is lightweight, comprising only 0.38 million parameters.
  In contrast, simulating a single scenario using traditional hydrodynamic models (Delft3D for San Francisco and Delft3D coupled with SWAN for Abu Dhabi) takes approximately 14 hours and 19 hours, respectively, on high-performance computing infrastructure. This means that running all 72 scenarios in the test set (36 for Abu Dhabi and 36 for San Francisco) would take around 49 days using these simulators. In comparison, CASPIAN-v2 can process the same 72 scenarios in roughly 17 seconds.
  We will include these comparisons in a new table and accompanying discussion to clearly demonstrate the significant time and resource savings achieved by our surrogate model, directly addressing the computational motivation highlighted in the manuscript.
  Uncertainty Quantification
  
  The model provides deterministic predictions without addressing prediction uncertainties, which is crucial for risk assessment and decision support in coastal planning.
  
  Example: The error maps in Figures 5-7 show where predictions differ from ground truth, but they don't indicate the model's confidence in its predictions, which is essential for reliable risk assessment.
  
  Recommendation: Incorporate uncertainty quantification into the model (e.g., through ensemble methods, Bayesian techniques, or prediction intervals) or thoroughly discuss this limitation and its implications for practical use.
  
  Response:
  We appreciate the valuable feedback from the reviewer regarding uncertainty quantification. We fully agree that predictive uncertainty is essential for informed decision-making in coastal risk management. While we previously used Grad-CAM to provide qualitative insights into the spatial regions influencing the model’s predictions, we acknowledge that this approach supports interpretability but does not quantify uncertainty.
  To address this comment, in the revised manuscript we plan to incorporate a predictive uncertainty estimation method. Specifically, we propose to apply Monte Carlo Dropout by enabling dropout at inference time to generate multiple outputs for the same input. Alternatively, we may implement deep ensembles, where multiple independently trained models are used to derive a distribution over predictions. In addition, we will explore the use of random cutout–based test-time augmentation, where different variants of the same input are passed through the model, and the variability in outputs is used to estimate uncertainty. This method leverages our existing data augmentation strategy and provides a simple, architecture-agnostic way to probe model robustness.
  All three approaches will help generate confidence maps alongside the predicted inundation, offering a more robust basis for risk-sensitive planning. We will include the resulting uncertainty maps and a corresponding discussion in the revised manuscript..
  Data Imbalance Handling
  
  Figure 9 reveals severe class imbalance in the dataset, with non-inundated areas predominating. While the authors acknowledge this challenge, they don't adequately explain how their approach specifically addresses it.
  
  Example: Section 7 mentions the imbalance issue but doesn't describe specific techniques beyond the hybrid loss function that were employed to mitigate its effects. It's unclear how the model achieves its reported high accuracy despite this challenge.
  
  Recommendation: Elaborate on specific techniques used to address data imbalance, potentially including specialized sampling strategies, data augmentation approaches tailored to rare flood events, or custom components in the architecture designed for imbalanced spatial data.
  
  Response:
  Thank you for highlighting this important point. We acknowledge the significant class imbalance in the dataset, as shown in Figure 9, with non-inundated points dominating the spatial distribution. While we did not apply explicit sampling or augmentation strategies to overcome this, our current formulation incorporates a hybrid loss function and custom architectural components (e.g., MARX and SEE blocks) designed to learn and prioritize spatially meaningful features, particularly around protection boundaries and inundation-prone regions.
  As further evidenced in Figure 10, the model tends to focus more on areas near unprotected OLUs, which are more likely to flood. This behavior suggests that despite the numerical dominance of non-inundated points, the network implicitly assigns greater representational focus to the spatial characteristics associated with flood-prone zones. In the revised manuscript, we will clarify this point and also briefly discuss possible future enhancements, such as spatial weighting schemes or scenario-focused data balancing, to further address the imbalance in flood prediction tasks..
  Real-world Application Context
  
  The practical utility of the model for coastal planning is asserted but not demonstrated through concrete examples or integration pathways.
  
  Example: The conclusion claims CASPIAN-v2 is "an essential tool for coastal resilience planning" (lines 539-540, page 28), but doesn't provide specific guidance on how planners might integrate this tool with existing decision-making frameworks.
  
  Recommendation: Include a case study or conceptual workflow showing how the model could be integrated into actual coastal planning processes, identifying key stakeholders and decision points where the model adds value.
  
  Response:
  Thank you for this insightful comment. In the revised manuscript, we will include a workflow explanation describing how CASPIAN-v2 can be applied in real-world coastal planning contexts. Specifically, we will outline how the output of the model, such as scenario-based flood maps and uncertainty estimates, can support planners, engineers, and policymakers at various decision points. We will also describe how the model complements traditional hydrodynamic simulations by enabling rapid scenario analysis, which is particularly useful for emergency preparedness and evaluating the impacts of different shoreline protection strategies. This addition will clarify the practical value of the work and strengthen its positioning as a decision-support tool.
  Specific Comments
  
  Mathematical Notation Inconsistency
  
  The paper uses inconsistent notation, particularly in Section 4.1, making the mathematical formulations difficult to follow.
  
  Example: In Equations 1-5, subscripts sometimes denote indices and sometimes represent different variables entirely. The relationship between tensors across equations is not always clear.
  
  Recommendation: Standardize notation throughout the paper and provide a notation table for reference.
  
  Response:
  Thank you for pointing out the issue with notation clarity. In the revised manuscript, we will standardize the mathematical notation, ensuring consistent use of subscripts, tensor dimensions, and variable references. Additionally, we will include a notation table to clearly define all symbols and improve readability.
  Evaluation Metrics Justification
  
  While the paper employs multiple evaluation metrics, the rationale for these specific choices and their relevance to practical flood prediction applications isn't fully explained.
  
  Example: The threshold exceedance metric (δ > Δ) is introduced in Section 5.4, but its practical significance for flood risk assessment isn't discussed.
  
  Recommendation: Justify the choice of each evaluation metric in terms of its relevance to practical flood prediction applications and decision-making contexts.
  
  Response:
  Thank you for highlighting this important point. In the revised manuscript, we will elaborate on the rationale behind our choice of evaluation metrics. While we included multiple metrics to comprehensively assess model performance from different perspectives, we now recognize the need to clarify what each metric represents and how it relates to practical flood prediction and decision-making.
  Data Preprocessing Details
  
  The data preprocessing section (3.3) lacks sufficient detail on critical aspects that could impact model performance.
  
  Example: The method for mapping inundation coordinates onto a 1024×1024 grid (lines 182-184, page 9) is mentioned but not described in detail, despite this being a critical step that affects the spatial resolution of predictions.
  
  Recommendation: Provide more detailed explanation of preprocessing steps, potentially with illustrative examples showing the transformation from raw data to model inputs.
  
  Response:
  Thank you for this observation. Due to space constraints in the main manuscript, we provided a detailed description of the data preprocessing steps in Section S2 of the supplementary material. This section outlines each stage of the transformation from raw simulation outputs to the standardized 1024×1024 input grids used by the model. We will ensure that this is clearly referenced in Section 3.3 of the revised manuscript.
  Ablation Study Presentation
  
  The paper mentions ablation studies in the supplementary material but doesn't adequately summarize key findings in the main text.
  
  Example: Line 341-342 on page 16 mentions "extensive ablation studies" but doesn't present the key insights derived from these experiments.
  
  Recommendation: Include a summary table of ablation study results in the main text, highlighting the contribution of each novel component to overall performance.
  
  Response:
  Thank you for this helpful suggestion. In the revised manuscript, we will include a summary table of the ablation study results in the main text, highlighting the performance impact of each novel component. This will provide a clearer understanding of their individual contributions and key insights from the experiments currently detailed in the supplementary material.
  Figure Clarity and Interpretation
  
  Several figures are complex and difficult to interpret, with insufficient explanation in captions and text.
  
  Example: Figure 7 compares model predictions across different approaches, but the subtle differences between models are difficult to discern with the chosen color scale and presentation format.
  
  Recommendation: Improve figure clarity through better color scales, simplified presentations, or additional explanatory elements like difference maps to highlight where each model performs better or worse.
  
  Response:
  Thank you for pointing this out. In the revised manuscript, we will improve the clarity of key figures, particularly Figure 7, by adopting more perceptually distinct color scales and adding explanatory elements to better highlight variations between model predictions. We will also enhance the figure captions and in-text descriptions to guide interpretation and ensure that the visual comparisons are more accessible and informative.
  Summary Assessment
  
  This manuscript presents valuable research on deep learning for coastal flood prediction, with promising results that could significantly advance the field. However, major revisions are needed to address issues related to model architecture presentation, computational efficiency analysis, uncertainty quantification, data imbalance handling, and real-world application context.
  
  With these improvements, the paper has the potential to make a significant contribution to both the technical literature on deep learning for environmental modeling and practical coastal planning applications.
  
  Response:
  We sincerely thank the reviewer for their encouraging assessment and constructive feedback. We appreciate the recognition of the potential impact of our work and fully acknowledge the areas that require improvement. In response, we plan to undertake substantial revisions to enhance the clarity of the model architecture presentation, provide a comprehensive computational efficiency analysis, incorporate or discuss uncertainty quantification, clarify our approach to data imbalance handling, and better contextualize the results for real-world coastal planning applications. We are confident that these revisions will strengthen the manuscript and align it more closely with the expectations of both technical and applied research communities.
  
  Citation: https://doi.org/10.5194/egusphere-2025-838-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Reconsider after major revisions (further review by editor and referees) (05 Jun 2025) by Lelys Bravo de Guenni

AR by Bilal Hassan on behalf of the Authors (24 Jul 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (11 Aug 2025) by Lelys Bravo de Guenni

RR by Anonymous Referee #2 (03 Sep 2025)

Suggestions for revision or reasons for rejection

The authors have made substantial improvements to the manuscript and have adequately addressed the majority of concerns raised in the first round. The computational efficiency analysis and uncertainty quantification represent significant enhancements. The following minor issues should be addressed for final acceptance:

Specific Minor Revision Comments
1. Data Imbalance Performance Analysis
While the authors explain their theoretical approach to handling data imbalance, they should provide a brief quantitative breakdown showing model performance specifically on flooded vs non-flooded pixels. A simple confusion matrix or performance metrics stratified by flood/no-flood would suffice.
2. Uncertainty Quantification Practical Guidance
The deep ensemble approach is well-implemented, but the authors should add 1-2 sentences providing practical guidance on uncertainty threshold interpretation for coastal planners (e.g., "uncertainty values above X suggest areas requiring additional hydrodynamic validation").
3. Figure 4 Clarity
The simplified architecture diagram may be too abstract. Consider adding labels for the novel components (MARX, SEE blocks) or a brief legend to help readers connect the simplified version to the detailed supplementary figure.
4. Model Limitations Discussion
Add a brief paragraph discussing model limitations, particularly regarding generalizability to coastal environments beyond the two study areas and data requirements for new regions.
5. Mathematical Notation Verification
Ensure the standardized notation described in the response letter is actually implemented consistently throughout the supplementary material.

These targeted revisions will enhance clarity and practical applicability without requiring fundamental changes to the methodology or results.

Hide

RR by Anonymous Referee #3 (07 Nov 2025)

Suggestions for revision or reasons for rejection

The authors developed a machine learning model designed to rapidly predict coastal flooding under varying sea-level rise (SLR) projections and shoreline adaptation scenarios, positioning it as a potential surrogate for traditional hydrodynamic models. Overall, the manuscript is well-structured and sufficiently clear for readers. The authors have also addressed the comments from two reviewers during the first round of review. However, my primary concern pertains to the significance of this study.
While the machine learning model offers the advantage of high computational efficiency compared to hydrodynamic models, this efficiency is typically most beneficial for short-term predictions, such as those related to storm surges induced by tropical cyclones. The focus of this study, however, is primarily on assessing the impacts of future SLR scenarios, which are generally characterized by slow processes that do not necessitate extremely high computational efficiency. Consequently, I believe that the significance and necessity of this research are considerably diminished. The value of this research would be greatly enhanced if it could be demonstrated that the machine learning model is also applicable for short-term scenario predictions. Conversely, for future SLR scenarios, the accuracy of the predictions becomes paramount. However, it is important to note that the accuracy of the machine learning model is inherently limited by the accuracy of the hydrodynamic model from which it was trained. Therefore, the accuracy of the hydrodynamic model directly influences the performance of the machine learning model. As such, the significance of this study should be clearly justified and thoroughly discussed before its publication.

Hide

ED: Publish subject to revisions (further review by editor and referees) (07 Nov 2025) by Lelys Bravo de Guenni

AR by Bilal Hassan on behalf of the Authors (18 Dec 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (12 Jan 2026) by Lelys Bravo de Guenni

AR by Bilal Hassan on behalf of the Authors (23 Jan 2026) Manuscript

Journal article(s) based on this preprint

11 Mar 2026

Climate adaptation-aware flood prediction for coastal cities using Deep Learning

Bilal Hassan, Areg Karapetyan, Aaron Chung Hin Chow, and Samer Madanat

Hydrol. Earth Syst. Sci., 30, 1333–1358, https://doi.org/10.5194/hess-30-1333-2026,https://doi.org/10.5194/hess-30-1333-2026, 2026

Short summary

Bilal Hassan, Areg Karapetyan, Aaron Chung Hin Chow, and Samer Madanat

Supplement

https://doi.org/10.5194/egusphere-2025-838-supplement

Bilal Hassan, Areg Karapetyan, Aaron Chung Hin Chow, and Samer Madanat

Viewed

Total article views: 4,487 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
2,727	1,675	85	4,487	218	72	144

HTML: 2,727
PDF: 1,675
XML: 85
Total: 4,487
Supplement: 218
BibTeX: 72
EndNote: 144

Views and downloads (calculated since 27 Mar 2025)

Month	HTML	PDF	XML	Total
Mar 2025	98	34	4	136
Apr 2025	168	18	10	196
May 2025	146	90	6	242
Jun 2025	78	88	12	178
Jul 2025	70	106	0	176
Aug 2025	296	78	2	376
Sep 2025	1,216	56	4	1,276
Oct 2025	78	62	4	144
Nov 2025	60	108	4	172
Dec 2025	138	308	8	454
Jan 2026	116	166	8	290
Feb 2026	68	126	8	202
Mar 2026	134	238	12	384
Apr 2026	46	179	2	227
May 2026	15	18	1	34

Cumulative views and downloads (calculated since 27 Mar 2025)

Month	HTML	PDF	XML	Total
Mar 2025	98	34	4	136
Apr 2025	168	18	10	196
May 2025	146	90	6	242
Jun 2025	78	88	12	178
Jul 2025	70	106	0	176
Aug 2025	296	78	2	376
Sep 2025	1,216	56	4	1,276
Oct 2025	78	62	4	144
Nov 2025	60	108	4	172
Dec 2025	138	308	8	454
Jan 2026	116	166	8	290
Feb 2026	68	126	8	202
Mar 2026	134	238	12	384
Apr 2026	46	179	2	227
May 2026	15	18	1	34

Viewed (geographical distribution)

Total article views: 4,487 (including HTML, PDF, and XML) Thereof 4,487 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 24 May 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (44493 KB)
Metadata XML

Short summary

In this research, we developed an AI-driven framework that rapidly predicts floods in coastal areas, considering various shoreline protection strategies and a different sea-level rise scenarios. By combining data from two coastal cities, our lightweight model delivers near real-time flood projections under various adaptation strategies. This approach can guide policymakers in designing effective defenses, ultimately promoting safer coastal communities and infrastructure.


Total:	0
HTML:	0
PDF:	0
XML:	0