the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Rapid SpatioTemporal Flood Modelling via HydraulicsBased Graph Neural Networks
Abstract. Numerical modelling is a reliable tool for flood simulations, but accurate solutions are computationally expensive. In the recent years, researchers have explored datadriven methodologies based on neural networks to overcome this limitation. However, most models are used only for a specific case study and disregard the dynamic evolution of the flood wave. This limits their generalizability to topographies that the model was not trained on and in timedependent applications. In this paper, we introduce SWEGNN, a hydraulicsinspired surrogate model based on Graph Neural Networks (GNN) that can be used for rapid spatiotemporal flood modelling. The model exploits the analogy between finite volume methods, used to solve the shallow water equations (SWE), and GNNs. For a computational mesh, we create a graph by considering finitevolume cells as nodes and adjacent cells as connected by edges. The inputs are determined by the topographical properties of the domain and the initial hydraulic conditions. The GNN then determines how fluxes are exchanged between cells via a learned local function. We overcome the timestep constraints by stacking multiple GNN layers, which expand the considered space instead of increasing the time resolution. We also propose a multistepahead loss function along with a curriculum learning strategy to improve the stability and performance. We validate this approach using a dataset of twodimensional dike breach flood simulations on randomlygenerated digital elevation models, generated with a highfidelity numerical solver. The SWEGNN model predicts the spatiotemporal evolution of the flood for unseen topographies with a mean average error in time of 0.04 m for water depths and 0.004 m^{2}/s for unit discharges. Moreover, it generalizes well to unseen breach locations, bigger domains, and over longer periods of time, outperforming other deep learning models. On top of this, SWEGNN has a computational speedup of up to two orders of magnitude faster than the numerical solver. Our framework opens the doors to a new approach for replacing numerical solvers in timesensitive applications with spatiallydependant uncertainties.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(9481 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(9481 KB)  Metadata XML
 BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed

RC1: 'Comment on egusphere2023284', Anonymous Referee #1, 22 Mar 2023
This paper introduces a novel machine learning algorithm for the simulation of 2D surface flooding. The algorithm takes inspiration from scientific computing techniques to develop the machine learning architecture, to arrive at a setup that provides a timedynamic and spatially distributed simulation, and a speedup in the order of factor 100 compared to a numerical simulation model. To my knowledge, this is the first application of this kind of methodology for flood problems. A number of questions therefore remain unsolved, and the considered case examples do not yet reflect all complexities that we encounter in the real world. On the other hand, this work is likely to inspire a variety of derivative works in hydrology, where the potential impact of these techniques so far is not really recognized. Because this work is a frontrunner, I have a number of suggestions that are mainly aimed at making the work accessible to a broader audience. I think these can be adressed in a minor revision, and I recommend the paper for publication, subject to these modifications.
Main comments:
1. Limitations  and important feature of numerical methods is that they (mostly) preserve e.g. mass and momentum. As far as I can see, this is not the case for the algorithm proposed here, and it is not straightforward to see how this can be implemented in the encoder/decoder architecture. This should be clearly mentioned as a limitation.
2. A major challenge with this type of model is actually implementation. For example, efficient data pipelines for custom graph operators are not a straightforward problem. I would strongly suggest a section or appendix summarizing the main computational challenges and how you suggest to adress them.
3. Along the same lines, I appreciate that the authors have tried to keep the Methdology description generic. However, this also makes it very hard to read in some places. I think you could greatly help the readers with an Appendix that includes a detailed variant of Figure 2, where you include equations 5 to 10, and where hyperparameters (G, p, etc.) reflect that actual values used in your implementation.
4. Regarding the hyperparameters, I strongly miss a table that summarizes the final set of hyperparameters. These are now spread out through the paper. In addition, the complexity of the different MLPs used throughout the methodology is currently entirely unclear.
5. The benchmark models are introduced in a few lines of text, and not easily understood. Also here, I suggest including an appendix that details these.Detailed comments:
line 106: a is already used as symbol for area, use v for velocity?
Figure 2:
top part of the figure
mention what are the static inputs and the outputs in your work in the figure
include a recursive arrow from output 1 back to dynamic inputs, to make it clear that the model prediction is recycled
bottom part of the figure
include the following captions above MLP, GNN and MLP: "process individual nodes", "process neighborhood of each node", "process individual nodes"
consider referencing the equation numbers inside the figure, to make it clear which illustration relates to what step
include j in one of the orange circles
edge feature encodings should receive and output arrow from the MLPs as well. As a whole, maybe this figure would be easier to understand if you distinguish h_si, h_di (why not h_dy?) and eps_ij separately (three sets of arrows)
caption
h_si and h_di are not clear from the caption (and explained somewhere much later in the main text)
line 138: explain that input features and hyperparameters will be explained in section 3.2
line 145: Ut−p:t are the dynamic node features > Ut−p:t are the dynamic node features (hydraulic variables) for time steps tp to t
line 148: define I_epsilon
line 154: explain that G is a hyperparameter
Equation 7: At this point it would be really good to know how terrain differences come into play, and you don't explain it until Section 3.2. I think a short explanation is needed here, because it actually hinders the understanding of the methodology
line 184: is the activation function only applied on the final graph layer?
line 191: Do the neural networks in the graph layers include bias terms? You refer to sparsity in multiple places in the paper, but you never explain it and why it is relevanmt (a large area of the image is 0 and does not need to be processed?)
Eq. 12: it is confusing that u is not the same vector as when introducing the SWE. I have no good suggestion for improving this though.
line 212: define unit normal vector (probably already needed in the context of Fig. 2)
Section 3.3: Consider merging this with 4.2. Searching for the actual parameters used in training is yet another unnecessary complication
Algorithm 1: Nice!
line 266: I don't understand why H is now fixed (previously, you introduced the curriculum). Is this the maximum of H considered?
Section 4.3: I think it would be interesting to see some illustrations of how e.g. MAE evolves over simulation time. This would help us understand better if the method is stable
line 298: I'm not sure what propagation rule you refer to. As mentioned above, these model variations need to be documented better.
Figure 5: Show units also on the legends of the difference plots
Figure 7+8: From these figures, speedup in the order of factor 100 seems more realistic to me (not 600 as mentioned somewhere in the paper)Citation: https://doi.org/10.5194/egusphere2023284RC1  AC1: 'Reply on RC1', Roberto Bentivoglio, 20 Apr 2023

RC2: 'The geometry of flooding', Daniel Klotz, 19 Jun 2023
This paper presents a MLbased approach for twodimensional hydrodynamic modeling of floodings. In my eyes, the design takes inspiration from Geometric Deep Learning to wisely choose inductive biases so that the presented approach is aligned with knowledge from physicalprinciples (e.g., interactions follow along a gradient) and numerical modeling (e.g., the depth of the network compensates for time resolution). The result is a graphbased approach that is two magnitudes faster than a comparable numerical model and its performance is as good or better as other baselines — at least in modelland. The ideas are novel, the evaluation is good, and the exposition is clear. I only have small nitpicks and hope that the paper will be published as soon.General CommentsAblation Study. I have two problems with the ablation. First: I think that the major part of the Copernicus audience will (sadly) not be familiar with the term and I would thus propose to introduce what an "ablation study" is and what you do there as part of your experimental setup (e.g., after Section 4.3). Second: I will emphasize that I very, very much appreciate that the study did ablations. Ablations have become one of the primary tools in ML research, and I believe that many of the current Deep Learning applications are sourly missing them. Still, I want to attest that the presented ablations are rather unusual. An ablation usually refers to an experiment that removes part of the network. Here, on the other hand, the authors did ablate the loss and the algorithm. I am not sure if it is necessary to deal with this minor idiosyncrasy, but I personally would not call this an ablation at all, but rather an exploration of the importance of hyperparameter choice (I do admit that both are very similar in intent). On the other hand, if the authors like their ablationframing, I would like to suggest to also include an ablation that checks what happens if one reduces the model itself (e.g., removing the activations in equation 8 PRELU > RELU > no activation; or reducing the inductive bias by not using the hh part in equation 7).Future work on speeding the model further up. I think a discussion on further speed ups of the modeling pipeline would be good. Is a twomagnitude speed up already so much that its not worth pursuing even faster models? Can we still expect speedups? How important is the tradeoff between the temporal resolution and the speed mentioned in L. 260? Etc. Here is, for example, a direction I spontaneously see (not saying that this should be included; just for providing inspiration): In theory one can directly output multiple (all) needed timesteps from the network. This would likely speed the process up considerably, since no recurrence over time (as shown, e.g., in Figure 3) would be required anymore. The price one pays then is that this "concurrent output" approach can only model the inundation for a finite time horizon (as a matter of fact, it will always model exactly to that horizon no matter what). One can also think of “inbetween solutions' ', in which the model ingests and outputs chunks of time. This is not so easy with graphs, and does break the nice alignment with the physical conception of the problem.Specific CommentsL. 16. I think I know what the authors want to say here, but in general I think that it is not clear what "bigger domains" and "longer periods" of time mean here. Bigger and longer than what?L. 71. Maybe it would be good to say "two orders of magnitude" instead of "up to 600 times speedups" to align the contribution with the abstract.L.121. I would suggest to remove the phrasing "... wellknown 'curse of dimensionality', ...", because (1) it might not be as well known as you think, and perhaps more importantly (2) the term "curse of dimensionality" refers to a plethora of phenomena and thus readers might associate something different with it. Instead, I would propose to write something direct like "For MLPs the number of parameters increases exponentially with ..."L. 126. Bronstein (2021) as a sole reference is probably a bit unfitting here, since LeCun and Bengio (1995) already discussed the importance of shared weights in CNNS (according to ideas lined out by LeCun in 1989). References: LeCun, Y., & Bengio, Y. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), 1995. LeCun, Y. (1989). Generalization and network design strategies. Connectionism in perspective, 19(143155), 18.L. 131. The word "avoid" might be misleading here, as they just don't include these specific physical inductive biases. Maybe use "include" instead.L. 167f. "Either" here would suggest that both nodefeatures need to be nonzero. I don't see that. Is this a formulation thing or am I missing something?L. 174. More a thought than a suggestion: For me the W(l) is like an additional MLP layer (that maps to R1) without an activation. It might thus also be interesting to do an ablation on W(l).L. 191. Can you clarify this sentence? I understand that you don't want areas with constant values in it, but in my opinion the sentence is formulated a bit vaguely.Algorithm 1. I think you should at least initialize the weighting coefficients and the CurriculumSteps variable since they are used so explicitly (technically the other variables would need to be chosen too, but for them I think it's less crucial).L. 266. This should refer to the ablation study to show that the choice of H=8 is not arbitrary.L. 267f. The sentence is a bit peculiar and maybe I missed it: I see that discharge is weighted with 3, but what is the weighting factor for the water depth (I assume 30)?L. 277. MAE: I guess technically this is the mean of the mean absolute errors per variable, since you calculate the error in u_o (i.e., the Mean of the L1 for each hydraulic variable) and not in u (i.e. the L1 norm over the vector that spans all hydraulic variables). For me your choice makes more sense anyways. More importantly: What is of interest to me here, would be to see how the MAE changes if you include the weighting factors of the loss in the evaluation and get an weighted MAE (so that water depth is more important).Citation: https://doi.org/
10.5194/egusphere2023284RC2  AC2: 'Reply on RC2', Roberto Bentivoglio, 27 Jun 2023
Interactive discussion
Status: closed

RC1: 'Comment on egusphere2023284', Anonymous Referee #1, 22 Mar 2023
This paper introduces a novel machine learning algorithm for the simulation of 2D surface flooding. The algorithm takes inspiration from scientific computing techniques to develop the machine learning architecture, to arrive at a setup that provides a timedynamic and spatially distributed simulation, and a speedup in the order of factor 100 compared to a numerical simulation model. To my knowledge, this is the first application of this kind of methodology for flood problems. A number of questions therefore remain unsolved, and the considered case examples do not yet reflect all complexities that we encounter in the real world. On the other hand, this work is likely to inspire a variety of derivative works in hydrology, where the potential impact of these techniques so far is not really recognized. Because this work is a frontrunner, I have a number of suggestions that are mainly aimed at making the work accessible to a broader audience. I think these can be adressed in a minor revision, and I recommend the paper for publication, subject to these modifications.
Main comments:
1. Limitations  and important feature of numerical methods is that they (mostly) preserve e.g. mass and momentum. As far as I can see, this is not the case for the algorithm proposed here, and it is not straightforward to see how this can be implemented in the encoder/decoder architecture. This should be clearly mentioned as a limitation.
2. A major challenge with this type of model is actually implementation. For example, efficient data pipelines for custom graph operators are not a straightforward problem. I would strongly suggest a section or appendix summarizing the main computational challenges and how you suggest to adress them.
3. Along the same lines, I appreciate that the authors have tried to keep the Methdology description generic. However, this also makes it very hard to read in some places. I think you could greatly help the readers with an Appendix that includes a detailed variant of Figure 2, where you include equations 5 to 10, and where hyperparameters (G, p, etc.) reflect that actual values used in your implementation.
4. Regarding the hyperparameters, I strongly miss a table that summarizes the final set of hyperparameters. These are now spread out through the paper. In addition, the complexity of the different MLPs used throughout the methodology is currently entirely unclear.
5. The benchmark models are introduced in a few lines of text, and not easily understood. Also here, I suggest including an appendix that details these.Detailed comments:
line 106: a is already used as symbol for area, use v for velocity?
Figure 2:
top part of the figure
mention what are the static inputs and the outputs in your work in the figure
include a recursive arrow from output 1 back to dynamic inputs, to make it clear that the model prediction is recycled
bottom part of the figure
include the following captions above MLP, GNN and MLP: "process individual nodes", "process neighborhood of each node", "process individual nodes"
consider referencing the equation numbers inside the figure, to make it clear which illustration relates to what step
include j in one of the orange circles
edge feature encodings should receive and output arrow from the MLPs as well. As a whole, maybe this figure would be easier to understand if you distinguish h_si, h_di (why not h_dy?) and eps_ij separately (three sets of arrows)
caption
h_si and h_di are not clear from the caption (and explained somewhere much later in the main text)
line 138: explain that input features and hyperparameters will be explained in section 3.2
line 145: Ut−p:t are the dynamic node features > Ut−p:t are the dynamic node features (hydraulic variables) for time steps tp to t
line 148: define I_epsilon
line 154: explain that G is a hyperparameter
Equation 7: At this point it would be really good to know how terrain differences come into play, and you don't explain it until Section 3.2. I think a short explanation is needed here, because it actually hinders the understanding of the methodology
line 184: is the activation function only applied on the final graph layer?
line 191: Do the neural networks in the graph layers include bias terms? You refer to sparsity in multiple places in the paper, but you never explain it and why it is relevanmt (a large area of the image is 0 and does not need to be processed?)
Eq. 12: it is confusing that u is not the same vector as when introducing the SWE. I have no good suggestion for improving this though.
line 212: define unit normal vector (probably already needed in the context of Fig. 2)
Section 3.3: Consider merging this with 4.2. Searching for the actual parameters used in training is yet another unnecessary complication
Algorithm 1: Nice!
line 266: I don't understand why H is now fixed (previously, you introduced the curriculum). Is this the maximum of H considered?
Section 4.3: I think it would be interesting to see some illustrations of how e.g. MAE evolves over simulation time. This would help us understand better if the method is stable
line 298: I'm not sure what propagation rule you refer to. As mentioned above, these model variations need to be documented better.
Figure 5: Show units also on the legends of the difference plots
Figure 7+8: From these figures, speedup in the order of factor 100 seems more realistic to me (not 600 as mentioned somewhere in the paper)Citation: https://doi.org/10.5194/egusphere2023284RC1  AC1: 'Reply on RC1', Roberto Bentivoglio, 20 Apr 2023

RC2: 'The geometry of flooding', Daniel Klotz, 19 Jun 2023
This paper presents a MLbased approach for twodimensional hydrodynamic modeling of floodings. In my eyes, the design takes inspiration from Geometric Deep Learning to wisely choose inductive biases so that the presented approach is aligned with knowledge from physicalprinciples (e.g., interactions follow along a gradient) and numerical modeling (e.g., the depth of the network compensates for time resolution). The result is a graphbased approach that is two magnitudes faster than a comparable numerical model and its performance is as good or better as other baselines — at least in modelland. The ideas are novel, the evaluation is good, and the exposition is clear. I only have small nitpicks and hope that the paper will be published as soon.General CommentsAblation Study. I have two problems with the ablation. First: I think that the major part of the Copernicus audience will (sadly) not be familiar with the term and I would thus propose to introduce what an "ablation study" is and what you do there as part of your experimental setup (e.g., after Section 4.3). Second: I will emphasize that I very, very much appreciate that the study did ablations. Ablations have become one of the primary tools in ML research, and I believe that many of the current Deep Learning applications are sourly missing them. Still, I want to attest that the presented ablations are rather unusual. An ablation usually refers to an experiment that removes part of the network. Here, on the other hand, the authors did ablate the loss and the algorithm. I am not sure if it is necessary to deal with this minor idiosyncrasy, but I personally would not call this an ablation at all, but rather an exploration of the importance of hyperparameter choice (I do admit that both are very similar in intent). On the other hand, if the authors like their ablationframing, I would like to suggest to also include an ablation that checks what happens if one reduces the model itself (e.g., removing the activations in equation 8 PRELU > RELU > no activation; or reducing the inductive bias by not using the hh part in equation 7).Future work on speeding the model further up. I think a discussion on further speed ups of the modeling pipeline would be good. Is a twomagnitude speed up already so much that its not worth pursuing even faster models? Can we still expect speedups? How important is the tradeoff between the temporal resolution and the speed mentioned in L. 260? Etc. Here is, for example, a direction I spontaneously see (not saying that this should be included; just for providing inspiration): In theory one can directly output multiple (all) needed timesteps from the network. This would likely speed the process up considerably, since no recurrence over time (as shown, e.g., in Figure 3) would be required anymore. The price one pays then is that this "concurrent output" approach can only model the inundation for a finite time horizon (as a matter of fact, it will always model exactly to that horizon no matter what). One can also think of “inbetween solutions' ', in which the model ingests and outputs chunks of time. This is not so easy with graphs, and does break the nice alignment with the physical conception of the problem.Specific CommentsL. 16. I think I know what the authors want to say here, but in general I think that it is not clear what "bigger domains" and "longer periods" of time mean here. Bigger and longer than what?L. 71. Maybe it would be good to say "two orders of magnitude" instead of "up to 600 times speedups" to align the contribution with the abstract.L.121. I would suggest to remove the phrasing "... wellknown 'curse of dimensionality', ...", because (1) it might not be as well known as you think, and perhaps more importantly (2) the term "curse of dimensionality" refers to a plethora of phenomena and thus readers might associate something different with it. Instead, I would propose to write something direct like "For MLPs the number of parameters increases exponentially with ..."L. 126. Bronstein (2021) as a sole reference is probably a bit unfitting here, since LeCun and Bengio (1995) already discussed the importance of shared weights in CNNS (according to ideas lined out by LeCun in 1989). References: LeCun, Y., & Bengio, Y. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), 1995. LeCun, Y. (1989). Generalization and network design strategies. Connectionism in perspective, 19(143155), 18.L. 131. The word "avoid" might be misleading here, as they just don't include these specific physical inductive biases. Maybe use "include" instead.L. 167f. "Either" here would suggest that both nodefeatures need to be nonzero. I don't see that. Is this a formulation thing or am I missing something?L. 174. More a thought than a suggestion: For me the W(l) is like an additional MLP layer (that maps to R1) without an activation. It might thus also be interesting to do an ablation on W(l).L. 191. Can you clarify this sentence? I understand that you don't want areas with constant values in it, but in my opinion the sentence is formulated a bit vaguely.Algorithm 1. I think you should at least initialize the weighting coefficients and the CurriculumSteps variable since they are used so explicitly (technically the other variables would need to be chosen too, but for them I think it's less crucial).L. 266. This should refer to the ablation study to show that the choice of H=8 is not arbitrary.L. 267f. The sentence is a bit peculiar and maybe I missed it: I see that discharge is weighted with 3, but what is the weighting factor for the water depth (I assume 30)?L. 277. MAE: I guess technically this is the mean of the mean absolute errors per variable, since you calculate the error in u_o (i.e., the Mean of the L1 for each hydraulic variable) and not in u (i.e. the L1 norm over the vector that spans all hydraulic variables). For me your choice makes more sense anyways. More importantly: What is of interest to me here, would be to see how the MAE changes if you include the weighting factors of the loss in the evaluation and get an weighted MAE (so that water depth is more important).Citation: https://doi.org/
10.5194/egusphere2023284RC2  AC2: 'Reply on RC2', Roberto Bentivoglio, 27 Jun 2023
Peer review completion
Journal article(s) based on this preprint
Data sets
Raw datasets for paper "Rapid SpatioTemporal Flood Modelling via HydraulicsBased Graph Neural Networks" Roberto Bentivoglio and Ron Bruijns https://doi.org/10.5281/zenodo.7639233
Video supplement
Video simulations for paper "Rapid SpatioTemporal Flood Modelling via HydraulicsBased Graph Neural Networks" Roberto Bentivoglio https://doi.org/10.5281/zenodo.7652663
Viewed
HTML  XML  Total  BibTeX  EndNote  

1,132  665  25  1,822  28  16 
 HTML: 1,132
 PDF: 665
 XML: 25
 Total: 1,822
 BibTeX: 28
 EndNote: 16
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
Cited
3 citations as recorded by crossref.
 Rapid spatiotemporal flood modelling via hydraulicsbased graph neural networks R. Bentivoglio et al. 10.5194/hess2742272023
 Real time probabilistic inundation forecasts using a LSTM neural network F. Hop et al. 10.1016/j.jhydrol.2024.131082
 LSTMbased autoencoder models for realtime quality control of wastewater treatment sensor data S. Seshan et al. 10.2166/hydro.2024.167
Roberto Bentivoglio
Elvin Isufi
Sebastiaan Nicolas Jonkman
Riccardo Taormina
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(9481 KB)  Metadata XML