the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Role of the water balance constraint in the long short-term memory network: large-sample tests of rainfall-runoff prediction
Abstract. While deep learning (DL) models are effective in rainfall-runoff modelling, their dependence on data and lack of physical mechanisms can limit their use in hydrology. As there is yet no consensus on the consideration of the fundamental water balance for DL models, this paper presents an in-depth investigation of the effects of water balance constraint on the long-short term memory (LSTM) network. Specifically, based on the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset, the LSTM and its architecturally mass-conserving variant (MC-LSTM) are trained basin-wise to provide rainfall-runoff prediction and then the robustness of the LSTM and MC-LSTM against data sparsity, random parameters initialization and contrasting climate conditions are assessed across the contiguous United States. Through large-sample tests, the results show that the water balance constraint evidently improves the robustness of the basin-wise trained LSTM. On the one hand, as the amount of training data increases from 1 year to 15 years, the incorporation of the water balance constraint into the LSTM network decreases the sensitivity from 95.0 % to 32.7 %. On the other hand, the water balance constraint contributes to the stability of the LSTM for 450 (85 %) basins when there are 3 years’ training data. In the meantime, the water balance constraint improves the transferability of the LSTM from the driest years to the wettest years for 318 (67 %) basins. Overall, the in-depth investigations of this paper facilitate insights into the use of DL models for rainfall-runoff modelling.
- Preprint
(1502 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2023-2841', Anonymous Referee #1, 03 Feb 2024
In this paper, four experiments were conducted to assess the performance of three types of Rainfall-Runoff models: Long Short-Term Memory (LSTM), mass conservative LSTM (MC-LSTM), and the conceptual model EXP-HYDRO. The document is well-organized, and the results are presented and discussed clearly. Nevertheless, certain critical aspects arise in the formulation and execution of the study:
- The primary justification for the study is based on the authors' assertion that "there is yet no consensus on the effects of the water balance constraint on the use of the LSTM network." To support this claim, the authors cite the works of Cai et al. (2022) and Wang et al. (2023), stating higher accuracies with MC-LSTM over LSTM. However, Cai et al. (2022) does not use LSTM neural networks, and Wang et al. (2023) applies physically informed LSTMs for a different purpose and at a scale unrelated to the Rainfall-Runoff models of Frame et al. (2022) and Frame et al. (2023). Consequently, the comparison appears unfair, involving different processes, evaluated at different scales, and with significantly different instances.
- Considering the preceding point, the objective and main conclusion of the work are not clear. While each of the four experiments is thoroughly described, there appears to be a lack of novelty. For instance, the results of experiments 1 and 2 are somewhat expected and have already been published by other authors using diverse datasets.
- As highlighted by Kratzert et al. (2024, https://doi.org/10.5194/hess-2023-275), the training of effective rainfall-runoff LSTM models requires the use of multiple basins. However, in this study, several LSTM models are trained with single-basin data. Consequently, suboptimal LSTM models are likely obtained, and the superior performance demonstrated by MC-LSTM may be a consequence of these suboptimal models.
- It is unclear why the authors opted for the hyperparameters listed in Table 1. Moreover, it is essential to consider how the results of their experiments might be influenced by alternative hyperparameters. Could the enhanced performance of MC-LSTM be subject to change with different values of model hyperparameters?
Citation: https://doi.org/10.5194/egusphere-2023-2841-RC1 - AC1: 'Reply on RC1', Tongtiegang Zhao, 28 Feb 2024
-
RC2: 'Comment on egusphere-2023-2841', Anonymous Referee #2, 06 Feb 2024
General Comments
The authors derive a clear objective, investigating the robustness of LSTM and MC_LSTM against data sparsity, stability against parameter initialization, and test the transferability under different climatic conditions. The paper is in my opinion highly relevant given the current developments in using KI in Hydrology, it is generally well structured, easy to read and understandable and compact without missing relevant information. I belief therefore the manuscript is well suited for publication in the HESS journal.
Some comments/suggestion that I believe would improve the manuscript and that should be addressed before final publication is the following:
- It is clearly stated and shown in the last publications of the Kratzert/Nearing group that the full potential of LSTM application can be achieved when training the LSTM on a large number of variable catchments including also static and dynamic catchment features (see also most recent contribution: https://eartharxiv.org/repository/view/6363/). I would at least like to see a discussion of this topic and how this is related to the presented work.
- Given for example the results of Figure 2, they can be interpreted as LSTM’s being “better” than the EXP-Hydro model. However, as actually mentioned by Beven (2020, doi10.1002/hyp.13805), it is still ~50% of the catchments show KGE-values of below 0.6, in my opinion indicating strong problems in the modelling outside the model-structure and calibration procedure.
- I believe the statement in L374-375 is not supported by the experimental design of the paper – no LSTM model is trained simultaneously to many catchments here, so the statement needs to be modified – or I have misread section 2/3
Â
Specific/technical Comments
The following minor comments/suggestions I would like to make:
- L51: “On” instead of “One”
- L53: Mass Balance has already been introduced by Frame et al. 2023
- L66: please define robustness as used here – in statistics it has a very specific meaning related to performance when a priori assumption (e.g. Normality) are violated
- L80: a small figure as e.g. in Kratzert et al. 2018 to visulalize the LSMT would not be bad, the equations are not intuitive, it would also help in L98f to understand the implementation of the MC
- L120: Equation 11 does not explain how M, ET Q Ps and Pr are calculated – can also go into an appendix
- L125: 1-2 sentences on how EXP-Hydro is wrapped into a DL architecture would be interesting
- L209: I do not think “maximumly” can be used – using to a maximum!?
- L423: there are no further co-authors!
Â
I feel, the manuscript has in general the potential to be a valuable contribution to HESS, however, questions and issues raised in the general comments would need to be addressed and discussed to a significant part before final acceptance.
Citation: https://doi.org/10.5194/egusphere-2023-2841-RC2 - AC2: 'Reply on RC2', Tongtiegang Zhao, 28 Feb 2024
Status: closed
-
RC1: 'Comment on egusphere-2023-2841', Anonymous Referee #1, 03 Feb 2024
In this paper, four experiments were conducted to assess the performance of three types of Rainfall-Runoff models: Long Short-Term Memory (LSTM), mass conservative LSTM (MC-LSTM), and the conceptual model EXP-HYDRO. The document is well-organized, and the results are presented and discussed clearly. Nevertheless, certain critical aspects arise in the formulation and execution of the study:
- The primary justification for the study is based on the authors' assertion that "there is yet no consensus on the effects of the water balance constraint on the use of the LSTM network." To support this claim, the authors cite the works of Cai et al. (2022) and Wang et al. (2023), stating higher accuracies with MC-LSTM over LSTM. However, Cai et al. (2022) does not use LSTM neural networks, and Wang et al. (2023) applies physically informed LSTMs for a different purpose and at a scale unrelated to the Rainfall-Runoff models of Frame et al. (2022) and Frame et al. (2023). Consequently, the comparison appears unfair, involving different processes, evaluated at different scales, and with significantly different instances.
- Considering the preceding point, the objective and main conclusion of the work are not clear. While each of the four experiments is thoroughly described, there appears to be a lack of novelty. For instance, the results of experiments 1 and 2 are somewhat expected and have already been published by other authors using diverse datasets.
- As highlighted by Kratzert et al. (2024, https://doi.org/10.5194/hess-2023-275), the training of effective rainfall-runoff LSTM models requires the use of multiple basins. However, in this study, several LSTM models are trained with single-basin data. Consequently, suboptimal LSTM models are likely obtained, and the superior performance demonstrated by MC-LSTM may be a consequence of these suboptimal models.
- It is unclear why the authors opted for the hyperparameters listed in Table 1. Moreover, it is essential to consider how the results of their experiments might be influenced by alternative hyperparameters. Could the enhanced performance of MC-LSTM be subject to change with different values of model hyperparameters?
Citation: https://doi.org/10.5194/egusphere-2023-2841-RC1 - AC1: 'Reply on RC1', Tongtiegang Zhao, 28 Feb 2024
-
RC2: 'Comment on egusphere-2023-2841', Anonymous Referee #2, 06 Feb 2024
General Comments
The authors derive a clear objective, investigating the robustness of LSTM and MC_LSTM against data sparsity, stability against parameter initialization, and test the transferability under different climatic conditions. The paper is in my opinion highly relevant given the current developments in using KI in Hydrology, it is generally well structured, easy to read and understandable and compact without missing relevant information. I belief therefore the manuscript is well suited for publication in the HESS journal.
Some comments/suggestion that I believe would improve the manuscript and that should be addressed before final publication is the following:
- It is clearly stated and shown in the last publications of the Kratzert/Nearing group that the full potential of LSTM application can be achieved when training the LSTM on a large number of variable catchments including also static and dynamic catchment features (see also most recent contribution: https://eartharxiv.org/repository/view/6363/). I would at least like to see a discussion of this topic and how this is related to the presented work.
- Given for example the results of Figure 2, they can be interpreted as LSTM’s being “better” than the EXP-Hydro model. However, as actually mentioned by Beven (2020, doi10.1002/hyp.13805), it is still ~50% of the catchments show KGE-values of below 0.6, in my opinion indicating strong problems in the modelling outside the model-structure and calibration procedure.
- I believe the statement in L374-375 is not supported by the experimental design of the paper – no LSTM model is trained simultaneously to many catchments here, so the statement needs to be modified – or I have misread section 2/3
Â
Specific/technical Comments
The following minor comments/suggestions I would like to make:
- L51: “On” instead of “One”
- L53: Mass Balance has already been introduced by Frame et al. 2023
- L66: please define robustness as used here – in statistics it has a very specific meaning related to performance when a priori assumption (e.g. Normality) are violated
- L80: a small figure as e.g. in Kratzert et al. 2018 to visulalize the LSMT would not be bad, the equations are not intuitive, it would also help in L98f to understand the implementation of the MC
- L120: Equation 11 does not explain how M, ET Q Ps and Pr are calculated – can also go into an appendix
- L125: 1-2 sentences on how EXP-Hydro is wrapped into a DL architecture would be interesting
- L209: I do not think “maximumly” can be used – using to a maximum!?
- L423: there are no further co-authors!
Â
I feel, the manuscript has in general the potential to be a valuable contribution to HESS, however, questions and issues raised in the general comments would need to be addressed and discussed to a significant part before final acceptance.
Citation: https://doi.org/10.5194/egusphere-2023-2841-RC2 - AC2: 'Reply on RC2', Tongtiegang Zhao, 28 Feb 2024
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
387 | 164 | 25 | 576 | 17 | 15 |
- HTML: 387
- PDF: 164
- XML: 25
- Total: 576
- BibTeX: 17
- EndNote: 15
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1