the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Robustness of the long short-term memory network in rainfall-runoff prediction improved by the water balance constraint
Abstract. While the water balance constraint is fundamental to catchment hydrological models, there is yet no consensus on its role in the long short-term memory (LSTM) network. This paper is concentrated on the part that this constraint plays in the robustness of the LSTM network for rainfall-runoff prediction. Specifically, numerical experiments are devised to examine the robustness of the LSTM and its architecturally mass-conserving variant (MC-LSTM); and the Explainable Artificial Intelligence (XAI) is employed to interrogate how this constraint affects the robustness of the LSTM in learning rainfall-runoff relationships. Based on the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset, the LSTM, MC-LSTM and EXP-HYDRO models are trained under various amounts of training data and different seeds of parameter initialization over 531 catchments, leading to 95,580 (3×6×10×531) tests. Through large-sample tests, the results show that incorporating the water balance constraint into the LSTM improves the robustness, while the improvement tends to decrease as the amount of training data increases. Under 9 years’ training data, this constraint significantly enhances the robustness against data sparsity in 37 % (196 in 531) of the catchments and improves the robustness against parameter initialization in 73 % (386 in 531) of the catchments. In addition, it improves the robustness in learning rainfall-runoff relationships by increasing the median contribution of precipitation from 45.8 % to 47.3 %. These results point to the compensation effects between training data and process knowledge on the LSTM’s performance. Overall, the in-depth investigations facilitate insights into the use of the LSTM for rainfall-runoff prediction.
This preprint has been withdrawn.
-
Withdrawal notice
This preprint has been withdrawn.
-
Preprint
(1990 KB)
-
Supplement
(1341 KB)
-
This preprint has been withdrawn.
- Preprint
(1990 KB) - Metadata XML
-
Supplement
(1341 KB) - BibTeX
- EndNote
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2024-1449', Anonymous Referee #1, 10 Jun 2024
Having evaluated the earlier version/submission of the manuscript, I would like to state that the authors have largely improved the manuscript and clarified most of my concerns and comments. In particular I like the addition of the IG analysis given nice additional insight into the functioning of LSTMs and the role of constrain in this context. So, overall I belief this manuscript provides an interesting and important peace of research from which many readers might profit in their own work.
I have a few questions and comments that should be addressed in a revised version of the manuscript before publication.:
- L15: robustness should briefly be defined, also in the abstract
- L18: The sentence starting “In addition, …” the meaning is not understanding in this formulation, please reformulate.
- L68: Reference for XAI?
- L77: should be “…is passed…”
- L167: Why using catchments with high KGE – please add a reason for that
- L312: should be :”robust”
Citation: https://doi.org/10.5194/egusphere-2024-1449-RC1 -
AC1: 'Reply on RC1', Tongtiegang Zhao, 30 Jun 2024
Comment 1:
Having evaluated the earlier version/submission of the manuscript, I would like to state that the authors have largely improved the manuscript and clarified most of my concerns and comments. In particular, I like the addition of the IG analysis given nice additional insight into the functioning of LSTMs and the role of constrain in this context. So, overall I belief this manuscript provides an interesting and important piece of research from which many readers might profit in their own work.
Response 1:
We appreciate the positive comments. In the last round of revision, we have made great effort to thoroughly improve the whole paper by following the insightful and constructive comments.
Comment 2:
I have a few questions and comments that should be addressed in a revised version of the manuscript before publication.
Response 2:
We shall keep on improving the paper.
Comment 3:
L15: robustness should briefly be defined, also in the abstract
Response 3:
The definition that “the robustness of hydrological models, which is the ability to behave consistently under different conditions, is critical for hydrological forecasting” shall be added to the revision.
Comment 4:
L18: The sentence starting “In addition, …” the meaning is not understanding in this formulation, please reformulate.
Response 4:
The robustness in learning rainfall-runoff relationships is assessed by the range of variation in contributions of input features under different amounts of training data. When the amount of training data decreases from 15 years to only 1 year, the extent of degradation in the contributions of precipitation to the runoff of the LSTM is much bigger than that of the MC-LSTM. The results indicate that the water balance constraint enhances the robustness in learning rainfall-runoff relationships. The improvement is less with more training data. However, in order to avoid giving readers the impression that we were deliberately overselling the results, the slight improvement of the robustness in learning rainfall-runoff relationship with 9 years’ training data are presented in the abstract.
Comment 5:
L68: Reference for XAI?
Response 5:
Thank you very much for the constructive comment. The two important references on XAI, i.e., Alejandro et al. (2020) and Adadi et al. (2018), will be added to the revision.
Comment 6:
L77: should be “…is passed…”
Response 6:
Thank you for spotting the typo. We shall correct it accordingly.
Comment 7:
L167: Why using catchments with high KGE – please add a reason for that
Response 7:
Considering that the predictive performance is critical for extracting meaningful information from machine learning models (Murdoch et al., 2019; Jiang et al., 2022), this paper focuses on catchments with relatively high KGE. Specifically, following Sun et al. (2021), the attention is paid to the top 50 case study catchments in terms of the KGE value.
Comment 8:
L312: should be: “robust”
Response 8:
Thank you for spotting the typo. We shall correct it accordingly.
References:
Adadi, A., Berrada, M., Adadi, A., Berrada, M., Adadi, A., and Berrada, M.: Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI), IEEE ACCESS, 6, 52138–52160, https://doi.org/10.1109/ACCESS.2018.2870052, 2018.
Alejandro, B. A., Natalia, D.-R., Javier, D. S., Adrien, B., Siham, T., Alberto, B., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., and Herrera, F.: Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI, INFORM FUSION, 58, 82–115, https://doi.org/10.1016/j.inffus.2019.12.012, 2020.
Jiang, S., Zheng, Y., Wang, C., and Babovic, V.: Uncovering flooding mechanisms across the contiguous United States through interpretive deep learning on representative catchments, Water Resour. Res., 58, https://doi.org/10.1029/2021WR030185, 2022.
Murdoch, W. J., Singh, C., Kumbier, K., Abbasi-Asl, R., and Yu, B.: Definitions, methods, and applications in interpretable machine learning, P NATL ACAD SCI USA, 116, 22071–22080, https://doi.org/10.1073/pnas.1900654116, 2019.
Sun, A. Y., Jiang, P., Mudunuru, M. K., and Chen, X.: Explore Spatio‐Temporal Learning of Large Sample Hydrology Using Graph Neural Networks, Water Resour. Res., 57, e2021WR030394, https://doi.org/10.1029/2021WR030394, 2021.
Citation: https://doi.org/10.5194/egusphere-2024-1449-AC1
-
RC2: 'Comment on egusphere-2024-1449', Anonymous Referee #2, 24 Jun 2024
This manuscript presents a study that attempts to assess the effects of mass conservation constraints on the robustness of LSTM neural networks at a local scale. The study adopts the definition of robustness from Manure et al. (2023), according to which “robustness is the ability to perform consistently across varying conditions.” Based on this definition, the study examines:
- Robustness against data sparsity
- Robustness against parameter initialization
- Robustness in learning rainfall-runoff relationships
I find that this study lacks novelty and that the questions it attempts to answer do not represent a substantial contribution to scientific progress. The manuscript's presentation and clarity could be significantly improved, and some of the results and conclusions are not well justified. The following are some points that I think the authors should consider to improve their contribution:
- The English usage should be reviewed.
- The structure of the paper gives the impression that the authors are merely testing three modeling frameworks and comparing their results. Large portions of the text simply describe their figures, and often the conclusions are case-specific, thus only valid for a selected group of catchments.
- In the section “Robustness against data sparsity,” the conclusions suggest that under certain circumstances, mass conservation constraints can enhance the accuracy of LSTM models. Beyond indicating that more data generally benefits unconstrained models, the study does not provide further information. It would be interesting to identify in which watersheds MC constraints enhance predictions. Additionally, there is a methodological inconsistency as the authors focus their analysis on the mean of KGE values rather than the range or standard deviation of KGE values. Thus, instead of studying robustness, they are analyzing accuracy against data sparsity.
- In the section “Robustness against parameter initialization,” it is indicated that the robustness of LSTM (measured as the standard deviation of KGE values) improves (i.e., standard deviation values decrease) when more data is available for model training. However, this result alone is not very useful. A model's prediction can be consistent but inaccurate. In other words, the standard deviation of the KGE values for a catchment could be small (indicating high robustness), yet those KGE values could still be very poor. Thus, robustness alone is not a very informative metric.
- The section “Robustness in learning rainfall-runoff relationships” could benefit from clearer figures. In this section, the statement “These results indicate that water balance constraint enhances the robustness in learning rainfall-runoff relationships” is not sufficiently supported. The extent to which a model output is explained by one or multiple variables does not indicate whether the model's ability to perform consistently across varying conditions is enhanced. These are two different questions: one is “To what extent do input variables contribute to a model prediction?” and the other is “Does the model prediction vary significantly if it is a function of more variables?”
Citation: https://doi.org/10.5194/egusphere-2024-1449-RC2 -
AC2: 'Reply on RC2', Tongtiegang Zhao, 30 Jun 2024
Comment 1:
This manuscript presents a study that attempts to assess the effects of mass conservation constraints on the robustness of LSTM neural networks at a local scale. The study adopts the definition of robustness from Manure et al. (2023), according to which “robustness is the ability to perform consistently across varying conditions.” Based on this definition, the study examines:
-Robustness against data sparsity
-Robustness against parameter initialization
-Robustness in learning rainfall-runoff relationships
Response 1:
Thank you very much for the brief summary of the paper. This paper is built upon a previous submission entitled “Role of the water balance constraint in the long short-term memory network: large-sample tests of rainfall-runoff prediction”. The two reviewers raised insightful and constructive comments for improving the paper. Accordingly, we have made great effort to thoroughly improve the whole paper in the last round of revision.
Below please find point-by-point responses the review comments.Comment 2:
I find that this study lacks novelty and that the questions it attempts to answer do not represent a substantial contribution to scientific progress.
Response 2:
Thank you for the comment. It is noted that the robustness against data sparsity and parameter initialization is a critical issue for deep learning (DL) models (Cai et al., 2022; Kratzert et al., 2024). In particular for safety-critical applications such as flood prediction, the robustness of DL models against parameter initialization plays an essential part in their trustworthiness to users (Hemachandra et al., 2023).
When revising the paper, we have conducted a literature survey to identify the contribution of the paper. It is found that the effect of the water balance constraint has been tested by considering extreme events (Frame et al., 2022), data sparsity at the grid scale (Li et al., 2024) and changing climate (Wi and Steinschneider, 2024). In the meantime, the effect of the water balance constraint on the robustness of the long short-term memory network (LSTM) against data sparsity and parameter initialization is yet to be investigated. Therefore, in this paper, the attention is paid to how the water balance constraint contribute to the robustness of the LSTM for the purpose of rainfall-runoff prediction.Comment 3:
The manuscript's presentation and clarity could be significantly improved, and some of the results and conclusions are not well justified.
Response 3:
We shall try our best to improve the presentation and clarity of the manuscript. In the meantime, we shall perform additional numerical experiments to refine the results.Comment 4:
The following are some points that I think the authors should consider to improve their contribution.
Response 4:
Thank you very much for the constructive comments. By following the comments, we shall keep on improving the paper.Comment 5:
The English usage should be reviewed.
Response 5:
We are sorry if there exist some confusions. We shall have the paper proofread by a native speaker when revising the paper.Comment 6:
The structure of the paper gives the impression that the authors are merely testing three modeling frameworks and comparing their results.
Response 6:
Thank you for the comment and we agree on the concern. It is noted that these three experiments play a critical part in examining the robustness of the long short-term memory network for rainfall-runoff prediction. More importantly, the integrated gradient method is utilized to quantify the contributions of input features of precipitation, solar radiation, vapor pressure, maximum temperature, minimum temperature and day length.
It takes a lot of effort to reach the current experimental design, as shown in Figure 1 of the submission and the supplement of this reply. We wish that the value of the experimental design can be recognized.Comment 7:
Large portions of the text simply describe their figures and often the conclusions are case-specific, thus only valid for a selected group of catchments.
Response 7:
Thank you. We make effort to elaborate on the results presented in the figures. The figures point to the contribution of the water balance constraint to the LSTM’s robustness.
As to the comment about “a selected group of catchments”, it is noted that the CAMELS dataset, which covers 531 catchments under various hydroclimatic conditions across the contiguous United States, is one of the most comprehensive hydrological datasets in the field of hydrology.Comment 8:
In the section “Robustness against data sparsity,” the conclusions suggest that under certain circumstances, mass conservation constraints can enhance the accuracy of LSTM models. Beyond indicating that more data generally benefits unconstrained models, the study does not provide further information.
Response 8:
Thank you for the constructive comment. The first experiment across all the 531 catchments is carried out to showcase the effect of the water balance constraint under 1, 3, 6, 9, 12 and 15 years’ training data. The main findings are that incorporating the water balance constraint into the LSTM improves the robustness and that the improvement tends to decrease as the amount of training data increases. These results, which are straightforward, can serve as a useful guide on the amount of training data that is in demand for rainfall-runoff prediction.Comment 9:
It would be interesting to identify in which watersheds MC constraints enhance predictions.
Response 9:
Thank you very much for the insightful comment. We shall devise further experiments to elaborate on this issue.Comment 10:
Additionally, there is a methodological inconsistency as the authors focus their analysis on the mean of KGE values rather than the range or standard deviation of KGE values. Thus, instead of studying robustness, they are analyzing accuracy against data sparsity.
Response 10:
Thank you. It is pointed out that the robustness against data sparsity is estimated by the range of variation in the accuracy under different amounts of training data. It means that the robustness can be assessed by the extent of degradation in the performance of models when faced with the reduction of the amount of training data. Therefore, we focus on the variation range of the mean of KGE values across different training data, which is consistent with the methodology.Comment 11:
In the section “Robustness against parameter initialization,” it is indicated that the robustness of LSTM (measured as the standard deviation of KGE values) improves (i.e., standard deviation values decrease) when more data is available for model training. However, this result alone is not very useful. A model's prediction can be consistent but inaccurate. In other words, the standard deviation of the KGE values for a catchment could be small (indicating high robustness), yet those KGE values could still be very poor. Thus, robustness alone is not a very informative metric.
Response 11:
Thank you. In some safety-critical applications, such as flood predictions, the robustness of DL models against parameter initialization plays an essential part in the trustworthiness of model outputs to users (Hemachandra et al., 2023). Besides, the meaningful point is that this study found two methods to reduce the uncertainty caused by the random initialization of model parameters, that is, increasing the amount of training data and incorporating domain knowledge into DL models. This result is useful for operational applications of DL models.Comment 12:
The section “Robustness in learning rainfall-runoff relationships” could benefit from clearer figures. In this section, the statement “These results indicate that water balance constraint enhances the robustness in learning rainfall-runoff relationships” is not sufficiently supported. The extent to which a model output is explained by one or multiple variables does not indicate whether the model's ability to perform consistently across varying conditions is enhanced. These are two different questions: one is “To what extent do input variables contribute to a model prediction?” and the other is “Does the model prediction vary significantly if it is a function of more variables?”
Response 12:
Thank you for the detailed comment. We agree on that it is not surprising that the LSTM networks achieve the highest accuracy when there are 15 years’ training data. By setting different amounts of training data, the robustness in learning rainfall-runoff relationships is assessed by the range of variation in contributions of input features. It is meant to examine the robustness when faced with limited training data, i.e., “to what extent do the water balance constraint help the LSTM model to learning the rainfall-runoff relationships consistently under the conditions of data sparsity?” When revising the paper, we shall devise additional experiments to investigate these two different questions.References
Cai, H., Liu, S., Shi, H., Zhou, Z., Jiang, S., and Babovic, V.: Toward improved lumped groundwater level predictions at catchment scale: mutual integration of water balance mechanism and deep learning method, J. Hydrol., 613, 128495, https://doi.org/10.1016/j.jhydrol.2022.128495, 2022.
Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392, https://doi.org/10.5194/hess-26-3377-2022, 2022.
Hemachandra, A., Dai, Z., Singh, J., Ng, S.-K., and Low, B. K. H.: Training-Free Neural Active Learning with Initialization-Robustness Guarantees, in: Proceedings of the 40th International Conference on Machine Learning, International Conference on Machine Learning, 12931–12971, 2023.
Kratzert, F., Gauch, M., Klotz, D., and Nearing, G.: HESS Opinions: Never train an LSTM on a single basin, Hydrol. Earth Syst. Sci. Discuss., 2024, 1–19, https://doi.org/10.5194/hess-2023-275, 2024.
Li, L., Dai, Y., Wei, Z., Shangguan, W., Zhang, Y., Wei, N., and Li, Q.: Enforcing Water Balance in Multitask Deep Learning Models for Hydrological Forecasting, J. Hydrometeorol., 25, 89–103, https://doi.org/10.1175/JHM-D-23-0073.1, 2024.
Read, J. S., Jia, X., Willard, J., Appling, A. P., Zwart, J. A., Oliver, S. K., Karpatne, A., Hansen, G. J. A., Hanson, P. C., Watkins, W., Steinbach, M., and Kumar, V.: Process-guided deep learning predictions of lake water temperature, Water Resour. Res., 55, 9173–9190, https://doi.org/10.1029/2019WR024922, 2019.
Sun, A. Y., Jiang, P., Mudunuru, M. K., and Chen, X.: Explore Spatio‐Temporal Learning of Large Sample Hydrology Using Graph Neural Networks, Water Resour. Res., 57, e2021WR030394, https://doi.org/10.1029/2021WR030394, 2021.
Wang, Y., Wang, W., Ma, Z., Zhao, M., Li, W., Hou, X., Li, J., Ye, F., and Ma, W.: A deep learning approach based on physical constraints for predicting soil moisture in unsaturated zones, Water Resour. Res., 59, e2023WR035194, https://doi.org/10.1029/2023WR035194, 2023.
Wi, S. and Steinschneider, S.: On the need for physical constraints in deep learning rainfall–runoff projections under climate change: a sensitivity analysis to warming and shifts in potential evapotranspiration, Hydrol. Earth Syst. Sci., 28, 479–503, https://doi.org/10.5194/hess-28-479-2024, 2024. -
AC3: 'Reply on RC2', Tongtiegang Zhao, 18 Jul 2024
We are grateful to you for the critical comments on the revised manuscript. Given that there are extensive uses of the LSTM, we believe that the robustness plays an important part in the LSTM’s practical applications.
In addition to the detailed responses to the comments, we have devised two new experiments to examine the robustness of the LSTM:
Experiment 4 assesses the robustness against contrasting climate conditions. Specifically, two scenarios are conducted to examine the transferability of the three models between the wettest and the driest hydrological years that identified from the total precipitation.
Experiment 5 investigates the relationships between robustness and catchment attributes. Specifically, the catchments in which the water balance constraint significantly improves the robustness are identified. The relationships between the improvement of robustness and the characteristics of catchment are analyzed.
As shown by Figure 1 in the supplement, there are in total 5 experiments on the robustness. The findings through the experiments are expected to lay the basis for the LSTM-based rainfall-runoff prediction.
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2024-1449', Anonymous Referee #1, 10 Jun 2024
Having evaluated the earlier version/submission of the manuscript, I would like to state that the authors have largely improved the manuscript and clarified most of my concerns and comments. In particular I like the addition of the IG analysis given nice additional insight into the functioning of LSTMs and the role of constrain in this context. So, overall I belief this manuscript provides an interesting and important peace of research from which many readers might profit in their own work.
I have a few questions and comments that should be addressed in a revised version of the manuscript before publication.:
- L15: robustness should briefly be defined, also in the abstract
- L18: The sentence starting “In addition, …” the meaning is not understanding in this formulation, please reformulate.
- L68: Reference for XAI?
- L77: should be “…is passed…”
- L167: Why using catchments with high KGE – please add a reason for that
- L312: should be :”robust”
Citation: https://doi.org/10.5194/egusphere-2024-1449-RC1 -
AC1: 'Reply on RC1', Tongtiegang Zhao, 30 Jun 2024
Comment 1:
Having evaluated the earlier version/submission of the manuscript, I would like to state that the authors have largely improved the manuscript and clarified most of my concerns and comments. In particular, I like the addition of the IG analysis given nice additional insight into the functioning of LSTMs and the role of constrain in this context. So, overall I belief this manuscript provides an interesting and important piece of research from which many readers might profit in their own work.
Response 1:
We appreciate the positive comments. In the last round of revision, we have made great effort to thoroughly improve the whole paper by following the insightful and constructive comments.
Comment 2:
I have a few questions and comments that should be addressed in a revised version of the manuscript before publication.
Response 2:
We shall keep on improving the paper.
Comment 3:
L15: robustness should briefly be defined, also in the abstract
Response 3:
The definition that “the robustness of hydrological models, which is the ability to behave consistently under different conditions, is critical for hydrological forecasting” shall be added to the revision.
Comment 4:
L18: The sentence starting “In addition, …” the meaning is not understanding in this formulation, please reformulate.
Response 4:
The robustness in learning rainfall-runoff relationships is assessed by the range of variation in contributions of input features under different amounts of training data. When the amount of training data decreases from 15 years to only 1 year, the extent of degradation in the contributions of precipitation to the runoff of the LSTM is much bigger than that of the MC-LSTM. The results indicate that the water balance constraint enhances the robustness in learning rainfall-runoff relationships. The improvement is less with more training data. However, in order to avoid giving readers the impression that we were deliberately overselling the results, the slight improvement of the robustness in learning rainfall-runoff relationship with 9 years’ training data are presented in the abstract.
Comment 5:
L68: Reference for XAI?
Response 5:
Thank you very much for the constructive comment. The two important references on XAI, i.e., Alejandro et al. (2020) and Adadi et al. (2018), will be added to the revision.
Comment 6:
L77: should be “…is passed…”
Response 6:
Thank you for spotting the typo. We shall correct it accordingly.
Comment 7:
L167: Why using catchments with high KGE – please add a reason for that
Response 7:
Considering that the predictive performance is critical for extracting meaningful information from machine learning models (Murdoch et al., 2019; Jiang et al., 2022), this paper focuses on catchments with relatively high KGE. Specifically, following Sun et al. (2021), the attention is paid to the top 50 case study catchments in terms of the KGE value.
Comment 8:
L312: should be: “robust”
Response 8:
Thank you for spotting the typo. We shall correct it accordingly.
References:
Adadi, A., Berrada, M., Adadi, A., Berrada, M., Adadi, A., and Berrada, M.: Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI), IEEE ACCESS, 6, 52138–52160, https://doi.org/10.1109/ACCESS.2018.2870052, 2018.
Alejandro, B. A., Natalia, D.-R., Javier, D. S., Adrien, B., Siham, T., Alberto, B., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., and Herrera, F.: Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI, INFORM FUSION, 58, 82–115, https://doi.org/10.1016/j.inffus.2019.12.012, 2020.
Jiang, S., Zheng, Y., Wang, C., and Babovic, V.: Uncovering flooding mechanisms across the contiguous United States through interpretive deep learning on representative catchments, Water Resour. Res., 58, https://doi.org/10.1029/2021WR030185, 2022.
Murdoch, W. J., Singh, C., Kumbier, K., Abbasi-Asl, R., and Yu, B.: Definitions, methods, and applications in interpretable machine learning, P NATL ACAD SCI USA, 116, 22071–22080, https://doi.org/10.1073/pnas.1900654116, 2019.
Sun, A. Y., Jiang, P., Mudunuru, M. K., and Chen, X.: Explore Spatio‐Temporal Learning of Large Sample Hydrology Using Graph Neural Networks, Water Resour. Res., 57, e2021WR030394, https://doi.org/10.1029/2021WR030394, 2021.
Citation: https://doi.org/10.5194/egusphere-2024-1449-AC1
-
RC2: 'Comment on egusphere-2024-1449', Anonymous Referee #2, 24 Jun 2024
This manuscript presents a study that attempts to assess the effects of mass conservation constraints on the robustness of LSTM neural networks at a local scale. The study adopts the definition of robustness from Manure et al. (2023), according to which “robustness is the ability to perform consistently across varying conditions.” Based on this definition, the study examines:
- Robustness against data sparsity
- Robustness against parameter initialization
- Robustness in learning rainfall-runoff relationships
I find that this study lacks novelty and that the questions it attempts to answer do not represent a substantial contribution to scientific progress. The manuscript's presentation and clarity could be significantly improved, and some of the results and conclusions are not well justified. The following are some points that I think the authors should consider to improve their contribution:
- The English usage should be reviewed.
- The structure of the paper gives the impression that the authors are merely testing three modeling frameworks and comparing their results. Large portions of the text simply describe their figures, and often the conclusions are case-specific, thus only valid for a selected group of catchments.
- In the section “Robustness against data sparsity,” the conclusions suggest that under certain circumstances, mass conservation constraints can enhance the accuracy of LSTM models. Beyond indicating that more data generally benefits unconstrained models, the study does not provide further information. It would be interesting to identify in which watersheds MC constraints enhance predictions. Additionally, there is a methodological inconsistency as the authors focus their analysis on the mean of KGE values rather than the range or standard deviation of KGE values. Thus, instead of studying robustness, they are analyzing accuracy against data sparsity.
- In the section “Robustness against parameter initialization,” it is indicated that the robustness of LSTM (measured as the standard deviation of KGE values) improves (i.e., standard deviation values decrease) when more data is available for model training. However, this result alone is not very useful. A model's prediction can be consistent but inaccurate. In other words, the standard deviation of the KGE values for a catchment could be small (indicating high robustness), yet those KGE values could still be very poor. Thus, robustness alone is not a very informative metric.
- The section “Robustness in learning rainfall-runoff relationships” could benefit from clearer figures. In this section, the statement “These results indicate that water balance constraint enhances the robustness in learning rainfall-runoff relationships” is not sufficiently supported. The extent to which a model output is explained by one or multiple variables does not indicate whether the model's ability to perform consistently across varying conditions is enhanced. These are two different questions: one is “To what extent do input variables contribute to a model prediction?” and the other is “Does the model prediction vary significantly if it is a function of more variables?”
Citation: https://doi.org/10.5194/egusphere-2024-1449-RC2 -
AC2: 'Reply on RC2', Tongtiegang Zhao, 30 Jun 2024
Comment 1:
This manuscript presents a study that attempts to assess the effects of mass conservation constraints on the robustness of LSTM neural networks at a local scale. The study adopts the definition of robustness from Manure et al. (2023), according to which “robustness is the ability to perform consistently across varying conditions.” Based on this definition, the study examines:
-Robustness against data sparsity
-Robustness against parameter initialization
-Robustness in learning rainfall-runoff relationships
Response 1:
Thank you very much for the brief summary of the paper. This paper is built upon a previous submission entitled “Role of the water balance constraint in the long short-term memory network: large-sample tests of rainfall-runoff prediction”. The two reviewers raised insightful and constructive comments for improving the paper. Accordingly, we have made great effort to thoroughly improve the whole paper in the last round of revision.
Below please find point-by-point responses the review comments.Comment 2:
I find that this study lacks novelty and that the questions it attempts to answer do not represent a substantial contribution to scientific progress.
Response 2:
Thank you for the comment. It is noted that the robustness against data sparsity and parameter initialization is a critical issue for deep learning (DL) models (Cai et al., 2022; Kratzert et al., 2024). In particular for safety-critical applications such as flood prediction, the robustness of DL models against parameter initialization plays an essential part in their trustworthiness to users (Hemachandra et al., 2023).
When revising the paper, we have conducted a literature survey to identify the contribution of the paper. It is found that the effect of the water balance constraint has been tested by considering extreme events (Frame et al., 2022), data sparsity at the grid scale (Li et al., 2024) and changing climate (Wi and Steinschneider, 2024). In the meantime, the effect of the water balance constraint on the robustness of the long short-term memory network (LSTM) against data sparsity and parameter initialization is yet to be investigated. Therefore, in this paper, the attention is paid to how the water balance constraint contribute to the robustness of the LSTM for the purpose of rainfall-runoff prediction.Comment 3:
The manuscript's presentation and clarity could be significantly improved, and some of the results and conclusions are not well justified.
Response 3:
We shall try our best to improve the presentation and clarity of the manuscript. In the meantime, we shall perform additional numerical experiments to refine the results.Comment 4:
The following are some points that I think the authors should consider to improve their contribution.
Response 4:
Thank you very much for the constructive comments. By following the comments, we shall keep on improving the paper.Comment 5:
The English usage should be reviewed.
Response 5:
We are sorry if there exist some confusions. We shall have the paper proofread by a native speaker when revising the paper.Comment 6:
The structure of the paper gives the impression that the authors are merely testing three modeling frameworks and comparing their results.
Response 6:
Thank you for the comment and we agree on the concern. It is noted that these three experiments play a critical part in examining the robustness of the long short-term memory network for rainfall-runoff prediction. More importantly, the integrated gradient method is utilized to quantify the contributions of input features of precipitation, solar radiation, vapor pressure, maximum temperature, minimum temperature and day length.
It takes a lot of effort to reach the current experimental design, as shown in Figure 1 of the submission and the supplement of this reply. We wish that the value of the experimental design can be recognized.Comment 7:
Large portions of the text simply describe their figures and often the conclusions are case-specific, thus only valid for a selected group of catchments.
Response 7:
Thank you. We make effort to elaborate on the results presented in the figures. The figures point to the contribution of the water balance constraint to the LSTM’s robustness.
As to the comment about “a selected group of catchments”, it is noted that the CAMELS dataset, which covers 531 catchments under various hydroclimatic conditions across the contiguous United States, is one of the most comprehensive hydrological datasets in the field of hydrology.Comment 8:
In the section “Robustness against data sparsity,” the conclusions suggest that under certain circumstances, mass conservation constraints can enhance the accuracy of LSTM models. Beyond indicating that more data generally benefits unconstrained models, the study does not provide further information.
Response 8:
Thank you for the constructive comment. The first experiment across all the 531 catchments is carried out to showcase the effect of the water balance constraint under 1, 3, 6, 9, 12 and 15 years’ training data. The main findings are that incorporating the water balance constraint into the LSTM improves the robustness and that the improvement tends to decrease as the amount of training data increases. These results, which are straightforward, can serve as a useful guide on the amount of training data that is in demand for rainfall-runoff prediction.Comment 9:
It would be interesting to identify in which watersheds MC constraints enhance predictions.
Response 9:
Thank you very much for the insightful comment. We shall devise further experiments to elaborate on this issue.Comment 10:
Additionally, there is a methodological inconsistency as the authors focus their analysis on the mean of KGE values rather than the range or standard deviation of KGE values. Thus, instead of studying robustness, they are analyzing accuracy against data sparsity.
Response 10:
Thank you. It is pointed out that the robustness against data sparsity is estimated by the range of variation in the accuracy under different amounts of training data. It means that the robustness can be assessed by the extent of degradation in the performance of models when faced with the reduction of the amount of training data. Therefore, we focus on the variation range of the mean of KGE values across different training data, which is consistent with the methodology.Comment 11:
In the section “Robustness against parameter initialization,” it is indicated that the robustness of LSTM (measured as the standard deviation of KGE values) improves (i.e., standard deviation values decrease) when more data is available for model training. However, this result alone is not very useful. A model's prediction can be consistent but inaccurate. In other words, the standard deviation of the KGE values for a catchment could be small (indicating high robustness), yet those KGE values could still be very poor. Thus, robustness alone is not a very informative metric.
Response 11:
Thank you. In some safety-critical applications, such as flood predictions, the robustness of DL models against parameter initialization plays an essential part in the trustworthiness of model outputs to users (Hemachandra et al., 2023). Besides, the meaningful point is that this study found two methods to reduce the uncertainty caused by the random initialization of model parameters, that is, increasing the amount of training data and incorporating domain knowledge into DL models. This result is useful for operational applications of DL models.Comment 12:
The section “Robustness in learning rainfall-runoff relationships” could benefit from clearer figures. In this section, the statement “These results indicate that water balance constraint enhances the robustness in learning rainfall-runoff relationships” is not sufficiently supported. The extent to which a model output is explained by one or multiple variables does not indicate whether the model's ability to perform consistently across varying conditions is enhanced. These are two different questions: one is “To what extent do input variables contribute to a model prediction?” and the other is “Does the model prediction vary significantly if it is a function of more variables?”
Response 12:
Thank you for the detailed comment. We agree on that it is not surprising that the LSTM networks achieve the highest accuracy when there are 15 years’ training data. By setting different amounts of training data, the robustness in learning rainfall-runoff relationships is assessed by the range of variation in contributions of input features. It is meant to examine the robustness when faced with limited training data, i.e., “to what extent do the water balance constraint help the LSTM model to learning the rainfall-runoff relationships consistently under the conditions of data sparsity?” When revising the paper, we shall devise additional experiments to investigate these two different questions.References
Cai, H., Liu, S., Shi, H., Zhou, Z., Jiang, S., and Babovic, V.: Toward improved lumped groundwater level predictions at catchment scale: mutual integration of water balance mechanism and deep learning method, J. Hydrol., 613, 128495, https://doi.org/10.1016/j.jhydrol.2022.128495, 2022.
Frame, J. M., Kratzert, F., Klotz, D., Gauch, M., Shalev, G., Gilon, O., Qualls, L. M., Gupta, H. V., and Nearing, G. S.: Deep learning rainfall–runoff predictions of extreme events, Hydrol. Earth Syst. Sci., 26, 3377–3392, https://doi.org/10.5194/hess-26-3377-2022, 2022.
Hemachandra, A., Dai, Z., Singh, J., Ng, S.-K., and Low, B. K. H.: Training-Free Neural Active Learning with Initialization-Robustness Guarantees, in: Proceedings of the 40th International Conference on Machine Learning, International Conference on Machine Learning, 12931–12971, 2023.
Kratzert, F., Gauch, M., Klotz, D., and Nearing, G.: HESS Opinions: Never train an LSTM on a single basin, Hydrol. Earth Syst. Sci. Discuss., 2024, 1–19, https://doi.org/10.5194/hess-2023-275, 2024.
Li, L., Dai, Y., Wei, Z., Shangguan, W., Zhang, Y., Wei, N., and Li, Q.: Enforcing Water Balance in Multitask Deep Learning Models for Hydrological Forecasting, J. Hydrometeorol., 25, 89–103, https://doi.org/10.1175/JHM-D-23-0073.1, 2024.
Read, J. S., Jia, X., Willard, J., Appling, A. P., Zwart, J. A., Oliver, S. K., Karpatne, A., Hansen, G. J. A., Hanson, P. C., Watkins, W., Steinbach, M., and Kumar, V.: Process-guided deep learning predictions of lake water temperature, Water Resour. Res., 55, 9173–9190, https://doi.org/10.1029/2019WR024922, 2019.
Sun, A. Y., Jiang, P., Mudunuru, M. K., and Chen, X.: Explore Spatio‐Temporal Learning of Large Sample Hydrology Using Graph Neural Networks, Water Resour. Res., 57, e2021WR030394, https://doi.org/10.1029/2021WR030394, 2021.
Wang, Y., Wang, W., Ma, Z., Zhao, M., Li, W., Hou, X., Li, J., Ye, F., and Ma, W.: A deep learning approach based on physical constraints for predicting soil moisture in unsaturated zones, Water Resour. Res., 59, e2023WR035194, https://doi.org/10.1029/2023WR035194, 2023.
Wi, S. and Steinschneider, S.: On the need for physical constraints in deep learning rainfall–runoff projections under climate change: a sensitivity analysis to warming and shifts in potential evapotranspiration, Hydrol. Earth Syst. Sci., 28, 479–503, https://doi.org/10.5194/hess-28-479-2024, 2024. -
AC3: 'Reply on RC2', Tongtiegang Zhao, 18 Jul 2024
We are grateful to you for the critical comments on the revised manuscript. Given that there are extensive uses of the LSTM, we believe that the robustness plays an important part in the LSTM’s practical applications.
In addition to the detailed responses to the comments, we have devised two new experiments to examine the robustness of the LSTM:
Experiment 4 assesses the robustness against contrasting climate conditions. Specifically, two scenarios are conducted to examine the transferability of the three models between the wettest and the driest hydrological years that identified from the total precipitation.
Experiment 5 investigates the relationships between robustness and catchment attributes. Specifically, the catchments in which the water balance constraint significantly improves the robustness are identified. The relationships between the improvement of robustness and the characteristics of catchment are analyzed.
As shown by Figure 1 in the supplement, there are in total 5 experiments on the robustness. The findings through the experiments are expected to lay the basis for the LSTM-based rainfall-runoff prediction.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
352 | 105 | 28 | 485 | 43 | 15 | 13 |
- HTML: 352
- PDF: 105
- XML: 28
- Total: 485
- Supplement: 43
- BibTeX: 15
- EndNote: 13
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Tongtiegang Zhao
This preprint has been withdrawn.
- Preprint
(1990 KB) - Metadata XML
-
Supplement
(1341 KB) - BibTeX
- EndNote