the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Towards Interpretable LSTM-based Modelling of Hydrological Systems
Luis Andres De la Fuente
Mohammad Reza Ehsani
Hoshin Vijai Gupta
Laura E. Condon
Abstract. Several studies have demonstrated the ability of Long Short-Term Memory (LSTM) machine learning based modeling to outperform traditional spatially lumped process-based modeling approaches for streamflow prediction. However, due mainly to the structural complexity of the LSTM network (which includes gating operations and sequential processing of the data), difficulties can arise when interpreting the internal processes and weights in the model.
Here, we propose and test a modification of LSTM architecture that represents internal system processes in a manner that is analogous to a hydrological reservoir. Our architecture, called HydroLSTM, simulates behaviors inherent in a dynamic system, such as sequential updating of the Markovian storage. Specifically, we modify how data is fed to the new representation to facilitate simultaneous access to past lagged inputs, thereby explicitly acknowledging the importance of trends and patterns in the data.
We compare the performance of the HydroLSTM and LSTM architectures using data from 10 hydro-climatically varied catchments. We further examine how the new architecture exploits the information in lagged inputs, for 588 catchments across the USA. The HydroLSTM-based models require fewer cell states to obtain similar performance to their LSTM-based counterparts. Further, the weights patterns associated with lagged input variables are interpretable and consistent with regional hydroclimatic characteristics (snowmelt-dominated, recent rainfall-dominated, and historical rainfall-dominated). These findings illustrate how the hydrological interpretability of LSTM-based models can be enhanced by appropriate architectural modifications that are physically and conceptually consistent with our understanding of the system.
Luis Andres De la Fuente et al.
Status: open (until 08 Jul 2023)
-
CC1: 'Comment on egusphere-2023-666', Grey Nearing, 22 Apr 2023
reply
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-666/egusphere-2023-666-CC1-supplement.pdf
-
AC1: 'Reply on CC1', Luis De La Fuente, 29 Apr 2023
reply
We would like to thank Grey Nearing for reviewing and commenting on our paper. We find this discussion very interesting and feel that it will enrich the final version of the paper.
-
AC1: 'Reply on CC1', Luis De La Fuente, 29 Apr 2023
reply
-
RC1: 'Comment on egusphere-2023-666', Tadd Bindas, 30 May 2023
reply
Hello,
Thank you for the lovely preprint. I enjoyed reading your work and offer the following suggestions below. I believe the paper should be reconsidered for HESS, with major revisions, and look forward to reading the next submission.
Best,
Tadd Bindas
- Does the paper address relevant scientific questions within the scope of HESS?
- Yes.
- Does the paper present novel concepts, ideas, tools, or data?
- The concept proposed by their HydroLSTM model is novel. The authors are looking to add more interpretability to the LSTM architecture and get similar results with fewer cell-states using the HydroLSTM code they developed.
- Are substantial conclusions reached?
- I’m not sure. As a summary of my understanding of the paper: the results obtained from their first experiment showcase that a simplified LSTM framework (similar to our understanding of a reservoir) can use one cell to learn a relationship between inputted forcings. The second experiment shows how their model performs when compared to 588 CAMELS basin observations.
- My confusion arises with how the authors train their HydroLSTM and LSTM in experiments 5 and 6. From what I’ve read, and understood from talks at conferences, LSTM models should be trained using all basin data, then tested at individual sites using either a PUB, PUR approach, or median NSE/KGE metric for all catchments. I do not believe the authors are doing this, thus, I am curious if training their HydroLSTM and LSTM models on all catchments would show the same results.
- Are the scientific methods and assumptions valid and clearly outlined?
- Yes. Table 1 does a good job of showing similarities between Storage and LSTM equations.
- Are the results sufficient to support the interpretations and conclusions?
- I believe more work needs to be done to validate the conclusion that HydroLSTM provides comparable performance with LSTM, but with added interpretability. A PUR or PUB experiment to see how a HydroLSTM trained on all CAMELS basins performs would be appreciated.
- Is the description of experiments and calculations sufficiently complete and precise to allow their reproduction by fellow scientists (traceability of results)?
- Almost. I still need clarification on the model training procedure.
- Do the authors give proper credit to related work and clearly indicate their own new/original contribution?
- Yes
- Does the title clearly reflect the contents of the paper?
- Yes
- Does the abstract provide a concise and complete summary?
- Yes
- Is the overall presentation well-structured and clear?
- Yes
- Is the language fluent and precise?
- Yes
- Are mathematical formulae, symbols, abbreviations, and units correctly defined and used?
- It would be appreciated to italicize all equations when in-line. It was hard to read/locate them amongst the text. There are also some repeated variable names (See the comments for an example).
- Should any parts of the paper (text, formulae, figures, tables) be clarified, reduced, combined, or eliminated?
- The model training could be a little clearer (Similar to the above comment).
- Are the number and quality of references appropriate?
- Yes
- Is the amount and quality of supplementary material appropriate?
- Yes
Major Comments:
- Can you italicize all in-line variables and equations? It’s hard to determine which parts of the text describe equations/LSTM properties. In some cases, I’ve had to reread a paragraph multiple times to search for an equation I missed.
- (Lines 245) Are any static attributes used in model training?
- (Lines 252-259) I suggest swapping Calibration, Selection, and Evaluation periods with the training, validation, and testing periods within the parentheses. It looks like you are using the train, validation, and test verbiage throughout the paper, and only referring to calibration, selection, and evaluation periods once (Line 427) after being defined.
- (Section 5.1) Would it be possible to include a PUR analysis rather than a 10-basin (PUB) holdout? So, rather than having two basins from each region, you would test on all gages within a snowmelt-dominant or Recent rainfall-dominant region. I believe this study would benefit from comparing how each LSTM performs on regions not included in the training set. This analysis would strengthen the claim that HydroLSTM has similar model performance to LSTM, but with heightened interpretability.
- (Line 310) How many total catchments were included in the training period? It is mentioned in Section 6.1, but not in 5.1. Is it just one catchment?
- (Line 428) From my understanding of the literature, the best-performing LSTM models are using forcings, and attributes, from all basins in their inputs. For example, if there are 588 catchments, all catchments would be included in the training set. Then, testing would be done on all catchments, to determine a median KGE. Is training HydroLSTM on all catchments, or using basin attributes, something you have explored? Is the optimal lag memory hyperparameter the reason against having an entire CAMELS-trained LSTM? More explanation would be appreciated.
- (Section 6) Is it possible to add a comparison against an LSTM applied to a large sample of catchments?
- (Section 6) Is it possible to add a PUB comparison to this section?
Minor Comments:
- (Affiliations) The s in the United States is cut off
- (Line 58) Is Expected Gradient supposed to be capitalized?
- (Line 107, Line 120) The symbol for the output gate, and the time-constant value, are both o. This could lead to some confusion.
- (Line 130-135) Physical state and informational state don’t need to be italicized.
- (Table 1) Are the brackets supposed to be facing outward? (ex: o = ]0,1[)
- (Line 212) Typo. There needs to be a space inside Wand
- (Line 253) I believe you mean “Commonly referred to as Training, Validation, and Testing.” You used evaluation twice in this part.
- (Line 286) You didn’t establish what a testing period is (see earlier comment for Line 253). Testing should be replaced with “Evaluation.”
- (Line 304) The header “5 Experiment 1” reads weird. Maybe change to “5 First Experiment?”
- (Figure 3) It may be clearer to the reader that rows are the Catchment Studied if you put the gage number on the row’s y-axis in bold above “Cells.”
- (Line 417) Same as the above comment. Maybe replace this with 6 Second Experiment. The section title reads weird.
- (Line 439) There is an unnecessary space before “However”
Citation: https://doi.org/10.5194/egusphere-2023-666-RC1
Luis Andres De la Fuente et al.
Luis Andres De la Fuente et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
420 | 157 | 10 | 587 | 4 | 3 |
- HTML: 420
- PDF: 157
- XML: 10
- Total: 587
- BibTeX: 4
- EndNote: 3
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1