the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Toward Routing River Water in Land Surface Models with Recurrent Neural Networks
Abstract. Machine learning is playing an increasing role in hydrology, supplementing or replacing physics-based models. One notable example is the use of recurrent neural networks (RNNs) for forecasting streamflow given observed precipitation and geographic characteristics. Training of such a model over the continental United States has demonstrated that a single set of model parameters can be used across independent catchments, and that RNNs can outperform physics-based models. In this work, we take a next step and study the performance of RNNs for river routing in land surface models (LSMs). Instead of observed precipitation, the LSM-RNN uses instantaneous runoff calculated from physics-based models as an input. We train the model with data from river basins spanning the globe and test it in streamflow hindcasts. The model demonstrates skill at generalization across basins (predicting streamflow in unseen catchments) and across time (predicting streamflow during years not used in training). We compare the predictions from the LSM-RNN to an existing physics-based model calibrated with a similar dataset and find that the LSM-RNN outperforms the physics-based model. Our results give further evidence that RNNs are effective for global streamflow prediction from runoff inputs and motivate the development of complete routing models that can capture nested sub-basis connections.
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-1206', Anonymous Referee #1, 22 Jul 2024
Dear Authors,
In this paper, LSTM networks are trained at two levels (the US and more globally) to simulate streamflow. The paper investigates: (a) the potential benefits of using surface and subsurface discharge information instead of precipitation (the most common input variable in rainfall-runoff modeling in almost all previous studies) for representing a water-routing model, and (b) its generalization performance at both temporal and spatial scales. I find the study's concept of replacing precipitation in LSTM inputs with runoff-related variables to be interesting and the paper is overall well written. However, I believe that some major revisions are necessary before it can be considered for publication.
Introduction – The various concepts of the context are explained well and are interesting. However, the review of existing literature related to these concepts is almost completely missing. A good introduction should include key concepts of the problem at hand, i.e. water routing (which is addressed in the current version*). It should then review what has been done so far (both classical and AI-based methods) related to these concepts, thereby revealing the current gap(s) to which your paper would contribute (this is poorly addressed in the current version).* Although this section could be more engaging by concisely explaining water routing ideas in physics-based models.
Methods – Important LSTM training details are missing. For example, the loss function, with its full definition, is a crucial element of the optimization algorithm and should be presented in the Methods section, not in the Metrics section (and it is not sufficient to refer the reader to another work for its definition). The optimization algorithm and the LSTM architecture are also completely missing here and throughout the paper.
Benchmark – In its current form, the comparison with LISFLOOD is not fully justified in my opinion, as the existing LISFLOOD simulations were conducted under a different setup that seems to be unknown to the authors. Such simulations involve several subtleties that need to be carefully managed; otherwise, any conclusions drawn would be biased. Why don’t the authors conduct these simulations themselves under controlled conditions corresponding to their LSTM experiments?
Results (and Discussion) – The analysis of the results does not appear to be sufficiently in-depth, particularly in relation to the few previous regionalization studies using LSTMs over the US continent. There should be a thorough discussion comparing the findings of this study with those from previous research to highlight the contributions and significance of your work, or, to explain any potential divergence from their results (this is missing in the current version).
The Use of the Term "Forecast" – Based on the content, this paper is not about forecasting but rather about prediction (simulation). This error should be corrected (this mistake does not appear in the Conclusion, where it correctly states: “We have successfully trained and validated an LSTM for the task of predicting streamflow from runoff worldwide”).
MINOR COMMENTS
- PDF Version – The PDF version of the paper did not include line numbers, which made it very impractical for review.
- P.2, Introduction – The phrase “common ungauged basins” in the sentence “This indicates that information in large-scale hydrological datasets is sufficient for generalization tasks, especially to the common ungauged basins (Nearing et al., 2021)” needs clarification. What does “common ungauged basins” mean in this context?
- Table 1 – Please specify the range for each level presented in the table.
- Page 9, Line 6 – Remove "mean" in “mean squared error,” as the term does not include any averaging.
- Appendix B – What are the tested values for each of the three hyperparameters? This information is important and concise enough to be included in the main text.
- Figure 5 – Place the legend above the subplots as it applies to both of them.
- Figure 6 – (Maybe) Place the labels (a), (b), (c), (d) inside the respective subplots to save space.
- P. 16 – It would be interesting if you could present the top 4-5 attributes that, according to your results, show a relationship with model performance. For instance, in which regions are variables like “karst percent cover” and “groundwater table depth” explanatory to some degree in terms of model performance?
- Figure 8 (caption) – Provide the definition of the aridity index both in the caption and in the text. For instance, you mention "Drier regions" (i.e., regions with lower aridity index),” but it is natural to expect that the higher the index of a region, the more the climate lacks effective moisture. Additionally, the following sentence in the caption should be stated more carefully: “There is a tendency for worse scores for smaller aridity indexes (i.e., drier basins),” since at the same range of aridity indices, there are basins with good NSE values. Also, state what each point represents in the figure.
- P.18 – “However, it is not clear if this increase in performance is due to a change in the LSTM model.” How is Nearing et al.’s LSTM model different from yours? This is an example of studies that should be included in the literature review. The provided context and highlighted differences can then be used as an element of result analysis in your discussion section.
- P.18 – The equivalency between gauged and time-split configuration, as well as between ungauged and basin-split, should be mentioned at their first introduction. Additionally, consider using the terms "gauged" and "ungauged" instead of "time-split" and "basin-split," as these are far more intuitive.
- P.18 – The conclusion “suggesting that drier regions pose unique challenges for the LSTM model” is incorrect. As mentioned above, many of your basins with good NSEs fall within the same aridity interval. Please revise this conclusion to reflect the actual results.
- P.19 – Remove the parentheses around “Hoedt et al., 20211.”
- I find the mass balance analysis interesting. You may consider placing it inside the main body of the paper.
Citation: https://doi.org/10.5194/egusphere-2024-1206-RC1 - AC1: 'Reply on RC1', Mauricio Lima, 12 Sep 2024
-
RC2: 'Comment on egusphere-2024-1206', Anonymous Referee #2, 04 Oct 2024
Lima et al. (2024) builds a machine learning method based on Recurrent Neural Networks (RNNs) for river routing in global-scale Land Surface Models (LSMs). Their LSM-RNN uses surface and sub-surface runoff from ERA5-Land reanalysis (the LSM within ERA5-Land is ECLand, formally HTESSEL) – this method is considered a hybrid approach that blends output from a physical-based LSM with ML-based river routing. Lima et al. show their hybrid LSM-RNN outperforms a fully physical-based method, GloFAS-ERA5 (HTESSEL + LISFLOOD river routing if using version 2, needs to be confirmed, see below) for when evaluated against a global network of river discharge observations.
This paper is an interesting and worthwhile contribution to the ongoing body of work on global-scale hydrological modelling. Recent work (i.e. Nearing et al., 2024) has demonstrated strong performance of a fully-ML based model compared to physical-based LISFLOOD model used in GloFAS version 4. Lima et al. (2024) demonstrates a hybrid approach is also valuable. This is particularly interesting as nearly all climate and Numerical Weather Prediction (NWP) models produce runoff as a standard output, but not river discharge. The LSM-RNN method by Lima et al. seems to perform even better than physical-based river routing, and is likely more computationally efficient (not demonstrated here, but efficiency of ML methods is well demonstrated elsewhere). Even for NWP models that have a physical-based river routing scheme, such as ECMWF’s ECLand LSM model that uses CaMa-Flood (Boussetta et al., 2021), an ML-based river routing method could provide benefits.
Overall, I recommend this paper to be published, but my main comment is that there is first some work to increase the level of methodological detail. It would be difficult for others to understand fully and apply their method in this paper as it is presented currently. Please find my other main and minor points are below.
Main points
- Pg3, bullet iv: Which version of GloFAS is used in the paper? Version 2 and previous were forced by runoff from the ECMWF land surface model ECLand (formally, HTESSEL) with river discharge produced by the LISFLOOD river routing scheme, whereas from version 3, the full LISFLOOD hydrological model was used. This is quite important given the premise of the benchmarking here. If using version 2 (forced with HTESSEL + LISFLOOD river routing) then this work benchmarks only the river routing part, but if GloFAS version 3 or newer, then GloFAS is driven by a full LISFLOOD hydrological model (forced with e.g. Precipitation and temperature from ERA5, rather than runoff). See the GloFAS version system changes and associated documentation for details: https://confluence.ecmwf.int/display/CEMS/GloFAS+versioning+system
- Again on Pg 10, line 5-6: If GloFAS v3 onwards, then it uses the fully physical-based LISFLOOD model, if using GloFAS v2 it uses runoff from ECMWF ECLand + the LISFLOOD river runoff scheme.
- Pg 4, 2nd line from bottom: GloFAS is forced with ERA5 not ERA5-Land, while there is not strong differences, ERA5-Land does show better performance for hydrological modelling (see Munoz-Sabater et al. (2021) for a hydrological benchmark on GloFAS with both). These differences impact the benchmarking in Lima et al. (2024) and the details and differences in experimental set-up must be qualified.
- Pg 8, last para, line 5-6: This seems to be where the detail of the ML model as used in this paper is outlined. It’s constrained to the Appendix B with details found in code uploaded to Zenodo. While I strongly support and compliment the authors for uploading their code, I still think there is insufficient detail and considerations of limitations explained within the main manuscript. The method is very difficult to repeat without more detail explained to the reader. Figure 3 outlines the RNN, but where are the assumptions/limitations for this particular river routing application, where are the detail on the model training method/time periods/temporal resolution used here etc.
- Pg 10; line 7: what do you mean by “hindcasts”; hindcast is used in the forecast literature to mean running forecasts for past dates. However, I do not believe you are running forecasts here.
- Pg 16; line 6: In the performance of models for different geographical regions in the world, it’s important to mention that the LSM within ERA5/Land is forced with precipitation from the Numerical Weather Prediction (NWP) model used within ERA5. NWP models fundamentally struggle to capture precipitation in the tropics, and this will impact the results here. See for example Lavers et al. (2022).
- Table A1: Key details missing. Which datasets do each variable come from? What time period, temporal resolution is used?
Minor points
- Pg2, para 1, line 5: please change “regions is” to “regions are”.
- Pg2, para2, line 5: suggest changing “forming the streamflow” to “forming streamflow”.
- Pg 3, bullet iv: Change “run operationally by the European Copernicus program” to something like, “GloFAS, the European Union Copernicus Emergency Management Service (CEMS) global flood forecasting system run operationally at the European Centre for Medium-Range Weather Forecasts (ECMWF)”.
- Pg8, line 2: missing the description of “Xs”
- Figure 3: should be ?
- Pg 9, line 7-9 (I think, there are no line numbers included!): An NSE > 0 or KGE > ~-0.41 is not “good”. The interpretation is that the model is performing better than a mean flow benchmark. This is what NSE=0 or KGE=~-0.41 means. It shows your model provides some level of skill beyond a very naïve mean flow benchmark. This should not be confused as “good”. But great that you use the ~-0.41 threshold for the KGE, I agree with this!
- Pg 14, Sect. 3.3, line 1: please change “present some simulated” to “present simulated”.
- Pg 18; last line: please change “doesn’t” to “does not”.
- Pag 19; line 2: You say routing in LSM component in climate models – but it’s much wider than that. NWP models have a LSM, so this work is also relevant for short, medium and longer range hydrological forecasting using runoff from land surface models within weather models. It has a much wider impact that just the climate models!
References
Boussetta, S., Balsamo, G., Arduini, G., Dutra, E., McNorton, J., Choulga, M., Agustí-Panareda, A., Beljaars, A., Wedi, N., Munõz-Sabater, J., de Rosnay, P., Sandu, I., Hadade, I., Carver, G., Mazzetti, C., Prudhomme, C., Yamazaki, D., and Zsoter, E.: ECLand: The ECMWF Land Surface Modelling System, Atmosphere, 12, 723, https://doi.org/10.3390/atmos12060723, 2021.
Lavers, D. A., Simmons, A., Vamborg, F., and Rodwell, M. J.: An evaluation of ERA5 precipitation for climate monitoring, Quarterly Journal of the Royal Meteorological Society, 148, 3152–3165, https://doi.org/10.1002/qj.4351, 2022.
Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., Boussetta, S., Choulga, M., Harrigan, S., Hersbach, H., Martens, B., Miralles, D. G., Piles, M., Rodríguez-Fernández, N. J., Zsoter, E., Buontempo, C., and Thépaut, J.-N.: ERA5-Land: a state-of-the-art global reanalysis dataset for land applications, Earth System Science Data, 13, 4349–4383, https://doi.org/10.5194/essd-13-4349-2021, 2021.
Nearing, G., Cohen, D., Dube, V., Gauch, M., Gilon, O., Harrigan, S., Hassidim, A., Klotz, D., Kratzert, F., Metzger, A., Nevo, S., Pappenberger, F., Prudhomme, C., Shalev, G., Shenzis, S., Tekalign, T. Y., Weitzner, D., and Matias, Y.: Global prediction of extreme floods in ungauged watersheds, Nature, 627, 559–563, https://doi.org/10.1038/s41586-024-07145-1, 2024.
Citation: https://doi.org/10.5194/egusphere-2024-1206-RC2 - AC2: 'Reply on RC2', Mauricio Lima, 22 Oct 2024
- Pg3, bullet iv: Which version of GloFAS is used in the paper? Version 2 and previous were forced by runoff from the ECMWF land surface model ECLand (formally, HTESSEL) with river discharge produced by the LISFLOOD river routing scheme, whereas from version 3, the full LISFLOOD hydrological model was used. This is quite important given the premise of the benchmarking here. If using version 2 (forced with HTESSEL + LISFLOOD river routing) then this work benchmarks only the river routing part, but if GloFAS version 3 or newer, then GloFAS is driven by a full LISFLOOD hydrological model (forced with e.g. Precipitation and temperature from ERA5, rather than runoff). See the GloFAS version system changes and associated documentation for details: https://confluence.ecmwf.int/display/CEMS/GloFAS+versioning+system
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
260 | 0 | 0 | 260 | 0 | 0 |
- HTML: 260
- PDF: 0
- XML: 0
- Total: 260
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1