the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improving Streamflow Simulation through Machine Learning-Powered Data Integration and Its Implications for Forecasting in the Western U.S.
Abstract. Accurate streamflow forecasts are crucial but remain challenging for the arid Western United States (U.S.). Recently, machine learning methods such as long short-term memory (LSTM) have exhibited high accuracy in streamflow simulation and strong abilities to integrate observations to enhance performance. This study evaluated an LSTM-based data integration approach that incorporates streamflow (Q) and snow water equivalent (SWE) observations to improve streamflow estimations across different lag times (1–10 days, 1–6 months) and timescales (daily and monthly) over hundreds of basins in the Western U.S. Integrating Q at the daily scale provided the greatest improvements, increasing the median Kling-Gupta Efficiency (KGE) of 646 basins from 0.80 to 0.96 when integrating 1-day lagged Q, and remaining at 0.89 even with a 10-day lag. Integrating Q at the monthly scale also enhanced streamflow estimations, though to a lesser extent than at the daily scale, with the median KGE rising from 0.80 to 0.86 when integrating 1-month lagged streamflow. The next most notable improvement resulted from integrating SWE at the monthly scale, where the median KGE improved to 0.86 when integrating 1-month lagged SWE. Furthermore, SWE integration showed greater benefits at the monthly scale in snow-dominated basins during snowmelt season, which was beneficial for spring-summer flow estimations. However, integrating SWE at the daily scale did not show improvements. These results highlight the potential of this LSTM-based data integration approach for both short-term and long-term streamflow forecasting due to its performance, automation and efficiency.
- Preprint
(1781 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-1708', Anonymous Referee #1, 02 Jun 2025
This article presents a robust LSTM-based data integration framework for improving streamflow simulation in the Western U.S., through integrating lagged streamflow and SWE observations across daily and monthly timescales. The paper is well-structured, the experiments are comprehensive, and the findings are practically significant. However, several aspects require further clarification and refinement. General comments are as follows:
1. Is there any reference or justification for the criteria used to select snow-dominated basins?
2. In the model input processing, the three types of inputs, which include forcings, attributes and lagged observations, have different dimensionalities. How are these inputs aligned in terms of dimensions before being fed into the LSTM model? Please clarify the specific preprocessing or embedding strategies used to ensure compatibility across these input types.
3. The meaning of Equation (2) is unclear. Does this formulation represent single-step or multi-step prediction? Are the input variables provided in a sliding window? When estimating streamflow at the current time step, are lagged forcings also included, or are only the current forcings used as inputs?
4. Why is the mean of six model simulations used for the model evaluation? How was the number six determined, and can this sample size ensure the representativeness and stability of the evaluation results? Please clarify the rationality.
5. Figure 3 shows that streamflow estimations in several basins exhibit very low or even zero KGE values under different models and temporal scales. Please discuss the possible reasons for such poor model performance in these specific basins.
6. Please provide a more detailed explanation of the statement: “The compaction of FHV was less pronounced than that of FLV, likely due to the shorter timescales of peak flows and their lower dependence on memory compared to low flows.”
7. The paper does not provide any analysis or discussion regarding the KGE spatial patterns over the Western U.S. for experiments at the monthly scale but only evaluation for April to July. Please supplement the corresponding analysis.
8. The paper attributes the limited benefits from daily SWE integration to the prevalence of zero SWE values or potential data quality issues. However, it lacks an in-depth analysis of the error structure of the SWE dataset and its influence on model performance. It is recommended to supplement the current findings with additional analyses using higher-quality SWE datasets and to further investigate this hypothesis to provide stronger support for the explanation.
9. In the paragraph around line 285, it is generally expected that integrating lagged SWE data during the snowmelt seasons should bring certain benefits to snow-dominated regions. However, the paper reports that KGE improvements are minimal and RB performance is even worse when evaluated over all regions, which may lead to biased conclusions. It is recommended to conduct this analysis specifically for snow-dominated regions.
10. The streamflow simulations in the paper are conducted using observed forcings rather than predicted forcings. However, in an operational forecasting mode, predicted forcings is used. Therefore, when applying the proposed method in a forecasting mode, the claimed enhancements such as improving daily streamflow forecasts up to 10 days in advance or monthly forecasts up to six months cannot be guaranteed.
11. In addition to the explanation provided around line 319, another possible reason for the observed phenomenon is that integrating lagged SWE performs poorly in rain-dominated regions, which may lower the overall performance when evaluated across all basins. It is recommended to compare the performance of integrating lagged Q and SWE specifically within snow-dominated regions, and also conduct a comparative analysis within rain-dominated regions.
Specific comments:
(1) Why is Δ|RV−1| used in Figure 8(c) instead of directly showing Δ|RV| values?
(2) Please clearly specify which months are defined as the accumulation season and which are defined as the snowmelt season.
(3) It is recommended to include representative case studies of individual basins in the results section, such as time series plots, rather than relying solely on statistical boxplots.
(4) The results throughout the paper are presented primarily through figures. It is recommended to include data tables to provide a more quantitative presentation of the results.Citation: https://doi.org/10.5194/egusphere-2025-1708-RC1 - RC2: 'Comment on egusphere-2025-1708', Anonymous Referee #2, 23 Jun 2025
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
312 | 97 | 17 | 426 | 9 | 18 |
- HTML: 312
- PDF: 97
- XML: 17
- Total: 426
- BibTeX: 9
- EndNote: 18
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1