Improving Streamflow Simulation through Machine Learning-Powered Data Integration and Its Implications for Forecasting in the Western U.S.

Yang, Yuan; Pan, Ming; Feng, Dapeng; Xiao, Mu; Dixon, Taylor; Hartman, Robert; Shen, Chaopeng; Song, Yalan; Sengupta, Agniv; Monache, Luca Delle; Ralph, F. Martin

doi:10.5194/egusphere-2025-1708

Preprints

https://doi.org/10.5194/egusphere-2025-1708

Preprints

30 Apr 2025

| 30 Apr 2025

Improving Streamflow Simulation through Machine Learning-Powered Data Integration and Its Implications for Forecasting in the Western U.S.

Yuan Yang, Ming Pan, Dapeng Feng, Mu Xiao, Taylor Dixon, Robert Hartman, Chaopeng Shen, Yalan Song, Agniv Sengupta, Luca Delle Monache, and F. Martin Ralph

Abstract. Accurate streamflow forecasts are crucial but remain challenging for the arid Western United States (U.S.). Recently, machine learning methods such as long short-term memory (LSTM) have exhibited high accuracy in streamflow simulation and strong abilities to integrate observations to enhance performance. This study evaluated an LSTM-based data integration approach that incorporates streamflow (Q) and snow water equivalent (SWE) observations to improve streamflow estimations across different lag times (1–10 days, 1–6 months) and timescales (daily and monthly) over hundreds of basins in the Western U.S. Integrating Q at the daily scale provided the greatest improvements, increasing the median Kling-Gupta Efficiency (KGE) of 646 basins from 0.80 to 0.96 when integrating 1-day lagged Q, and remaining at 0.89 even with a 10-day lag. Integrating Q at the monthly scale also enhanced streamflow estimations, though to a lesser extent than at the daily scale, with the median KGE rising from 0.80 to 0.86 when integrating 1-month lagged streamflow. The next most notable improvement resulted from integrating SWE at the monthly scale, where the median KGE improved to 0.86 when integrating 1-month lagged SWE. Furthermore, SWE integration showed greater benefits at the monthly scale in snow-dominated basins during snowmelt season, which was beneficial for spring-summer flow estimations. However, integrating SWE at the daily scale did not show improvements. These results highlight the potential of this LSTM-based data integration approach for both short-term and long-term streamflow forecasting due to its performance, automation and efficiency.

Received: 10 Apr 2025 – Discussion started: 30 Apr 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1781 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (1781 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

21 Oct 2025

Improving streamflow simulation through machine learning-powered data integration and its potential for forecasting in the Western U.S.

Yuan Yang, Ming Pan, Dapeng Feng, Mu Xiao, Taylor Dixon, Robert Hartman, Chaopeng Shen, Yalan Song, Agniv Sengupta, Luca Delle Monache, and F. Martin Ralph

Hydrol. Earth Syst. Sci., 29, 5453–5476, https://doi.org/10.5194/hess-29-5453-2025,https://doi.org/10.5194/hess-29-5453-2025, 2025

Short summary

Yuan Yang, Ming Pan, Dapeng Feng, Mu Xiao, Taylor Dixon, Robert Hartman, Chaopeng Shen, Yalan Song, Agniv Sengupta, Luca Delle Monache, and F. Martin Ralph

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1708', Anonymous Referee #1, 02 Jun 2025

This article presents a robust LSTM-based data integration framework for improving streamflow simulation in the Western U.S., through integrating lagged streamflow and SWE observations across daily and monthly timescales. The paper is well-structured, the experiments are comprehensive, and the findings are practically significant. However, several aspects require further clarification and refinement. General comments are as follows:

1. Is there any reference or justification for the criteria used to select snow-dominated basins?

2. In the model input processing, the three types of inputs, which include forcings, attributes and lagged observations, have different dimensionalities. How are these inputs aligned in terms of dimensions before being fed into the LSTM model? Please clarify the specific preprocessing or embedding strategies used to ensure compatibility across these input types.

3. The meaning of Equation (2) is unclear. Does this formulation represent single-step or multi-step prediction? Are the input variables provided in a sliding window? When estimating streamflow at the current time step, are lagged forcings also included, or are only the current forcings used as inputs?

4. Why is the mean of six model simulations used for the model evaluation? How was the number six determined, and can this sample size ensure the representativeness and stability of the evaluation results? Please clarify the rationality.

5. Figure 3 shows that streamflow estimations in several basins exhibit very low or even zero KGE values under different models and temporal scales. Please discuss the possible reasons for such poor model performance in these specific basins.

6. Please provide a more detailed explanation of the statement: “The compaction of FHV was less pronounced than that of FLV, likely due to the shorter timescales of peak flows and their lower dependence on memory compared to low flows.”

7. The paper does not provide any analysis or discussion regarding the KGE spatial patterns over the Western U.S. for experiments at the monthly scale but only evaluation for April to July. Please supplement the corresponding analysis.

8. The paper attributes the limited benefits from daily SWE integration to the prevalence of zero SWE values or potential data quality issues. However, it lacks an in-depth analysis of the error structure of the SWE dataset and its influence on model performance. It is recommended to supplement the current findings with additional analyses using higher-quality SWE datasets and to further investigate this hypothesis to provide stronger support for the explanation.

9. In the paragraph around line 285, it is generally expected that integrating lagged SWE data during the snowmelt seasons should bring certain benefits to snow-dominated regions. However, the paper reports that KGE improvements are minimal and RB performance is even worse when evaluated over all regions, which may lead to biased conclusions. It is recommended to conduct this analysis specifically for snow-dominated regions.

10. The streamflow simulations in the paper are conducted using observed forcings rather than predicted forcings. However, in an operational forecasting mode, predicted forcings is used. Therefore, when applying the proposed method in a forecasting mode, the claimed enhancements such as improving daily streamflow forecasts up to 10 days in advance or monthly forecasts up to six months cannot be guaranteed.

11. In addition to the explanation provided around line 319, another possible reason for the observed phenomenon is that integrating lagged SWE performs poorly in rain-dominated regions, which may lower the overall performance when evaluated across all basins. It is recommended to compare the performance of integrating lagged Q and SWE specifically within snow-dominated regions, and also conduct a comparative analysis within rain-dominated regions.

Specific comments:

(1) Why is Δ|RV−1| used in Figure 8(c) instead of directly showing Δ|RV| values?

(2) Please clearly specify which months are defined as the accumulation season and which are defined as the snowmelt season.

(3) It is recommended to include representative case studies of individual basins in the results section, such as time series plots, rather than relying solely on statistical boxplots.

(4) The results throughout the paper are presented primarily through figures. It is recommended to include data tables to provide a more quantitative presentation of the results.

Citation: https://doi.org/10.5194/egusphere-2025-1708-RC1
- AC1: 'Reply on RC1', Yuan Yang, 04 Aug 2025
  
  Thank you for your comments. Please find our responses in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1708-AC1
RC2:
'Comment on egusphere-2025-1708', Anonymous Referee #2, 23 Jun 2025

please see the attached PDF.

Citation: https://doi.org/10.5194/egusphere-2025-1708-RC2
- AC2: 'Reply on RC2', Yuan Yang, 04 Aug 2025
  
  Thank you for your comments. Please find our responses in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1708-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1708', Anonymous Referee #1, 02 Jun 2025

This article presents a robust LSTM-based data integration framework for improving streamflow simulation in the Western U.S., through integrating lagged streamflow and SWE observations across daily and monthly timescales. The paper is well-structured, the experiments are comprehensive, and the findings are practically significant. However, several aspects require further clarification and refinement. General comments are as follows:

1. Is there any reference or justification for the criteria used to select snow-dominated basins?

2. In the model input processing, the three types of inputs, which include forcings, attributes and lagged observations, have different dimensionalities. How are these inputs aligned in terms of dimensions before being fed into the LSTM model? Please clarify the specific preprocessing or embedding strategies used to ensure compatibility across these input types.

3. The meaning of Equation (2) is unclear. Does this formulation represent single-step or multi-step prediction? Are the input variables provided in a sliding window? When estimating streamflow at the current time step, are lagged forcings also included, or are only the current forcings used as inputs?

4. Why is the mean of six model simulations used for the model evaluation? How was the number six determined, and can this sample size ensure the representativeness and stability of the evaluation results? Please clarify the rationality.

5. Figure 3 shows that streamflow estimations in several basins exhibit very low or even zero KGE values under different models and temporal scales. Please discuss the possible reasons for such poor model performance in these specific basins.

6. Please provide a more detailed explanation of the statement: “The compaction of FHV was less pronounced than that of FLV, likely due to the shorter timescales of peak flows and their lower dependence on memory compared to low flows.”

7. The paper does not provide any analysis or discussion regarding the KGE spatial patterns over the Western U.S. for experiments at the monthly scale but only evaluation for April to July. Please supplement the corresponding analysis.

8. The paper attributes the limited benefits from daily SWE integration to the prevalence of zero SWE values or potential data quality issues. However, it lacks an in-depth analysis of the error structure of the SWE dataset and its influence on model performance. It is recommended to supplement the current findings with additional analyses using higher-quality SWE datasets and to further investigate this hypothesis to provide stronger support for the explanation.

9. In the paragraph around line 285, it is generally expected that integrating lagged SWE data during the snowmelt seasons should bring certain benefits to snow-dominated regions. However, the paper reports that KGE improvements are minimal and RB performance is even worse when evaluated over all regions, which may lead to biased conclusions. It is recommended to conduct this analysis specifically for snow-dominated regions.

10. The streamflow simulations in the paper are conducted using observed forcings rather than predicted forcings. However, in an operational forecasting mode, predicted forcings is used. Therefore, when applying the proposed method in a forecasting mode, the claimed enhancements such as improving daily streamflow forecasts up to 10 days in advance or monthly forecasts up to six months cannot be guaranteed.

11. In addition to the explanation provided around line 319, another possible reason for the observed phenomenon is that integrating lagged SWE performs poorly in rain-dominated regions, which may lower the overall performance when evaluated across all basins. It is recommended to compare the performance of integrating lagged Q and SWE specifically within snow-dominated regions, and also conduct a comparative analysis within rain-dominated regions.

Specific comments:

(1) Why is Δ|RV−1| used in Figure 8(c) instead of directly showing Δ|RV| values?

(2) Please clearly specify which months are defined as the accumulation season and which are defined as the snowmelt season.

(3) It is recommended to include representative case studies of individual basins in the results section, such as time series plots, rather than relying solely on statistical boxplots.

(4) The results throughout the paper are presented primarily through figures. It is recommended to include data tables to provide a more quantitative presentation of the results.

Citation: https://doi.org/10.5194/egusphere-2025-1708-RC1
- AC1: 'Reply on RC1', Yuan Yang, 04 Aug 2025
  
  Thank you for your comments. Please find our responses in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1708-AC1
RC2:
'Comment on egusphere-2025-1708', Anonymous Referee #2, 23 Jun 2025

please see the attached PDF.

Citation: https://doi.org/10.5194/egusphere-2025-1708-RC2
- AC2: 'Reply on RC2', Yuan Yang, 04 Aug 2025
  
  Thank you for your comments. Please find our responses in the attached PDF file.
  
  Citation: https://doi.org/10.5194/egusphere-2025-1708-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (08 Aug 2025) by Xing Yuan

AR by Yuan Yang on behalf of the Authors (16 Aug 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (18 Aug 2025) by Xing Yuan

RR by Anonymous Referee #2 (26 Aug 2025)

RR by Anonymous Referee #1 (02 Sep 2025)

ED: Publish as is (04 Sep 2025) by Xing Yuan

AR by Yuan Yang on behalf of the Authors (09 Sep 2025)

Journal article(s) based on this preprint

21 Oct 2025

Improving streamflow simulation through machine learning-powered data integration and its potential for forecasting in the Western U.S.

Yuan Yang, Ming Pan, Dapeng Feng, Mu Xiao, Taylor Dixon, Robert Hartman, Chaopeng Shen, Yalan Song, Agniv Sengupta, Luca Delle Monache, and F. Martin Ralph

Hydrol. Earth Syst. Sci., 29, 5453–5476, https://doi.org/10.5194/hess-29-5453-2025,https://doi.org/10.5194/hess-29-5453-2025, 2025

Short summary

Yuan Yang, Ming Pan, Dapeng Feng, Mu Xiao, Taylor Dixon, Robert Hartman, Chaopeng Shen, Yalan Song, Agniv Sengupta, Luca Delle Monache, and F. Martin Ralph

Viewed

Total article views: 3,643 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
2,711	840	92	3,643	116	193

HTML: 2,711
PDF: 840
XML: 92
Total: 3,643
BibTeX: 116
EndNote: 193

Views and downloads (calculated since 30 Apr 2025)

Month	HTML	PDF	XML	Total
Apr 2025	66	2	2	70
May 2025	334	102	16	452
Jun 2025	192	70	14	276
Jul 2025	130	64	4	198
Aug 2025	350	90	12	452
Sep 2025	1,174	100	8	1,282
Oct 2025	86	66	2	154
Nov 2025	90	96	6	192
Dec 2025	50	48	4	102
Jan 2026	66	58	10	134
Feb 2026	40	40	2	82
Mar 2026	70	72	6	148
Apr 2026	31	14	0	45
May 2026	16	9	3	28
Jun 2026	12	5	1	18
Jul 2026	4	4	2	10

Cumulative views and downloads (calculated since 30 Apr 2025)

Month	HTML	PDF	XML	Total
Apr 2025	66	2	2	70
May 2025	334	102	16	452
Jun 2025	192	70	14	276
Jul 2025	130	64	4	198
Aug 2025	350	90	12	452
Sep 2025	1,174	100	8	1,282
Oct 2025	86	66	2	154
Nov 2025	90	96	6	192
Dec 2025	50	48	4	102
Jan 2026	66	58	10	134
Feb 2026	40	40	2	82
Mar 2026	70	72	6	148
Apr 2026	31	14	0	45
May 2026	16	9	3	28
Jun 2026	12	5	1	18
Jul 2026	4	4	2	10

Viewed (geographical distribution)

Total article views: 3,640 (including HTML, PDF, and XML) Thereof 3,640 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 20 Jul 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (1781 KB)
Metadata XML

Short summary

We explore a machine learning-based data integration method that integrates streamflow (Q) and snow water equivalent (SWE) to improve streamflow estimates at various lag times (1–10 days, 1–6 months) and timescales (daily and monthly) over Western U.S. basins. Benefits rank as: integrating Q at the daily scale > Q at the monthly scale > SWE at the monthly scale > SWE at the daily scale. Results highlight the method’s potential for short- and long-term streamflow forecasting in the Western U.S.


Total:	0
HTML:	0
PDF:	0
XML:	0

Improving Streamflow Simulation through Machine Learning-Powered Data Integration and Its Implications for Forecasting in the Western U.S.

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Viewed

Viewed (geographical distribution)