Preprints
https://doi.org/10.5194/egusphere-2025-4244
https://doi.org/10.5194/egusphere-2025-4244
30 Sep 2025
 | 30 Sep 2025
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Testing data assimilation strategies to enhance short-range AI-based discharge forecasts

Bob E. Saint-Fleur, Eric Gaume, Florian Surmont, Nicolas Akil, and Dominique Theriez

Abstract. Effective discharge forecasts are essential in operational hydrology. The accuracy of such forecasts, particularly in short lead times, is generally increased through the integration of recent measured discharges using data assimilation (DA) procedures. Recent studies have demonstrated the effectiveness of deep learning (DL) approaches for rainfall-runoff (RR) modeling, particularly Long Short-Term Memory (LSTM) networks, outperforming traditional approaches. However, most of these studies do not include DA procedures, which may limit their operational forecast performance. This study suggests and evaluates three DA strategies that incorporate discharge from either past observed discharges or forecast discharges of a pre-trained benchmark model (BM). The proposed strategies, based on a Multilayer Perceptron (MLP) orchestrator, include: (1) the integration of recent observed discharges, (2) the integration of both recent discharge observations and pre-trained BM forecasts, and (3) the post-processing of BM forecast errors. Experiments are implemented using the CAMELS-US dataset using two established benchmark models: the trained LSTM model from Kratzert et al. (2019) and the conceptual Sacramento Soil Moisture Accounting (SAC-SMA) model from Newman et al. (2017), covering both machine learning and conceptual RR simulation approaches. Lead times of 1, 3, and 7 days, covering short- and mid-term horizons, are considered. The approaches are evaluated in two forecast frameworks: (1) perfect meteorological forecasts over the forecasting lead time and (2) highly uncertain ensemble meteorological forecasts. The two frameworks yield contrasting outcomes. When evaluated under the perfect forecast framework, the application of DA leads to substantial improvements in forecast performance, although the magnitude of these gains depends on the initial performance of the benchmark (BM) models and the forecasting lead time. Improvements are consistently significant for the SAC-SMA cases, while for the LSTM cases, gains are observed mainly for basins where the LSTM initially underperforms. However, the ensemble forecast evaluation yields unexpected results: the performance ranking of the tested models changes markedly compared to the perfect forecast framework. The LSTM model, in particular, appears penalized by the unreliability – specifically, the under-dispersion – of its forecast ensembles, meaning that its predictions are insufficiently responsive to meteorological forcing over the forecast lead time. This finding underscores the importance of ensuring reliable ensemble dispersion for the efficient operational deployment of AI-based hydrological forecasts.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Bob E. Saint-Fleur, Eric Gaume, Florian Surmont, Nicolas Akil, and Dominique Theriez

Status: open (until 11 Nov 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Bob E. Saint-Fleur, Eric Gaume, Florian Surmont, Nicolas Akil, and Dominique Theriez

Data sets

Data (raw and processed) to "Testing data assimilation strategies to enhance short-range AI-based discharge forecasts Bob E. Saint-Fleur and Eric Gaume https://doi.org/10.5281/zenodo.16944643

Model code and software

AI_Operational_HydroForecast Bob E. Saint-Fleur https://gitlab.univ-eiffel.fr/bob.saint-fleur/ai_operational_hydroforecast#

Interactive computing environment

AI_Operational_HydroForecast Bob E. Saint-Fleur https://gitlab.univ-eiffel.fr/bob.saint-fleur/ai_operational_hydroforecast#

Bob E. Saint-Fleur, Eric Gaume, Florian Surmont, Nicolas Akil, and Dominique Theriez

Viewed

Total article views: 169 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
161 6 2 169 0 0
  • HTML: 161
  • PDF: 6
  • XML: 2
  • Total: 169
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 30 Sep 2025)
Cumulative views and downloads (calculated since 30 Sep 2025)

Viewed (geographical distribution)

Total article views: 168 (including HTML, PDF, and XML) Thereof 168 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 02 Oct 2025
Download
Short summary
This paper emphasizes the need to account for operational constraints when developing discharge forecast models. Using an open access dataset (CAMELS-US) for hydrology, two established rainfall-runoff models (LSTM and SAC-SMA), and a multilayer perceptron for implementation, we evaluate the importance of data assimilation, the persistence and ensemble analysis under various scenario. Results show DA is crucial, and models performances can sharply drop from idealized to operational conditions.
Share