Preprints
https://doi.org/10.2139/ssrn.6737296
https://doi.org/10.2139/ssrn.6737296
23 Jun 2026
 | 23 Jun 2026
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Exploring uncertainties across the modelling chain in machine-learning-based streamflow forecasting

Luka Vinokić, Milan Dotlić, Ana Samac, Veljko Prodanović, Slobodan Kolaković, and Milan Stojković

Abstract. Operational streamflow forecasts underpin flood preparedness and reservoir operations, yet their utility is often constrained by poorly characterized and attributed predictive uncertainty. In machine-learning-based forecasting, uncertainty is frequently omitted or reported as a single aggregate output, leaving it unclear which parts of the end-to-end forecasting chain drive overconfidence and forecast degradation, particularly with increasing lead time. In this work, we develop an end-to-end uncertainty decomposition framework for operational streamflow forecasting that attributes predictive uncertainty across meteorological forcing choice, feature design, model architecture, hyperparameter optimization, and training variability, evaluated across multi-day horizons. The decomposition reveals a systematic, horizon-dependent shift in dominant uncertainty sources, with forcing-related contributions increasing with lead time while model-structure and feature choices remain influential at shorter horizons. During high-flow events, predictive intervals remain essential because pipeline heterogeneity can bias the central estimate even when ensemble dispersion widens appropriately. Tuning contributes little to the uncertainty budget but strongly affects compute–skill trade-offs, with Bayesian optimization delivering the most favorable cost–benefit performance under the tested constraints. Together, these results provide actionable guidance for operational freshwater management, showing where investment yields the largest reliability gains: model design at short lead times and forcing quality at longer lead times. This guidance can reduce the risk of costly or unsafe decisions in flood preparedness, reservoir operation, and other critical decision-making contexts in water management.

Share
Luka Vinokić, Milan Dotlić, Ana Samac, Veljko Prodanović, Slobodan Kolaković, and Milan Stojković

Status: open (until 04 Aug 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Luka Vinokić, Milan Dotlić, Ana Samac, Veljko Prodanović, Slobodan Kolaković, and Milan Stojković
Luka Vinokić, Milan Dotlić, Ana Samac, Veljko Prodanović, Slobodan Kolaković, and Milan Stojković
Metrics will be available soon.
Latest update: 23 Jun 2026
Download
Short summary
Reliable streamflow forecasts require knowing not just what the model predicts, but how uncertain that prediction is and why. This study shows that uncertainty in machine-learning-based forecasts shifts with lead time: model design dominates at short horizons, while weather forecast quality takes over beyond day two. Ensemble mean forecasts can mislead during floods; predictive intervals remain reliable. Bayesian optimization offers the best accuracy-to-cost ratio among tuning strategies tested.
Share