BiasCast: Learning and adjusting real time biases from meteorological forecasts to enhance runoff predictions
Abstract. The use of deep learning models in hydrology is becoming an ever more prevalent application in operational flood forecasting. Such operational systems face performance degradation when transitioning from high quality reanalysis to meteorological forecast data with lower accuracy. This study investigates training strategies and Long Short-Term Memory network architectures to mitigate forecast-induced bias in maximum daily discharge predictions using the Extended LamaH- CE dataset and a subset of 451 basins. We systematically evaluated cross-domain generalization, transfer learning approaches, Encoder–Decoder LSTMs, Sequential Forecast LSTMs, and the role of input embeddings and integrating past discharge observations. The results show that domain shifts between reanalysis and forecast data lead to substantial skill loss, with median Nash–Sutcliffe Efficiency decreasing from 0.58 to 0.33. Among the tested strategies, the Sequential Forecast LSTM demonstrated the most stable improvements, achieving a median NSE of 0.63. Integrating recent discharge observations further enhanced performance, raising median NSE to 0.71 and surpassing even the reanalysis-driven baseline. In contrast, integrating archived forecasts or using more complex input embeddings did not yield consistent benefits and in some cases degraded model stability. These findings highlight the value of training strategies that allow models to directly learn bias correction during forecast transitions and emphasize the operational potential of combining sequential processing with near real-time discharge observations.
This manuscript addresses the challenge of deploying machine-learning hydrological models in operational forecasting by explicitly considering domain shift between reanalysis and forecast meteorological inputs. The authors explore alternative training strategies and LSTM architectures to improve 1-day streamflow forecasts, and the results suggest that architectures combining hindcast and forecast phases, which use reanalysis and forecast data respectively, provide the greatest performance gains. The study tackles an important problem, presents interesting results, and is structured well. Some additional analysis and clarifications would further strengthen the interpretation of the experiments and results.
General comments
Specific comments
Technical corrections
References
Seibert J, Vis MJP, Lewis E, van Meerveld HJ. Upper and lower benchmarks in hydrological modelling. Hydrological Processes. 2018; 32: 1120–1125. https://doi.org/10.1002/hyp.11476