Preprints
https://doi.org/10.5194/egusphere-2026-2909
https://doi.org/10.5194/egusphere-2026-2909
16 Jun 2026
 | 16 Jun 2026
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Do reservoir-influenced gauges need explicit consideration in machine learning models? A case study with Hydra-LSTM

Karan Ruparell, Dai Yamazaki, Kieran Hunt, Hannah Cloke, Christel Prudhomme, Florian Pappenberger, and Matthew Chantry

Abstract. Reservoirs fundamentally alter downstream river flow regimes, decoupling discharge from natural meteorological forcing and challenging standard hydrological prediction. While data-driven models, such as Long Short-Term Memory (LSTM) networks, show promise in regulated catchments, it remains unclear how training data composition across natural and regulated rivers influences model generalisability and behaviour. In this study, we investigate how the presence or absence of reservoir-influenced catchments in training data impacts model performance across different flow regimes and alters the physical drivers the models learn to rely on. Using carefully matched subsets of the CAMELS-GB dataset, we trained separate specialist LSTMs (reservoir and non-reservoir), a pooled Full LSTM, and a multi-headed Hydra-LSTM to investigate whether explicit architectural specialisation offers any advantage over pooled training alone. Models were evaluated on held-out test gauges using standard performance metrics and gradient importance analysis to interpret feature reliance. Our results demonstrate that exposure to reservoir-influenced catchments during training is essential. Models trained exclusively on natural catchments consistently overestimate the mean and variance of regulated flows. Conversely, training exclusively on reservoir-influenced data degrades performance on non reservoir-influenced rivers (KGE reduction of ≥ 0.1) giving importance primarily to anthropogenic static features, such as abstraction rates, at the expense of precipitation drivers. A single Full LSTM trained on combined data matched the performance of both specialist models in their respective domains, implicitly switching its feature reliance between regimes. The Hydra-LSTM performed comparably to the Full LSTM throughout, indicating that the shared body may act as a regulariser limiting over-specialisation, but that explicit architectural specialisation provides no further benefit under these conditions. We conclude that pooling training data across regimes is a highly effective strategy for general-purpose modelling. However, case studies highlight a fundamental limitation: purely meteorological inputs remain insufficient for predicting flows in heavily managed single-purpose reservoirs, where unobserved human operational decisions dominate the hydrograph.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Karan Ruparell, Dai Yamazaki, Kieran Hunt, Hannah Cloke, Christel Prudhomme, Florian Pappenberger, and Matthew Chantry

Status: open (until 28 Jul 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Karan Ruparell, Dai Yamazaki, Kieran Hunt, Hannah Cloke, Christel Prudhomme, Florian Pappenberger, and Matthew Chantry
Karan Ruparell, Dai Yamazaki, Kieran Hunt, Hannah Cloke, Christel Prudhomme, Florian Pappenberger, and Matthew Chantry
Metrics will be available soon.
Latest update: 16 Jun 2026
Download
Short summary
Reservoirs change how rivers behave, making them harder to predict. We tested whether machine learning models can learn these effects by training on river sites with and without upstream reservoirs across Great Britain. Models without reservoir training overestimated regulated flows, while reservoir-only models performed poorly on natural rivers. Training on both performed well everywhere. All models failed at heavily managed reservoirs where human decisions dominate river flow.
Share