Preprints
https://doi.org/10.5194/egusphere-2025-5201
https://doi.org/10.5194/egusphere-2025-5201
07 Nov 2025
 | 07 Nov 2025
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Hybrid models generalize better to warmer climate conditions than process-based and purely data-driven models

Jan P. Bohl, Raul R. Wood, Corinna Frank, Paul C. Astagneau, Jonas Peters, and Manuela I. Brunner

Abstract. Deep-learning based rainfall-runoff models, in particular long short-term memory networks (LSTM), have been shown to outperform traditional hydrological models at various tasks, both when used as purely data-driven models and when combined with process-based models in a hybrid setting. These tasks include predictions in ungauged basins (PUB) and regions (PUR), tasks which have traditionally been challenging for conceptual hydrological models. While the spatial generalizability of deep-learning based models has received a lot of attention, it is less clear how they generalize to unseen and warmer climate conditions, i.e. how suitable these models are for hydrological climate impact studies. To address this research gap, we assess the ability of three types of models including (1) fully data-driven (LSTMs), (2) conceptual (Hydrologiska Byråns Vattenbalansavdelning (HBV)), and (3) hybrid (LSTM-HBV) models to simulate streamflow under conditions warmer than those used to train the models by running a differential split sample test. That is, we trained the models using data from the historical period 1960–1990 and evaluated them on both data of this period as well as of the warmer period 2000–2023. We find that LSTMs, while being the most accurate during the 1960–1990 period, have inferior generalizability to the warm period compared to the hybrid and conceptual models. In addition, we show that when generalizing to the warm period, hybrid models have similar accuracy as LSTMs, independently of whether the entire streamflow distribution or extreme events such as floods and droughts are considered. However, for snow-dominated catchments, all models suffer from similar reductions in accuracy when simulating streamflow under unseen climate conditions and the LSTM is the most accurate model for all periods. A detailed look at the snowmelt simulations of the hybrid and conceptual model suggests that better process-representation might be needed to accurately capture the dynamics of snow-melt and -accumulation processes, which are highly sensitive to changes in temperature. We conclude that the hybrid models effectively combine the high accuracy of LSTMs when predicting in ungauged basins with the good generalizability under changes in climate of conceptual hydrological models. This makes them a suitable choice for hydrological climate change impact assessments, particularly in ungauged basins.

Competing interests: At least one of the (co-)authors is a member of the editorial board of Hydrology and Earth System Sciences.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Jan P. Bohl, Raul R. Wood, Corinna Frank, Paul C. Astagneau, Jonas Peters, and Manuela I. Brunner

Status: open (until 19 Dec 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Jan P. Bohl, Raul R. Wood, Corinna Frank, Paul C. Astagneau, Jonas Peters, and Manuela I. Brunner

Data sets

E-OBS Daily Gridded Meteorological Data for Europe from 1950 to Present Derived from in-Situ Observations Copernicus Climate Change Service, Climate Data Store https://doi.org/10.24381/cds.151d3ec6

SPASS - new gridded climatological snow datasets for Switzerland C. Marty et al. https://www.doi.org/10.16904/envidat.580

SNOWGRID Klima v2.1 GeoSphere Austria https://doi.org/10.60669/fsxx-6977

Model code and software

Caravan - A global community dataset for large-sample hydrology F. Kratzert https://github.com/kratzert/Caravan/

Analyzing the generalization capabilities of hybrid hydrological models for extrapolation to extreme events E. Acuna Espinoza https://doi.org/10.5281/zenodo.14191623

Jan P. Bohl, Raul R. Wood, Corinna Frank, Paul C. Astagneau, Jonas Peters, and Manuela I. Brunner
Metrics will be available soon.
Latest update: 07 Nov 2025
Download
Short summary
To assess climate impacts on streamflow, we need models that can predict streamflow under future conditions. This study compares three model types: data-driven (LSTM), conceptual (HBV), and hybrid (LSTM-HBV). LSTMs perform best overall, but HBV and hybrid models generalize better to warmer climates. Hybrid models are a promising tool for climate impact assessments, combining LSTMs accuracy with better generalizability of traditional models. In snowy regions, all models struggle to generalize.
Share