EGUsphere

Copernicus Publications

Göttingen, Germany

10.5194/egusphere-2026-1965

Better data or better architecture? Improving deep-learning-based prediction in ungauged basins

Heudorfer

Benedikt

https://orcid.org/0000-0001-7801-9375

¹ Gupta

Hoshin

² Dolich

Alexander

https://orcid.org/0000-0003-4160-6765

³ Loritz

Ralf

https://orcid.org/0000-0002-0540-6478

Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research – Atmospheric Trace Gases and Remote Sensing, Karlsruhe, Germany

Department of Hydrology and Atmospheric Sciences, The University of Arizona, Tucson, AZ, USA

Karlsruhe Institute of Technology (KIT), Institute for Water and Environment, Karlsruhe, Germany

20 04 2026

2026 1 30

2026

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/

This article is available from https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1965/

The full text article is available as a PDF file from https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1965/egusphere-2026-1965.pdf

Large-sample hydrology has recently been driven by two key developments. First, the introduction of hydrological benchmark datasets such as CAMELS-US and CARAVAN, and second, the emergence of deep‑learning modelling frameworks, particularly LSTM‑based regional models, which have demonstrated performance on par with, and in some cases exceeding, that of process-based models for streamflow prediction in gauged and ungauged settings. Building on these developments, we investigate whether (i) further enhanced LSTM architectures, (ii) new sets of static features, or (iii) a combination of both enable us to significantly improve Predictions in Ungauged Basins (PUB). In this study, we evaluate a state-of-the-art regional LSTM model (base LSTM) against embedded (EMB-LSTM) and cross‑attention enhanced (CA-LSTM) variants, in combination with a suite of newly applied static features, namely MODIS surface reflectance bands, ALPHAEARTH embeddings, DEM-, meteorology- and catchment coordinate-derived auxiliary aggregates, and conventional CAMELS attributes. We tested these model-and-data combinations in pseudo‑ungauged 5‑fold cross‑validation across the 531 CAMELS‑US catchments. Model performance was quantified by the Nash‑Sutcliffe Efficiency (NSE), while latent‑space complexity was assessed via the Shannon effective rank (erank). Results show that the quality of static features is more important than architectural improvements. ALPHAEARTH embeddings attained the highest median NSE, but only in combination with auxiliary static feature data (ALPHAEARTH<sub>plus</sub>). Architectural refinements yielded only modest improvements. Thereby the relatively simple EMB-LSTM, which allowed the LSTM layer to better ingest ALPHAEAERTH<sub>plus</sub> static features, outperformed the other architectures. With this combination, we achieved a median performance of NSE = 0.726, significantly improving the state-of-the-art PUB performance (NSE = 0.69) for the CAMELS-US dataset. Auxiliary analysis indicates that further improvement is possible when adding MODIS bands as additional dynamic features to the model. In conclusion, our study indicates that, broadly speaking, (a) better data is more important than better architecture, (b) better architecture is necessary only to accommodate better data, (c) the single layer LSTM remains the most suitable core model as of now, and (d) the Shannon effective rank complexity of the latent space is a useful diagnostic for linking improved PUB performance to improved quality of latent hydrological representation inside the model. Overall, this highlights the need for improved measurement‑derived descriptor datasets, especially for soil and geology.