Better data or better architecture? Improving deep-learning-based prediction in ungauged basins

Heudorfer, Benedikt; Gupta, Hoshin; Dolich, Alexander; Loritz, Ralf

doi:10.5194/egusphere-2026-1965

Preprints

https://doi.org/10.5194/egusphere-2026-1965

Preprints

20 Apr 2026

| 20 Apr 2026

Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Better data or better architecture? Improving deep-learning-based prediction in ungauged basins

Benedikt Heudorfer, Hoshin Gupta, Alexander Dolich, and Ralf Loritz

Abstract. Large-sample hydrology has recently been driven by two key developments. First, the introduction of hydrological benchmark datasets such as CAMELS-US and CARAVAN, and second, the emergence of deep‑learning modelling frameworks, particularly LSTM‑based regional models, which have demonstrated performance on par with, and in some cases exceeding, that of process-based models for streamflow prediction in gauged and ungauged settings. Building on these developments, we investigate whether (i) further enhanced LSTM architectures, (ii) new sets of static features, or (iii) a combination of both enable us to significantly improve Predictions in Ungauged Basins (PUB). In this study, we evaluate a state-of-the-art regional LSTM model (base LSTM) against embedded (EMB-LSTM) and cross‑attention enhanced (CA-LSTM) variants, in combination with a suite of newly applied static features, namely MODIS surface reflectance bands, ALPHAEARTH embeddings, DEM-, meteorology- and catchment coordinate-derived auxiliary aggregates, and conventional CAMELS attributes. We tested these model-and-data combinations in pseudo‑ungauged 5‑fold cross‑validation across the 531 CAMELS‑US catchments. Model performance was quantified by the Nash‑Sutcliffe Efficiency (NSE), while latent‑space complexity was assessed via the Shannon effective rank (erank). Results show that the quality of static features is more important than architectural improvements. ALPHAEARTH embeddings attained the highest median NSE, but only in combination with auxiliary static feature data (ALPHAEARTH_plus). Architectural refinements yielded only modest improvements. Thereby the relatively simple EMB-LSTM, which allowed the LSTM layer to better ingest ALPHAEAERTH_plus static features, outperformed the other architectures. With this combination, we achieved a median performance of NSE = 0.726, significantly improving the state-of-the-art PUB performance (NSE = 0.69) for the CAMELS-US dataset. Auxiliary analysis indicates that further improvement is possible when adding MODIS bands as additional dynamic features to the model. In conclusion, our study indicates that, broadly speaking, (a) better data is more important than better architecture, (b) better architecture is necessary only to accommodate better data, (c) the single layer LSTM remains the most suitable core model as of now, and (d) the Shannon effective rank complexity of the latent space is a useful diagnostic for linking improved PUB performance to improved quality of latent hydrological representation inside the model. Overall, this highlights the need for improved measurement‑derived descriptor datasets, especially for soil and geology.

Received: 07 Apr 2026 – Discussion started: 20 Apr 2026

Competing interests: At least one of the (co-)authors is a member of the editorial board of Hydrology and Earth System Sciences.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Benedikt Heudorfer, Hoshin Gupta, Alexander Dolich, and Ralf Loritz

Status: open (until 26 Jun 2026)

Post a comment Subscribe to comment alert

CC1: 'Comment on egusphere-2026-1965', John Ding, 25 May 2026 reply

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1965/egusphere-2026-1965-CC1-supplement.pdf
Reply

Citation: https://doi.org/10.5194/egusphere-2026-1965-CC1

Benedikt Heudorfer, Hoshin Gupta, Alexander Dolich, and Ralf Loritz

Viewed

Total article views: 708 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
465	220	23	708	14	19

HTML: 465
PDF: 220
XML: 23
Total: 708
BibTeX: 14
EndNote: 19

Views and downloads (calculated since 20 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	304	160	15	479
May 2026	161	60	8	229

Cumulative views and downloads (calculated since 20 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	304	160	15	479
May 2026	161	60	8	229

Viewed (geographical distribution)

Total article views: 708 (including HTML, PDF, and XML) Thereof 708 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 31 May 2026

Short summary

For most rivers, water level is not measured, making flood prediction difficult. But it's still possible with certain models. We want to improve these models and test if better models or better data help when predicting floods with these models in the United States. Results show that better data (measured by satellites) improves predictions more than better model designs. Actually, simple models often worked best. And we show that we need better measurement of landscape information.


Total:	0
HTML:	0
PDF:	0
XML:	0