Preprints
https://doi.org/10.5194/egusphere-2025-4048
https://doi.org/10.5194/egusphere-2025-4048
03 Dec 2025
 | 03 Dec 2025
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Strategies for Incorporating Static Features into Global Deep Learning Models

Tanja Liesch and Marc Ohmer

Abstract. Global deep learning (DL) models are increasingly used in hydrology and hydrogeology to model time series data across multiple sites simultaneously. To account for site-specific behavior, static input features are commonly included in these models. Although the method of integration of static features into model architectures can influence performance, this aspect is seldom systematically evaluated. In this study, we systematically compare four strategies for incorporating static features into a global DL model for groundwater level prediction, including approaches commonly used in water science (repetition, concatenation) and two adopted from related disciplines (attention, conditional initialization). The models are evaluated using a large-scale groundwater dataset from Germany, tested under both in-sample (temporal generalization) and out-of-sample (spatiotemporal generalization) settings, and with both environmental and time-series-derived static features.

Our results show that all integration methods perform rather similar in terms of average metrics, though their performance varies across wells and settings. The repetition approach achieves slightly better overall performance but is computationally inefficient due to the redundant replication of static features. Therefore, it may be worthwhile to explore alternative integration strategies that can offer comparable results with lower computational cost. Importantly, the choice of integration method becomes less critical than the quality of the static features themselves. These findings underscore the importance of careful feature selection and provide practical guidance for the design of global deep learning models in hydrologic applications.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Tanja Liesch and Marc Ohmer

Status: open (until 14 Jan 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CC1: 'Comment on egusphere-2025-4048', Willem Zaadnoordijk, 04 Dec 2025 reply
Tanja Liesch and Marc Ohmer

Data sets

Groundwater level time series, meteorological forcings and static feature dataset for 667 wells in Germany Tanja Liesch, Marc Ohmer https://zenodo.org/records/16601180

Model code and software

GitHub Repository for "Strategies for Incorporating Static Features into Global Deep Learning Models" Tanja Liesch https://github.com/KITHydrogeology/dynamic_static

Tanja Liesch and Marc Ohmer

Viewed

Total article views: 38 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
30 6 2 38 0 0
  • HTML: 30
  • PDF: 6
  • XML: 2
  • Total: 38
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 03 Dec 2025)
Cumulative views and downloads (calculated since 03 Dec 2025)

Viewed (geographical distribution)

Total article views: 38 (including HTML, PDF, and XML) Thereof 38 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 04 Dec 2025
Download
Short summary
We studied how to add site information to deep learning models that predict groundwater levels at many wells at once. Using data from Germany, we compared four simple ways to combine time varying weather with time invariant site characteristics. All methods gave similar average accuracy. Repeating site data at each time step was slightly best but used more computer power. The quality of site information mattered more than the method, guiding future model design.
Share