Strategies for Incorporating Static Features into Global Deep Learning Models
Abstract. Global deep learning (DL) models are increasingly used in hydrology and hydrogeology to model time series data across multiple sites simultaneously. To account for site-specific behavior, static input features are commonly included in these models. Although the method of integration of static features into model architectures can influence performance, this aspect is seldom systematically evaluated. In this study, we systematically compare four strategies for incorporating static features into a global DL model for groundwater level prediction, including approaches commonly used in water science (repetition, concatenation) and two adopted from related disciplines (attention, conditional initialization). The models are evaluated using a large-scale groundwater dataset from Germany, tested under both in-sample (temporal generalization) and out-of-sample (spatiotemporal generalization) settings, and with both environmental and time-series-derived static features.
Our results show that all integration methods perform rather similar in terms of average metrics, though their performance varies across wells and settings. The repetition approach achieves slightly better overall performance but is computationally inefficient due to the redundant replication of static features. Therefore, it may be worthwhile to explore alternative integration strategies that can offer comparable results with lower computational cost. Importantly, the choice of integration method becomes less critical than the quality of the static features themselves. These findings underscore the importance of careful feature selection and provide practical guidance for the design of global deep learning models in hydrologic applications.
I would suggest to include the purpose of the models in the title, e.g. by adding "for groundwater levels in Germany".
Best wishes, Willem Zaadnoordijk