Preprints
https://doi.org/10.5194/egusphere-2026-3267
https://doi.org/10.5194/egusphere-2026-3267
19 Jun 2026
 | 19 Jun 2026
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Exploring the generalisation ability and interpretability of Long Short-Term Memory (LSTM) networks for large-sample groundwater level predictions

Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

Abstract. Deep Learning (DL) models, particularly Long Short-Term Memory (LSTM) networks, have shown similar or even superior performance to process-based models in estimating streamflow particularly at ungauged locations. However, their ability to extrapolate groundwater levels across time and space is less understood, as the number of studies addressing this issue is so far relatively limited. Here, we exploit the unique availability of a large-sample dataset of groundwater level observations across England to contribute to filling this gap. We configured two LSTM model variants: one using static environmental attributes (LSTM_ENV) and one using random integers as unique identifiers of places (LSTM_RND). Both models were trained using data from 636 stations over the period 1971-2014 and tested over 2015-2019 at both the training stations (in-sample test) and at 341 unseen stations (out-of-sample). Our results indicate that the two configurations achieved comparable performance in in-sample test, but their performances significantly diverge at unseen stations. To put the LSTM models’ performance into context, we also compared them to the performance of a process-based surface-groundwater model at 124 unseen stations. We found that both models effectively capture temporal fluctuations but struggle to accurately reproduce the mean and variability of the water table depth. This systematic bias frequently resulted in negative NSE values despite high temporal correlation, suggesting that evaluating LSTM performance using NSE solely can be misleading. We also found that the LSTM_ENV model performs better at stations characterised by higher specific yield and transmissivity, and that it mostly uses meteorological input features (e.g. precipitation) and topographic features (e.g. elevation and height above nearest drainage) to make predictions at unseen stations. These findings highlight the potential of LSTMs for regional groundwater level predictions and the value of interpretability tools for understanding how such models achieve their performance and whether the environmental features used are informative.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

Status: open (until 31 Jul 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

Model code and software

LSTM_groundwater_modelling Qidong Fang https://github.com/QidongFang1203/LSTM_groundwater_modelling

Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi
Metrics will be available soon.
Latest update: 19 Jun 2026
Download
Short summary
It is unclear whether deep learning models can predict groundwater level at places without measurements using attributes of the places. Our deep learning model captured temporal variation well, especially in more responsive aquifers, similarly to a process-based model. Interpretation tools showed that meteorological and environmental information at places helped predictions at unseen wells. We highlight the potential of deep learning models for regional groundwater level predictions.
Share