Exploring the generalisation ability and interpretability of Long Short-Term Memory (LSTM) networks for large-sample groundwater level predictions

Fang, Qidong; Rahman, Mostaquimur; Wagener, Thorsten; Pianosi, Francesca

doi:10.5194/egusphere-2026-3267

Preprints

https://doi.org/10.5194/egusphere-2026-3267

Preprints

19 Jun 2026

| 19 Jun 2026

Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Exploring the generalisation ability and interpretability of Long Short-Term Memory (LSTM) networks for large-sample groundwater level predictions

Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

Abstract. Deep Learning (DL) models, particularly Long Short-Term Memory (LSTM) networks, have shown similar or even superior performance to process-based models in estimating streamflow particularly at ungauged locations. However, their ability to extrapolate groundwater levels across time and space is less understood, as the number of studies addressing this issue is so far relatively limited. Here, we exploit the unique availability of a large-sample dataset of groundwater level observations across England to contribute to filling this gap. We configured two LSTM model variants: one using static environmental attributes (LSTM_ENV) and one using random integers as unique identifiers of places (LSTM_RND). Both models were trained using data from 636 stations over the period 1971-2014 and tested over 2015-2019 at both the training stations (in-sample test) and at 341 unseen stations (out-of-sample). Our results indicate that the two configurations achieved comparable performance in in-sample test, but their performances significantly diverge at unseen stations. To put the LSTM models’ performance into context, we also compared them to the performance of a process-based surface-groundwater model at 124 unseen stations. We found that both models effectively capture temporal fluctuations but struggle to accurately reproduce the mean and variability of the water table depth. This systematic bias frequently resulted in negative NSE values despite high temporal correlation, suggesting that evaluating LSTM performance using NSE solely can be misleading. We also found that the LSTM_ENV model performs better at stations characterised by higher specific yield and transmissivity, and that it mostly uses meteorological input features (e.g. precipitation) and topographic features (e.g. elevation and height above nearest drainage) to make predictions at unseen stations. These findings highlight the potential of LSTMs for regional groundwater level predictions and the value of interpretability tools for understanding how such models achieve their performance and whether the environmental features used are informative.

Received: 05 Jun 2026 – Discussion started: 19 Jun 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1882 KB)

Supplement (4592 KB)

Download & links

Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

Status: open (until 11 Aug 2026)

Post a comment Subscribe to comment alert

RC1: 'Comment on egusphere-2026-3267', Benedikt Heudorfer, 23 Jul 2026 reply

The study presents the first (or one of the first) global/regional deep-learning based groundwater model for the UK. The results are very well worked out throughout, language and presentation is excellent, and the content represents valuable contribution to the scientific body. I do not have major comments and propose publication after minor comments (below) are addressed.

Reply

Citation: https://doi.org/10.5194/egusphere-2026-3267-RC1

Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

Supplement

https://doi.org/10.5194/egusphere-2026-3267-supplement

Model code and software

LSTM_groundwater_modelling Qidong Fang https://github.com/QidongFang1203/LSTM_groundwater_modelling

Qidong Fang, Mostaquimur Rahman, Thorsten Wagener, and Francesca Pianosi

Viewed

Total article views: 64 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
37	19	8	64	8	4	4

HTML: 37
PDF: 19
XML: 8
Total: 64
Supplement: 8
BibTeX: 4
EndNote: 4

Views and downloads (calculated since 19 Jun 2026)

Month	HTML	PDF	XML	Total
Jul 2026	37	19	8	64

Cumulative views and downloads (calculated since 19 Jun 2026)

Month	HTML	PDF	XML	Total
Jul 2026	37	19	8	64

Viewed (geographical distribution)

Total article views: 45 (including HTML, PDF, and XML) Thereof 45 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 30 Jul 2026

Short summary

It is unclear whether deep learning models can predict groundwater level at places without measurements using attributes of the places. Our deep learning model captured temporal variation well, especially in more responsive aquifers, similarly to a process-based model. Interpretation tools showed that meteorological and environmental information at places helped predictions at unseen wells. We highlight the potential of deep learning models for regional groundwater level predictions.


Total:	0
HTML:	0
PDF:	0
XML:	0