Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning

Husic, Admin; Hammond, John; Price, Adam N.; Roundy, Joshua K.

doi:https://doi.org/10.5194/egusphere-2024-3235

Admin Husic, John Hammond, Adam N. Price, and Joshua K. Roundy

Abstract. Large-scale hydrologic models are increasingly being developed for operational use in the forecasting and planning of water resources. However, the predictive strength of such models depends on how well they resolve various functions of catchment hydrology, which are influenced by gradients in climate, topography, soils, and land use. Most assessments of these hydrologic models has been limited to traditional statistical approaches. The rise of machine learning techniques can provide novel insights into identifying process deficiencies in large-scale hydrologic models. In this study, we train a random forest model to predict the Kling-Gupta Efficiency (KGE) of National Water Model (NWM) and National Hydrologic Model (NHM) predictions for 4,383 streamgages across the conterminous United States. Thereafter, we explain the local and global controls that 48 catchment attributes exert on KGE prediction using interpretable Shapley values. Overall, we find that soil water content is the most impactful feature controlling successful model performance, suggesting that soil water storage is difficult for hydrologic models to resolve, particularly for arid locations. We identify non-linear thresholds beyond which predictive performance decreases for NWM and NHM. For example, soil water content less than 210 mm, precipitation less than 900 mm/yr, road density greater than 5 km/km², and lake area percent greater than 10 % contributed to lower KGE values. These results suggest that improvements in how these influential processes are represented could result in the largest increases in predictive performance of NWM and NHM. This study demonstrates the utility of interrogating process-based models using data-driven techniques, which has broad applicability and potential for improving the next generation of large-scale hydrologic models.

Received: 16 Oct 2024 – Discussion started: 13 Nov 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 7251 KB)

Supplement (9423 KB)

Download & links

Country	#	Views	%
United States of America	1	161	46
China	2	37	10
Brazil	3	16	4
United Kingdom	4	12	3
Netherlands	5	12	3


Total:	0
HTML:	0
PDF:	0
XML:	0

Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning

Supplement

Viewed

Viewed (geographical distribution)