Preprints
https://doi.org/10.5194/egusphere-2024-3235
https://doi.org/10.5194/egusphere-2024-3235
13 Nov 2024
 | 13 Nov 2024

Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning

Admin Husic, John Hammond, Adam N. Price, and Joshua K. Roundy

Abstract. Large-scale hydrologic models are increasingly being developed for operational use in the forecasting and planning of water resources. However, the predictive strength of such models depends on how well they resolve various functions of catchment hydrology, which are influenced by gradients in climate, topography, soils, and land use. Most assessments of these hydrologic models has been limited to traditional statistical approaches. The rise of machine learning techniques can provide novel insights into identifying process deficiencies in large-scale hydrologic models. In this study, we train a random forest model to predict the Kling-Gupta Efficiency (KGE) of National Water Model (NWM) and National Hydrologic Model (NHM) predictions for 4,383 streamgages across the conterminous United States. Thereafter, we explain the local and global controls that 48 catchment attributes exert on KGE prediction using interpretable Shapley values. Overall, we find that soil water content is the most impactful feature controlling successful model performance, suggesting that soil water storage is difficult for hydrologic models to resolve, particularly for arid locations. We identify non-linear thresholds beyond which predictive performance decreases for NWM and NHM. For example, soil water content less than 210 mm, precipitation less than 900 mm/yr, road density greater than 5 km/km2, and lake area percent greater than 10 % contributed to lower KGE values. These results suggest that improvements in how these influential processes are represented could result in the largest increases in predictive performance of NWM and NHM. This study demonstrates the utility of interrogating process-based models using data-driven techniques, which has broad applicability and potential for improving the next generation of large-scale hydrologic models.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share

Journal article(s) based on this preprint

17 Sep 2025
Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning
Admin Husic, John Hammond, Adam N. Price, and Joshua K. Roundy
Hydrol. Earth Syst. Sci., 29, 4457–4472, https://doi.org/10.5194/hess-29-4457-2025,https://doi.org/10.5194/hess-29-4457-2025, 2025
Short summary
Admin Husic, John Hammond, Adam N. Price, and Joshua K. Roundy

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2024-3235', Jonathan Frame, 13 Jan 2025
    • AC1: 'Reply on RC1', Admin Husic, 25 Feb 2025
  • RC2: 'Comment on egusphere-2024-3235', Anonymous Referee #2, 16 Jan 2025
    • AC2: 'Reply on RC2', Admin Husic, 25 Feb 2025

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2024-3235', Jonathan Frame, 13 Jan 2025
    • AC1: 'Reply on RC1', Admin Husic, 25 Feb 2025
  • RC2: 'Comment on egusphere-2024-3235', Anonymous Referee #2, 16 Jan 2025
    • AC2: 'Reply on RC2', Admin Husic, 25 Feb 2025

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
ED: Publish subject to minor revisions (further review by editor) (10 Mar 2025) by Frederiek Sperna Weiland
AR by Admin Husic on behalf of the Authors (14 Mar 2025)  Author's response   Author's tracked changes   Manuscript 
ED: Publish subject to minor revisions (review by editor) (18 May 2025) by Frederiek Sperna Weiland
AR by Admin Husic on behalf of the Authors (21 May 2025)  Author's response   Author's tracked changes   Manuscript 
ED: Publish as is (17 Jul 2025) by Frederiek Sperna Weiland
AR by Admin Husic on behalf of the Authors (17 Jul 2025)  Manuscript 

Journal article(s) based on this preprint

17 Sep 2025
Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning
Admin Husic, John Hammond, Adam N. Price, and Joshua K. Roundy
Hydrol. Earth Syst. Sci., 29, 4457–4472, https://doi.org/10.5194/hess-29-4457-2025,https://doi.org/10.5194/hess-29-4457-2025, 2025
Short summary
Admin Husic, John Hammond, Adam N. Price, and Joshua K. Roundy
Admin Husic, John Hammond, Adam N. Price, and Joshua K. Roundy

Viewed

Total article views: 803 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
669 112 22 803 34 19 32
  • HTML: 669
  • PDF: 112
  • XML: 22
  • Total: 803
  • Supplement: 34
  • BibTeX: 19
  • EndNote: 32
Views and downloads (calculated since 13 Nov 2024)
Cumulative views and downloads (calculated since 13 Nov 2024)

Viewed (geographical distribution)

Total article views: 739 (including HTML, PDF, and XML) Thereof 739 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 17 Sep 2025
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
We used explainable machine learning to evaluate the accuracy of two continental-scale hydrologic models. We analyzed a suite of catchment attributes and found that soil water content had the biggest impact on model performance, especially in dry areas. Key thresholds for variables like precipitation and road density were identified, which could guide future improvements in these models. Our findings highlight the potential of data-driven methods to inform process-based models.
Share