Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning

Admin Husic; John Christopher Hammond; Adam N. Price; Joshua Roundy

doi:10.5194/hess-29-4457-2025

Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning

Hydrology and Earth System Sciences

By: Admin Husic, John Christopher Hammond, Adam N. Price, and Joshua Roundy

https://doi.org/10.5194/hess-29-4457-2025

Metrics

Cited by publications in Crossref

Web analytics dashboard Metrics definitions

Links

More information: Publisher Index Page (via DOI)
Open Access Version: Publisher Index Page
Download citation as: RIS | Dublin Core

Abstract

Large-scale hydrologic models are increasingly being developed for operational use in the forecasting and planning of water resources. However, the predictive strength of such models depends on how well they resolve various functions of catchment hydrology, which are influenced by gradients in climate, topography, soils, and land use. Most assessments of hydrologic model uncertainty have been limited to traditional statistical methods. Here, we present a proof-of-concept approach that uses interpretable machine learning techniques to provide post hoc assessment of model sensitivity and process deficiency in hydrologic models. We train a random forest model to predict the Kling–Gupta efficiency (KGE) of National Water Model (NWM) and National Hydrologic Model (NHM) streamflow predictions for 4383 stream gauges in the conterminous United States. Thereafter, we explain the local and global controls that 48 catchment attributes exert on KGE prediction using interpretable Shapley values. Overall, we find that soil water content is the most impactful feature controlling successful model performance, suggesting that soil water storage is difficult for hydrologic models to resolve, particularly for arid locations. We identify nonlinear thresholds beyond which predictive performance decreases for NWM and NHM. For example, soil water content less than 210 mm, precipitation less than 900 mm yr⁻¹, road density greater than 5 km km⁻², and lake area percent greater than 10 % contributed to lower KGE values. These results suggest that improvements in how these influential processes are represented could result in the largest increases in NWM and NHM predictive performance. This study demonstrates the utility of interrogating process-based models using data-driven techniques, which has broad applicability and potential for improving the next generation of large-scale hydrologic models.

Suggested Citation

Husic, A., Hammond, J.C., Price, A.N., and Roundy, J., 2025, Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning: Hydrology and Earth System Sciences, v. 29, p. 4457-4472, https://doi.org/10.5194/hess-29-4457-2025.

Study Area

Additional publication details
Publication type	Article
Publication Subtype	Journal Article
Title	Interrogating process deficiencies in large-scale hydrologic models with interpretable machine learning
Series title	Hydrology and Earth System Sciences
DOI	10.5194/hess-29-4457-2025
Volume	29
Publication Date	September 17, 2025
Year Published	2025
Language	English
Publisher	Copernicus Publications
Contributing office(s)	Maryland-Delaware-District of Columbia Water Science Center
Description	16 p.
First page	4457
Last page	4472
Country	United States
Other Geospatial	conterminous United States