Process diagnostics of snowmelt runoff in global hydrological models: Part I – Model evaluation from the perspective of robustness
Abstract. Accurate simulation of snowmelt runoff (SMR) is critical for water resource management. However, despite the abundance of global hydrological models, little is known about their SMR performance. This study first presents a comprehensive evaluation of SMR across 15 state-of-the-art large-scale models and runoff products by focusing on their biases in first-order indices, i.e., the total volume (Qsum), peak flow (Qmax), and centroid timing (CTQ) of runoff in the snowmelt period. Then by introducing 1,513 snow-dominated basins with increasing basin complexities, we further proposed a novel model robustness metric to quantify how the model performance changes with basin complexity. Our results reveal that (1) most models exhibit underestimated Qsum and Qmax and predict CTQ too early. These biases are particularly pronounced in regions such as the western United States, northern Europe, and northeastern China. (2) Model biases systematically increase with basin complexity, with CTQ exhibiting the strongest sensitivity to increasing mean elevation and topographic variability, while that of Qsum and Qmax is mainly shaped by mean elevation and the diversity of vegetation types in the basin. (3) The robustness assessment further shows that observation-constrained runoff products exhibit the most outstanding performance, followed by the ISIMIP3a and ISIMIP2a models. Overall, global hydrological models exhibit stronger performance in simulating SMR than land surface models. Notably, land surface models perform substantially better for CTQ than for Qsum or Qmax, highlighting their structural advantage in capturing melt timing relative to runoff magnitude. This study provides a benchmark for SMR evaluation and a new framework for assessing model performance under basin complexity, offering crucial insights for future model development and uncertainty reduction.