Preprints
https://doi.org/10.5194/egusphere-2025-6212
https://doi.org/10.5194/egusphere-2025-6212
27 Dec 2025
 | 27 Dec 2025
Status: this preprint is open for discussion and under review for Hydrology and Earth System Sciences (HESS).

Introducing the Model Fidelity Metric (MFM) for robust and diagnostic land surface model evaluation

Zezhen Wu, Zhongwang Wei, Xingjie Lu, Nan Wei, Lu Li, Shupeng Zhang, Hua Yuan, Shaofeng Liu, and Yongjiu Dai

Abstract. The accurate evaluation of Land Surface Models (LSMs) is fundamental to their development and application. However, standard metrics such as the Nash-Sutcliffe Efficiency (NSE) and Kling-Gupta Efficiency (KGE) possess well-documented shortcomings. Relying on moment-based statistics such as mean, variance, and correlation often falls short for land surface modelling data, which are typically non-normal and skewed. These metrics can be misleading due to issues such as error compensation, instability when variability is low, and the confusion of magnitude and phase errors, leading to inaccurate model assessments. To address these fundamental flaws, we propose the Model Fidelity Metric (MFM), a novel evaluation framework constructed using robust statistics and information theory. MFM integrates three orthogonal dimensions of model performance within a Euclidean framework, including 1) Accuracy, which measure by the robust Normalized Mean Absolute p-Error (NMAEp) and penalized for timing issues via a Phase Penalty Factor (PPF); 2) Variability, quantified using the information-theoretic Scaled and Unscaled Shannon Entropy differences (SUSE); and 3) Distribution Similarity, assessed non-parametrically using the Percentage of Histogram Intersection (PHI). We evaluated MFM against with traditional metrics using targeted synthetic experiments and the large-sample CAMELS dataset. Our results demonstrate that MFM provides a more authentic and reliable assessment of model fidelity. MFM proved immune to error compensation effects that mislead KGE and remained stable in low-variability scenarios where NSE and KGE fail. Furthermore, MFM provides superior diagnostic capabilities by decoupling phase and magnitude errors and decomposing performance into its core components. This work highlights the need to move beyond traditional moment-based metrics. We advocate adopting robust, diagnostic frameworks such as MFM to support the development of more trustworthy LSMs.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Zezhen Wu, Zhongwang Wei, Xingjie Lu, Nan Wei, Lu Li, Shupeng Zhang, Hua Yuan, Shaofeng Liu, and Yongjiu Dai

Status: open (until 07 Feb 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Zezhen Wu, Zhongwang Wei, Xingjie Lu, Nan Wei, Lu Li, Shupeng Zhang, Hua Yuan, Shaofeng Liu, and Yongjiu Dai
Zezhen Wu, Zhongwang Wei, Xingjie Lu, Nan Wei, Lu Li, Shupeng Zhang, Hua Yuan, Shaofeng Liu, and Yongjiu Dai
Metrics will be available soon.
Latest update: 27 Dec 2025
Download
Short summary
Accurately evaluating land surface models is crucial for reliable climate forecast and water resource management. We proposed a new evaluation metric that avoids some traditional metrics' flaws by focusing on accuracy, variability, and pattern similarity. This work offers a more reliable alternative to evaluate land surface models, supporting better decisions in land surface model development.
Share