Preprints
https://doi.org/10.5194/egusphere-2026-2288
https://doi.org/10.5194/egusphere-2026-2288
01 Jun 2026
 | 01 Jun 2026
Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

An ESMValTool-based framework for sanity checks, physical consistency and climate fidelity during model development – ICONEval v1.0

Axel Lauer, Manuel Schlund, Lisa Bock, Birgit Hassler, Gunnar Behrens, Bettina Gier, Lukas Lindenlaub, Stephan Lorenz, Jan-Hendrik Malles, Wolfgang A. Müller, Trang van Pham, Katja Weigel, Guang Zeng, and Veronika Eyring

Abstract. Continuous evaluation and performance monitoring during the development of Earth System Models (ESMs) are essential to identify potential problems early, such as unrealistic behavior of climate-relevant quantities, insufficient skill in reproducing the observed basic climate state, or violations of physical laws. The latter is particularly important for the emerging class of hybrid machine learning (ML) enhanced ESMs, where data-driven components are integrated with physics-based model formulations. ESMs used for projections of future climate continue to increase in complexity and resolution. Efficient and user-friendly tools such as the Earth System Model Evaluation Tool (ESMValTool) can therefore greatly support the assessment of a model. So far, ESMValTool focused primarily on providing a broad collection of community-developed evaluation diagnostics and recipes, allowing users to perform a large variety of rather detailed assessments across different domains. A main application of the tool was the assessment of multiple ESMs, in particular those participating in the coupled model intercomparison project (CMIP). Here, we introduce ICONEval, an open-source evaluation framework using ESMValTool that complements existing capabilities by enabling rapid, reproducible, and physically informed assessments of model performance, also during development. ICONEval provides efficient parallel processing of ESMValTool recipes and can generate HTML summary reports allowing to easily automatize and visualize evaluation and monitoring of performance during model development. The new capabilities are grouped into three complementary categories: (1) sanity checks, (2) physical consistency checks, and (3) climate fidelity diagnostics. The sanity checks assess whether global mean values of climate-relevant variables are within the bounds derived from observational and reanalysis datasets. The physical consistency checks aim to identify potential violations of constraints imposed by fundamental physics such as conservation of total air mass, realistic variability of atmospheric water vapor with temperature or the temperature dependence of the cloud ice fraction. The climate fidelity diagnostics assess important climate variables from different ESM components (atmosphere, ocean, and land). Here, we demonstrate this extension of the ESMValTool capabilities by applying the new diagnostics to a historical simulation performed with the ICON-XPP model as an illustrative example. The three-step assessment presented here can be efficiently used to compare different model configurations or versions, for example when testing new or updated parameterizations, including hybrid ML-enhanced (MLe) ESMs, also supporting emerging community benchmarking standards such as ClimateBench.

Competing interests: At least one of the (co-)authors is a member of the editorial board of Geoscientific Model Development.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Axel Lauer, Manuel Schlund, Lisa Bock, Birgit Hassler, Gunnar Behrens, Bettina Gier, Lukas Lindenlaub, Stephan Lorenz, Jan-Hendrik Malles, Wolfgang A. Müller, Trang van Pham, Katja Weigel, Guang Zeng, and Veronika Eyring

Status: open (until 27 Jul 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Axel Lauer, Manuel Schlund, Lisa Bock, Birgit Hassler, Gunnar Behrens, Bettina Gier, Lukas Lindenlaub, Stephan Lorenz, Jan-Hendrik Malles, Wolfgang A. Müller, Trang van Pham, Katja Weigel, Guang Zeng, and Veronika Eyring

Data sets

An ESMValTool-based framework for sanity checks, physical consistency and climate fidelity during model development – ICONEval v1.0 Axel Lauer, Manuel Schlund, Lisa Bock, Birgit Hassler, Gunnar Behrens, Bettina Gier, Lukas Lindenlaub, Stephan Lorenz, Jan-Hendrik Malles, Wolfgang A. Müller, Trang v. Pham, Katja Weigel, Guang Zeng, and Veronika Eyring https://doi.org/10.5281/zenodo.19664576

Model code and software

Earth System Model Evaluation Tool (ESMValTool) Andela, Bouwe; Broetz, Bjoern; de Mora, Lee; Drost, Niels; Eyring, Veronika; Koldunov, Nikolay; Lauer, Axel; Mueller, Benjamin; Predoi, Valeriu; Righi, Mattia; Schlund, Manuel; Vegas-Regidor, Javier; Zimmermann, Klaus; Adeniyi, Kemisola; Castellani, Giulia; Arnone, Enrico; Bellprat, Omar; Berg, Peter; Billows, Chris; Blockley, Ed; Bock, Lisa; Bodas-Salcedo, Alejandro; Caron, Louis-Philippe; Carvalhais, Nuno, Cionni, Irene; Cortesi, Nicola; Corti, Susanna; Crezee, Bas; Davin, Edouard Leopold; Davini, Paolo; Deser, Clara; Diblen, Faruk; Docquier, David; Dreyer, Laura; Ehbrecht, Carsten; Earnshaw, Paul; Geddes, Theo; Gier, Bettina; Gillett, Ed; Gonzalez-Reviriego, Nube; Goodman, Paul; Hagemann, Stefan; Hall, Sophie; Hardacre, Catherine; von Hardenberg, Jost; Hassler, Birgit; Heuer, Helge; Hogan, Emma; Hunter, Alasdair; Kadow, Christopher; Kindermann, Stephan; Koirala, Sujan; Kuehbacher, Birgit; Lledó, Llorenç; Lejeune, Quentin; Lembo, Valerio; Little, Bill; Loosveldt-Tomas, Saskia; Lorenz, Ruth; Lovato, Tomas; Lucarini, Valerio; Malinina, Elizaveta; Massonnet, François; Mohr, Christian Wilhelm; Amarjiit, Pandde; Parsons, Naomi; Pérez-Zanón, Núria; Phillips, Adam; Proft, Max; Russell, Joellen; Sandstad, Marit; Sellar, Alistair; Senftleben, Daniel; Serva, Federico; Sillmann, Jana; Stacke, Tobias; Storkey, Dave; Swaminathan, Ranjini; Tomkins, Katherine; Torralba, Verónica; Weigel, Katja; Sarauer, Ellen; Schulze, Kirsten; Roberts, Charles; Kalverla, Peter; Alidoost, Sarah; Verhoeven, Stefan; Vreede, Barbara; Smeets, Stef; Soares Siqueira, Abel; Kazeroni, Rémi; Potter, Jerry; Winterstein, Franziska; Beucher, Romain; Kraft, Jeremy; Ruhe, Lukas; Bonnet, Pauline; Munday, Gregory; Chun, Felicity; Ellis, Hannah https://doi.org/10.5281/zenodo.3401363

ICONEval Schlund, Manuel; Bock, Lisa https://doi.org/10.5281/zenodo.18937450

Axel Lauer, Manuel Schlund, Lisa Bock, Birgit Hassler, Gunnar Behrens, Bettina Gier, Lukas Lindenlaub, Stephan Lorenz, Jan-Hendrik Malles, Wolfgang A. Müller, Trang van Pham, Katja Weigel, Guang Zeng, and Veronika Eyring
Metrics will be available soon.
Latest update: 01 Jun 2026
Download
Short summary
ICONEval is a new framework to facilitate researchers to continuously monitor and check their climate models during the development phase. It builds on the ESMValTool software package to run tests verifying that the results are plausible, follow physical laws, and that the model has sufficient skill in reproducing the observed climate. An important aim is to make it easier to spot model errors early, for example when implementing new model components that use machine learning.
Share