the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Optimisation of ICON-CLM for the EURO-CORDEX domain: developments, sensitivities, tuning
Abstract. Optimising the model performance to reduce model biases is a challenging task in global and regional climate modelling, especially relevant for free-running climate change simulations. This challenge is addressed in the present study through a systematic RCM tuning strategy using a novel methodology, which includes an iterative update of the reference configuration and combines expert judgement with objective tuning using a Linear Meta-Model optimisation (LiMMo) to derive an optimised model configuration. We applied this methodology to the regional climate model ICON-CLM setup over Europe at 12 km grid size (EURO-CORDEX domain) in order to reduce, e.g., the overestimation of incoming solar radiation and too low 2-m temperature. During this process, the sensitivity of the model to changes of 29 model parameters and their physical consistency was tested and investigated. Comparing the results of optimisation by expert judgement with LiMMo showed that the latter not only confirms the expert judgement focusing on a priori known highly sensitive parameters, but additionally, it allows a model configuration fine-tuning with an explicit control over the tuning process and makes parameter combinations more efficient. With reference to the default ICON numerical weather prediction (NWP) configuration, the model optimisation yielded significant improvements for a real climate mode simulations use case. For example, biases in incoming short wave radiation could be reduced by 30%, latent heat flux biases by 15%, by tuning cloud parameters in combination with surface flux parameters. Furthermore, the new configuration could only be reached by using revised external datasets, including transient aerosols. Based on the community-based coordinated parameter tuning, we recommend an ICON-CLM model configuration for the EURO-CORDEX domain that is already being used for the downscaling of global CMIP6 simulations.
- Preprint
(23267 KB) - Metadata XML
-
Supplement
(18125 KB) - BibTeX
- EndNote
Status: open (until 05 Apr 2026)
- RC1: 'Comment on egusphere-2025-4726', Anonymous Referee #1, 16 Feb 2026 reply
-
RC2: 'Comment on egusphere-2025-4726', Gregory Elsaesser, 12 Mar 2026
reply
Review of Geyer et al. “Optimisation of ICON-CLM for the EURO-CORDEX domain: developments, sensitivities, tuning”
Summary and Overall Comments:
The authors describe their approach for improving ICON-CLM (for a regional domain), with a particular focus on integrating more objective tuning efforts into the model optimization approach (which is exciting to see). Up to now, a more automated tuning method has not been used, and this addition is valuable and in line with broader community efforts to make tuning more objective. With respect to the model, the authors are particularly aiming to reduce overestimation of incoming SW and too low 2-m temperature biases as part of overall optimization efforts. The paper is mostly OK as written (recommend somewhere between minor and major revision), and I hesitate to recommend more as it is already long (with ~20 figures and ~10 tables), but I do think it is worthwhile to clarify throughout exactly how they view or partition tuning into “expert” and “objective” components because even objective tuning requires decisions on parameters and metrics provided by experts (the ‘computer’ itself does not decide on parameters and metrics, and thus the results are human dependent too). I think of tuning from the perspective of advances in automation and AI enabling more parameters and metrics to be considered (with the whole process more efficient), but with experts always involved in deciding on which parameters to tune & metrics to tune against -- I do not think tuning is fully separated which the paper seems to maybe imply as a new approach (although perhaps that is accidental, and only minor wording changes are needed throughout?). Additional general comments/questions are laid out below.
General Comments:
1.
Figure 1 is very helpful, but it contains quite a bit of text and acronyms, and I am still not certain I understand the formal steps at arriving at tuning that is a balance of expert judgment and objective tuning. Existing automated tuning (or auto-calibration) efforts are not trying to silo the two efforts, necessarily. Are the authors proposing a new way for merging the two aspects of tuning that CURRENT autotuning or autocalibration efforts are not considering? At present, as it exists, autotuning still implies expert involvement (because someone has to choose the parameters and metrics they compare against) – thus, the comment around line 892, and elsewhere – “LiMMo tuning cannot achieve the same model quality as expert tuning” – sort of assumes that LiMMo is determining things on its own, but if you coded up the same features that the experts are looking at, couldn’t LiMMo do better? I do not fully understand the implied extent of separation between experts and objective tuning as being referenced throughout.
2.
In the tuning strategy – it is mentioned that the tuning effects of combined parameter changes involves experts – how exactly is this done?
3.
How is structural error in the model accounted for? How is it ensured that parameter tuning is not compensating for structural errors or is a possible compensation not a concern in their modeling efforts/goals?
4.
For the observations used in the tuning, is there any concern about observational uncertainty and its role in the parameter tuning process? Are all the observations, many of which are retrievals, “truthful” enough or would accounting for uncertainty in the process influence the parameter settings?
5.
How sensitive are results to the weighting chosen? (referencing the weights in Table 2)
6.
The authors note the value of a new score (ScoPi) but only an in-press (I think) publication is provided. This should be laid out in the paper as it seems a key point when scoring across (or accounting for) very different metrics, or at least, key equations and text relevant to this need to be provided in a supplement.
7.
It was hard to keep track of, but this paper has a lot of acronyms that may not have always been defined initially; please check that undefined/unintroduced acronyms are defined before revision.
Addressing some of my general comments above only would add length to the paper, and I do wonder if some parts in the existing manuscript overall can be moved to a supplementary section (?).
Specific comments:
Line 3 in abstract: RCM acronym not defined yet, assume it is “regional climate modeling”, as introduced in first line.
Line 8: The following text/sentence is confusing to me. Additionally, it is not obvious what a “more efficient parameter combination” is -- can the sentence below be rewritten more clearly?
“Comparing the results of optimisation by expert judgement with LiMMo showed that the latter not only confirms the expert judgement focusing on a priori known highly sensitive parameters, but additionally, it allows a model configuration fine-tuning with an explicit control over the tuning process and makes parameter combinations more efficient.”
Line 13: what does it mean to say the new “configuration could only be reached by using revised datasets” – isn’t the new configuration related to optimized parameter combinations?
Line 15: The paper suggests an iteration between expert and objective tuning efforts, which to me does not exactly suggest “community-based coordinated parameter tuning” as written in the abstract. What does community-based mean as written here? Is this the appropriate wording?
Line 193: “further fine-tune the” should be “further fine-tuning of the”
Line 216: are the first two of the 3 parameterizations really not as uncertain? Perhaps the first, but detrained water into an anvil ties more to a time tendency of anvil areas (the purpose of prognostic parameterizations), instead of anvil area (A) itself. No uncertain parameters for these routines?
Lines 345-350: it is written: “…in previous studies devoted to objective calibration, monte carlo sampling was used, but they note that complexity associated with a large parameter state space limited the number of parameters to 7 – 8.” This statement about monte carlo efforts is a bit out of date and is beginning to change, especially with use of model ML emulators that enable a tremendous speed increase; see NASA GISS E3 autocalibration work (40 parameters) and MCMC use, as well as discussion in the Carslaw et al. (2026) opinion piece in EGUsphere. High resolution modeling might not quite be ready until emulators of such models are available, which I understand, but monte carlo efforts are definitely expanding into autotuning, but with reliance on AI as model emulators enabling order of magnitude speed-up.
Line 362: “…this mimimum may be on the boundary of the constrained region for certain parameters.” This is likely to also happen because of so few parameters too (really the suite is very large, many 10s-100), as well as structural error in the model. How is structural error accounted for? (see above more general point).
Line 415 and 429 and many other lines: Geyer (submitted) paper – perhaps available now? The authors mention use of a novel metric (Scopi) but there are no details, and the paper cannot be found; information must be provided therefore, so as to demonstrate the great advantage of the new “ScoPi” score as claimed.
Signed,
Greg Elsaesser, GISS
Citation: https://doi.org/10.5194/egusphere-2025-4726-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 440 | 321 | 29 | 790 | 50 | 20 | 34 |
- HTML: 440
- PDF: 321
- XML: 29
- Total: 790
- Supplement: 50
- BibTeX: 20
- EndNote: 34
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Please see the attached.