the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
UKCM2-LL: a new low-resolution GC5 configuration with constrained climate sensitivity – methodology and development
Abstract. The Global Coupled model version 5 (GC5) of the Met Office Unified Model incorporates substantial developments across its model components and shows improved performance for a range of applications. However, it exhibits a very high effective climate sensitivity (EffCS) of 6.7 K and an excessive rise in recent global-mean surface air temperatures (GMSAT), limiting its suitability for some climate applications. This motivated the development of an alternative GC5-based configuration with EffCS constrained to lie within the IPCC Sixth Assessment Report “very likely” range, improved simulation of historical temperatures, and acceptable climatological performance. This new configuration, called UKCM2-LL, will form part of the UK’s submission to CMIP7.
We describe a two-stage methodology used to develop UKCM2-LL. First, a 503-member atmosphere-only perturbed parameter ensemble (PPE) of GC5 variants was used to train statistical emulators that predict climatological performance metrics and atmosphere-only feedbacks. These emulators were then used to generate 41 candidate configurations predicted to have substantially reduced EffCS relative to GC5. Second, coupled preindustrial control and abrupt 4×CO₂ experiments were used to evaluate the candidates against large-scale climatological metrics and to diagnose EffCS, progressively narrowing the set of viable configurations through an expert-led evaluation process. Final fine-tuning of a single candidate was performed manually using the coupled experiments and was informed by parameter sensitivity information derived from the PPE. The resulting UKCM2-LL configuration has an EffCS of 3.6 K and exhibits an improved simulation of historical temperatures relative to GC5. However, achieving this lower EffCS required a degradation in climatological performance, reflecting a structural constraint of the GC5 model.
This work demonstrates the value of PPE-based approaches for systematically exploring such structural constraints, and for parameter tuning during model development. We discuss potential improvements to the methodology and consider the implications of explicitly constraining climate sensitivity for future model development and multi-model ensemble diversity.
- Preprint
(2419 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-1676', Anonymous Referee #1, 02 Jun 2026
-
RC2: 'Comment on egusphere-2026-1676', Anonymous Referee #2, 17 Jun 2026
This manuscript documents the development of UKCM2-LL, a new low-resolution GC5-based coupled climate model configuration intended for the UK’s CMIP7 contribution. The main motivation is that the lower-resolution version of GC5 has an excessively high effective climate sensitivity, EffCS = 6.7 K, and an overly rapid recent historical warming trend, which limits its suitability for long-timescale climate applications, decadal prediction, and Earth-system modelling. The authors develop UKCM2-LL by tuning GC5 parameters to reduce EffCS into the IPCC AR6 “very likely” range while attempting to retain acceptable climatological performance. The final configuration achieves EffCS = 3.6 K, although with some degradation in climatological performance relative to GC5. It uses a large 503-member atmosphere-only perturbed-parameter ensemble, statistical emulators, AMIP and AMIP-future-4K experiments, and a smaller set of coupled piControl and abrupt-4×CO₂ simulations to identify and refine candidate configurations. The study is valuable because it makes explicit the trade-off between reducing climate sensitivity and preserving present-day climatological performance in this model family. However, several aspects of the methodology require clearer justification, especially the relationship between the low-resolution and higher-resolution/operational configurations, the adequacy and transparency of the evaluation metrics, and the very short coupled spin-up used during candidate screening.
Major comments
1. consistency between high and low-resolution configurations: A central issue is that the manuscript focuses exclusively on the low-resolution N96ORCA1 configuration, while the broader GC5 model development is also linked to operational/NWP improvements and higher-resolution applications. The paper states that both this study and the companion evaluation focus only on lower-resolution simulations.
If the low-resolution configuration is intended only as an independent CMIP7 low-resolution model, this should be stated more explicitly. However, if the low-resolution model is intended to serve as a computationally efficient surrogate for the higher-resolution or operational version of GC5, then the manuscript needs to demonstrate that the two configurations behave similarly in the relevant aspects.
2. Evaluation metrics: There are literally tons of evaluation metrics. At the first glance, metrics used in this study looks reasonable. However, I'd like to see more justification of why these particular metrics are sufficient.
3. 44-yr spinup: It is rather short. If it is just extension of the high-res model, it is okay. This issue is connected to #1 regarding how one has to view this low-res version. Also, it seems to have a bit of drift as well. Not sure whether this can impact interepretation of the result.
Citation: https://doi.org/10.5194/egusphere-2026-1676-RC2
Data sets
Supplementary material for manuscript "UKCM2-LL: a new low-resolution GC5 configuration with constrained climate sensitivity – methodology and development": Data John W. Rostron and David M. H. Sexton https://doi.org/10.5281/zenodo.19267353
Model code and software
Supplementary material for manuscript "UKCM2-LL: a new low-resolution GC5 configuration with constrained climate sensitivity – methodology and development": Scripts John W. Rostron https://doi.org/10.5281/zenodo.19205160
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 379 | 119 | 24 | 522 | 22 | 25 |
- HTML: 379
- PDF: 119
- XML: 24
- Total: 522
- BibTeX: 22
- EndNote: 25
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Review of “UKCM2-LL: a new low-resolution GC5 configuration with constrained climate sensitivity – methodology and development”
The manuscript documents the process by which a new configuration of the Met Office Unified Model was assembled in order to have an effective climate sensitivity (EffCS) within the IPCC’s likely range of values. The authors utilize a series of steps relying on amip-like simulations to form a PPE, which is run in pairs of present-day and plus-4K warming conditions to create a relationship between errors in variable fields of interest and the feedback parameter. Next, they use an emulator to probe the parameter space for suitable candidate configurations to test in coupled mode. After they narrow these down, an expert-based fine-tuning procedure is done to arrive at the final solution.
I appreciate the authors taking the time to put together a manuscript like this that provides a description of their procedure. As more modeling centers embark on auto-tuning or other tuning methods for purpose-driven applications, it will be interesting to see how the community builds upon and adapts these efforts over time. I particularly enjoyed the discussion in section 5 related to the authors’ impressions of their own methods, as well as the deeper questions facing the community. That said, there are some minor areas where this manuscript could be improved, which I have outlined below.
Lines 182-186: the authors mention using the modal and median parameter values. Looking at Rostron et al., 2025, it seems to rely on the prior parameter distributions, but how are these defined? The “flat top” nature is presumably expert judgement weighing in on the likely range as a subset of the total range, I imagine, but this isn’t clear here. Is there a citation that can be given and/or a quick synopsis on how these prior distributions are defined? For clarity, it would be helpful to report what the median and modal values being tested are.
Related to the variables in Table A1: how was it decided which variables should be assessed annually and which should be separated into DJF and JJA? For example, both SWCRE and LWCRE should follow monsoon precipitation, but only SWCRE and precipitation are assessed seasonally. Perhaps this is planned analyses for the companion paper by Bodas-Salcedo et al., but was there anything learned about potential structural errors in the model’s ability to simulate these fields from this work?
The solar bug fix is described as having “been tested previously” and “found to have minimal impacts on simulation results,” but Fig. B1-B3 show large trends during the adjustment period for that change. Are those trends meant to be interpreted as continuations of the trends from the previous changes?
Why not include UKCM2 on Figure 1? Why not provide the RMSE values for the metrics going into Eamip for GC5 and UKCM2? It seems like it would be valuable to see how these criteria that were selected for the initial PPE perform after all of the fine tuning results. I imagine the follow-up paper will cover many of these features, but it would be valuable to at least get a hint at how this model performs relative to its starting point. Especially if claims like “its performance is competitive with CMIP6 models across a wide range of variables,” are being made (L680-681). We can piece together which dot is the proto-UKCM2 by comparing Figs 1 and 6, so we know that Eamip is 1.4x that of GC5. Maybe just the global RMSE values for each variable would be sufficient here, leaving the follow-up work to focus on the spatial patterns.
It would be valuable to future research efforts to give a complete list of the 37 “most influential” parameters used in the Monte Carlo sampling, if possible. Knowing which parameters the feedback parameter is sensitive to would be valuable for designing future PPEs.
Line 235: “with at least of 174” -> “with at least 174”
Line 313: please specify your method of computing λamip
Line 527: “less stringent criteria…” can you provide examples?