the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Using a Gaussian Process Emulator to approximate the climate response patterns to greenhouse gas and aerosol forcings
Abstract. We present a Gaussian process emulator for estimating fast surface temperature response patterns to a range of different climate forcing agents, including both long-lived greenhouse gases and short-lived pollutants such as aerosols. This emulator is trained on simulations driven by perturbations to emissions (for short-lived pollutants) and concentrations (for long-lived greenhouse gases) using a full-complexity global climate model and predicts the response in the first five years after the forcing, at a small fraction of the computational cost. We outline the emulator design, including the choice of pollutant perturbations and the input space covered by the training data. We show that the emulator performs well in most regions of the chosen input space, except under very large aerosol perturbations. A global sensitivity analysis is carried out to characterize and understand emission-response relationships for each pollutant. We find similar large-scale patterns of sensitivity to aerosol pollutants released in different regions. Finally, we demonstrate how this type of emulator could be used in policy-relevant studies to predict fast adjustments of regional climate to changes in anthropogenic emissions for a given scenario. This establishes a basis for rapid climate change projection, without the need for computationally expensive climate model simulations, and increases the number of climate change scenarios that can be explored simultaneously.
- Preprint
(2873 KB) - Metadata XML
-
Supplement
(5397 KB) - BibTeX
- EndNote
Status: open (until 02 Mar 2026)
- RC1: 'Comment on egusphere-2025-6046', Anonymous Referee #1, 25 Feb 2026 reply
-
RC2: 'Comment on egusphere-2025-6046', Anonymous Referee #2, 25 Feb 2026
reply
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-6046/egusphere-2025-6046-RC2-supplement.pdf
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 190 | 130 | 26 | 346 | 43 | 35 | 32 |
- HTML: 190
- PDF: 130
- XML: 26
- Total: 346
- Supplement: 43
- BibTeX: 35
- EndNote: 32
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
There is a real need for emulators that can capture the spatial patterns of the response to GHGs and short-lived climate forcers, and there is work in this manuscript that moves us towards that goal. However, I am afraid I need to recommend rejection of the manuscript in its current form. The simulations used to train the emulator are not fit for the often-implied purpose, and there is confusing messaging throughout the manuscript about the suitability of the emulator for approximating the climate responses seen in the SSPs. This is an emulator of the fast temperature response to sudden changes in forcing, which we do not expect to capture the slow climate responses seen in the SSPs, yet there are parallels drawn with the SSPs and implications made about direct policy-relevance throughout the manuscript. There are some acknowledgements that this emulator is in fact only a step towards this, and a rewrite of the manuscript that presents it as the proof of concept that it is would be more appropriate.
Â
For policy implications, the focus on fast responses is a key limitation of the study, and it needs to be clearer in the title and conclusions that this is an emulator of fast temperature responses only. This is very different to what many readers will understand from the ‘climate response’ currently in the title (for many, this will be decadal- to centennial-scale responses), and may well be markedly different from the long-term response in many regions. Since the regional aerosol perturbations used in the short simulations produced to train the emulator are based on regional perturbations also used in long equilibrium experiments with similar scalings, it would be helpful to see a comparison of the fast and slow responses to regional aerosol perturbations to help the reader to better understand the limitations of trying to apply an emulator of the fast response to real-world or CMIP-style coupled-transient scenarios. The patterns and magnitude are different, as can be seen in existing literature, so the claims throughout the manuscript of ‘policy-relevance’ and the emulator as a ‘basis for rapid climate change projection’ must be toned down, and must appear alongside appropriate caveats.
Â
The emulator derives most of its variance in the temperature response from variance in the CO2 forcing, which is to be expected from the design of the training experiments, and is not representative of the real-world responses. In the real world, we have seen strong temperature responses to aerosol, which are not captured by this emulator. And we do not expect them to be. Most of the aerosol perturbations considered in this work are fast responses to SO2 changes. The fast temperature response to SO2 is a fraction of the magnitude of the slow response that we see in long coupled simulations and the real world, and it has a markedly different spatial pattern, see e.g. Figure 2: https://journals.ametsoc.org/view/journals/clim/31/11/jcli-d-17-0439.1.xml The design of the training experiments mean that this emulator will always underestimate the importance of aerosol relative to CO2 for any real-world applications, and that it will also fail to capture the pattern of the real-world aerosol response, as this differs markedly to the pattern of the fast temperature response. As the training dataset uses a series of 5yr coupled simulations, the emulator is also not showing us a clean version of the fast response to the forcings considered, as it is conflating them with responses across multiple timescales and internal variability (although an attempt was made to address the latter by having an ensemble of short simulations for each case). Â
Â
Putting aside the caveats of focusing on the fast response to SO2 perturbations, and of using 5yr simulations to do this, the choice of SO2 perturbation regions, and the magnitude of the SO2 perturbations themselves, which are scaled to be large enough to generate a clear signal useful for emulator training also seem sensible. Exploration of BB OC/BC is more limited, using only a tropics-wide training perturbation, and is likely to be missing key regional patterns in the response as a result. The choice of a Gaussian Process approach seems sensible, and the emulator does appear to represent the GCM responses well. However, the design of the training dataset mean that it is not an appropriate tool for emulating temperature responses in the SSPs.
Â
Specific comments follow.
Â
Title: Study is only trying to emulate the fast temperature response. This must be specified in the title in place of ‘climate’.
Introduction: Would be helpful to have a sentence in the introduction explaining the policy-relevance of the fast temperature response if that framing is retained. How often does the real-world/slow response look like the fast response? For SO2 in particular, there are large differences that should be addressed. It would be better to bring out the narrative that this emulator is an interim step, which provides proof of concept, then to try to force a narrative of policy-relevance. Â
L32: Since the perturbation regions have been used in previous studies (listed on L147), can you include a figure that compares the spatial pattern of the fast and slow responses to your regional aerosol perturbations, and also a figure comparing your fast responses to those calculated in a traditional fixed SST framework?
L59: For Gaussian process representation of regional temperature responses, see also: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2025JH000741
L74: As you say on L70, emulating the fast response lays the foundations for emulating the long-term climate response, but emulating the fast response itself has rather limited policy application, whether calculated from fixed-SST simulations or the from the first few years following a step perturbation in a fully coupled simulation.
L161: You’ve taken care to perturb SO2 in multiple regions, but then only apply one tropics-wide perturbation for BB OC/BC. Because of the nature of the response to tropical absorbing aerosol, the response to tropical BB is likely to be strongly dependent on the longitude of the perturbation e.g.: https://www.nature.com/articles/s41558-022-01415-4 It would be interesting and worthwhile to see if there are distinct responses in an Asia-Pacific and America-Atlantic-Africa perturbation. There is an argument made here that unrealistically large BB OC/BC forcings will be required for small-scale forcing, but, due to the physical mechanisms involved in the response, you might actually find larger responses when splitting the tropical band in two in this way. Westervelt et al. (2020) perturb South American and African biomass burning emissions in 3 GCMs, by similar amounts to those in this study, and find significant regional and global temperature responses in all models (https://acp.copernicus.org/articles/20/3009/2020/ ). Besides, in the paragraph beginning on L184, you describe using unrealistically large SO2 perturbations to ensure that you see a signal suitable for training the emulator. Why is it acceptable to do this for SO2, but not for BB OC/BC?
L268: ‘Most test simulations are reasonably well predicted by the emulator and show patterns of warming or cooling generally in the correct spatial location, similar to Figure 1’ . Looking at this figure, I see that the emulator does not correctly capture the sign of the response over some very populous regions (Europe, the Arabian peninsula) and has absolute errors in excess of 1 standard deviation over China, North America, the Amazon basin.  By eye, there is clearly a reasonable pattern correlation between the GCM and emulator responses, but some of the regional differences are large. How does one decide when the emulator is good enough? The example in Figure 1 has an incorrect sign and a large absolute error for Europe, and appears to be suggesting a warming for a 5x European SO2 experiment, which seems unphysical to me, despite the moderately high CO2 in the experiment.
L277: If your points are more likely to fall outside the accuracy threshold in the Northern Hemisphere, does that mean the emulator is particularly struggling with the response to SO2? Â
Figure 2: It would be helpful to see these regions on a map. Are they the same as the emission regions? Since one of the more useful features of this emulator is its ability to capture the spatial pattern of the response, it would be interesting and useful to see a similar evaluation of this that assesses the emulated pattern. Even just a pattern correlation between the true and predicted responses would be helpful. I’m certainly more curious about how well the pattern of the response compares to the training data. This figure implies that it is well captured since there is a strong linear relationship in all the regions shown, but it would be nice to have an objective metric. The relationships shown in Figure 2 are stronger than I expected based on Figure 1 and the Supplementary figures.
L370: Typo on this line – unnecessary ‘)’
L389: Doesn’t Westervelt et al. 2020 include South America and Africa cases?  Since you point the reader to studies with an extratropical focus, it would be helpful also to highlight those that do include regional tropical perturbations, e.g. https://acp.copernicus.org/articles/23/3575/2023/ , https://acp.copernicus.org/articles/20/3009/2020/
Â
Figures 4 and 5: I like what you are trying to do with these figures, and with figure 5 especially, but I am not quite sure how to interpret them. Your maximum CO2 case is a doubling of present day CO2, so essentially you have a 5-yr simulation where you have jumped from the present day to 2100 in SSP5-8.5, where we have a more rapid increase in GHG emissions than current policy. Minimum CO2 is a pre-industrial value, so we are essentially thinking about the variance in the temperature between a pre-industrial climate and something 4-6C warmer (I can’t remember where HadGEM sits in this range, but I assume it is towards the upper end). This is then compared to regional aerosol perturbations of various scalings. Both figures show us, essentially, CO2 contributing to most of the variance in most regions. Aerosol contributes to local variance in the emission region, but is essentially swamped by the CO2-driven variance. In the SSPs, most of the aerosol emission uncertainty is seen before 2050, with the pathways largely converging after that, so we would, in the real world, expect most of the aerosol-driven contributions to variance to be seen before 2050, with the CO2-driven variance in the response across different pathways to become larger with time. This is a somewhat long-winded way of saying that, I expect CO2 to contribute most of the variance in most of the regions in this experiment design because the range of CO2 concentrations considered is very large, but I think this is actively unhelpful if you are presenting the emulator as a policy-relevant tool, because we do not expect to see CO2 emissions and aerosol emissions changing across the ranges tested on the same kind of timescales in the real world. In reality, aerosols are likely more important for variance in the response in the near future, while CO2 will become steadily more important out to 2100. I’m also surprised that, even in the regions with very large aerosol emission ranges applied, that they are only contributing to around 5% of the variance. Aerosol has offset around a third of GHG-driven warming over the industrial era, with some large temperature responses over the main emission regions that are also on the order of a third of the GHG-driven response. The CO2 range in these simulations is to then double again on top of this historical change, so if aerosol offset around a third of the warming from 282-400ppm, we might expect it to offset around one fifteenth of the warming from 282-834ppm, or around 5%. But the aerosol ranges being considered are between 0 and 2 to 7x the present day values, so why don’t we expect something of the order 10-30% globally, and around 30% regionally? Why is the aerosol contribution so small in these figures? The ranges considered all go from 0 to some regional scaling, so it is not a case of the aerosol effect buffering as we go to higher scalings, as the aerosol reduction, which should produce a larger temperature response for a given emission change, is also considered. Is this simply a reflection of the small magnitude of the fast response to SO2 compared to the magnitude of the slow response? Or a reflection of the emulator not capturing the magnitude of the regional responses to the aerosol? Or are the colourbars in Figure 5 saturating, so that those dark blue patches are actually showing us something more like 10-20%?
Â
Figure 5: Regardless of your thoughts on my above comments, I think you need to use nonlinear colour scales in this figure so that you’re not getting so much saturation in those blue colours.
Â
Figure 6: I appreciate the note at L465 that the ‘emulator does not predict the transient response to the SSP scenarios between present day and 2050, but rather it estimates the short-term climate response to an abrupt jump in emissions’. However, this comes soon after the text at L487: ‘ we now explore how the emulator could be used to estimate the response to mid-century (2050) conditions from the Shared Socioeconomic Pathways (SSPs), which project socio-economic changes and their associated emissions into the future’, which the emulator cannot do. As Figure 6 is presented, it looks like you are using the emulator to predict the transient response in the SSPs. The titles of the panels are ‘predicted response for SSP1 and SSP3’. This is not what they show, nor can it be what they show. The figure caption needs to clearly state that this is a fast temperature response to an instantaneous time slice of emissions representative of 2050 emissions, and the panel titles also need to be clear that this is a fast response. If the emulator can be used to estimate the responses seen in SSP1-1.9 and SSP3-7.0 then we really need to see the current maps in Figure 6 alongside actual SSP1-1.9 and SSP3-7.0 projections for a period centred on 2050 from HadGEM3-GC3.1. For a policy-relevant emulator of the spatial temperature response to various forcings, surely this is the most exciting test, and it’s not shown. However, I am not convinced that the emulator can be used for this, in which case this figure is, at best, misleading without clear caveats.
L472 and L514: are there any policy-related studies that would consider the fast response in isolation?
L515: ‘These projections demonstrate how the emulator can be used to predict the spatial response of surface temperature to a specific scenario featuring emission changes from different pollutants at a fraction of the cost of the complete GCM.’ I strongly disagree with this statement. The response of surface temperature in specific scenarios is a slow response to changes in forcing, and I do not expect it to be well captured by an emulator of 5-year responses to a sudden change in forcing. An emulator that could be used to explore a wider range of scenarios than we can afford with GCMs would be a valuable tool, but this emulator is a step towards that, not the finished article.Â
Â
Â
Â
Â
Â
Â