An EOF-Based Emulator of Means and Covariances of Monthly Climate Fields

Geogdzhayev, Gosha; Souza, Andre N.; Flierl, Glenn R.; Ferrari, Raffaele

doi:10.5194/egusphere-2025-3768

Preprints

https://doi.org/10.5194/egusphere-2025-3768

Preprints

18 Aug 2025

| 18 Aug 2025

An EOF-Based Emulator of Means and Covariances of Monthly Climate Fields

Gosha Geogdzhayev, Andre N. Souza, Glenn R. Flierl, and Raffaele Ferrari

Abstract. Fast emulators of comprehensive climate models are often used to explore the impact of anthropogenic emissions on future climate. A new approach to emulators is introduced that predicts means and covariances of monthly averaged climate variables. The emulator is trained with output from a state-of-the-art climate model and serves as a good first-order representation for the evolution of spatially resolved climate variables and their variability. For illustrative purposes, the emulator is applied to predict changes in the mean and variability of monthly values of both temperature and relative humidity as a function of global mean temperature changes. However, the approach can be applied to any other variable of interest.

Received: 03 Aug 2025 – Discussion started: 18 Aug 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Gosha Geogdzhayev, Andre N. Souza, Glenn R. Flierl, and Raffaele Ferrari

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-3768', Anonymous Referee #1, 19 Aug 2025
This manuscript proposes a change of basis method to project spatially explicit means and covariances of monthly climate variables as a function of global average temperature change. There’s nothing wrong I can see in the method and it is an approach to emulation I have not seen before. But it almost reads as not having been developed in close collaboration with a user group and I think suffers some serious shortcomings because of that. Ultimately, I am left asking ‘who is this actually for, specifically? Who is going to pick this up, generate values, and use them? How specifically will they be used?’ I'm confident the authors have something in mind for this, but I think the manuscript would benefit for a more explicit treatment of this question.
Major issues
What is the use case for spatially resolved means and covariances of monthly variables? I’m not trying to be glib, I’m coming at this from the perspective of an impact modeler where I need time series of daily or monthly values of variables. I can’t use the statistics you highlight reconstructing in section 4.2 and Section 5 + appendices. Am I missing something? Do you generate the time series and just demonstrate on statistics (Fig 6-8. E1) because those are more critical for validation? If so, I think having at least an example plot in Appendix E showing an actual time series generated with this method is key, You may also want to consider extending Appendix E and moving it explicitly into the main body of the manuscript.
If I have to plug into another emulator like DiffESM to get daily values, why do I need this? A skim of the DiffESM paper shows they only need monthly averages and not covariances and other, simpler methods can give monthly averages. The STITCHES approach can generate a decent sized ensemble of time series of multiple variables jointly just from global temperature.

The method seems to apply well to any individual gridded variables (demonstrated in the manuscript with surface temperature and surface relative humidity) but, unless I read it wrong, this doesn’t extend to coherent joint emulation of multiple variables, right? There are certainly some impact models that only need temperature or relative humidity or precipitation, but many need all of those variables coherently together. I’m thinking of hydrology models especially. Can this method handle that? If not, what could be downsides of using independently generated time series of temperature, relative humidity, and precipitation together?

The authors are clear an ESM needs to have provided a sufficiently large collection of runs to train from, but how large is large?
You touch on the implications of this in your final paragraph but I think this needs to be expanded. Like most emulation techniques, this approach targets a single ESM. But many studies using outputs from emulation, say to study projections of a novel scenario, are also concerned with multi-model uncertainty, i.e. they would want multiple emulators each trained on a different ESM. Does the collection of ESMs that provided ‘enough’ training have uncertainty properties at all similar uncertainty characteristics as the full collection of models? I don’t have any way of even roughly guessing because I don’t know what the extent of ESMs providing enough data to be individually emulated is.

Missing relevant citations for exclusively emulators of the class trained to extend ESMs to arbitrary future scenarios:
Bassetti et al was actually published nearly a year ago https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2023MS004194

Tebaldi, C., Snyder, A., and Dorheim, K.: STITCHES: creating new scenarios of climate model output by stitching together pieces of existing simulations, Earth Syst. Dynam., 13, 1557–1609, https://doi.org/10.5194/esd-13-1557-2022, 2022.

Quilcaille et al https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2022GL099012

and Quilcaille et al https://esd.copernicus.org/articles/14/1333/2023/
Citation: https://doi.org/10.5194/egusphere-2025-3768-RC1
- AC1: 'Reply on RC1', Gosha Geogdzhayev, 14 Oct 2025
  
  We thank the referee for their insightful questions and useful suggestions. We agree that it is important to carefully consider use cases when designing emulation methods. We hope that the responses below help to clarify our vision for the use of this method.
  Before getting to detailed responses to each specific question, we want to make some broader comments. The main contribution of this manuscript is to introduce a new emulator architecture that addresses important limitations of many existing emulators. (a) The emulator predicts means and variances of climate variables as a function of global mean surface temperature. This is essential to quantify if trends in means are significant and emerge out of the natural variability. The time of emergence of a climate signal, which our emulator can easily be used to quantify, is typically defined as the time when the change in the mean of a climate variable exceeds the natural variability defined as one or two standard deviations of that variable. (b) The emulator is capable of handling multiple variables by either emulating them separately or computing their cross-variable correlations. Both approaches may be useful in different use cases, and we will edit the manuscript to emphasize the possibility of both. Indeed, there are many downstream applications in which the co-occurrence of two variables, for example high temperature and relative humidity, is of particular interest. (c) The emulator architecture is computationally very cheap and can be easily implemented in interactive APIs like the preliminary demo developed by our group at MIT (http://eddies.mit.edu:8080/bc3/enr-gm-demo.html ; note that this is an alternative version that emulates precipitation instead of RH).
  We chose to apply the emulator to monthly mean variables as a useful introductory example to the model architecture. Applying our emulator to daily variables would be straightforward except for requiring more input data. We do illustrate an application to maximum daily temperatures in a companion paper that we are about to submit. This companion manuscript is more focused on applications rather than the methodology. We found that including a comprehensive description of methodology and applications in a single manuscript would make it way too long.
  Here are more detailed responses to the specific questions.
  1. A monthly emulator that accounts for covariances provides crucial information when considering climate planning, adaptation, and education. Climate information at monthly resolution is generally of interest to planners, as it can be used to assess the large-scale effects of climate change on, for example, agriculture and water management. Notably, drought prediction is often conducted using monthly-resolution data [1]. Monthly averages are also key to capturing changes in the seasonal cycle, which may have significant agricultural impacts.
  Emulating variance is important for providing a first-order estimate of uncertainty to monthly projections. Variance cannot be emulated pointwise without accounting for the full covariance structure of the EOF coefficients. Of course, one could try to perform a "pattern scaling" exercise directly on the pointwise variance, but then this would miss the variance associated with spatial averages. In our revised manuscript, we will make sure to expand our discussion of the applications of emulated monthly data with the associated covariances.
  One specific example of the utility of emulating monthly variance is illustrated in Appendix E, which showcases how our emulator can be used to study differences in the impact of an arbitrary climate change scenario over a specific region--NW India. This would not be possible without accurate quantification of the variance over that region, which requires a model for the full covariance structure of the EOF coefficients. In addition to what is discussed in the manuscript, our emulator can be used to further explore, e.g., questions of how soon a particular effect of climate change emerges from natural variability (time of emergence). In this particular case study, for example, the effect of climate change only becomes significant in NW India around 2070. This would be impossible to quantify without accounting for the variance. In our revision of the manuscript, we will move this case study section into the main text and expand it with a more detailed discussion of climate change detection.
  It is important to emphasize that we do not generate timeseries, but rather aim to emulate statistics directly for the ease of use in a variety of cases. We believe that in the majority of use cases for monthly data, it is the statistics that are of interest, not the exact trajectories of emulated "ensemble members" (since none will match the actual future exactly). Importantly, our framework allows for the seamless generation of statistics across various spatial scales using the same mechanism.
  For an impact modeler with an interest in daily values, coupling with a downstream emulator such as DiffESM is indeed necessary. This is, in fact, a missing piece of the DiffESM framework: the monthly data it is conditioned on must come from another emulator (such as ours). An advantage of the DiffESM emulator is that it can be conditioned on the monthly average of a single ESM ensemble member, as opposed to the ensemble-mean monthly value. This allows DiffESM to respond differently, e.g., to abnormally warm or cold years. The emulator that DiffESM is conditioned on must therefore supply an ensemble of monthly values, for which the emulation of variance is necessary. Finally, while the DiffESM approach does not use monthly covariances directly, one could imagine a probabilistic framework similar to DiffESM that takes in means and variances of monthly averages and downscales them (using, e.g., a diffusion model) to daily timescales. Such an approach would benefit from a continuous probabilistic representation of the climate; it is possible that daily values conditioned on both means and variances would be more accurate than those conditioned solely on means. In other words, including the variance of monthly values adds information for downstream applications.
  Lastly, we would like to emphasize that the simplicity of our method (as compared to emulators based on neural networks) makes it easy to run on modest hardware, without requiring GPUs. Our method should thus be thought of as a natural extension of the ethos of pattern scaling, which can cheaply emulate ESM behavior. Our approach marries this spirit of simplicity with the ability to directly model variance and coherent spatial structure. This same simplicity makes it an excellent tool for climate education. Monthly data are easier to understand intuitively than annual averages (which are too coarse) or daily data (which cannot be easily summarized without averaging). Regional monthly-resolution projections of climate change effects can thus be an effective tool in communicating the impacts of climate change, and this communication is further augmented by the addition of uncertainty (variance) to the projections. We have been using our emulator, coupled with a visualization tool developed by Prof. Glenn Flierl (linked above), to teach about climate change at the Cambridge Science Festivals.
  2. While in this manuscript we indeed choose to emulate the two variables (temperature and relative humidity) separately, the method can be applied to the joint emulation of two or more variables without much modification. Capturing correlations of one variable at different spatial locations (which is done in the manuscript) is no different than capturing correlations among variables. One would need only to redefine the EOF basis as one that encompasses modes of both variables. In defining this joint basis, it is important to be mindful of the resulting ranking of the EOFs, which may be sub-optimal (in terms of variance captured) for some of the variables. However, this would not typically have a big effect, since the EOF modes serve primarily as the projection basis for dimensionality reduction. For the purposes of demonstrating the emulator approach, we judged that illustrating its application to separate statistics was sufficient. We will revise the manuscript to clarify the possibility of joint emulation and describe explicitly how it would be done.
  3. The question of how many ensemble members is "enough" is somewhat ill-posed. The answer depends on the exact variables being considered, the application being targeted, and the particular ESM being emulated. Essentially, one must be satisfied that the ESM training data contains sufficient ensemble members to accurately represent the full amplitude of the model's internal variability. This issue is more quantitatively addressed in a manuscript published by some of the co-authors on this manuscript: Lutjens et al. (The Impact of Internal Variability on Benchmarking Deep Learning Climate Emulators, JAMES, 2025). The conclusion of that work is that between 5 and 10 ensemble members is a reasonable minimum to estimate internal variability, at least for temperature. In the CMIP6 archive used here, 4 models pass this bar, each with fairly different representations of future climate (CanESM, for example, is often regarded as a "hot" model). We will include references and discussions of this point in the revised manuscript.
  4. We thank the referee for the suggested citations and will incorporate them into the revised manuscript.
  Works cited: [1] Mishra, A. K. and V. P. Singh. "A review of drought concepts." J. Hydrology, 2010.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3768-AC1
RC2:
'Comment on egusphere-2025-3768', Anonymous Referee #2, 01 Oct 2025

The authors use large-ensemble historical and scenario simulations of a single climate model to train a regression model that predicts means and covariances of 2-m temperature and relative humidity for each month as a quadratic function of global-mean (and ensemble-mean) temperature. The model trained on one scenario successfully approximates the means and covariances of the fields of interest over other two scenarios.
This is a reasonable strategy and implementation. I do have a relatively minor comment though that I would like to clarify. The authors apply the EOF decomposition to the full historical data, rather than anomalies, as is usually done. This is fine for the purposes of the data compression (the leading EOF would essentially give you mean field and the trailing EOFs would be close to the EOFs of anomalies) . Such choices are convenient sometimes (for example, when you use EOFs as a basis to project dynamical equations on (where you need mean state orthogonal to the basis of anomalies) but I am not sure this choice is super-convenient or most economical for the present purposes. I would compute instead the standard EOFs of the historical period (after subtracting the mean) [better yet - ensemble EOFs - why use the single realization?], use pattern scaling for projecting the mean into the future (quadratic regression on global mean at each grid point), and then linear regression of eof amplitudes (or quadratic regression of EOF variances) on global mean temperature to project the covariance matrices. Since EOFs diagonalize covariance matrices, you would still get the positive-definite covariances as long as your projected EOF amplitudes remain positive. I think it is highly likely that this much simpler training procedure (which assumes a still diagonal future covariance matrix in the basis of historical EOFs) will give essentially the same results as a more complex method used by the authors - but, of course, I'd be willing and happy to hear the authors opinion and discuss further!

Citation: https://doi.org/10.5194/egusphere-2025-3768-RC2
- AC2: 'Reply on RC2', Gosha Geogdzhayev, 14 Oct 2025
  
  We thank the referee for their detailed comments. As noted by the reviewer, the primary goal of the EOF decomposition in our work is data compression / dimensionality reduction. Computing EOFs on a single realization of the historical period is a choice motivated by the focus on dimensionality reduction to reduce the computational burden of the emulator. Throughout this work, we sought to make the emulator as lightweight and straightforward as possible, prioritizing ease of use and interpretability. In our revised manuscript, we will clarify the reasoning behind our choices.
  In our formulation, the stationarity or intra-ensemble generalizability of EOFs are not necessary preconditions for the emulation to be valid. The EOF choice you suggest for the emulator design would instead require some of these assumptions to hold, if we understand your suggestion correctly. In particular, an issue arises with generalizability, since EOF amplitudes are only guaranteed to be bi-orthogonal over the dataset they are computed from. The EOF basis functions are, of course, orthogonal in perpetuity since these only involve spatial integrals. The main issue for orthogonality to hold in general is the orthogonality of the EOF amplitudes, which involve temporal/ensemble averaging. For example, although the EOF amplitudes are orthogonal over the entire historical period for which they are computed, if we look at their monthly covariance as given by an ensemble average, this is no longer the case. This issue might be somewhat alleviated by, say, calculating the EOFs over the full range of the training data (historical and SSP5-8.5 scenario, full ensemble), but this would significantly increase the computational cost of training, as described above, without guaranteeing full generalizability. Lastly, computing the EOFs over the historical period, when the statistics are approximately stationary, provides a more robust basis than computing the EOFs on future scenarios where the statistics depend strongly on the particular set of simulations being considered and my not extrapolate well to unseen new scenarios. As for the decomposition optimization for the covariance, to our knowledge, it is the only method that guarantees that the covariance matrices are positive definite, a key requirement for the problem to be well-posed.
  On a slightly different note, one possible way to amend our methodology in the spirit of your suggestion (of using anomalies rather than the full variables) would be to assume that the correlation matrix stays constant and instead model only the EOF variances. This is possible, and indeed an early version of the emulator made this assumption, but we found that this assumption was not justified for the full range of SSP scenarios being considered. In our revised manuscript, we will expand further on the choice of modeling the covariance vs. correlation matrices.
  More generally, we believe that the method presented has the advantage of being straightforward, computationally light, and sufficiently accurate for the outlined purposes. Instead of splitting the system into separate components by separating out the mean trend of the data and emulating the variability, we emulate in a unified manner a dimensionally-reduced version of the system, then reconstruct it using the inverse of our chosen dimensionality reduction technique. This approach guarantees that the emulation method is maximally suited for generalization.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3768-AC2

Gosha Geogdzhayev, Andre N. Souza, Glenn R. Flierl, and Raffaele Ferrari

Data sets

GaussianEarth Gosha Geogdzhayev and Andre N. Souza https://github.com/sandreza/GaussianEarth

Gosha Geogdzhayev, Andre N. Souza, Glenn R. Flierl, and Raffaele Ferrari

Viewed

Total article views: 1,187 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,080	81	26	1,187	33	33

HTML: 1,080
PDF: 81
XML: 26
Total: 1,187
BibTeX: 33
EndNote: 33

Views and downloads (calculated since 18 Aug 2025)

Month	HTML	PDF	XML	Total
Aug 2025	212	24	4	240
Sep 2025	758	20	7	785
Oct 2025	88	21	11	120
Nov 2025	22	16	4	42

Cumulative views and downloads (calculated since 18 Aug 2025)

Month	HTML	PDF	XML	Total
Aug 2025	212	24	4	240
Sep 2025	758	20	7	785
Oct 2025	88	21	11	120
Nov 2025	22	16	4	42

Viewed (geographical distribution)

Total article views: 1,144 (including HTML, PDF, and XML) Thereof 1,144 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 22 Nov 2025

Short summary

Climate models serve as good guesses of how humans affect the climate, but they cannot explore all possible future scenarios of interest. We develop a method that can serve as a fast and cheap stand-in to evaluate likely changes in variables like surface temperature and relative humidity at a regional scale in arbitrary future climates. Crucially, our method captures relationships between different geographic areas and predicts both average values and likely ranges using a unified framework.


Total:	0
HTML:	0
PDF:	0
XML:	0