Various ways of using Empirical Orthogonal Functions for Climate Model evaluation

Benestad, Rasmus E.; Mezghani, Abdelkader; Lutz, Julia; Dobler, Andreas; Parding, Kajsa M.; Landgren, Oskar A.

doi:https://doi.org/10.5194/egusphere-2022-1385

Preprints

https://doi.org/10.5194/egusphere-2022-1385

Preprints

08 Feb 2023

| 08 Feb 2023

Various ways of using Empirical Orthogonal Functions for Climate Model evaluation

Rasmus E. Benestad, Abdelkader Mezghani, Julia Lutz, Andreas Dobler, Kajsa M. Parding, and Oskar A. Landgren

Abstract. We present a framework for evaluating multi-model ensembles based on common empirical orthogonal functions ('common EOFs') that emphasise salient features connected to spatio-temporal covariance structures embedded in large climate data volumes. In other words, this framework enables the extraction of the most pronounced spatial patterns of coherent variability within the joint data set and provides a set of weights for each model in terms of principal components which refer to exactly the same set of spatial patterns of covariance. In other words, common EOFs provide a means for extracting information from large volumes of data. Moreover, they can provide an objective basis for evaluation that can be used to accentuate ensembles more than traditional methods for evaluation, which tend to focus on individual models. Our demonstration of the capability of common EOFs reveals a statistically significant improvement of the sixth generation of the World Climate Research Programme (WCRP) Climate Model Intercomparison Project (CMIP6) simulations over the previous generation (CMIP5) in terms of their ability to reproduce the mean seasonal cycle in air surface temperature, precipitation, and mean sea-level pressure over the Nordic countries. The leading common EOF principal component for annually/seasonally aggregated temperature, precipitation and pressure statistics suggest that their simulated interannual variability is generally consistent with that seen in the ERA5 reanalysis. We also demonstrate how common EOFs can be used to analyse whether CMIP ensembles reproduce the observed historical trends over the historical period 1959–2021, and the results suggest that the trend statistics provided by both CMIP5 RCP4.5 and CMIP6 SSP245 are consistent with observed trends. An interesting finding is also that the leading common EOF principal component for annually/seasonally aggregated statistics seems to be approximately normally distributed, which is useful information about the multi-model ensemble data.

Received: 02 Dec 2022 – Discussion started: 08 Feb 2023

Download & links

Preprint (PDF, 11027 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (11027 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

26 May 2023

Various ways of using empirical orthogonal functions for climate model evaluation

Rasmus E. Benestad, Abdelkader Mezghani, Julia Lutz, Andreas Dobler, Kajsa M. Parding, and Oskar A. Landgren

Geosci. Model Dev., 16, 2899–2913, https://doi.org/10.5194/gmd-16-2899-2023,https://doi.org/10.5194/gmd-16-2899-2023, 2023

Short summary

Rasmus E. Benestad et al.

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-1385', Abdel Hannachi, 05 Mar 2023

Review of "Various ways of using empirical orthogonal functions

for climate model evaluation" by Benestad et al., submitted to

Geoscientific Model Development.
The paper discusses a particular version of common EOFs with

application to a large number (75) of GCMs runs from CMIP5 and

CMIP6 simulations. The paper highlights the benefits of applying

common EOFs in model evaluation, and points to its use in related

topics such as empirical-statistical downscaling. The authors show

in particular that all the models capture the mean seasonal cycle

and interannual variability of precipitation, sea-level pressure

and surface air temperature reasonably well and that CMIP6 shows

some improvement over CMIP5.
Recommendation

--------------------------
The authors are to be congratulated for this huge effort to apply

common EOFs to a large data base of CMIP5 and CMIP6 simulations.

I support its publication in Geoscientific Model Development

subject to some minor changes.
Major Comments

-------------------------
The SVD-based common EOFs method used in the paper is akin to the

combined EOFs (e.g., Navarra and Simoricini 2010) where the

different datasets are packed in one single large array, which is

then analysed via SVD. Of course the difference is in the way the

data bloc matrices are arranged in the large array. The result is

a set of individual eigenelements (i.e. EOF in S-mode as in

Barnett (1998) and also here, or PC in T-mode as in combined PCA,

see, e.g. Jolliffe (2002)) associated with corresponding

eigenvalues.

The original common EOFs method as presented first by Flury

(1984, 1986), see also Hannachi (2021) for earlier literature,

analyses different covariance matrices, for which one common EOF

has different explained variances depending on the data (or model

run). Clearly this version gives more degrees-of-freedom to the

common EOFs compared to the one defined by Barnett (1998) or here

where one common EOF has one single explained variance for all the

models' simulations. One benefit of the former is that these

eigenvalues --for one given common EOF-- can be made useful to

weigh the different models, and can be used in various other ways,

e.g., to get the models' climatology. In addition, it overcomes the

issue of scaling in the different datasets.

Of course I must say though that the SVD-based common EOFs

(Barnett 1998 and the present manuscript) is computationally much

faster and is convenient for application with large number of GCMs

runs as in this paper. I think these points, with the above

references highlighting the historical context of common EOF/PCs

should be included in the revised version.
In Hannachi et al. (2022) the references we mentioned there are

more related to climate research. Some other references (e.g.,

Barnett 1999) were missing because the search engine did not find

them as they do not mention common EOFs/PCs in the title. In any

case, the first time common EOF/PCs was mentioned was in Flury

(1984).
Minor Comments

-------------------------
Pg 3 - Please change TAS to SAT (surface air temperature) and PSL

to SLP (sea level pressure) in page 3 and elsewhere.
Pg 3, l71: 'vector' --> 'value'
Pg 6, near l171, l175 - repetition.
Pg 5, top panel: y-axis label: add "Relative (or scaled) rank".
Pg 8, l240 - ensemble spread cannot be normal - could be truncated

normal perhaps.
Fig. 8, I presume there is one value per model, right? Is it global

mean of the climatology?

Fig. 9, top left and bottom panels, units: oC/yr
References

----------------

Flury B.N., Common principal components in k groups. J. Am. Stat.

Assoc., 1984.
Flury B.N., and W. Gutchi: An algorithm for simultaneous orthogonal

transformation of several positive definite symmetric matrices to

nearly diagonal form. SIAM J. Sci. Stat. Comput., 1986.

Hannachi, A., Pattern Identification and Data Mining in Weather and

Climate, Springer Nature, 2021.
Jolliffe I.T., Principal Component Analysis, Second Edition, Springer

Nature, 2002.
Navarra, A., and V. Simoncini, A Guide to Empirical Orthogonal

Functions for Climate Data Analysis. Springer, 2010.

Citation: https://doi.org/10.5194/egusphere-2022-1385-RC1
- AC1: 'Reply on RC1', Rasmus Benestad, 22 Mar 2023
  
  Thanks for your comments which will be useful for the revision of our paper. A more detailed response is provided in the attached PDF-file.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1385-AC1
RC2:
'Comment on egusphere-2022-1385', Anonymous Referee #2, 23 Mar 2023

Review of "Various ways of using empirical orthogonal functions for climate model evaluation" by Benestad et al., submitted to Geoscientific Model Development
The manuscript presents a very valuable extension to the traditional evaluation framework for climate projections, which evaluates an ensemble as a whole as opposed to the evaluation of a single projections. This is a much-needed shift in perspective that responds to the already decades-old practice of assessing climate related questions by means of ensembles.
In particular, the authors propose to use the EOF machinery to produce a decomposition of an entire ensemble, where every projection along with a reference dataset is characterized by its PCs in a common EOF space. The manuscript also comprises a number of illustrative examples for the application of the proposed procedure to both an ensemble of CMIP5 and of CMIP6 projections.
Recommandation
I recommend the manuscript for publication in Geoscientific Model Development conditionally to an in-depth discussion of some methodological questions.
Major comments
The presented EOF technique does not pursue the same target as the original common EOFs introduced by Flury (1984) and recently applied by Hannachi et al. (2022). These common EOFs find a set of basis vectors that approximately maximizes the variance within a number of datasets (each from one projection) simultaneously. The EOFs proposed by Benestad et al. find the exact EOFs that maximize the mixture of within variances (within each projection) and between cross-covariances (cross-covariance between all pairs of projections) of the combined dataset. As these are two different kinds of optimization, I would suggest to find a new name for the presented technique.
Besides, it would surely be very helpful to discuss the implications of this difference on the meaning of the resulting EOFs. The technique could further be contrasted to multivariate EOFs, where all monthly values of any one projection are stacked in space-direction instead of time-direction, as was done by Sanderson et al. (2015) to represent the similarity between projections in a joint multidimensional space, and/or to one of the many tensor decompositions, where the space-time matrices of the individual projections are stacked along a third (model) direction, and components are found, which present the main variations along those three directions (e.g. Cichocki et al., 2015). Aside from that, there exists a vast and slightly chaotic literature on multigroup/ multiblock/multitable PCA methods, which is also concerned with contriving ways to combine datasets of differing origin in one joint PCA.
The discussion of the presented applications seems, at least in my opinion, very much centered on the “easy” cases, where the variance is represented overwhelmingly by the first EOF and the reference lies well in the middle of the ensemble (seasonal cycles, interannual variability of TAS and PSL, and TAS trend). The authors could take the more complicated situation in the interannual variation of PR as an opportunity to elaborate further on the inclusion of more than one PC in the subsequent evaluation, and on the possibility and consequences of the reference not falling inside the ensemble.
I further suggest to consider a variation of the proposed method: to include only the ensemble of projections in the EOF analysis, and to project the reference, which might also be more than one references, onto the EOFs afterwards. It would appear to me as a “cleaner” approach not to meddle projections and references. On the other hand, in such vast ensembles as used here, the exclusion of the reanalysis will hardly make any appreciable difference.
Sanderson, B.M., Knutti, R., Caldwell, P. (2015): A representative democracy to reduce interdependency in a multimodel ensemble. Journal of Climate, 28(13), pp. 5171-5194
Cichocki, A., Mandic, D., De Lathauwer, L., ... (2015): Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Signal Processing Magazine, 32(2),7038247, pp. 145-163
Minor comments
Line 235 ff.: Is there any evidence that supports the proposition of independence between ensemble members? The QQ-plot suggests Gaussian distribution, but to my knowledge the QQ-plot gives no hint concerning independence. Furthermore, this would contradict the widely-recognized notion of strong interdependence between climate models.
PC plots in all figures: I find it difficult to distinguish the black curve (reanalysis) against the background of the dark blue curves (CMIP6).

Citation: https://doi.org/10.5194/egusphere-2022-1385-RC2
- CC1: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
  
  Thanks for these comments. We will base our revisions on themand have responded to them in the attached PDF.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1385-CC1
- CC2: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
  
  Thanks for these comments. We will base our revisions on themand have responded to them in the attached PDF.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1385-CC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-1385', Abdel Hannachi, 05 Mar 2023

Review of "Various ways of using empirical orthogonal functions

for climate model evaluation" by Benestad et al., submitted to

Geoscientific Model Development.
The paper discusses a particular version of common EOFs with

application to a large number (75) of GCMs runs from CMIP5 and

CMIP6 simulations. The paper highlights the benefits of applying

common EOFs in model evaluation, and points to its use in related

topics such as empirical-statistical downscaling. The authors show

in particular that all the models capture the mean seasonal cycle

and interannual variability of precipitation, sea-level pressure

and surface air temperature reasonably well and that CMIP6 shows

some improvement over CMIP5.
Recommendation

--------------------------
The authors are to be congratulated for this huge effort to apply

common EOFs to a large data base of CMIP5 and CMIP6 simulations.

I support its publication in Geoscientific Model Development

subject to some minor changes.
Major Comments

-------------------------
The SVD-based common EOFs method used in the paper is akin to the

combined EOFs (e.g., Navarra and Simoricini 2010) where the

different datasets are packed in one single large array, which is

then analysed via SVD. Of course the difference is in the way the

data bloc matrices are arranged in the large array. The result is

a set of individual eigenelements (i.e. EOF in S-mode as in

Barnett (1998) and also here, or PC in T-mode as in combined PCA,

see, e.g. Jolliffe (2002)) associated with corresponding

eigenvalues.

The original common EOFs method as presented first by Flury

(1984, 1986), see also Hannachi (2021) for earlier literature,

analyses different covariance matrices, for which one common EOF

has different explained variances depending on the data (or model

run). Clearly this version gives more degrees-of-freedom to the

common EOFs compared to the one defined by Barnett (1998) or here

where one common EOF has one single explained variance for all the

models' simulations. One benefit of the former is that these

eigenvalues --for one given common EOF-- can be made useful to

weigh the different models, and can be used in various other ways,

e.g., to get the models' climatology. In addition, it overcomes the

issue of scaling in the different datasets.

Of course I must say though that the SVD-based common EOFs

(Barnett 1998 and the present manuscript) is computationally much

faster and is convenient for application with large number of GCMs

runs as in this paper. I think these points, with the above

references highlighting the historical context of common EOF/PCs

should be included in the revised version.
In Hannachi et al. (2022) the references we mentioned there are

more related to climate research. Some other references (e.g.,

Barnett 1999) were missing because the search engine did not find

them as they do not mention common EOFs/PCs in the title. In any

case, the first time common EOF/PCs was mentioned was in Flury

(1984).
Minor Comments

-------------------------
Pg 3 - Please change TAS to SAT (surface air temperature) and PSL

to SLP (sea level pressure) in page 3 and elsewhere.
Pg 3, l71: 'vector' --> 'value'
Pg 6, near l171, l175 - repetition.
Pg 5, top panel: y-axis label: add "Relative (or scaled) rank".
Pg 8, l240 - ensemble spread cannot be normal - could be truncated

normal perhaps.
Fig. 8, I presume there is one value per model, right? Is it global

mean of the climatology?

Fig. 9, top left and bottom panels, units: oC/yr
References

----------------

Flury B.N., Common principal components in k groups. J. Am. Stat.

Assoc., 1984.
Flury B.N., and W. Gutchi: An algorithm for simultaneous orthogonal

transformation of several positive definite symmetric matrices to

nearly diagonal form. SIAM J. Sci. Stat. Comput., 1986.

Hannachi, A., Pattern Identification and Data Mining in Weather and

Climate, Springer Nature, 2021.
Jolliffe I.T., Principal Component Analysis, Second Edition, Springer

Nature, 2002.
Navarra, A., and V. Simoncini, A Guide to Empirical Orthogonal

Functions for Climate Data Analysis. Springer, 2010.

Citation: https://doi.org/10.5194/egusphere-2022-1385-RC1
- AC1: 'Reply on RC1', Rasmus Benestad, 22 Mar 2023
  
  Thanks for your comments which will be useful for the revision of our paper. A more detailed response is provided in the attached PDF-file.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1385-AC1
RC2:
'Comment on egusphere-2022-1385', Anonymous Referee #2, 23 Mar 2023

Review of "Various ways of using empirical orthogonal functions for climate model evaluation" by Benestad et al., submitted to Geoscientific Model Development
The manuscript presents a very valuable extension to the traditional evaluation framework for climate projections, which evaluates an ensemble as a whole as opposed to the evaluation of a single projections. This is a much-needed shift in perspective that responds to the already decades-old practice of assessing climate related questions by means of ensembles.
In particular, the authors propose to use the EOF machinery to produce a decomposition of an entire ensemble, where every projection along with a reference dataset is characterized by its PCs in a common EOF space. The manuscript also comprises a number of illustrative examples for the application of the proposed procedure to both an ensemble of CMIP5 and of CMIP6 projections.
Recommandation
I recommend the manuscript for publication in Geoscientific Model Development conditionally to an in-depth discussion of some methodological questions.
Major comments
The presented EOF technique does not pursue the same target as the original common EOFs introduced by Flury (1984) and recently applied by Hannachi et al. (2022). These common EOFs find a set of basis vectors that approximately maximizes the variance within a number of datasets (each from one projection) simultaneously. The EOFs proposed by Benestad et al. find the exact EOFs that maximize the mixture of within variances (within each projection) and between cross-covariances (cross-covariance between all pairs of projections) of the combined dataset. As these are two different kinds of optimization, I would suggest to find a new name for the presented technique.
Besides, it would surely be very helpful to discuss the implications of this difference on the meaning of the resulting EOFs. The technique could further be contrasted to multivariate EOFs, where all monthly values of any one projection are stacked in space-direction instead of time-direction, as was done by Sanderson et al. (2015) to represent the similarity between projections in a joint multidimensional space, and/or to one of the many tensor decompositions, where the space-time matrices of the individual projections are stacked along a third (model) direction, and components are found, which present the main variations along those three directions (e.g. Cichocki et al., 2015). Aside from that, there exists a vast and slightly chaotic literature on multigroup/ multiblock/multitable PCA methods, which is also concerned with contriving ways to combine datasets of differing origin in one joint PCA.
The discussion of the presented applications seems, at least in my opinion, very much centered on the “easy” cases, where the variance is represented overwhelmingly by the first EOF and the reference lies well in the middle of the ensemble (seasonal cycles, interannual variability of TAS and PSL, and TAS trend). The authors could take the more complicated situation in the interannual variation of PR as an opportunity to elaborate further on the inclusion of more than one PC in the subsequent evaluation, and on the possibility and consequences of the reference not falling inside the ensemble.
I further suggest to consider a variation of the proposed method: to include only the ensemble of projections in the EOF analysis, and to project the reference, which might also be more than one references, onto the EOFs afterwards. It would appear to me as a “cleaner” approach not to meddle projections and references. On the other hand, in such vast ensembles as used here, the exclusion of the reanalysis will hardly make any appreciable difference.
Sanderson, B.M., Knutti, R., Caldwell, P. (2015): A representative democracy to reduce interdependency in a multimodel ensemble. Journal of Climate, 28(13), pp. 5171-5194
Cichocki, A., Mandic, D., De Lathauwer, L., ... (2015): Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Signal Processing Magazine, 32(2),7038247, pp. 145-163
Minor comments
Line 235 ff.: Is there any evidence that supports the proposition of independence between ensemble members? The QQ-plot suggests Gaussian distribution, but to my knowledge the QQ-plot gives no hint concerning independence. Furthermore, this would contradict the widely-recognized notion of strong interdependence between climate models.
PC plots in all figures: I find it difficult to distinguish the black curve (reanalysis) against the background of the dark blue curves (CMIP6).

Citation: https://doi.org/10.5194/egusphere-2022-1385-RC2
- CC1: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
  
  Thanks for these comments. We will base our revisions on themand have responded to them in the attached PDF.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1385-CC1
- CC2: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
  
  Thanks for these comments. We will base our revisions on themand have responded to them in the attached PDF.
  
  Citation: https://doi.org/10.5194/egusphere-2022-1385-CC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Rasmus Benestad on behalf of the Authors (13 Apr 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (27 Apr 2023) by Axel Lauer

AR by Rasmus Benestad on behalf of the Authors (02 May 2023) Manuscript

Journal article(s) based on this preprint

26 May 2023

Various ways of using empirical orthogonal functions for climate model evaluation

Rasmus E. Benestad, Abdelkader Mezghani, Julia Lutz, Andreas Dobler, Kajsa M. Parding, and Oskar A. Landgren

Geosci. Model Dev., 16, 2899–2913, https://doi.org/10.5194/gmd-16-2899-2023,https://doi.org/10.5194/gmd-16-2899-2023, 2023

Short summary

Rasmus E. Benestad et al.

Video supplement

common EOFs for evaluation of geophysical data and global climate models. Rasmus Benestad https://www.youtube.com/watch?v=32mtHHAoq6k

A brief presentation of common EOFs in R-studio Rasmus Benestad https://www.youtube.com/watch?v=E01hthVL9pY

Rasmus E. Benestad et al.

Viewed

Total article views: 784 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
582	183	19	784	9	8

HTML: 582
PDF: 183
XML: 19
Total: 784
BibTeX: 9
EndNote: 8

Views and downloads (calculated since 08 Feb 2023)

Month	HTML	PDF	XML	Total
Feb 2023	243	91	7	341
Mar 2023	177	48	12	237
Apr 2023	115	27	0	142
May 2023	47	17	0	64
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0

Cumulative views and downloads (calculated since 08 Feb 2023)

Month	HTML	PDF	XML	Total
Feb 2023	243	91	7	341
Mar 2023	177	48	12	237
Apr 2023	115	27	0	142
May 2023	47	17	0	64
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0

Viewed (geographical distribution)

Total article views: 780 (including HTML, PDF, and XML) Thereof 780 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 10 Jan 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (11027 KB)
Metadata XML

Short summary

A mathematical method known as 'common EOFs' is not widely used within the climate research community, but they offer innovative ways of evaluating climate models. We show how they can be used to evaluate large ensembles of global climate model simulations and distill information about their ability to reproduce salient features of the regional climate. We can say they represent a kind of machine learning (ML) for dealing with "Big data".


Total:	0
HTML:	0
PDF:	0
XML:	0