the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Various ways of using Empirical Orthogonal Functions for Climate Model evaluation
Rasmus E. Benestad
Abdelkader Mezghani
Julia Lutz
Andreas Dobler
Kajsa M. Parding
Oskar A. Landgren
Abstract. We present a framework for evaluating multimodel ensembles based on common empirical orthogonal functions ('common EOFs') that emphasise salient features connected to spatiotemporal covariance structures embedded in large climate data volumes. In other words, this framework enables the extraction of the most pronounced spatial patterns of coherent variability within the joint data set and provides a set of weights for each model in terms of principal components which refer to exactly the same set of spatial patterns of covariance. In other words, common EOFs provide a means for extracting information from large volumes of data. Moreover, they can provide an objective basis for evaluation that can be used to accentuate ensembles more than traditional methods for evaluation, which tend to focus on individual models. Our demonstration of the capability of common EOFs reveals a statistically significant improvement of the sixth generation of the World Climate Research Programme (WCRP) Climate Model Intercomparison Project (CMIP6) simulations over the previous generation (CMIP5) in terms of their ability to reproduce the mean seasonal cycle in air surface temperature, precipitation, and mean sealevel pressure over the Nordic countries. The leading common EOF principal component for annually/seasonally aggregated temperature, precipitation and pressure statistics suggest that their simulated interannual variability is generally consistent with that seen in the ERA5 reanalysis. We also demonstrate how common EOFs can be used to analyse whether CMIP ensembles reproduce the observed historical trends over the historical period 1959–2021, and the results suggest that the trend statistics provided by both CMIP5 RCP4.5 and CMIP6 SSP245 are consistent with observed trends. An interesting finding is also that the leading common EOF principal component for annually/seasonally aggregated statistics seems to be approximately normally distributed, which is useful information about the multimodel ensemble data.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(11027 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(11027 KB)  Metadata XML
 BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Rasmus E. Benestad et al.
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221385', Abdel Hannachi, 05 Mar 2023
Review of "Various ways of using empirical orthogonal functions
for climate model evaluation" by Benestad et al., submitted to
Geoscientific Model Development.The paper discusses a particular version of common EOFs with
application to a large number (75) of GCMs runs from CMIP5 and
CMIP6 simulations. The paper highlights the benefits of applying
common EOFs in model evaluation, and points to its use in related
topics such as empiricalstatistical downscaling. The authors show
in particular that all the models capture the mean seasonal cycle
and interannual variability of precipitation, sealevel pressure
and surface air temperature reasonably well and that CMIP6 shows
some improvement over CMIP5.Recommendation
The authors are to be congratulated for this huge effort to apply
common EOFs to a large data base of CMIP5 and CMIP6 simulations.
I support its publication in Geoscientific Model Development
subject to some minor changes.Major Comments
The SVDbased common EOFs method used in the paper is akin to the
combined EOFs (e.g., Navarra and Simoricini 2010) where the
different datasets are packed in one single large array, which is
then analysed via SVD. Of course the difference is in the way the
data bloc matrices are arranged in the large array. The result is
a set of individual eigenelements (i.e. EOF in Smode as in
Barnett (1998) and also here, or PC in Tmode as in combined PCA,
see, e.g. Jolliffe (2002)) associated with corresponding
eigenvalues.
The original common EOFs method as presented first by Flury
(1984, 1986), see also Hannachi (2021) for earlier literature,
analyses different covariance matrices, for which one common EOF
has different explained variances depending on the data (or model
run). Clearly this version gives more degreesoffreedom to the
common EOFs compared to the one defined by Barnett (1998) or here
where one common EOF has one single explained variance for all the
models' simulations. One benefit of the former is that these
eigenvalues for one given common EOF can be made useful to
weigh the different models, and can be used in various other ways,
e.g., to get the models' climatology. In addition, it overcomes the
issue of scaling in the different datasets.
Of course I must say though that the SVDbased common EOFs
(Barnett 1998 and the present manuscript) is computationally much
faster and is convenient for application with large number of GCMs
runs as in this paper. I think these points, with the above
references highlighting the historical context of common EOF/PCs
should be included in the revised version.In Hannachi et al. (2022) the references we mentioned there are
more related to climate research. Some other references (e.g.,
Barnett 1999) were missing because the search engine did not find
them as they do not mention common EOFs/PCs in the title. In any
case, the first time common EOF/PCs was mentioned was in Flury
(1984).Minor Comments
Pg 3  Please change TAS to SAT (surface air temperature) and PSL
to SLP (sea level pressure) in page 3 and elsewhere.Pg 3, l71: 'vector' > 'value'
Pg 6, near l171, l175  repetition.
Pg 5, top panel: yaxis label: add "Relative (or scaled) rank".
Pg 8, l240  ensemble spread cannot be normal  could be truncated
normal perhaps.Fig. 8, I presume there is one value per model, right? Is it global
mean of the climatology?
Fig. 9, top left and bottom panels, units: oC/yrReferences

Flury B.N., Common principal components in k groups. J. Am. Stat.
Assoc., 1984.Flury B.N., and W. Gutchi: An algorithm for simultaneous orthogonal
transformation of several positive definite symmetric matrices to
nearly diagonal form. SIAM J. Sci. Stat. Comput., 1986.
Hannachi, A., Pattern Identification and Data Mining in Weather and
Climate, Springer Nature, 2021.Jolliffe I.T., Principal Component Analysis, Second Edition, Springer
Nature, 2002.Navarra, A., and V. Simoncini, A Guide to Empirical Orthogonal
Functions for Climate Data Analysis. Springer, 2010.Citation: https://doi.org/10.5194/egusphere20221385RC1  AC1: 'Reply on RC1', Rasmus Benestad, 22 Mar 2023

RC2: 'Comment on egusphere20221385', Anonymous Referee #2, 23 Mar 2023
Review of "Various ways of using empirical orthogonal functions for climate model evaluation" by Benestad et al., submitted to Geoscientific Model Development
The manuscript presents a very valuable extension to the traditional evaluation framework for climate projections, which evaluates an ensemble as a whole as opposed to the evaluation of a single projections. This is a muchneeded shift in perspective that responds to the already decadesold practice of assessing climate related questions by means of ensembles.
In particular, the authors propose to use the EOF machinery to produce a decomposition of an entire ensemble, where every projection along with a reference dataset is characterized by its PCs in a common EOF space. The manuscript also comprises a number of illustrative examples for the application of the proposed procedure to both an ensemble of CMIP5 and of CMIP6 projections.
Recommandation
I recommend the manuscript for publication in Geoscientific Model Development conditionally to an indepth discussion of some methodological questions.
Major comments
The presented EOF technique does not pursue the same target as the original common EOFs introduced by Flury (1984) and recently applied by Hannachi et al. (2022). These common EOFs find a set of basis vectors that approximately maximizes the variance within a number of datasets (each from one projection) simultaneously. The EOFs proposed by Benestad et al. find the exact EOFs that maximize the mixture of within variances (within each projection) and between crosscovariances (crosscovariance between all pairs of projections) of the combined dataset. As these are two different kinds of optimization, I would suggest to find a new name for the presented technique.
Besides, it would surely be very helpful to discuss the implications of this difference on the meaning of the resulting EOFs. The technique could further be contrasted to multivariate EOFs, where all monthly values of any one projection are stacked in spacedirection instead of timedirection, as was done by Sanderson et al. (2015) to represent the similarity between projections in a joint multidimensional space, and/or to one of the many tensor decompositions, where the spacetime matrices of the individual projections are stacked along a third (model) direction, and components are found, which present the main variations along those three directions (e.g. Cichocki et al., 2015). Aside from that, there exists a vast and slightly chaotic literature on multigroup/ multiblock/multitable PCA methods, which is also concerned with contriving ways to combine datasets of differing origin in one joint PCA.
The discussion of the presented applications seems, at least in my opinion, very much centered on the “easy” cases, where the variance is represented overwhelmingly by the first EOF and the reference lies well in the middle of the ensemble (seasonal cycles, interannual variability of TAS and PSL, and TAS trend). The authors could take the more complicated situation in the interannual variation of PR as an opportunity to elaborate further on the inclusion of more than one PC in the subsequent evaluation, and on the possibility and consequences of the reference not falling inside the ensemble.
I further suggest to consider a variation of the proposed method: to include only the ensemble of projections in the EOF analysis, and to project the reference, which might also be more than one references, onto the EOFs afterwards. It would appear to me as a “cleaner” approach not to meddle projections and references. On the other hand, in such vast ensembles as used here, the exclusion of the reanalysis will hardly make any appreciable difference.
Sanderson, B.M., Knutti, R., Caldwell, P. (2015): A representative democracy to reduce interdependency in a multimodel ensemble. Journal of Climate, 28(13), pp. 51715194
Cichocki, A., Mandic, D., De Lathauwer, L., ... (2015): Tensor decompositions for signal processing applications: From twoway to multiway component analysis. IEEE Signal Processing Magazine, 32(2),7038247, pp. 145163
Minor comments
Line 235 ff.: Is there any evidence that supports the proposition of independence between ensemble members? The QQplot suggests Gaussian distribution, but to my knowledge the QQplot gives no hint concerning independence. Furthermore, this would contradict the widelyrecognized notion of strong interdependence between climate models.
PC plots in all figures: I find it difficult to distinguish the black curve (reanalysis) against the background of the dark blue curves (CMIP6).
Citation: https://doi.org/10.5194/egusphere20221385RC2  CC1: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
 CC2: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221385', Abdel Hannachi, 05 Mar 2023
Review of "Various ways of using empirical orthogonal functions
for climate model evaluation" by Benestad et al., submitted to
Geoscientific Model Development.The paper discusses a particular version of common EOFs with
application to a large number (75) of GCMs runs from CMIP5 and
CMIP6 simulations. The paper highlights the benefits of applying
common EOFs in model evaluation, and points to its use in related
topics such as empiricalstatistical downscaling. The authors show
in particular that all the models capture the mean seasonal cycle
and interannual variability of precipitation, sealevel pressure
and surface air temperature reasonably well and that CMIP6 shows
some improvement over CMIP5.Recommendation
The authors are to be congratulated for this huge effort to apply
common EOFs to a large data base of CMIP5 and CMIP6 simulations.
I support its publication in Geoscientific Model Development
subject to some minor changes.Major Comments
The SVDbased common EOFs method used in the paper is akin to the
combined EOFs (e.g., Navarra and Simoricini 2010) where the
different datasets are packed in one single large array, which is
then analysed via SVD. Of course the difference is in the way the
data bloc matrices are arranged in the large array. The result is
a set of individual eigenelements (i.e. EOF in Smode as in
Barnett (1998) and also here, or PC in Tmode as in combined PCA,
see, e.g. Jolliffe (2002)) associated with corresponding
eigenvalues.
The original common EOFs method as presented first by Flury
(1984, 1986), see also Hannachi (2021) for earlier literature,
analyses different covariance matrices, for which one common EOF
has different explained variances depending on the data (or model
run). Clearly this version gives more degreesoffreedom to the
common EOFs compared to the one defined by Barnett (1998) or here
where one common EOF has one single explained variance for all the
models' simulations. One benefit of the former is that these
eigenvalues for one given common EOF can be made useful to
weigh the different models, and can be used in various other ways,
e.g., to get the models' climatology. In addition, it overcomes the
issue of scaling in the different datasets.
Of course I must say though that the SVDbased common EOFs
(Barnett 1998 and the present manuscript) is computationally much
faster and is convenient for application with large number of GCMs
runs as in this paper. I think these points, with the above
references highlighting the historical context of common EOF/PCs
should be included in the revised version.In Hannachi et al. (2022) the references we mentioned there are
more related to climate research. Some other references (e.g.,
Barnett 1999) were missing because the search engine did not find
them as they do not mention common EOFs/PCs in the title. In any
case, the first time common EOF/PCs was mentioned was in Flury
(1984).Minor Comments
Pg 3  Please change TAS to SAT (surface air temperature) and PSL
to SLP (sea level pressure) in page 3 and elsewhere.Pg 3, l71: 'vector' > 'value'
Pg 6, near l171, l175  repetition.
Pg 5, top panel: yaxis label: add "Relative (or scaled) rank".
Pg 8, l240  ensemble spread cannot be normal  could be truncated
normal perhaps.Fig. 8, I presume there is one value per model, right? Is it global
mean of the climatology?
Fig. 9, top left and bottom panels, units: oC/yrReferences

Flury B.N., Common principal components in k groups. J. Am. Stat.
Assoc., 1984.Flury B.N., and W. Gutchi: An algorithm for simultaneous orthogonal
transformation of several positive definite symmetric matrices to
nearly diagonal form. SIAM J. Sci. Stat. Comput., 1986.
Hannachi, A., Pattern Identification and Data Mining in Weather and
Climate, Springer Nature, 2021.Jolliffe I.T., Principal Component Analysis, Second Edition, Springer
Nature, 2002.Navarra, A., and V. Simoncini, A Guide to Empirical Orthogonal
Functions for Climate Data Analysis. Springer, 2010.Citation: https://doi.org/10.5194/egusphere20221385RC1  AC1: 'Reply on RC1', Rasmus Benestad, 22 Mar 2023

RC2: 'Comment on egusphere20221385', Anonymous Referee #2, 23 Mar 2023
Review of "Various ways of using empirical orthogonal functions for climate model evaluation" by Benestad et al., submitted to Geoscientific Model Development
The manuscript presents a very valuable extension to the traditional evaluation framework for climate projections, which evaluates an ensemble as a whole as opposed to the evaluation of a single projections. This is a muchneeded shift in perspective that responds to the already decadesold practice of assessing climate related questions by means of ensembles.
In particular, the authors propose to use the EOF machinery to produce a decomposition of an entire ensemble, where every projection along with a reference dataset is characterized by its PCs in a common EOF space. The manuscript also comprises a number of illustrative examples for the application of the proposed procedure to both an ensemble of CMIP5 and of CMIP6 projections.
Recommandation
I recommend the manuscript for publication in Geoscientific Model Development conditionally to an indepth discussion of some methodological questions.
Major comments
The presented EOF technique does not pursue the same target as the original common EOFs introduced by Flury (1984) and recently applied by Hannachi et al. (2022). These common EOFs find a set of basis vectors that approximately maximizes the variance within a number of datasets (each from one projection) simultaneously. The EOFs proposed by Benestad et al. find the exact EOFs that maximize the mixture of within variances (within each projection) and between crosscovariances (crosscovariance between all pairs of projections) of the combined dataset. As these are two different kinds of optimization, I would suggest to find a new name for the presented technique.
Besides, it would surely be very helpful to discuss the implications of this difference on the meaning of the resulting EOFs. The technique could further be contrasted to multivariate EOFs, where all monthly values of any one projection are stacked in spacedirection instead of timedirection, as was done by Sanderson et al. (2015) to represent the similarity between projections in a joint multidimensional space, and/or to one of the many tensor decompositions, where the spacetime matrices of the individual projections are stacked along a third (model) direction, and components are found, which present the main variations along those three directions (e.g. Cichocki et al., 2015). Aside from that, there exists a vast and slightly chaotic literature on multigroup/ multiblock/multitable PCA methods, which is also concerned with contriving ways to combine datasets of differing origin in one joint PCA.
The discussion of the presented applications seems, at least in my opinion, very much centered on the “easy” cases, where the variance is represented overwhelmingly by the first EOF and the reference lies well in the middle of the ensemble (seasonal cycles, interannual variability of TAS and PSL, and TAS trend). The authors could take the more complicated situation in the interannual variation of PR as an opportunity to elaborate further on the inclusion of more than one PC in the subsequent evaluation, and on the possibility and consequences of the reference not falling inside the ensemble.
I further suggest to consider a variation of the proposed method: to include only the ensemble of projections in the EOF analysis, and to project the reference, which might also be more than one references, onto the EOFs afterwards. It would appear to me as a “cleaner” approach not to meddle projections and references. On the other hand, in such vast ensembles as used here, the exclusion of the reanalysis will hardly make any appreciable difference.
Sanderson, B.M., Knutti, R., Caldwell, P. (2015): A representative democracy to reduce interdependency in a multimodel ensemble. Journal of Climate, 28(13), pp. 51715194
Cichocki, A., Mandic, D., De Lathauwer, L., ... (2015): Tensor decompositions for signal processing applications: From twoway to multiway component analysis. IEEE Signal Processing Magazine, 32(2),7038247, pp. 145163
Minor comments
Line 235 ff.: Is there any evidence that supports the proposition of independence between ensemble members? The QQplot suggests Gaussian distribution, but to my knowledge the QQplot gives no hint concerning independence. Furthermore, this would contradict the widelyrecognized notion of strong interdependence between climate models.
PC plots in all figures: I find it difficult to distinguish the black curve (reanalysis) against the background of the dark blue curves (CMIP6).
Citation: https://doi.org/10.5194/egusphere20221385RC2  CC1: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
 CC2: 'Reply on RC2', Rasmus Benestad, 30 Mar 2023
Peer review completion
Journal article(s) based on this preprint
Rasmus E. Benestad et al.
Video supplement
common EOFs for evaluation of geophysical data and global climate models. Rasmus Benestad https://www.youtube.com/watch?v=32mtHHAoq6k
A brief presentation of common EOFs in Rstudio Rasmus Benestad https://www.youtube.com/watch?v=E01hthVL9pY
Rasmus E. Benestad et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

582  183  19  784  9  8 
 HTML: 582
 PDF: 183
 XML: 19
 Total: 784
 BibTeX: 9
 EndNote: 8
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(11027 KB)  Metadata XML