the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
On the spatial calibration of imperfect climate models
Abstract. The calibration of Earth System Model parameters is subject to both data, time and computational constraints. The high dimensionality of this calibration problem, combined with errors which arise from model structural assumptions makes it impossible to find model versions fully consistent with historical observations, with the potential for multiple plausible configurations which make different tradeoffs between skill in different variables or spatial regions. In this study, we lay out a formalism for making different assumptions about how ensemble variability in a Perturbed Physics Ensemble (PPE) relates to model error, proposing an empirical but practical solution for finding diverse nearoptimal solutions. We argue that the effective degrees of freedom in model performance response to parameter input (the ’parametric component’ ) is, in fact, relatively small, illustrating why manual calibration is often able to find nearoptimal solutions. Comparison with a perturbed initial condition ensemble reveals that internal variability associated with this parametric component of model error is negligible. Finally, there is a potential for comparably performing parameter configurations making different tradeoffs in model errors. These alternative configurations can inform model development and could potentially lead to significantly different future climate evolution.
 Preprint
(46862 KB)  Metadata XML
 BibTeX
 EndNote
Status: closed

RC1: 'Comment on egusphere20232269', Anonymous Referee #1, 13 Nov 2023

AC1: 'Reply on RC1', Saloua Peatier, 08 Feb 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere20232269/egusphere20232269AC1supplement.pdf

AC1: 'Reply on RC1', Saloua Peatier, 08 Feb 2024

RC2: 'Comment on egusphere20232269', Anonymous Referee #2, 19 Nov 2023

AC2: 'Reply on RC2', Saloua Peatier, 08 Feb 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere20232269/egusphere20232269AC2supplement.pdf

AC2: 'Reply on RC2', Saloua Peatier, 08 Feb 2024

RC3: 'Comment on egusphere20232269', Anonymous Referee #3, 20 Nov 2023
The paper deals with the problem of finding good calibrations of a climate model based on a finite number of simulations with different parameters. A methodology based on PCA analysis of an ensemble of simulations is developed where the model error is split into a parametric component and a nonparametric component. The methodology is applied to a perturbed physics ensemble from an atmospheric model. I am not an expert in calibration but find that the paper includes some valuable insights. The paper is very technical and, unfortunately, in some places not very clearly described and therefore hard to follow.
General comments:
1) The Introduction is rather terse and could be more welcoming for a broader audience. It would be nice if the authors would define the concept 'emulator'. It seems from the examples that this is here understood as statistical tools to interpolate between the parameters of which you have climate model experiments. However, it is my understanding that emulators also can be simple physical models.
It would also be an improvement if the authors would discuss which fields are generally used in the calibration of climate models. Is 'calibration' the same process as what is often referred to as 'tuning'?
Where does the word 'spatial' in the title come from? I guess the method is rather general although you apply it here to spatial fields.
2) It seems to be assumed that observations are perfect. This is not the case and often there are also errors originating from the finite sampling: the estimated climatology in both observations and models depends on the length of the timeseries. How will these errors impact your results?
3) The emulator used in this study seems to be linear multiple regression. Furthermore, the methodology of the analysis is based on EOF/PC analyses. But other emulators are nonlinear. The output of such emulators are therefore not limited to the linear space spanned by the EOFs. So how would your analyses change with a nonlinear emulator and how will it change your conclusions?
4) The analytical deviations can in places be hard to follow. I would suggest that a notation is used that differentiates between scalars, vectors, and matrices.
5) I am also confused about the optimisation described in section 2.4 and 2.5. As far as I can see you don't apply a minimizing method but generates a lot of emulations with different parameters. Then you find a set of parameters with error smaller than the error from a reference model (what is this). And then you prune that set to a much smaller set (2.5). Is this correct? So the set you find are then not local minima?
Specific comments:
43: This is the first time PPE is used in the main text. It should be spelled out here.
64: Salter et al. should not be in ().
l85: So n=102 and each \Theta_i is a vector of length 30? As mentioned above it would be helpful if the notation separated vectors from scalars.
l106: \mu seems to be on the lefthand side in Eq. 105, so does r_f include \mu here?
Eqs. 6 and 7: c_f > c_y ?
Eq. 14: So this is multilinear regression. I am again confused about notation. Should \Theta_i just be the vector \Theta? Theta_i is the vector of parameters used in the i'th climate model?
l174: What is a reference model? Is there any reason to assume that this is better than any random selected model from the ensemble?
Section 3, beginning: You consider the SAT from 3 years but what is more precisely used here? The annual means?
l211: Section ??
In Fig. 1 left: It is not clear to me what the hatched regions show. Is the nonparametric error only shown for q=102?
l232: We follow section 14? Eq. 14?
l239: It is not clear to me what insample refers to here.
l257: I think this is the first time in the paper GMMIP is mentioned. What is it?
l283: What does LHS mean?
Fig. 3: The caption should also describe the plots along the diagonal.
Citation: https://doi.org/10.5194/egusphere20232269RC3 
AC3: 'Reply on RC3', Saloua Peatier, 08 Feb 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere20232269/egusphere20232269AC3supplement.pdf

AC3: 'Reply on RC3', Saloua Peatier, 08 Feb 2024
Status: closed

RC1: 'Comment on egusphere20232269', Anonymous Referee #1, 13 Nov 2023

AC1: 'Reply on RC1', Saloua Peatier, 08 Feb 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere20232269/egusphere20232269AC1supplement.pdf

AC1: 'Reply on RC1', Saloua Peatier, 08 Feb 2024

RC2: 'Comment on egusphere20232269', Anonymous Referee #2, 19 Nov 2023

AC2: 'Reply on RC2', Saloua Peatier, 08 Feb 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere20232269/egusphere20232269AC2supplement.pdf

AC2: 'Reply on RC2', Saloua Peatier, 08 Feb 2024

RC3: 'Comment on egusphere20232269', Anonymous Referee #3, 20 Nov 2023
The paper deals with the problem of finding good calibrations of a climate model based on a finite number of simulations with different parameters. A methodology based on PCA analysis of an ensemble of simulations is developed where the model error is split into a parametric component and a nonparametric component. The methodology is applied to a perturbed physics ensemble from an atmospheric model. I am not an expert in calibration but find that the paper includes some valuable insights. The paper is very technical and, unfortunately, in some places not very clearly described and therefore hard to follow.
General comments:
1) The Introduction is rather terse and could be more welcoming for a broader audience. It would be nice if the authors would define the concept 'emulator'. It seems from the examples that this is here understood as statistical tools to interpolate between the parameters of which you have climate model experiments. However, it is my understanding that emulators also can be simple physical models.
It would also be an improvement if the authors would discuss which fields are generally used in the calibration of climate models. Is 'calibration' the same process as what is often referred to as 'tuning'?
Where does the word 'spatial' in the title come from? I guess the method is rather general although you apply it here to spatial fields.
2) It seems to be assumed that observations are perfect. This is not the case and often there are also errors originating from the finite sampling: the estimated climatology in both observations and models depends on the length of the timeseries. How will these errors impact your results?
3) The emulator used in this study seems to be linear multiple regression. Furthermore, the methodology of the analysis is based on EOF/PC analyses. But other emulators are nonlinear. The output of such emulators are therefore not limited to the linear space spanned by the EOFs. So how would your analyses change with a nonlinear emulator and how will it change your conclusions?
4) The analytical deviations can in places be hard to follow. I would suggest that a notation is used that differentiates between scalars, vectors, and matrices.
5) I am also confused about the optimisation described in section 2.4 and 2.5. As far as I can see you don't apply a minimizing method but generates a lot of emulations with different parameters. Then you find a set of parameters with error smaller than the error from a reference model (what is this). And then you prune that set to a much smaller set (2.5). Is this correct? So the set you find are then not local minima?
Specific comments:
43: This is the first time PPE is used in the main text. It should be spelled out here.
64: Salter et al. should not be in ().
l85: So n=102 and each \Theta_i is a vector of length 30? As mentioned above it would be helpful if the notation separated vectors from scalars.
l106: \mu seems to be on the lefthand side in Eq. 105, so does r_f include \mu here?
Eqs. 6 and 7: c_f > c_y ?
Eq. 14: So this is multilinear regression. I am again confused about notation. Should \Theta_i just be the vector \Theta? Theta_i is the vector of parameters used in the i'th climate model?
l174: What is a reference model? Is there any reason to assume that this is better than any random selected model from the ensemble?
Section 3, beginning: You consider the SAT from 3 years but what is more precisely used here? The annual means?
l211: Section ??
In Fig. 1 left: It is not clear to me what the hatched regions show. Is the nonparametric error only shown for q=102?
l232: We follow section 14? Eq. 14?
l239: It is not clear to me what insample refers to here.
l257: I think this is the first time in the paper GMMIP is mentioned. What is it?
l283: What does LHS mean?
Fig. 3: The caption should also describe the plots along the diagonal.
Citation: https://doi.org/10.5194/egusphere20232269RC3 
AC3: 'Reply on RC3', Saloua Peatier, 08 Feb 2024
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere20232269/egusphere20232269AC3supplement.pdf

AC3: 'Reply on RC3', Saloua Peatier, 08 Feb 2024
Viewed
HTML  XML  Total  BibTeX  EndNote  

351  154  32  537  26  29 
 HTML: 351
 PDF: 154
 XML: 32
 Total: 537
 BibTeX: 26
 EndNote: 29
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1