the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Monte Carlo Drift Correction – Quantifying the Drift Uncertainty of Global Climate Models
Abstract. Global climate models are susceptible to drift, causing spurious trends in output variables. Drift is often corrected using data from a control simulation. However, internal climate variability within the control simulation introduces uncertainty to the drift correction process. To quantify this drift uncertainty, we develop a probabilistic technique: Monte Carlo drift correction (MCDC). MCDC involves random sampling of the control time series. We apply MCDC to an ensemble of global climate models from the Coupled Model Intercomparison Project Phase 6 (CMIP6). We find that drift correction partially addresses a problem related to drift: energy nonconservation. Nevertheless, the energy balance of several models remains suspect. We quantify the drift uncertainty of global quantities associated with energy balance and thermal expansion of the ocean. When correcting drift in a cumulativelyintegrated energy flux, we find that it is preferable to integrate the flux before correcting the trend: an alternative method would be to correct the bias before integrating the flux, but this alternative method amplifies the drift uncertainty by up to an order of magnitude. We find that drift uncertainty is often smaller than other sources of uncertainty: for thermosteric sealevel rise projections for the 2090s, ensemblemean drift uncertainty (9 mm) is an order of magnitude smaller than scenario uncertainty (138 mm) and model uncertainty (98 mm). However, drift uncertainty may dominate time series that have weak trends: for historical thermosteric sealevel rise since the 1850s, ensemblemean drift uncertainty is 15 mm, which is of comparable magnitude to the impact of omitting volcanic forcing in control simulations. Therefore, drift uncertainty may influence comparisons between historical simulations and observationbased estimates of thermosteric sealevel rise. When evaluating and analysing global climate model data that are susceptible to drift, researchers should consider drift uncertainty.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(10963 KB)

Supplement
(8024 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(10963 KB)  Metadata XML

Supplement
(8024 KB)  BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221515', Damien Irving, 16 Mar 2023
# General comments
In general, I think this manuscript makes a valuable contribution to the literature. It introduces a concept  internal drift uncertainty arising from internal climate variability within model control simulations  that is typically overlooked by authors working with climate model variables that are prone to drift (i.e. those influenced by the deep ocean). The main result  that drift uncertainty can be relatively large in comparison to forced trends in historical simulations  is important and the authors put forward a useful method (Monte Carlo Drift Correction) for quantifying/checking the size of drift uncertainty.
Some other minor results are also interesting and well worth documenting:
1. The results the authors present regarding the how the fraction of excess energy absorbed by the ocean and expansion efficiency of heat behaves in control simulations before and after drift correction also adds a little to the existing literature on energy conservation in CMIP models (Hobbs et al 2016; Irving et al 2021).
2. The authors point out that it is preferable to integrate fluxes before correcting the trend (as opposed to correcting the bias before integrating the flux) which is something other papers do (e.g. Irving et al 2019; https://doi.org/10.1029/2019GL082015) but don't necessarily explain why they make that methodological choice.
# Specific comments
The authors acknowledge in the manuscript (line 415 and elsewhere) that a limitation of their study is that they ignore branch time metadata, which could be used to reduce uncertainty by allowing for higher order polynomials to be fitted. I agree with the authors that in some cases the branch time metadata is either not available or incorrect, but more often than not branch time metadata is available/correct and where it isn't it can usually be estimated. For instance, Irving et al (2019; https://doi.org/10.1029/2019GL082015) analyse an ensemble of CMIP models and say the following: "We obtained a drift estimate by fitting a cubic polynomial to the full control time series... The time period in the control simulation that parallels the forced simulation was then identified using the branch time information provided in the file metadata, so that the correct segment of the cubic polynomial could be subtracted from the forced simulation. For models with erroneous metadata, the branch time was estimated via visual inspection of the globally integrated OHC timeseries."
I strongly encourage the authors to follow the lead of Irving et al (2019) by attempting to verify model branch times by plotting a variable such as globally integrated OHC. Since essentially all models have a fairly large drift in globally integrated OHC, if you plot the control and forced experiment time series (using branch time information to line up the respective time axes) it's usually pretty easy to see if the first value of the forced experiment does in fact branch off the control experiment at the time the metadata says it does. If it doesn't, it's usually pretty easy to approximately esimate where the branch point actually is. Following this procedure I'd be surprised if there were many models for which the branch time couldn't be verified as correct or sufficiently estimated. This would allow the authors to overcome some of the main limitations of their study.
# Technical corrections
Irving et al "2020" is quoted throughout the paper but the actual publication year of that paper is 2021: https://doi.org/10.1175/JCLID200281.1
Citation: https://doi.org/10.5194/egusphere20221515RC1 
RC2: 'Comment on egusphere20221515', Anonymous Referee #2, 19 Jun 2023
The authors build on previous papers that considered model drift in coupled climate models, and in particular the potential role of internal variability in affecting drift corrections. To that end they propose a Monte Carlo method of drift estimation, and compare drift estimates using time derivative/flux vs timeintegrals/state variables. The paper is quite clearly written (although I found the paragraph structure  with almost each sentence being a separate paragraph  to be quite jarring), and the figures are mostly clear. However, there are some significant technical/methodological issues that the authors will need to either address or justify before i could recommend publication.
I have attached a markedup PDF, but my significant comments are;
1) terminology  the authors need to be a little careful claiming that the climate models are not energy conserving, which isn't really true. There are errors in the models' energy budgets  which previous papers have termed 'leaks'  but because the models ARE energyconserving these leaks are generally absorbed by the ocean.
2) Method  this is my biggest issue. The authors split a model control into 150year segments and calculate the spurious trends (ie. drift) for each segment. That's an entirely appropriate way fo looking at century0scale internal variability. However, for thier Monte Carlo method they then create a parametric Gaussian distribution of the drift, based on the statistical error of each 150year segment. That uncertainty is then included in their estimates of uncertainty in the overall drift, since they extrapolating that 150year estimate (with statistical error) to the full 1100year control run. This spuriously magnifies the actual uncertainty, because the statistical error in a trend calculated from the full 1100year series would be much less than that from a 150year series. Almost any reasonable analysis would use the whole control run, and not a 10% subset, for drift correct.
3) Justification of Method  the use of 150year estimates rather than the full control is not adequately justified. The only real justification that's given is related to issues with an unknown branch time in forced model experiments. That argument isn't relevent here where there's an a priori assumption that the drift is a constant, and it's also untrue that branch time is inherently unknowable, It is correct that in CMIP5 some published branch times were incorrect, but the correct ones were able to be inferred and indeed were publicly posted on the CMIP5 errata. I'm not aware of any such issue in CMIP6.
4) as a counterpoint to using time subsets, the authors could still use a Monte Carlo method of dealing with internal variability by calculating the trend across the full control run, and considering the standard error from that estimate.
5) One the authors' main conclusions is that drift estimated from timeintergrals has less uncertainty than from a timederivative, becuase timederivatives are inherently more noisy and so have highr standard error. This is certainly true, but since the use of subset periods spuriously elevates the uncertainty, this effect is magnified. Figures 3b and f show that there really isn't any difference when you use the entire control run (beyond that fact that one is estimated using an average, and the other using least squares optimisation).
In summary, I think the authors need to make a much clearer argument for using temporal subsets rather than the full series, and also need to consider very carefully how uncertainty from a subset should be extrapolated to the whole series.
 AC1: 'Response to referee comments', Benjamin Grandey, 07 Aug 2023
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20221515', Damien Irving, 16 Mar 2023
# General comments
In general, I think this manuscript makes a valuable contribution to the literature. It introduces a concept  internal drift uncertainty arising from internal climate variability within model control simulations  that is typically overlooked by authors working with climate model variables that are prone to drift (i.e. those influenced by the deep ocean). The main result  that drift uncertainty can be relatively large in comparison to forced trends in historical simulations  is important and the authors put forward a useful method (Monte Carlo Drift Correction) for quantifying/checking the size of drift uncertainty.
Some other minor results are also interesting and well worth documenting:
1. The results the authors present regarding the how the fraction of excess energy absorbed by the ocean and expansion efficiency of heat behaves in control simulations before and after drift correction also adds a little to the existing literature on energy conservation in CMIP models (Hobbs et al 2016; Irving et al 2021).
2. The authors point out that it is preferable to integrate fluxes before correcting the trend (as opposed to correcting the bias before integrating the flux) which is something other papers do (e.g. Irving et al 2019; https://doi.org/10.1029/2019GL082015) but don't necessarily explain why they make that methodological choice.
# Specific comments
The authors acknowledge in the manuscript (line 415 and elsewhere) that a limitation of their study is that they ignore branch time metadata, which could be used to reduce uncertainty by allowing for higher order polynomials to be fitted. I agree with the authors that in some cases the branch time metadata is either not available or incorrect, but more often than not branch time metadata is available/correct and where it isn't it can usually be estimated. For instance, Irving et al (2019; https://doi.org/10.1029/2019GL082015) analyse an ensemble of CMIP models and say the following: "We obtained a drift estimate by fitting a cubic polynomial to the full control time series... The time period in the control simulation that parallels the forced simulation was then identified using the branch time information provided in the file metadata, so that the correct segment of the cubic polynomial could be subtracted from the forced simulation. For models with erroneous metadata, the branch time was estimated via visual inspection of the globally integrated OHC timeseries."
I strongly encourage the authors to follow the lead of Irving et al (2019) by attempting to verify model branch times by plotting a variable such as globally integrated OHC. Since essentially all models have a fairly large drift in globally integrated OHC, if you plot the control and forced experiment time series (using branch time information to line up the respective time axes) it's usually pretty easy to see if the first value of the forced experiment does in fact branch off the control experiment at the time the metadata says it does. If it doesn't, it's usually pretty easy to approximately esimate where the branch point actually is. Following this procedure I'd be surprised if there were many models for which the branch time couldn't be verified as correct or sufficiently estimated. This would allow the authors to overcome some of the main limitations of their study.
# Technical corrections
Irving et al "2020" is quoted throughout the paper but the actual publication year of that paper is 2021: https://doi.org/10.1175/JCLID200281.1
Citation: https://doi.org/10.5194/egusphere20221515RC1 
RC2: 'Comment on egusphere20221515', Anonymous Referee #2, 19 Jun 2023
The authors build on previous papers that considered model drift in coupled climate models, and in particular the potential role of internal variability in affecting drift corrections. To that end they propose a Monte Carlo method of drift estimation, and compare drift estimates using time derivative/flux vs timeintegrals/state variables. The paper is quite clearly written (although I found the paragraph structure  with almost each sentence being a separate paragraph  to be quite jarring), and the figures are mostly clear. However, there are some significant technical/methodological issues that the authors will need to either address or justify before i could recommend publication.
I have attached a markedup PDF, but my significant comments are;
1) terminology  the authors need to be a little careful claiming that the climate models are not energy conserving, which isn't really true. There are errors in the models' energy budgets  which previous papers have termed 'leaks'  but because the models ARE energyconserving these leaks are generally absorbed by the ocean.
2) Method  this is my biggest issue. The authors split a model control into 150year segments and calculate the spurious trends (ie. drift) for each segment. That's an entirely appropriate way fo looking at century0scale internal variability. However, for thier Monte Carlo method they then create a parametric Gaussian distribution of the drift, based on the statistical error of each 150year segment. That uncertainty is then included in their estimates of uncertainty in the overall drift, since they extrapolating that 150year estimate (with statistical error) to the full 1100year control run. This spuriously magnifies the actual uncertainty, because the statistical error in a trend calculated from the full 1100year series would be much less than that from a 150year series. Almost any reasonable analysis would use the whole control run, and not a 10% subset, for drift correct.
3) Justification of Method  the use of 150year estimates rather than the full control is not adequately justified. The only real justification that's given is related to issues with an unknown branch time in forced model experiments. That argument isn't relevent here where there's an a priori assumption that the drift is a constant, and it's also untrue that branch time is inherently unknowable, It is correct that in CMIP5 some published branch times were incorrect, but the correct ones were able to be inferred and indeed were publicly posted on the CMIP5 errata. I'm not aware of any such issue in CMIP6.
4) as a counterpoint to using time subsets, the authors could still use a Monte Carlo method of dealing with internal variability by calculating the trend across the full control run, and considering the standard error from that estimate.
5) One the authors' main conclusions is that drift estimated from timeintergrals has less uncertainty than from a timederivative, becuase timederivatives are inherently more noisy and so have highr standard error. This is certainly true, but since the use of subset periods spuriously elevates the uncertainty, this effect is magnified. Figures 3b and f show that there really isn't any difference when you use the entire control run (beyond that fact that one is estimated using an average, and the other using least squares optimisation).
In summary, I think the authors need to make a much clearer argument for using temporal subsets rather than the full series, and also need to consider very carefully how uncertainty from a subset should be extrapolated to the whole series.
 AC1: 'Response to referee comments', Benjamin Grandey, 07 Aug 2023
Peer review completion
Journal article(s) based on this preprint
Data sets
d22amcdc: Analysis Code for "Monte Carlo Drift Correction – Quantifying the Drift Uncertainty of Global Climate Models" Benjamin S. Grandey https://doi.org/10.5281/zenodo.7488335
Viewed
HTML  XML  Total  Supplement  BibTeX  EndNote  

387  114  18  519  51  6  7 
 HTML: 387
 PDF: 114
 XML: 18
 Total: 519
 Supplement: 51
 BibTeX: 6
 EndNote: 7
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
Cited
1 citations as recorded by crossref.
Benjamin S. Grandey
Zhi Yang Koh
Dhrubajyoti Samanta
Benjamin P. Horton
Justin Dauwels
Lock Yue Chew
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(10963 KB)  Metadata XML

Supplement
(8024 KB)  BibTeX
 EndNote
 Final revised paper