Monte Carlo Drift Correction &ndash; Quantifying the Drift Uncertainty of Global Climate Models

Grandey, Benjamin S.; Koh, Zhi Yang; Samanta, Dhrubajyoti; Horton, Benjamin P.; Dauwels, Justin; Chew, Lock Yue

doi:https://doi.org/10.5194/egusphere-2022-1515

Preprints

https://doi.org/10.5194/egusphere-2022-1515

Preprints

14 Feb 2023

| 14 Feb 2023

Monte Carlo Drift Correction – Quantifying the Drift Uncertainty of Global Climate Models

Benjamin S. Grandey, Zhi Yang Koh, Dhrubajyoti Samanta, Benjamin P. Horton, Justin Dauwels, and Lock Yue Chew

Abstract. Global climate models are susceptible to drift, causing spurious trends in output variables. Drift is often corrected using data from a control simulation. However, internal climate variability within the control simulation introduces uncertainty to the drift correction process. To quantify this drift uncertainty, we develop a probabilistic technique: Monte Carlo drift correction (MCDC). MCDC involves random sampling of the control time series. We apply MCDC to an ensemble of global climate models from the Coupled Model Intercomparison Project Phase 6 (CMIP6). We find that drift correction partially addresses a problem related to drift: energy non-conservation. Nevertheless, the energy balance of several models remains suspect. We quantify the drift uncertainty of global quantities associated with energy balance and thermal expansion of the ocean. When correcting drift in a cumulatively-integrated energy flux, we find that it is preferable to integrate the flux before correcting the trend: an alternative method would be to correct the bias before integrating the flux, but this alternative method amplifies the drift uncertainty by up to an order of magnitude. We find that drift uncertainty is often smaller than other sources of uncertainty: for thermosteric sea-level rise projections for the 2090s, ensemble-mean drift uncertainty (9 mm) is an order of magnitude smaller than scenario uncertainty (138 mm) and model uncertainty (98 mm). However, drift uncertainty may dominate time series that have weak trends: for historical thermosteric sea-level rise since the 1850s, ensemble-mean drift uncertainty is 15 mm, which is of comparable magnitude to the impact of omitting volcanic forcing in control simulations. Therefore, drift uncertainty may influence comparisons between historical simulations and observation-based estimates of thermosteric sea-level rise. When evaluating and analysing global climate model data that are susceptible to drift, researchers should consider drift uncertainty.

Received: 28 Dec 2022 – Discussion started: 14 Feb 2023

Download & links

Preprint (PDF, 10963 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (10963 KB)

Supplement (8024 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

16 Nov 2023

Monte Carlo drift correction – quantifying the drift uncertainty of global climate models

Benjamin S. Grandey, Zhi Yang Koh, Dhrubajyoti Samanta, Benjamin P. Horton, Justin Dauwels, and Lock Yue Chew

Geosci. Model Dev., 16, 6593–6608, https://doi.org/10.5194/gmd-16-6593-2023,https://doi.org/10.5194/gmd-16-6593-2023, 2023

Short summary

Benjamin S. Grandey, Zhi Yang Koh, Dhrubajyoti Samanta, Benjamin P. Horton, Justin Dauwels, and Lock Yue Chew

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2022-1515', Damien Irving, 16 Mar 2023

# General comments
In general, I think this manuscript makes a valuable contribution to the literature. It introduces a concept - internal drift uncertainty arising from internal climate variability within model control simulations - that is typically overlooked by authors working with climate model variables that are prone to drift (i.e. those influenced by the deep ocean). The main result - that drift uncertainty can be relatively large in comparison to forced trends in historical simulations - is important and the authors put forward a useful method (Monte Carlo Drift Correction) for quantifying/checking the size of drift uncertainty.
Some other minor results are also interesting and well worth documenting:
1. The results the authors present regarding the how the fraction of excess energy absorbed by the ocean and expansion efficiency of heat behaves in control simulations before and after drift correction also adds a little to the existing literature on energy conservation in CMIP models (Hobbs et al 2016; Irving et al 2021).
2. The authors point out that it is preferable to integrate fluxes before correcting the trend (as opposed to correcting the bias before integrating the flux) which is something other papers do (e.g. Irving et al 2019; https://doi.org/10.1029/2019GL082015) but don't necessarily explain why they make that methodological choice.
# Specific comments
The authors acknowledge in the manuscript (line 415 and elsewhere) that a limitation of their study is that they ignore branch time metadata, which could be used to reduce uncertainty by allowing for higher order polynomials to be fitted. I agree with the authors that in some cases the branch time metadata is either not available or incorrect, but more often than not branch time metadata is available/correct and where it isn't it can usually be estimated. For instance, Irving et al (2019; https://doi.org/10.1029/2019GL082015) analyse an ensemble of CMIP models and say the following: "We obtained a drift estimate by fitting a cubic polynomial to the full control time series... The time period in the control simulation that parallels the forced simulation was then identified using the branch time information provided in the file metadata, so that the correct segment of the cubic polynomial could be subtracted from the forced simulation. For models with erroneous metadata, the branch time was estimated via visual inspection of the globally integrated OHC timeseries."
I strongly encourage the authors to follow the lead of Irving et al (2019) by attempting to verify model branch times by plotting a variable such as globally integrated OHC. Since essentially all models have a fairly large drift in globally integrated OHC, if you plot the control and forced experiment time series (using branch time information to line up the respective time axes) it's usually pretty easy to see if the first value of the forced experiment does in fact branch off the control experiment at the time the metadata says it does. If it doesn't, it's usually pretty easy to approximately esimate where the branch point actually is. Following this procedure I'd be surprised if there were many models for which the branch time couldn't be verified as correct or sufficiently estimated. This would allow the authors to overcome some of the main limitations of their study.
# Technical corrections
Irving et al "2020" is quoted throughout the paper but the actual publication year of that paper is 2021: https://doi.org/10.1175/JCLI-D-20-0281.1

Citation: https://doi.org/10.5194/egusphere-2022-1515-RC1
RC2: 'Comment on egusphere-2022-1515', Anonymous Referee #2, 19 Jun 2023

The authors build on previous papers that considered model drift in coupled climate models, and in particular the potential role of internal variability in affecting drift corrections. To that end they propose a Monte Carlo method of drift estimation, and compare drift estimates using time derivative/flux vs time-integrals/state variables. The paper is quite clearly written (although I found the paragraph structure - with almost each sentence being a separate paragraph - to be quite jarring), and the figures are mostly clear. However, there are some significant technical/methodological issues that the authors will need to either address or justify before i could recommend publication.
I have attached a marked-up PDF, but my significant comments are;
1) terminology - the authors need to be a little careful claiming that the climate models are not energy conserving, which isn't really true. There are errors in the models' energy budgets - which previous papers have termed 'leaks' - but because the models ARE energy-conserving these leaks are generally absorbed by the ocean.
2) Method - this is my biggest issue. The authors split a model control into 150-year segments and calculate the spurious trends (ie. drift) for each segment. That's an entirely appropriate way fo looking at century0scale internal variability. However, for thier Monte Carlo method they then create a parametric Gaussian distribution of the drift, based on the statistical error of each 150-year segment. That uncertainty is then included in their estimates of uncertainty in the overall drift, since they extrapolating that 150-year estimate (with statistical error) to the full 1100-year control run. This spuriously magnifies the actual uncertainty, because the statistical error in a trend calculated from the full 1100-year series would be much less than that from a 150-year series. Almost any reasonable analysis would use the whole control run, and not a 10% subset, for drift correct.
3) Justification of Method - the use of 150-year estimates rather than the full control is not adequately justified. The only real justification that's given is related to issues with an unknown branch time in forced model experiments. That argument isn't relevent here where there's an a priori assumption that the drift is a constant, and it's also untrue that branch time is inherently unknowable, It is correct that in CMIP5 some published branch times were incorrect, but the correct ones were able to be inferred and indeed were publicly posted on the CMIP5 errata. I'm not aware of any such issue in CMIP6.
4) as a counterpoint to using time subsets, the authors could still use a Monte Carlo method of dealing with internal variability by calculating the trend across the full control run, and considering the standard error from that estimate.
5) One the authors' main conclusions is that drift estimated from time-intergrals has less uncertainty than from a time-derivative, becuase time-derivatives are inherently more noisy and so have highr standard error. This is certainly true, but since the use of subset periods spuriously elevates the uncertainty, this effect is magnified. Figures 3b and f show that there really isn't any difference when you use the entire control run (beyond that fact that one is estimated using an average, and the other using least squares optimisation).
In summary, I think the authors need to make a much clearer argument for using temporal subsets rather than the full series, and also need to consider very carefully how uncertainty from a subset should be extrapolated to the whole series.

Citation: https://doi.org/10.5194/egusphere-2022-1515-RC2
AC1: 'Response to referee comments', Benjamin Grandey, 07 Aug 2023

We thank the two referees for their constructive critique of our manuscript. The attached PDF contains a response to each comment.

Citation: https://doi.org/10.5194/egusphere-2022-1515-AC1

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2022-1515', Damien Irving, 16 Mar 2023

# General comments
In general, I think this manuscript makes a valuable contribution to the literature. It introduces a concept - internal drift uncertainty arising from internal climate variability within model control simulations - that is typically overlooked by authors working with climate model variables that are prone to drift (i.e. those influenced by the deep ocean). The main result - that drift uncertainty can be relatively large in comparison to forced trends in historical simulations - is important and the authors put forward a useful method (Monte Carlo Drift Correction) for quantifying/checking the size of drift uncertainty.
Some other minor results are also interesting and well worth documenting:
1. The results the authors present regarding the how the fraction of excess energy absorbed by the ocean and expansion efficiency of heat behaves in control simulations before and after drift correction also adds a little to the existing literature on energy conservation in CMIP models (Hobbs et al 2016; Irving et al 2021).
2. The authors point out that it is preferable to integrate fluxes before correcting the trend (as opposed to correcting the bias before integrating the flux) which is something other papers do (e.g. Irving et al 2019; https://doi.org/10.1029/2019GL082015) but don't necessarily explain why they make that methodological choice.
# Specific comments
The authors acknowledge in the manuscript (line 415 and elsewhere) that a limitation of their study is that they ignore branch time metadata, which could be used to reduce uncertainty by allowing for higher order polynomials to be fitted. I agree with the authors that in some cases the branch time metadata is either not available or incorrect, but more often than not branch time metadata is available/correct and where it isn't it can usually be estimated. For instance, Irving et al (2019; https://doi.org/10.1029/2019GL082015) analyse an ensemble of CMIP models and say the following: "We obtained a drift estimate by fitting a cubic polynomial to the full control time series... The time period in the control simulation that parallels the forced simulation was then identified using the branch time information provided in the file metadata, so that the correct segment of the cubic polynomial could be subtracted from the forced simulation. For models with erroneous metadata, the branch time was estimated via visual inspection of the globally integrated OHC timeseries."
I strongly encourage the authors to follow the lead of Irving et al (2019) by attempting to verify model branch times by plotting a variable such as globally integrated OHC. Since essentially all models have a fairly large drift in globally integrated OHC, if you plot the control and forced experiment time series (using branch time information to line up the respective time axes) it's usually pretty easy to see if the first value of the forced experiment does in fact branch off the control experiment at the time the metadata says it does. If it doesn't, it's usually pretty easy to approximately esimate where the branch point actually is. Following this procedure I'd be surprised if there were many models for which the branch time couldn't be verified as correct or sufficiently estimated. This would allow the authors to overcome some of the main limitations of their study.
# Technical corrections
Irving et al "2020" is quoted throughout the paper but the actual publication year of that paper is 2021: https://doi.org/10.1175/JCLI-D-20-0281.1

Citation: https://doi.org/10.5194/egusphere-2022-1515-RC1
RC2: 'Comment on egusphere-2022-1515', Anonymous Referee #2, 19 Jun 2023

The authors build on previous papers that considered model drift in coupled climate models, and in particular the potential role of internal variability in affecting drift corrections. To that end they propose a Monte Carlo method of drift estimation, and compare drift estimates using time derivative/flux vs time-integrals/state variables. The paper is quite clearly written (although I found the paragraph structure - with almost each sentence being a separate paragraph - to be quite jarring), and the figures are mostly clear. However, there are some significant technical/methodological issues that the authors will need to either address or justify before i could recommend publication.
I have attached a marked-up PDF, but my significant comments are;
1) terminology - the authors need to be a little careful claiming that the climate models are not energy conserving, which isn't really true. There are errors in the models' energy budgets - which previous papers have termed 'leaks' - but because the models ARE energy-conserving these leaks are generally absorbed by the ocean.
2) Method - this is my biggest issue. The authors split a model control into 150-year segments and calculate the spurious trends (ie. drift) for each segment. That's an entirely appropriate way fo looking at century0scale internal variability. However, for thier Monte Carlo method they then create a parametric Gaussian distribution of the drift, based on the statistical error of each 150-year segment. That uncertainty is then included in their estimates of uncertainty in the overall drift, since they extrapolating that 150-year estimate (with statistical error) to the full 1100-year control run. This spuriously magnifies the actual uncertainty, because the statistical error in a trend calculated from the full 1100-year series would be much less than that from a 150-year series. Almost any reasonable analysis would use the whole control run, and not a 10% subset, for drift correct.
3) Justification of Method - the use of 150-year estimates rather than the full control is not adequately justified. The only real justification that's given is related to issues with an unknown branch time in forced model experiments. That argument isn't relevent here where there's an a priori assumption that the drift is a constant, and it's also untrue that branch time is inherently unknowable, It is correct that in CMIP5 some published branch times were incorrect, but the correct ones were able to be inferred and indeed were publicly posted on the CMIP5 errata. I'm not aware of any such issue in CMIP6.
4) as a counterpoint to using time subsets, the authors could still use a Monte Carlo method of dealing with internal variability by calculating the trend across the full control run, and considering the standard error from that estimate.
5) One the authors' main conclusions is that drift estimated from time-intergrals has less uncertainty than from a time-derivative, becuase time-derivatives are inherently more noisy and so have highr standard error. This is certainly true, but since the use of subset periods spuriously elevates the uncertainty, this effect is magnified. Figures 3b and f show that there really isn't any difference when you use the entire control run (beyond that fact that one is estimated using an average, and the other using least squares optimisation).
In summary, I think the authors need to make a much clearer argument for using temporal subsets rather than the full series, and also need to consider very carefully how uncertainty from a subset should be extrapolated to the whole series.

Citation: https://doi.org/10.5194/egusphere-2022-1515-RC2
AC1: 'Response to referee comments', Benjamin Grandey, 07 Aug 2023

We thank the two referees for their constructive critique of our manuscript. The attached PDF contains a response to each comment.

Citation: https://doi.org/10.5194/egusphere-2022-1515-AC1

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Benjamin Grandey on behalf of the Authors (07 Aug 2023) Author's response Author's tracked changes Manuscript

ED: Publish subject to minor revisions (review by editor) (05 Sep 2023) by Sergey Gromov

Dear Benjamin S. Grandey and co-authors,

Thank you for submitting the revised version of the manuscript. I sincerely apologise for an exceptionally long review process caused mainly by a great difficulty of finding the reviewers for your study (I have never had more than two dozen declines while editing for GMD) and overcommitment of the latter which added up to the delay.

I am generally satisfied with the review process and your replies to the reviewers’ comments. Should only the latter have been addressed, I would be happy to continue with the publication “as is”. However, I notice considerable changes in the methodology (i.e., the “agnostic MCDC”) introduced which would normally trigger me to send the manuscript out for another round of reviews, which I would like to spare us from by offering myself for a round of discussion (luckily we are allowed to do that in GMD).

The reason here is that I see the same criticism as was earlier brought by both reviewers regarding the “mixed” use of estimates of short-term drift samples. In the agnostic MCDC – by combining linear, quadratic and cubic fits in one MC statistic – you may spuriously increase the final uncertainty estimate, as obviously one or two of the fit models are inferior. Why not testing each of the three models separately and selecting the one that yields the best fit (for a given ESM)? Combining the three also has little physical sense – would not you expect the underlying process to be a mere linear, quadratic or cubic function of time (through whatever, perhaps unidentified, reason in the model build)?

I believe the point of combining the three fit models in one statistic has to be justified in the revised manuscript. Alternatively, using the best-fitting model will not require such justification. At last, why only the quadratic and cubic fits are considered as an alternative to the linear one? As the underlying functional relation for drift temporal evolution is not known, you could use a general “exponential” form (e.g., c+a*t^p, with c, a and p being the fitting parameters) which will reduce the estimates to the two general cases: classic linear (p=1) and non-linear (p<>1, c representing whatever accumulating hidden unbalanced component of the system prior to the branch time). In my view, this would be the most sensible way to study whatever non-linear option for the estimate, whilst keeping the “traditional” (or read expected from the first-principle ΔH-ΔE-ΔZ relation) option for comparison as well.

In addition to this one general comments, I have outlined a few specific ones below.

I am looking forward to your reply and the revised version of the manuscript.

Sincerely yours,
S. Gromov

Specific comments

L115 Consider revising the sentence (wordiness)
LL200-201 Looking at Fig.3, I note that the agnostic-method drift uncertainty is approximately twice as large as the ensemble median” for more than one model
L405 Are there no uncertainty estimates available for η at all?
L438 I understand that XX will be replaced by a given no. in the future?
Fig.2, left and centre columns: What is the essence of presenting the fit at times prior to the branch time? I am not sure that this anyhow quantifies anything sensibly
Fig.2, caption: Please use “plotted alongside the uncorrected control time series” only once, in the sentence preceding explication of panels (a), (b) and (c)

Hide

AR by Benjamin Grandey on behalf of the Authors (18 Sep 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (03 Oct 2023) by Sergey Gromov

AR by Benjamin Grandey on behalf of the Authors (04 Oct 2023)

Journal article(s) based on this preprint

16 Nov 2023

Monte Carlo drift correction – quantifying the drift uncertainty of global climate models

Benjamin S. Grandey, Zhi Yang Koh, Dhrubajyoti Samanta, Benjamin P. Horton, Justin Dauwels, and Lock Yue Chew

Geosci. Model Dev., 16, 6593–6608, https://doi.org/10.5194/gmd-16-6593-2023,https://doi.org/10.5194/gmd-16-6593-2023, 2023

Short summary

Benjamin S. Grandey, Zhi Yang Koh, Dhrubajyoti Samanta, Benjamin P. Horton, Justin Dauwels, and Lock Yue Chew

Supplement

https://doi.org/10.5194/egusphere-2022-1515-supplement

Data sets

d22a-mcdc: Analysis Code for "Monte Carlo Drift Correction – Quantifying the Drift Uncertainty of Global Climate Models" Benjamin S. Grandey https://doi.org/10.5281/zenodo.7488335

Benjamin S. Grandey, Zhi Yang Koh, Dhrubajyoti Samanta, Benjamin P. Horton, Justin Dauwels, and Lock Yue Chew

Viewed

Total article views: 519 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
387	114	18	519	51	6	7

HTML: 387
PDF: 114
XML: 18
Total: 519
Supplement: 51
BibTeX: 6
EndNote: 7

Views and downloads (calculated since 14 Feb 2023)

Month	HTML	PDF	XML	Total
Feb 2023	118	31	5	154
Mar 2023	90	13	4	107
Apr 2023	44	8	1	53
May 2023	19	3	0	22
Jun 2023	22	10	3	35
Jul 2023	21	11	1	33
Aug 2023	25	11	2	38
Sep 2023	21	10	0	31
Oct 2023	20	10	1	31
Nov 2023	7	7	1	15
Dec 2023	0
Jan 2024	0

Cumulative views and downloads (calculated since 14 Feb 2023)

Month	HTML	PDF	XML	Total
Feb 2023	118	31	5	154
Mar 2023	90	13	4	107
Apr 2023	44	8	1	53
May 2023	19	3	0	22
Jun 2023	22	10	3	35
Jul 2023	21	11	1	33
Aug 2023	25	11	2	38
Sep 2023	21	10	0	31
Oct 2023	20	10	1	31
Nov 2023	7	7	1	15
Dec 2023	0
Jan 2024	0

Viewed (geographical distribution)

Total article views: 508 (including HTML, PDF, and XML) Thereof 508 with geography defined and 0 with unknown origin.

Country	#	Views	%

Cited

Latest update: 26 Jan 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (10963 KB)
Metadata XML

Short summary

Global climate models are susceptible to spurious trends known as drift. Fortunately, drift can be corrected when analysing data produced by the models. To explore the uncertainty associated with drift correction, we develop a new method: Monte Carlo drift correction. For historical simulations of thermosteric sea-level rise, drift uncertainty is relatively large. When analysing data susceptible to drift, researchers should consider drift uncertainty.


Total:	0
HTML:	0
PDF:	0
XML:	0

Monte Carlo Drift Correction – Quantifying the Drift Uncertainty of Global Climate Models

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Journal article(s) based on this preprint

Supplement

Data sets

Viewed

Viewed (geographical distribution)

Cited

1 citations as recorded by crossref.