Standardising the "Gregory method" for calculating equilibrium climate sensitivity
Abstract. The equilibrium climate sensitivity (ECS) – the equilibrium global mean temperature response to a doubling of atmospheric CO2 – is a high-profile metric for quantifying the Earth system’s response to human-induced climate change. A widely applied approach to estimating the ECS is the ‘Gregory method’ (Gregory et al., 2004), which uses an ordinary least squares (OLS) regression between the net radiative flux and surface air temperature anomalies from a 150-year experiment in which atmospheric CO2 concentrations are quadrupled. The ECS is determined at the point where net radiative flux reaches zero i.e. the system is back in equilibrium. This method has been used to compare ECS estimates across the CMIP5 and CMIP6 ensembles and will likely be a key diagnostic for CMIP7. Despite its widespread application, there is little consistency or transparency between studies in how the climate model data is processed prior to the regression, leading to potential discrepancies in ECS estimates. We identify 20 alternative data processing pathways, varying by different choices in global mean weighting, annual mean weighting, anomaly calculation method, and linear regression fit. Using 41 CMIP6 models, we systematically assess the impact of these choices on ECS estimates. While the inter-model ECS range is insensitive to the data processing pathway, individual models exhibit notable differences. Approximating a model’s native grid cell area with cosine of the latitude can decrease the ECS by 11 %, and some anomaly calculation methods can introduce spurious temporal correlations in the processed data. Beyond data processing choices, we also evaluate an alternative linear regression method – total least squares (TLS) – which appears to have a more statistically robust basis than OLS. However, for consistency with previous literature, and given physical reasoning suggests that TLS may further reduce the ECS compared to OLS, i.e. make a known bias in the Gregory method worse, we do not feel there is sufficient clarity to recommend a transition to TLS in all cases. To improve reproducibility and comparability in future studies, we recommend a standardised Gregory method: weighting the global mean by cell area, weighting the annual mean by number of days per month, and calculating anomalies by first applying a rolling average to the piControl timeseries then subtracting from the CO2 quadrupling experiment. This approach implicitly accounts for model drift while reducing noise in the data to best meet the pre-conditions of the linear regression. While CMIP6 results of the multi-model mean ECS appear robust to these processing choices, similar assumptions may not hold for CMIP7, underscoring the need for standardised data preparation in future climate sensitivity assessments.