the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Constraining net long term climate feedback from satellite observed internal variability possible by mid 2030s
Abstract. Observing climate feedbacks to long term global warming, crucial climate regulators, is not feasible within the observational record. However, linking them to topoftheatmosphere flux variations in response to surface temperature fluctuations (internal variability feedbacks) is a viable approach. Here, we explore the use of this method of relating internal variability to forced climate feedbacks in models and applying the resulting relationship to observations to constrain forced climate feedbacks. Our findings reveal strong longwave and shortwave feedback relationships in models during the 14year overlap with the CERES observational record. Yet, due to the weaker relationship between internal variability and forced climate longwave feedbacks, the net feedback relationship remains weak, even over longer periods extending beyond the CERES record. However, after about half a century, this relationship strengthens primarily due to a reinforcement of the relationship between internal variability and forced climate shortwave feedbacks. We therefore explore merging the satellite records with reanalysis to establish an extended data record. The resulting constraint suggests a stronger negative forced climate net feedback than the model´s distribution and an equilibrium climate sensitivity of about 2.5 K (2.14 K to 3.07 K, 5–95 % confidence intervals). Nevertheless, for example biogeochemical climate feedbacks, inactive on short time scales, and also not represented in most models, may lead to climate sensitivity being underestimated by this method. Also, continuous satellite observations until at least the mid2030s are necessary for using purely observed estimate of the net internal variability feedback in constraining net forced climate feedback and, consequently, climate sensitivity.
 Preprint
(5041 KB)  Metadata XML
 BibTeX
 EndNote
Status: final response (author comments only)

RC1: 'Comment', Anonymous Referee #1, 04 Jun 2024
Using CMIP6 model simulations, the authors derive an emergent constraint that relates feedback from internal variability (IV) to forced feedback. They show that there are statistically significant relationships across models between components of IV feedback and forced feedback: a strong relationship for SW, a weaker relationship for LW, and an even weaker, but still significant and meaningful relationship for the net feedback. Using this relationship and combining it with observed internal variability, they show that more observations are needed in order to use this finding to actually constrain ECS. As an alternative to waiting for more satellite data, the authors extend the satellite record back in time by applying a correction model to reanalysis radiative fluxes, and thus constrain ECS to 2.5 K [90 % CI: 2.14 – 3.07 K], which is lower than current estimates from models, the IPCC, or Sherwood et al. 2020. Future satellite observations could be integrated into their method to update the estimates, and the authors quantify the quality of the constraint as a function of the number of observed years.
The paper is wellwritten, wellpresented, clearly states its goal and provides evidence to support the claims, guiding the reader through the argumentation. The statistical methods are sound and used in an appropriate way. While I do have a long list of comments and questions, I want to stress that I enjoyed reading the paper and consider it a beneficial addition to the research on feedbacks and climate sensitivity. I have one main point to raise in criticism of this paper, which I will present in the following.
My main comment can be summarized as “What about the pattern effect?”
From the methods section, I understand that the feedback parameters are calculated as differential feedback parameters (referring to Rugenstein and Armour 2021, https://doi.org/10.1029/2021GL092983, please confirm if this interpretation is correct). All feedback parameters are estimated as the slope of N(T). For λ_ab, which time period is used for the regression? The full 150 years? We know that λ_ab changes considerably over time, both over the 150 years period (which is accounted for if the full 150 years are used for the regression), but also after this (e.g. Rugenstein et al. 2020, https://doi.org/10.1029/2019GL083898). According to that paper, ECS estimated from the 150year span is an underestimate of the true ECS by 17 % in models. Would this affect the ECS estimate that the paper gives?
Further uncertainties may arise when leaving the model world. The historical simulations which are used to compute λ_it, are not capable of reproducing the observed SST patterns (e.g. Wills et al. 2022, https://doi.org/10.1029/2022GL100011). It is currently debated if the observed pattern of strong Western Pacific warming will continue or switch to stronger warming in the Eastern Pacific. This uncertainty implies enormous uncertainty for ECS (Alessi and Rugenstein 2023, https://doi.org/10.1029/2023GL105795). The point that I’m trying to make with these explanations is that it may very well be that the connection between λ_it and λ_ab is very different in the real world and models. While models produce ElNino like patterns both in the present and future, the real world has warmed more LaNina like until now, and we don’t know how it will continue. Since these patterns are tightly linked to λ_ab, the model results may not be applicable to the real world. This would be a major problem for the emergent constraint that the paper develops, because an implicit assumption of the emergent constraint approach is that the statistical relationship that is found in the models is applicable to reality.
I would like to ask the authors to discuss this uncertainty. In particular, do you think it affects the ECS range that is determined? If yes, how? If no, why not? If the authors agree that this could add substantial uncertainty, I propose mentioning this also in the last part of the abstract, which currently suggests that all uncertainties (except for the biogeochemical feedback) are accounted for in the 5 – 95 % CI.
In addition, I have other comments:

l. 12, 21, 335 – 337: What biogeochemical processes does this refer to? Can you specify? I wonder if they are relevant for ECS, as the carboncycle does not matter for this concept of fixed CO2 concentration, and vegetation changes are not included in the definition of ECS

l. 82 paragraph: As mentioned before, please state which years are used for the regression of λ_ab;

l. 83 – 84: Is there a particular reason for subtracting the control state? I wonder, because a constant shouldn’t affect the slope estimate. It wouldn’t hurt the calculation, but I’m curious.

l. 125: It is not immediately clear to me what was done here by “randomly permuting”. Were the R and T time series randomly matched (e.g. R from model 1 realization 1 and T from model 2 realization 1), and were the feedback parameters subsequently computed from these randomly matched time series? Am I right in assuming that only complete time series were permuted, not individual values in the time series?

Fig. 2: I am not sure that Fig. 2 is really needed. To me as a reader, the only relevant information is the likelihood of obtaining the correlations by chance, which is mentioned in the text; the full distribution is not so interesting, and the differences between the blue, red, and black lines are anyway hard to grasp. While I take no issue with this figure, I believe that it could be removed without loss of information; however, I would like to see the likelihood to obtain the correlations for the net feedback parameter by chance in the text, I only found this information for LW and SW

l. 150: Given that the first term is 0.43 and the last one is 0.72, does that mean that the internal SW feedback outperforms the internal LW feedback as a predictor for the forced LW feedback (by having a strong anticorrelation)? I find that interesting.

l. 174 – 175: So if the SW is the strongest contributor, that means that it comes down to clouds (unsurprisingly). Do you think the poor model representation of clouds is a problem for that?

l. 182: Models have no measurement uncertainty, but EBAF does. Is the uncertainty that arises from the satellite measurements (and also from the temperature data, but I assume that will be less important) taken into account? Would it affect the estimate of ECS or is it too small to make a difference? When combining the measurements from CERES and ERBE, is it problematic that the satellite changes, e.g., are there inconsistencies or steps?

l. 187: The values are almost all well below 1 %. Doesn’t that mean that less years might also be enough, if we think that, e.g., 5 % would be sufficient?

Fig. 3 caption: Unclear what is meant by “n – 2014”, what is n here? Should I read it as “n to 2014” or “n minus 2014”?

l. 194 – 195: The suggested approach here is to wait for new satellite observations, but by then we will also have longer historical simulations. Can’t we just run your analysis on the historical simulations again in 14 years, circumventing the whole problem of using the emergent relationship from one period with observations from another? It’s still an interesting question to ask, but I don’t see the practical necessity to use the “old” emergent relationship 14 years from now

l. 206  216: This seems to be in disagreement with the results of Fig. 4 (d). In Fig. 4 (d) you show that when taking at least 40 years, it doesn’t matter which period one picks, λ_it will always be the same. So λ_it does not depend on the chosen period if the period is long enough. λ_ab obviously doesn’t depend on the chosen period either. So how can the relationship between λ_it and λ_ab depend on the chosen period (that’s what I read from Fig. 4 a and b)? I have a hard time reconciling this. In addition, Gregory and Andrews 2016 (https://doi.org/10.1002/2016GL068406) show that historical feedback has varied quite a bit, although they use shorter than 40year periods for their regression.

Fig. 4 (a) and (b). How can the starting year be 1980 and higher for 51year periods?

Does it surprise you that the relationship between λ_it and λ_ab varies strongly in time?

l. 250 – 252 and Fig. 5 (a): +/ 2 W/m^2 seems not negligible compared to interannual variability of globalmean TOA flux, which I would expect to vary by less than 10 W/m^2. How can it be that the correlation with CERESERBE is still so high (0.99)? It means that 98% of the variance of the ERA5 feedback parameter is explained by CERESERBE, so only 2 % is left for the error, which seems low given that the error gets up to +/ 2 W/m^2.

l. 277 – 279: I don’t understand the method here. A probability density function of which quantity? What values are sampled from this distribution? I had expected one value for λ_it from ERA5, obtained from regressing over the 40year period, not a whole distribution. What am I missing? This seems like a central point of the paper and maybe deserves another sentence or two to clarify the method.

Is there a reason for presenting the results from this analysis as small insets in Fig. 1? It seems like one of the main outcomes of this paper is hidden in a small inset. If showing it in Fig. 1, I would prefer the yaxes of the main plot and the inset to be aligned.

l. 296 – 301: The list of limitations seems short. In addition to my questions about the pattern effect potentially limiting the results of this study, I think it may be beneficial to discuss further limitations. In particular, the emergent relationship is obtained from model simulations using models, hoping that this relationship would translate to the real world. However, most models that contribute to this relationship simulate λ_it values way outside the observed range (see Fig. 1 f). Could this limit the results?
Minor comments:

l. 72 – 75: the halfsentence “incorporating a more extensive…” appears twice

l. 161: The use of the word “assuming” makes sense here, but made me stumble, because it sounds like it’s a prerequisite to run the hypothesis, when it’s actually rather the null hypothesis; “testing for” or something similar would have been clearer to me
Citation: https://doi.org/10.5194/egusphere20241559RC1 

CC1: 'Comment on egusphere20241559: Regression dilution and use of an inappropriate CO2 forcing measure greatly inflate estimated ECS', Nicholas Lewis, 28 Jun 2024
This is an interesting and useful study. However, I see two significant technical shortcomings in the authors' derivation of their emergent constraint based estimates of net long term climate feedback and of equilibrium climate sensitivity (ECS). The first of these two shortcomings biases the climate feedback estimate, weakening its central value by 32%, and together the two shortcomings bias the ECS estimate upwards by some 70%
1. The authors derive an emergent constraint on net long term climate feedback (λ_{ab}) with a median value of –1.56 Wm^{−2}K^{−1} from a linear regression fit between net internal variability feedback (λ_{it}) and λ_{ab}, as shown in Figure 1(f). I cannot see that the regression method used for this purpose is explicitly stated, but it appears to be standard ordinary least squares (OLS) regression of λ_{ab} on λ_{it}. Such OLS regression, using data points and CERESERBEERA5 (observational) λ_{it} of –1.28 Wm^{−2}K^{−1} digitized from Figure 1(f), yields a –1.56 Wm^{−2}K^{−1} central estimate for λ_{ab}, identical to that given in line 287.
OLS regression assumes that the regressor variable is error free; if it is not then the regression slope will be biased towards zero ("regression dilution"). There is little uncertainty in the regressee variable, λ_{ab}, due to the high level of effective radiative forcing (ERF), and hence large changes in planetary net radiative balance (N) and surface temperature anomaly (ΔT), involved in abrupt4xCO2 simulations by atmosphereocean global climate models (GCMs). However, there is significant uncertainty in the regressor variable, λ_{it}, as shown by the horizontal error bars in Figure 1(f). Hence OLS regression of λ_{ab} on λ_{it} is unsuitable and will give a slope estimate biased towards zero.
However, as there is little uncertainty in λ_{ab}, OLS regression will give an almost unbiased estimation if the regressor and regressee variables are switched, with λ_{it} regressed on λ_{ab}.
Doing so gives a regression fit estimate of λ_{it} = 0.187 + 0.637 λ_{ab}, which on rearranging implies λ_{ab} = 1.570 λ_{it} – 0.294. The central estimate of λ_{ab}, based on the observed λ_{it} estimate of –1.28 Wm^{−2}K^{−1}, is then –2.30 Wm^{−2}K^{−1}.
In summary, the authors should use λ_{ab} rather than λ_{it} as the regressor, in order to avoid significant bias in the estimated linear fit between them, and should adopt the resulting observationallyconstrained λ_{ab} estimate of –2.30 Wm^{−2}K^{−1 }in place of their –1.56 Wm^{−2}K^{−1} estimate, which is seriously biased by regression dilution.
2. The standard estimate of a GCM's ECS corresponds to the ΔT at which N = 0 when extending an OLS regression linear fit of annual mean N on ΔT over 150 years of its abrupt4xCO2 simulation. That ΔT mathematically equals minus the slope of the regression fit line, –λ_{ab}, divided into the value of N where the fit line intersects the ΔT = 0 axis (F_{4x_reg150}). It follows that the standard estimate is ECS = F_{4x_reg150 }/ –λ_{ab}.
For almost all GCMs, F_{4x_reg150} is significantly lower than the actual ERF from quadrupled CO_{2}, as estimated from fixed SST simulations with a correction for land surface warming (F_{4x_SSTts}). The reason for this is simple. Net feedback is generally higher in the early part of 150 year abrupt4xCO2 simulations than in the much more numerous subsequent years, which have a dominant influence on linear estimation using OLS. As a result, in the earliest part of the N versus ΔT plot the regression fit lies below the actual N values, most significantly when ΔT = 0 at the start (taking F_{4x_SSTts} as the best estimate of the actual value of N at that point). As the standard estimate of ECS in GCMs (before scaling from 4x to 2x CO_{2}) is F_{4x_reg150 }/ –λ_{ab}, ECS estimated as F_{4x_SSTts }/ –λ_{ab} will be biased upwards.
It follows that deriving ECS, as the authors do for their median ECS estimate of 2.5 K, by dividing an observationallyconstrained GCMbased estimate of –λ_{ab} into estimated F_{2x_SSTts} will significantly overestimate ECS. This point is illustrated and more fully explained in sections 4.1 and S1 of Lewis (2023). Although using F_{2x_SSTts} rather than F_{2x_reg150} as the numerator when using estimated λ_{ab} in the denominator is not uncommon when estimating ECS, doing so is unjustifiable: it is mathematically incorrect and causes significant overestimation of ECS.
On average, F_{4x_reg150} is 16% lower than F_{4x_SSTts} in the 17 CMIP6 models for which Smith et al (2020) were able to derive F_{4x_SSTts} (see their Table S1 ERFreg150 and ERF_ts values). Those 17 models have an average F_{4x_SSTts} of 8.41 Wm^{−2}. This should be converted to a value for a doubling of CO_{2}, F_{2x_SSTts}, by dividing F_{4x_SSTts} by 2.10, per the formula in Meinshausen et al. (2020) that was adopted in IPCC AR6, rather than using the popular but inaccurate method of simply halving the 4x CO2 ERF. Doing so gives a F_{2x_SSTts} value of 4.01 Wm^{−2} for the Smith et al (2020) CMIP6 mean. That is close to the 3.93 Wm^{−2} value of F_{2x_SSTts} derived in AR6 and used in the manuscript. By contrast, F_{2x_reg150}, the similarly converted value of F_{4x_reg150}, is only 3.37 Wm^{−2} for the 17 Smith et al (2020) models – almost identical to the 3.35 Wm^{−2} average that I calculate for a larger set of 30 CMIP6 models.
The 2.10 ratio of quadrupled to doubled CO_{2} ERF is derived using detailed linebyline radiation code, not simplified GCM radiation code, but the radiation code in CMIP6 models should be more accurate than that in earlier GCM generations. Moreover, I compute a similar (marginally higher) average 4x to 2x CO_{2} ERF ratio for the five GCMs analysed in Rugenstein et al (2020) with data from both abrupt4xCO2 and abrupt2xCO2 simulations, with the ratio being 2.10 for the only CMIP6 GCM included (CNRMCM6). (That is based on estimating ERF by regression of N on ΔT over the first ten years after the abrupt CO_{2} increase, which provides a reasonable proxy for F_{4x_SSTts} in the Smith et al (2020) set of abrupt4xCO2 simulations.)
It follows that the authors should revise their ECS estimation formula to ECS = (F_{4x_reg150 }/ 2.10) /λ_{ab}, using the average regressionderived F_{4x_reg150} for the set of CMIP6 models used to constrain λ_{ab.} If that F_{2x_reg150} were the same as the 3.35 Wm^{−2} that I calculated for 30 CMIP6 models, then the revised median ECS estimate, based on the corrected central λ_{it} estimate of –2.30 Wm^{−2}K^{−1}, would be 3.35 / 2.30 = 1.46 K. If the IPCC AR6 assessment of ECS is correct, then such a low ECS estimate may be considered unlikely to be accurate. If so, the correct, and important, conclusion to draw would then be that the relationship between λ_{ab} and λ_{it} in CMIP6 models does not provide a reliable emergent constraint on ECS.
The relationship of "true" ECS to that derived from regression over 150 years after a CO_{2} increase
Linear regression of N on ΔT over the first 150 years of an abrupt4xCO2 (or abrupt2xCO2) simulation, with ECS taken as the N = 0 intercept of the fit (ECS_{reg150}) is the standard method for estimating the ECS of GCMs. Moreover, it is usual for observationallyconstrained nonpaleoclimate ECS studies to estimate that or another effective climate sensitivity measure. But, as noted in the Comment by Anonymous Referee #1, in GCMs the actual (true) ECS, as estimated from ultralong abrupt CO_{2} increase forced simulations, generally exceeds ECS_{reg150}. However, the 17% mean excess (for abrupt4xCO2 simulations) stated in the paper they cited, Rugenstein et al (2020), includes the FAMOUS model, which appears to be near to runaway at quadrupled CO_{2} – it warms almost four times as much as for doubled CO_{2}. On a forcingadjusted basis (dividing abrupt4xCO2 warming by 2.10), its 4x to 2x CO2 ECS ratio is 1.86, while for all the other models with both simulations that ratio lies in the range 1.00 to 1.10.
The average excess of estimated actual ECS over ECS_{reg150} in the Rugenstein et al (2020) models excluding the outlier FAMOUS abrupt4xCO2 simulation is 13.6%. The average ratio for the superset of those models included in Dunne et al (2020), ex FAMOUS, is almost identical.
Moreover, if true equilibrium ECS is to be estimated, it should logically be from ultra long abrupt2xCO2 simulations, as the definition of ECS is for a doubling, not a quadrupling, of preindustrial carbon dioxide concentration. The average ratio, for the Rugenstein et al (2020) models, of estimated true ECS for 2x CO2 to ECS_{reg150} derived by dividing abrupt4xCO2 data by 2.10, is slightly lower at 1.11x (or 1.06x when including FAMOUS).
Nicholas Lewis
Independent climate scientist
References
Dunne, JP et al (2020): Comparison of equilibrium climate sensitivity estimates from slab ocean, 150year, and longer simulations. Geophys Res Lett, 47, e2020GL088852. https://doi.org/10.1029/2020GL088852
Lewis, N (2023): Objectively combining climate sensitivity evidence. Climate Dynamics 60.9 (2023): 31393165. https://doi.org/10.1007/s0038202206468x
Meinshausen M, et al (2020:) The shared socioeconomic pathway (SSP) greenhouse gas concentrations and their extensions to 2500. Geosci Model Dev 13:3571–3605. https://doi.org/10.5194/gmd1335712020
Rugenstein M, et al (2020): Equilibrium climate sensitivity estimated by equilibrating climate models. Geophys Res Lett 47:e2019GL083898. https:// doi.org/10.1029/2019GL083898
Smith, CJ et al (2020): Effective radiative forcing and adjustments in CMIP6 models. Atmos Chem Phys, 20, 9591–9618, 2020. https://doi.org/10.5194/acp2095912020
Citation: https://doi.org/10.5194/egusphere20241559CC1 
RC2: 'Comment on egusphere20241559', Anonymous Referee #2, 01 Jul 2024
The authors investigate the relationship between internal variability feedbacks and forced climate feedbacks across a range of CMIP6 models. They explore the feasibility of using this relationship, along with observed internal variability feedback estimates derived using CERES, to establish an emergent constraint on Equilibrium Climate Sensitivity (ECS). The authors find a robust relationship between internal variability and forced feedbacks, particularly for shortwave and longwave components, whereas the relationship seen for the net feedback is weaker. To address this, the authors explore how the relationship strengthens over longer time periods (50 years). To provide an estimated constraint on ECS, the authors combine satellite observations with a reanalysis dataset to provide an observed estimate of internal variability feedbacks. However, in order to provide a constraint based on observations only, continuous satellite observations until the mid2030s would be necessary.
I found this paper enjoyable to read and I believe it would be a useful addition to the literature in this field. I have one major comment and a number of minor comments.
Major Comment:
The authors suggest that the relationship between internal variability feedback and forced feedback could be used as an emergent constraint on ECS. However, the utility of this relationship could be challenged were there to be a bias in modelled estimates of internal variability feedbacks compared to observations.
Armour et al. (2024) investigated the relationship between historical temperature trends and ECS, showing that this relationship was not suitable for use as an emergent constraint due to a known systematic bias in modelled estimates of historical temperature trends.
They show that since coupled climate models do not simulate observed temperature patterns, modelled historical warming was systematically warmer compared to observations.
Would the results of Armour et al. (2024) impact the conclusions reached in this analysis? Would this suggest that there may be a systematic bias between observed and modelled internal variability feedbacks due to different SST patterns?
For example, if AOGCMs are biased in their simulation of SSTs patterns and feedbacks due to internal variability, then it is plausible that this biases the emergent constraint with long term feedbacks proposed here.
Either way, I would expect some discussion on the limitations and potential for biases in the results.
Minor Comments:
Line 23 – Aren’t the changes in topofatmosphere flux in response to surface temperature changes how we often define feedbacks in general (not just internal variability feedbacks). Could the definition of internal variability feedbacks and forced climate feedbacks be more explicitly defined?
Line 50 (and in the introduction in general) – I think it might be beneficial to formally define somewhere in the introduction what is meant by forced feedback and internal variability feedback.
Line 69 – Could the historical and amip experiments used be more clearly defined.
Line 82 – “TOA flux anomalies” – Could the authors write “R” given they have shortened surface temperature anomalies to “T”.
Line 84 – Could the authors expand on how the historical members are detrended?
Line 85 – “various” – Could the authors be a bit more specific?
Line 91 – “the transformed datasets” – It isn’t completely clear what datasets are being referred to here.
Line 81 Paragraph – I think in general, if this paragraph could be rewritten to be much more thorough with the details it would help. Further questions I am left with are… How is F calculated in order to calculate the lambda in the historical experiments? Is the idea that the internal variability feedbacks have no forcing effecting them? And if so, how is this achieved given there is forcing over the historical period? Is this why the timeseries have been detrended? If the detrending has had a linear trend removed, given the historical temperature and flux timeseries’ are often highly nonlinear, is this definitely an appropriate method? Would that leave an imprint of the forcing still in the timeseries?
I think this paragraph was the main unclear bit in the paper for me.
I was also curious about whether this is a different method compared to that used in Uribe et al. 2022? (i.e. the use of GLS). Has that contributed to the slight change in the results?
Line 125 – This isn’t completely clear what has been done here. How were the datasets randomly permuted?
Figure 3 – All text in Figure 3 is very small and smaller details are rather hard to read. Could the authors increase the font size and perhaps consider rearranging the subplot to help make it more readable.
Figures in general – Although the other figures are not as hard to read as Figure 3, some may benefit from larger text. Figure 5 is an example of this. Figure 5c also has the legend partially obscured by some of the lines in the plot.
Very Minor Comments:
Line 1 – “, crucial climate regulators,”  I would remove this or restructure the sentence as it is a little unclear whether the authors are describing the act of observing climate feedbacks or the climate feedbacks themselves. Obviously it is the latter, but I kept reading it as crucial climate regulators is not feasible. – This is a very very minor comment, but I think it would help make the first line of the abstract more punchy.
Line 85 – “However” – This doesn’t seem like quite the right word, “however” introduces a statement that contrasts or seems to contradict something. Would saying instead “Here, it is crucial…”. Again, a very minor comment, but it stood out to me.
Line 91 – “a OLS” –> “an OLS”.
Armour et al. (2024)  https://doi.org/10.1073/pnas.2312093121
Citation: https://doi.org/10.5194/egusphere20241559RC2
Viewed
HTML  XML  Total  BibTeX  EndNote  

369  64  26  459  20  29 
 HTML: 369
 PDF: 64
 XML: 26
 Total: 459
 BibTeX: 20
 EndNote: 29
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1