Likely breaks in cloud cover retrievals complicate attribution of the trend in the Earth Energy Imbalance

Hoogeveen, Jippe J. A.; Meirink, Jan Fokke; Selten, Frank M.

doi:10.5194/egusphere-2025-418

Preprints

https://doi.org/10.5194/egusphere-2025-418

Preprints

24 Feb 2025

| 24 Feb 2025

Likely breaks in cloud cover retrievals complicate attribution of the trend in the Earth Energy Imbalance

Jippe J. A. Hoogeveen, Jan Fokke Meirink, and Frank M. Selten

Abstract. There is a broad scientific consensus that the earth is warming due to anthropogenic emissions of greenhouse gases (GHG). Increasing GHGs decrease the outgoing longwave radiation (OLR) at the top of the atmosphere (TOA). Since climate change is driven by the Earth Energy Imbalance (EEI), it is crucial to have accurate estimates of the TOA radiative fluxes and identify the factors that drive the observed trend in EEI. In this research, we examined satellite-measured TOA radiative fluxes. In accordance with other studies we found a substantial increase in the absorbed solar radiation (ASR) and a smaller increase in OLR since 2000, which indicates that increased ASR played an important role in recent global warming. We derived a statistical model that quantifies the contribution of different factors to the observed trends in ASR and OLR. We found that the assessment of the contribution of trends in clouds is complicated due to inhomogeneities in retrieved clear-sky fluxes and the underlying cloud datasets. A formal break detection algorithm strongly suggests the existence of breaks in especially low cloud cover. The cloud effect on ASR is therefore relatively hard to estimate, but it is likely a major cause of the increase in ASR. OLR can be more accurately reproduced with cloud cover, temperature and water vapour changes, but the expected decrease due to GHG was not found. We conclude that the inhomogeneities detected in this study warrant more study as they impact the attribution of the trend in EEI.

Received: 28 Jan 2025 – Discussion started: 24 Feb 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Jippe J. A. Hoogeveen, Jan Fokke Meirink, and Frank M. Selten

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-418', Anonymous Referee #1, 07 Mar 2025

Overall take
The paper attempts to attribute changes in Outgoing Longwave Radiation (OLR) and Absorbed Solar Radiation (ASR), which when combined represent the Earth's Energy Imbalance (EEI) to changes in clouds, Water Vapor + Temperature, GHG, Aerosols + Surface Albedo. A statistical tool is used to diagnose whether the "residual" (GHG, Aerosols, Albedo) is heterogeneous in time, namely whether distinct periods exist where the influence changes sign or accelerates/decelerates rapidly. So-called "breaks" are found, but there is little interpretation or speculation of why they are there (it is admittedly difficult to pin down the reasons). I believe there is a serious misinterpretation of one of the critical datasets, no clear picture emerges, and the conclusion highlighted by the abstract (the role of low cloud changes) is not supported by the presentation, in my opinion. I thought the paper would discuss the challenges of putting together and reconciling long time series of disparate data, but it focuses on the well-observed post-2000 period (where we have high-quality CERES and MODIS); so one has to wonder why other, lower quality data have to be brought in. Ultimately, I think that this paper instead of shedding light on what's going on with OLR/ASR/EEI variations adds confusion with its questionable methodology and interpretation, and I thus can't recommend publication in anything close to its current form.
Below, I provide some specific comments in case the paper will be considered for publication:
Major points
-- Conclusion about low clouds Lines 9-10, but not reiterated in the conclusions. If this is a major finding of the paper any relevant results and figures can't be in the Appendix.
-- Lines 91-93. I think there's a major misunderstanding here. First of all the authors talk about cloud data in CERES SSF1deg above and then in these lines they invoke a CERES EBAF document. I think this statement draws from the DQS document p. 19, Table 7-1. What this part of the document addresses is how to convert MODIS narrowband radiances (within the CERES footprint) to broad band radiances so that a clear-sky flux at the sub-footprint level can be calculated from the clear-sky MODIS pixels. So C5 and C61 refer to radiance (L1b data) not cloud retrievals (Table 7-1 caption mentions fluxes). C5 and C61 cloud retrievals are performed by the GSFC-based Atmospheric Discipline MODIS team. The CERES project has its own cloud retrievals which are included in various CERES products. Invariably, when a retrieval algorithm changes, all the data is being reprocessed with the same algorithm: you don't use one algorithm for one part of the time series and another for another period. I think this major misinterpretation of different cloud retrievals for different periods unfortunately propagates throughout the paper.
-- Now, actual discontinuities in cloud retrievals, despite processing with the same algorithm, can happen because of other events. See for example https://atmosphere-imager.gsfc.nasa.gov/issues/cloud
-- Why was the break detection algorithm not applied to the Appendix cloud cover time series? That would've been interesting!
-- Cloud_cci, ERA5, and HIRS do not appear in sections 5 and 6. Why even include them in the paper?
-- Lines 11-12, decrease of OLR due to GHG increase. Well, there are a lot of negative trends in Table 3. Does this say something about deficiencies in the methodology? Clear-sky OLR does not have to go down if temperature goes up (the Earth warms to get rid of the excess heat).
-- There is no single plot of EEI = OLR+ASR. Wouldn't that give some clues about the quality of the datasets?
-- ISCCP is a notoriously inhomogeneous cloud dataset because of changes in the satellite fleet and instruments throughout time. But this is not mentioned.
Somewhat minor points
-- Figs. 1-4: Except for Fig. 1, pre-2000, it is hard to see what's going on. Perhaps smoothed versions (running mean) are in order? What's going on with ISCCP ASR circa 1984?
-- What's the physical meaning of the specific values of c? Are these values suitable for the anomaly values of this paper? Or is it a normalized value. They seem so arbitrary. So if I was looking at anomalies as high as 100 in the time series, how would those c values change? In general I found subsection 3.2 a bit too long and confusing. Are these details really needed? I think the reader has to take a leap of faith here and assume that the authors are using BEAST correctly.
-- Constant incoming solar assumption. Given the small magnitude of ASR anomalies, can you really make that assumption? I guess in a relative comparison of different ASR anomalies it's OK, but for absolute anomaly magnitudes, it is a bit precarious to make this assumption.
-- Lines 106-109: Why are you mentioning this? Does it matter for the analysis that follows?
-- Lines 98-99. The (radiative) tops of middle and high clouds lie between and above those levels, not the entire clouds.
-- Lines 225-226. This sentence basically says: OLR_all-sky = OLR_clr + (OLR_all-sky - OLR_clr). What is the point here? The two OLR_clr's are not the same?
-- So in Fig. 5, residual = GHG. -0.27 Wm^-2decade^-1 is the value of the red bar, In Fig. 5b, and of the slope of th ered line in Fig. 5c, correct? In Fig. 6 residual = other. These things could be explained more clearly.
-- Lines 152-154. This sentence is unclear to me.
-- Lines 83-84. This sentence doesn't make sense.

Citation: https://doi.org/10.5194/egusphere-2025-418-RC1
- AC1: 'Reply on RC1', Jippe Hoogeveen, 16 Apr 2025
  
  First of all, thank you for your review! Below, we copied the review and we added our replies to your individual points.
  
  Overall take
  The paper attempts to attribute changes in Outgoing Longwave Radiation (OLR) and Absorbed Solar Radiation (ASR), which when combined represent the Earth's Energy Imbalance (EEI) to changes in clouds, Water Vapor + Temperature, GHG, Aerosols + Surface Albedo. A statistical tool is used to diagnose whether the "residual" (GHG, Aerosols, Albedo) is heterogeneous in time, namely whether distinct periods exist where the influence changes sign or accelerates/decelerates rapidly. So-called "breaks" are found, but there is little interpretation or speculation of why they are there (it is admittedly difficult to pin down the reasons). I believe there is a serious misinterpretation of one of the critical datasets, no clear picture emerges, and the conclusion highlighted by the abstract (the role of low cloud changes) is not supported by the presentation, in my opinion. I thought the paper would discuss the challenges of putting together and reconciling long time series of disparate data, but it focuses on the well-observed post-2000 period (where we have high-quality CERES and MODIS); so one has to wonder why other, lower quality data have to be brought in. Ultimately, I think that this paper instead of shedding light on what's going on with OLR/ASR/EEI variations adds confusion with its questionable methodology and interpretation, and I thus can't recommend publication in anything close to its current form.
  Below, I provide some specific comments in case the paper will be considered for publication:
  Major points
  -- Conclusion about low clouds Lines 9-10, but not reiterated in the conclusions. If this is a major finding of the paper any relevant results and figures can't be in the Appendix.
  Reply:
  We agree that this is not a major point of the paper and that the specific statement on low clouds should not be included in the abstract. We propose to change this line to: ‘A formal break detection algorithm strongly suggests the existence of breaks in the timeseries of observed minus modelled TOA fluxes, in particular for ASR.’
  
  -- Lines 91-93. I think there's a major misunderstanding here. First of all the authors talk about cloud data in CERES SSF1deg above and then in these lines they invoke a CERES EBAF document. I think this statement draws from the DQS document p. 19, Table 7-1. What this part of the document addresses is how to convert MODIS narrowband radiances (within the CERES footprint) to broad band radiances so that a clear-sky flux at the sub-footprint level can be calculated from the clear-sky MODIS pixels. So C5 and C61 refer to radiance (L1b data) not cloud retrievals (Table 7-1 caption mentions fluxes). C5 and C61 cloud retrievals are performed by the GSFC-based Atmospheric Discipline MODIS team. The CERES project has its own cloud retrievals which are included in various CERES products. Invariably, when a retrieval algorithm changes, all the data is being reprocessed with the same algorithm: you don't use one algorithm for one part of the time series and another for another period. I think this major misinterpretation of different cloud retrievals for different periods unfortunately propagates throughout the paper.
  Reply: Although our description of the inhomogeneities in the manuscript was not completely accurate, we do believe that there are possible inhomogeneities in both CERES clear-sky OLR/ASR and CERES SSF cloud data, both related to the transit from MODIS C5 to C6.1 radiances. With reference to the relevant CERES documents, we explain below what we think could cause breaks in CERES clear-sky fluxes and CERES SSF cloud properties.
  For CERES clear-sky TOA fluxes, the second bullet point on page 19 and Table 7-1 in the data quality summary of EBAF states that: "For EBAF Ed4.2, the MODIS or VIIRS narrowband to broadband coefficients are based on the collection (C) number that is consistent with imager version used in the CERES record. MODIS C5 coefficients were applied between 2000 and February 2016, MODIS C6.1 coefficients were applied between March 2016 and March 2022, and NOAA20 VIIRS C2.1 coefficients are applied beginning in April 2022." This part explains how a clear-sky flux is computed in mostly-overcast regions, as also summarized by the reviewer: clear-sky MODIS narrowband radiances are converted to broadband radiances and subsequently to fluxes. Until February 2016, MODIS C5 coefficients are used, but from March 2016 onwards MODIS C6.1 coefficients are used (we did not include the latest data from NOAA20 in our analysis). We suggest that this might result in a break in clear-sky fluxes, although we definitely do not exclude other causes.
  In the paper, we will change line 83-84 to: " For cloudy CERES footprints, TOA radiances calculated from clear-sky MODIS pixels within the footprint are used to determine clear-sky CERES fluxes. Coefficients for the conversion of MODIS narrowband radiances to broadband radiances are based on MODIS Collection 5 (C5) until February 2016, while from March 2016 onwards they are based on MODIS Collection 6.1 (C6.1) (CERES, 2024). ."
  For CERES SSF, the third bullet point on page 10 of the data quality summary of CERES SSF1deg states that: "The ED4A MODIS cloud properties are based on upon Collection 5 through February 2016 and upon Collection 6.1 from March 2016 onwards." We suggest that this might cause an inhomogeneity in the cloud measurements of CERES SSF. Potential differences in cloud masks until and after February 2016 might also affect the CERES EBAF clear-sky fluxes through a different selection of clear-sky MODIS pixels. In the paper, we will change lines 85-87 to: "The CERES Single Scanner Footprint (SSF, version SSF1deg Ed4A) product includes cloud cover retrieved from the polar satellites Terra, Aqua, S-NPP and NOAA-20 (Doelling et al., 2013). For this paper, we focused on Terra and Aqua, since the other two satellites do not cover a long enough period. The relevant cloud property retrievals are based on MODIS C5 until February 2016 and MODIS C6.1 from March 2016 onwards (CERES, 2023). We averaged the measurements from Terra and Aqua."
  Maybe the confusion comes from the fact that the L1B data were reprocessed entirely with MODIS C6.1, but the CERES SSF data were not. The data quality summary of EBAF in the sixth bullet point on page 20 states this: "The entire MODIS C6.1 L1B record was reprocessed; however, the CERES project only reprocessed the SSF1deg and SYN1deg records using MODIS C6.1 between February 2016 and March 2018."
  
  -- Now, actual discontinuities in cloud retrievals, despite processing with the same algorithm, can happen because of other events. See for example https://atmosphere-imager.gsfc.nasa.gov/issues/cloud
  Reply: This is acknowledged and we do not intend to exclude potential other causes for breaks in the time series. In the revised manuscript, specifically when discussing breaks between February and March 2016, we will use more careful formulations and leave more room for other potential causes of discontinuities.
  
  -- Why was the break detection algorithm not applied to the Appendix cloud cover time series? That would've been interesting!
  Reply: BEAST assumes that a continuous time series contains only a trend (and noise). However the cloud cover time series most likely contain large natural variation and this makes it harder to estimate break points in them directly. In contrast, clear-sky fluxes contain less natural variation, especially since we also remove the temperature effect from OLR. We did try detecting breaks directly in cloud datasets in Figure A7 by examining the difference between the CLARA-A3 and CERES SSF cloud cover, for which natural variation should largely cancel out. Here a few possible other inhomogeneities were detected which might be related to variations in the collection of satellites used for CLARA-A3. These were strongest in low cloud cover, which is why this was mentioned in the main text.
  
  -- Cloud_cci, ERA5, and HIRS do not appear in sections 5 and 6. Why even include them in the paper?
  Reply: We added them to section 4 to illustrate that reliable OLR and ASR estimates are hard to achieve before 2000.
  
  -- Lines 11-12, decrease of OLR due to GHG increase. Well, there are a lot of negative trends in Table 3. Does this say something about deficiencies in the methodology? Clear-sky OLR does not have to go down if temperature goes up (the Earth warms to get rid of the excess heat).
  Reply: The trends in Table 3 represent the trend in OLR when cloud and temperature and water vapour effects have been removed. Here we mainly expected to find the GHG effect, so negative trends are expected. However in the scientific literature stronger negative trends are reported (see discussion in lines 386-390). A final conclusion on the cause of this difference was not reached.
  
  -- There is no single plot of EEI = OLR+ASR. Wouldn't that give some clues about the quality of the datasets?
  Reply: EEI is a very important variable, but we doubt that this would provide additional insight: we do not expect factors in OLR and ASR to cancel out in EEI, so plotting EEI does not make inhomogeneities easier to detect.
  
  -- ISCCP is a notoriously inhomogeneous cloud dataset because of changes in the satellite fleet and instruments throughout time. But this is not mentioned.
  Reply: We agree that ISCCP is rather inhomogeneous. This was also mentioned in the conclusions, but we will add the following line at the end of Section 2.2: ‘It has been noted that due to varying spatial coverage of and instruments onboard the satellites underlying ISCCP, the data record is relatively inhomogeneous and less suited for trend analyses (Evan et al., 2007; Devasthale and Karlsson, 2023).’.
  
  Somewhat minor points
  -- Figs. 1-4: Except for Fig. 1, pre-2000, it is hard to see what's going on. Perhaps smoothed versions (running mean) are in order? What's going on with ISCCP ASR circa 1984?
  Reply: We agree that the figures are not easy to read. However we do not believe that plotting only the running means would be the best thing to do, because it is also interesting to see if the monthly variations tie up between the different datasets. We propose to plot the monthly values with thinner lines and add the running means in bold.
  Regarding the anomalies of ISCCP ASR in 1984: we do not know what causes this.
  
  -- What's the physical meaning of the specific values of c? Are these values suitable for the anomaly values of this paper? Or is it a normalized value. They seem so arbitrary. So if I was looking at anomalies as high as 100 in the time series, how would those c values change? In general I found subsection 3.2 a bit too long and confusing. Are these details really needed? I think the reader has to take a leap of faith here and assume that the authors are using BEAST correctly.
  Reply: The value c indicates the ratio between the break and the standard deviation. We will mention this more clearly in section 3.2. The idea of this section is to provide a feeling for how strongly BEAST responds to inhomogeneities and if it is able to pinpoint the exact date. We believe that this will help readers to interpret the results of BEAST obtained for real data further on.
  
  -- Constant incoming solar assumption. Given the small magnitude of ASR anomalies, can you really make that assumption? I guess in a relative comparison of different ASR anomalies it's OK, but for absolute anomaly magnitudes, it is a bit precarious to make this assumption.
  Reply: The solar irradiance has only rather small variations (up to 0.4 Wm^-2 over the entire period). This can be important if one is interested in small ASR anomalies. The CERES measurements provide the reflected shortwave radiation, so to obtain the actual ASR, the incoming solar irradiance anomalies have to be added. However, we are interested in determining the cloud cover effect on ASR and finding inhomogeneities in this effect. To do this it is best to remove the solar irradiance effect on ASR again. Since most solar irradiance is absorbed (about 70%), this means that 70% of the solar irradiance anomalies have to be subtracted from ASR again (so only 30% of these anomalies remain in the CERES RSF measurements). Hence only small anomalies caused by solar irradiance remain in our analyses (about 0.1 Wm^-2) and we believe it is safe to ignore this.
  
  -- Lines 106-109: Why are you mentioning this? Does it matter for the analysis that follows?
  Reply: We mention this, because it gives a possible explanation why the CLARA-A3 cloud data can have an inhomogeneity in 2003. We will add it to the conclusions of our paper.
  
  -- Lines 98-99. The (radiative) tops of middle and high clouds lie between and above those levels, not the entire clouds.
  Reply: Thank you for pointing that out! We will adjust it.
  
  -- Lines 225-226. This sentence basically says: OLRall-sky = OLRclr + (OLRall-sky - OLRclr). What is the point here? The two OLRclr's are not the same?
  Reply: This sentence tries to illustrate how we estimate the cloud and temperature and water vapour effect on OLR when using CERES clear-sky for the cloud effect. The idea is that we try to explain OLRclr with temperature and water vapour. This gives an estimate for the temperature and water vapour effect. Next we use OLRall-sky - OLRclr for the cloud effect and we add them up for the total effect. We will improve the description of this reasoning in the paper.
  
  -- So in Fig. 5, residual = GHG. -0.27 Wm-2decade-1 is the value of the red bar, In Fig. 5b, and of the slope of th ered line in Fig. 5c, correct? In Fig. 6 residual = other. These things could be explained more clearly.
  Reply: Indeed, the values in Fig. 5(b) are the trends of the different drivers of OLR. Other is the residual, so the value Other (GHG) in 5(b) is the trend of Residual in 5(c). This works the same way in Fig. 6 where Other (SA + AER) in 6(b) is the trend of Residual in 6(c). We will adjust Other to Residual to clarify this.
  
  -- Lines 152-154. This sentence is unclear to me.
  Reply: In the regression, we take the cloud cover at different heights (and for OLR the global temperature) as explanatory variables to determine their effect. However we expect that other factors impact OLR and ASR that are not taken into account by the regression. These factors (such as GHG for OLR) are typically very monotone and almost linear, which makes it hard to estimate their effect using linear regression. Hence, we add a trend to the regression model as explanatory variable to take such factors into account. This trend is then not added to the model from for example Fig. 5(a) and its only function is to take this kind of factors into account, such that no other variable tries to do this. We will add this clarification in the paper.
  
  -- Lines 83-84. This sentence doesn't make sense.
  Reply: Yes, this is an error, thank you for pointing it out! We will correct it.
  
  References
  CERES: CERES_SSF1deg-Hour/Day/Month_Ed4A Data Quality Summary, Version 2, NASA, 45pp, https://ceres.larc.nasa.gov/documents/DQ_summaries/CERES_SSF1deg_Ed4A_DQS.pdf, 2023.
  CERES: CERES_EBAF_Ed4.2 and Ed4.2.1 Data Quality Summary, Version 4, 30pp, NASA, https://ceres.larc.nasa.gov/documents/DQ_summaries/CERES_EBAF_Ed4.2_DQS.pdf, 2024.
  Evan, A.T., Heidinger, A.K., and Vimont, D.J.: Arguments against a physical long-term trend in global ISCCP cloud amounts. Geophys. Res. Lett., 34, https://doi.org/10.1029/2006GL028083, 2007.
  Devasthale, A. and Karlsson, K.-G.: Decadal Stability and Trends in the Global Cloud Amount and Cloud Top Temperature in the Satellite-Based Climate Data Records. Remote Sensing, 15(15), 3819. https://doi.org/10.3390/rs15153819, 2023.
  
  Citation: https://doi.org/10.5194/egusphere-2025-418-AC1
RC2:
'Comment on egusphere-2025-418', Anonymous Referee #3, 13 May 2025

The authors attempt to identify and explain “break points” in radiation flux satellite data records by first quantifying and removing most physical drivers of the trend caused by changes in temperature, water vapor or clouds, and then calculating break points on the remaining residual variability, which would likely be from GHG or aerosol forcings. These break points may be an artifact of the dataset algorithm or a physical behavior. While attempts to explain trends and identify uncertainty in retrievals are important, I do not feel the methods and results presented here are appropriate for this task. I found the methods to be unclear and not well justified given that the linear regression model often does not capture the variability of the total radiation. This complicates interpretation of the break points.

The proposed model also gives quite different trend breakdowns of the different driving factors across datasets, despite the authors focusing on a post-2000 period where the overall radiation trends agree quite well between the datasets. In the results, the authors seemed to arbitrarily select or filter out break points (phrases like “...so we keep the break point there”) and then often give only surface-level explanations (sometimes incorrect) of potential retrieval artifacts without assessing whether the break points are capturing something physical. In some cases they give no explanation, which would often be fine if they ruled out explanations, but they don’t detail what they investigated before declaring it unexplainable. Break points and potential dataset retrieval issues are an important and often understudied topic, so I encourage the authors to rethink their paper, possibly using more established attribution techniques or a smaller, more manageable number of datasets that will allow them to dig further into the details. But in its current form, I cannot recommend this paper for acceptance and suggest the journal consider a rejection with the possibility for a re-submission in the future.

Line 40-41: The Dentener et al. timeseries is capturing Effective Radiative Forcing while the Loeb, Kramer, etc. estimates are just capturing the instantaneous radiative forcing, which does not include stratospheric or tropospheric adjustments to the CO2 and thus is smaller than the ERF. This is likely the primary cause of the discrepancy mentioned in these lines.

Line 82-84: CERES-EBAF 4.2 now has two types of clear-sky flux: the partial sky (c) product and the total sky (t) product. The authors should note which product they are referring to in these lines and which product they will use for the remainder of the paper.

Section 3.1: This section tries to describe the methods used to isolate the temperature, water vapor and cloud contributions to the total radiation. I found this section to be unclear, too vague and often not particularly well-justified
-Applying linear regression in this manner can’t guarantee an isolation of each contributing factor as cleanly as other well-established methods such as PRP or radiative kernels. Why not use those methods to isolate the contributions from T, WV and clouds?
-This section just refers to using the “temperature data of UAH” to represent temperature and water vapor contributions to the radiation. It is not clear what temperatures from the UAH dataset were used (mid-trop? Near surface?). More detail is necessary. There are also stable, well-tested direct measurements of moisture the authors could have used, such as from AIRS, HIRS, etc rather than using a proxy. Some explanation for why UAH temperature was used instead of those actual water vapor options is warranted.
-Line 146/147 says “Global averages of temperature and clouds are regionally weighted”. Does this mean the authors start with global-mean values and then distribute that regionally based on weights? This was unclear and, if so, would seem to add unnecessary uncertainty since regionally-resolved data is available from these products
-Generally it seems the authors are deriving cloud contributions by scaling clear-sky fluxes by cloud fraction. There needs to be some evidence or citations showing that this proportionality assumption is valid. I commend that the authors acknowledge this assumption could be a source of uncertainty in the conclusion sections, but since it is the foundation of this paper, making those acknowledgments is not enough.
-For CERES applications, the effects of clouds is diagnosed by using cloud radiative effect. However this can be misleading as CRE includes cloud masking effects from other variables, especially for the longwave. The authors should account for this either directly or indirectly in their analysis.

Section 3.2 – I appreciate that the authors evaluated the uncertainties of the BEAST method. It would be helpful if they added a conclusion sentence or two to this section about what we should take away from this analysis and keep in mind as we analyze the results in the rest of the paper. For instance, given that the breaks were place a bit earlier than index 121 (Table 2), should we assume breaks will be biased early when applied to the actual observations too?

General Fig 1-4: Given the improved agreement between datasets after 2000, it begs the question to what extent are most of these datasets dependent on CERES? Is CERES brought into the algorithm of these other datasets? Or are corrections made in order to better align with CERES? Or is the improved agreement around the time CERES began just a coincidence? Some discussion of this would be useful.

Line 235-238: The authors try to point to artificial causes of the breaks in the CERES record here, but neither make sense given the information provided. First, it is not clear which CERES product is being used in Figure 5, but if its an observed flux rather than a computed flux, mention of the GEOS analysis dataset is not relevant. Additionally, the authors surmise a CERES break in 2016 may come from a change in MODIS cloud product version, but CERES does not use the operational MODIS algorithms. And, regardless, when those MODIS algorithms are updated, the entire record is reprocessed rather than transitioned in the middle of the timeseries. A break point in 2016 would most likely come from the large El Nino that occurred rather than the artificial cause that was mentioned. And while one may assume El Nino signals would be removed from the residual timeseries by subtracting out temperature, moisture and cloud effects, that is only true if the model worked perfectly. If not, as seems to be the case here, breaks due to ENSO or other sources of variability are a possible explanation.

Line 245: This sentence may be true but its purpose is not clear and it seems unnecessary as currently written.

Line 251: Again, MODIS algorithm version change means the entire record gets reprocessed, so this explanation doesn’t apply here.

Line 253-255: The July 2001 break is a full year before the switch from Terra-only to Terra+Aqua, as noted. There seems to be no clear justification to assume the addition of Aqua is relevant to this 2001 break. If some justification exists, the authors should state it.

Line 275-276 and 290: The authors state they can’t explain the break points in the ISCCP data, but it is not clear what they did in an attempt to understand them. Did they investigate whether these changes fall within expectations of natural variability vs a forced response? Did they review the ISCCP ATBD to identify potential dataset changes or switches in satellite source? ISCCP is a stitching over many difficult-to-calibrate instruments.

Line 297: Correlation may not be a sufficient metric for skill. In 9a, the anomalies may be going in the right direction but clearly they are too small and fail to capture the magnitude of the variability compared to the CERES results shown before this. Given that the model did not work for ISCCP either, I begin to wonder if the methodology applied to capture cloud effects is not appropriate, but it is hard for me to judge because the method description is not yet clear, as noted in my comments above.

Line 304-306: Ignoring pre-2003 is not justified well enough. The authors seem to assume an inhomogeneity signifies an artifact in the dataset processing, when it could be something physical or fall within expected dataset uncertainties.

Line 315-317: Who would low cloud changes explain the break point if cloud effects have been removed in this residual calculation? I understand the cloud fields differ between this dataset and SSF, but so do the radiative fluxes that accompany each dataset

Citation: https://doi.org/10.5194/egusphere-2025-418-RC2
- AC2: 'Reply on RC2', Jippe Hoogeveen, 25 Jun 2025
  
  First of all, thank you for the review! It appears that the main criticism is on the attribution method, in particular the fact that linear regression is used instead of other techniques. We agree that our method might not produce a perfect attribution. However, this is not really the goal of the paper. Instead, the paper focuses on how break points can impact attribution of the trends in EEI. To achieve this goal, we perform a relatively simple but adequate attribution and test the residuals for inhomogeneities. Below, we copied your review and added our replies.
  
  Review: The authors attempt to identify and explain “break points” in radiation flux satellite data records by first quantifying and removing most physical drivers of the trend caused by changes in temperature, water vapor or clouds, and then calculating break points on the remaining residual variability, which would likely be from GHG or aerosol forcings. These break points may be an artifact of the dataset algorithm or a physical behavior. While attempts to explain trends and identify uncertainty in retrievals are important, I do not feel the methods and results presented here are appropriate for this task. I found the methods to be unclear and not well justified given that the linear regression model often does not capture the variability of the total radiation. This complicates interpretation of the break points.
  Reply: We agree that more sophisticated attribution methods are available but since the main aim of this paper is to study the impact of break points, it is not needed to use the most advanced attribution methods.
  Your main criticism is that the linear regression model has less variance than the observed values. However, for most of our models, the monthly variation is captured rather well, with correlation coefficients around 0.7 to 0.8. This suggests that the regression model works well. It is natural that the model has less variance than the observations, because, generally, the observations are the model plus extra noise, where the noise can also be caused by factors not taken into account in the model.
  We believe that the main improvements in the analysis can be achieved by including more explanatory factors, such as cloud optical depth. However, this makes it more difficult to determine the underlying cause of an inhomogeneity. Hence, we tried to keep the model as simple as possible, also because we do not aim to produce the perfect attribution.
  
  Review: The proposed model also gives quite different trend breakdowns of the different driving factors across datasets, despite the authors focusing on a post-2000 period where the overall radiation trends agree quite well between the datasets.
  Reply: There is indeed some difference in attribution between the different datasets, especially for ASR. For OLR, the CERES clear-sky differs from the other analyses, but the others are all rather similar. However, we do believe that this is not caused by the regression model, but instead by differences between the cloud datasets, partly because of inhomogeneities. For example, the two cloud datasets CLARA-A3 and CERES SSF differ in the low cloud cover trend by about 0.2% per decade, most likely because of inhomogeneities in CLARA-A3. This seems the main reason for the different attribution of ASR between these datasets. This also illustrates the importance of identifying inhomogeneities in cloud datasets, before a correct attribution can be made.
  
  Review: In the results, the authors seemed to arbitrarily select or filter out break points (phrases like “...so we keep the break point there”) and then often give only surface-level explanations (sometimes incorrect) of potential retrieval artifacts without assessing whether the break points are capturing something physical. In some cases they give no explanation, which would often be fine if they ruled out explanations, but they don’t detail what they investigated before declaring it unexplainable. Break points and potential dataset retrieval issues are an important and often understudied topic, so I encourage the authors to rethink their paper, possibly using more established attribution techniques or a smaller, more manageable number of datasets that will allow them to dig further into the details. But in its current form, I cannot recommend this paper for acceptance and suggest the journal consider a rejection with the possibility for a re-submission in the future.
  Reply: We generally tested the significance of break points using BEAST and review the metadata for potential causes. When we were unable to find a potential cause in the metadata, we reported that we do not know an explanation.
  
  Review: Line 40-41: The Dentener et al. timeseries is capturing Effective Radiative Forcing while the Loeb, Kramer, etc. estimates are just capturing the instantaneous radiative forcing, which does not include stratospheric or tropospheric adjustments to the CO2 and thus is smaller than the ERF. This is likely the primary cause of the discrepancy mentioned in these lines.
  Reply: Thank you for pointing that out. This is indeed a difference that needs to be mentioned.
  
  Review: Line 82-84: CERES-EBAF 4.2 now has two types of clear-sky flux: the partial sky (c) product and the total sky (t) product. The authors should note which product they are referring to in these lines and which product they will use for the remainder of the paper.
  Reply: We used the partial sky (c) product.
  
  Review: Section 3.1: This section tries to describe the methods used to isolate the temperature, water vapor and cloud contributions to the total radiation. I found this section to be unclear, too vague and often not particularly well-justified
  -Applying linear regression in this manner can’t guarantee an isolation of each contributing factor as cleanly as other well-established methods such as PRP or radiative kernels. Why not use those methods to isolate the contributions from T, WV and clouds?
  Reply: As explained before, the paper focuses on how inhomogeneities can impact attribution and does not aim to perform the perfect attribution. We believe that linear regression is suitable because it should theoretically find the correct coefficients and is free of (model) biases. Linear regression only struggles if two explanatory variables are highly correlated (such as T and WV), which is why we used T as an explanatory variable for both factors.
  
  Review: -This section just refers to using the “temperature data of UAH” to represent temperature and water vapor contributions to the radiation. It is not clear what temperatures from the UAH dataset were used (mid-trop? Near surface?). More detail is necessary. There are also stable, well-tested direct measurements of moisture the authors could have used, such as from AIRS, HIRS, etc rather than using a proxy. Some explanation for why UAH temperature was used instead of those actual water vapor options is warranted.
  Reply: We used the lower troposphere temperature, which should indeed have been mentioned. Temperature and water vapour are closely correlated, so linear regression has difficulty in handling them correctly. It has in particular difficulty in disentangling the temperature and water vapour effect, but the sum of both effects can be determined rather well. Hence, we use one explanatory variable (temperature) to approximate this combined effect. The paper does not focus on finding the best possible attribution, so it is not really important that the temperature and water vapour effect are not separated. It is only important that the sum of both effects is (approximately) removed from the all-sky flux, because that improves the detection of break points.
  
  Review: -Line 146/147 says “Global averages of temperature and clouds are regionally weighted”. Does this mean the authors start with global-mean values and then distribute that regionally based on weights? This was unclear and, if so, would seem to add unnecessary uncertainty since regionally-resolved data is available from these products
  Reply: This sentence explains that we calculate regional weighting factors and construct a global average by weighing the regional data with these factors. For example, to explain all-sky ASR, we weigh the regional cloud cover with the average clear-sky ASR over that region.
  
  Review: -Generally it seems the authors are deriving cloud contributions by scaling clear-sky fluxes by cloud fraction. There needs to be some evidence or citations showing that this proportionality assumption is valid. I commend that the authors acknowledge this assumption could be a source of uncertainty in the conclusion sections, but since it is the foundation of this paper, making those acknowledgments is not enough.
  Reply: We agree that, if you want to make a perfect attribution, this can have impact. However, we doubt that the weighing has a large impact, so because we do not focus on finding a perfect attribution, we believe that this is good enough. We can test this further by trying other weighing coefficients.
  
  Review: -For CERES applications, the effects of clouds is diagnosed by using cloud radiative effect. However this can be misleading as CRE includes cloud masking effects from other variables, especially for the longwave. The authors should account for this either directly or indirectly in their analysis.
  Reply: We agree that other factors can have impact on CRE too, especially for OLR. The regression can take the effect of temperature and water vapour into account, as these are the most important factors. This is also visible in the results, where the temperature effect on CERES clear-sky OLR is much smaller than expected, due to cloud masking. For ASR, we do not believe that the impact of other factors is too large for our purposes
  
  Review: Section 3.2 – I appreciate that the authors evaluated the uncertainties of the BEAST method. It would be helpful if they added a conclusion sentence or two to this section about what we should take away from this analysis and keep in mind as we analyze the results in the rest of the paper. For instance, given that the breaks were place a bit earlier than index 121 (Table 2), should we assume breaks will be biased early when applied to the actual observations too?
  Reply: We did add some of these remarks after the analysis: (i) if BEAST reports a probability larger than 0.2, then there is likely an inhomogeneity and (ii) BEAST has trouble in finding the exact location. However, we agree that it can be clearer.
  
  Review: General Fig 1-4: Given the improved agreement between datasets after 2000, it begs the question to what extent are most of these datasets dependent on CERES? Is CERES brought into the algorithm of these other datasets? Or are corrections made in order to better align with CERES? Or is the improved agreement around the time CERES began just a coincidence? Some discussion of this would be useful.
  Reply: As far as we know, the other datasets are not tuned to match CERES. There are dependencies: for example, CLARA-A3 uses CERES data for converting radiances to fluxes, but these conversions are constant over the entire period. Instead, we believe that the improved agreement from 2000 onward is to a large extent caused by the increasing number of satellites.
  
  Review: Line 235-238: The authors try to point to artificial causes of the breaks in the CERES record here, but neither make sense given the information provided. First, it is not clear which CERES product is being used in Figure 5, but if its an observed flux rather than a computed flux, mention of the GEOS analysis dataset is not relevant. Additionally, the authors surmise a CERES break in 2016 may come from a change in MODIS cloud product version, but CERES does not use the operational MODIS algorithms. And, regardless, when those MODIS algorithms are updated, the entire record is reprocessed rather than transitioned in the middle of the timeseries. A break point in 2016 would most likely come from the large El Nino that occurred rather than the artificial cause that was mentioned. And while one may assume El Nino signals would be removed from the residual timeseries by subtracting out temperature, moisture and cloud effects, that is only true if the model worked perfectly. If not, as seems to be the case here, breaks due to ENSO or other sources of variability are a possible explanation.
  Reply: The first break point in January 2008 concerns the clear-sky OLR from EBAF. In CERES EBAF 2.8, there was a known inhomogeneity in January 2008 caused by a transit from GEOS 4 to GEOS 5.2.1 (see Clouds and the Earth’s Radiant Energy System (CERES) Energy Balanced and Filled (EBAF) Top-of-Atmosphere (TOA) Edition-4.0 Data Product in: Journal of Climate Volume 31 Issue 2 (2018)). This transit caused an inhomogeneity in scene identification and hence in clear-sky OLR. This was corrected in EBAF 4.0 by using GEOS 5.4.1 over the entire period. Nonetheless, we still detect an inhomogeneity in January 2008 in the clear-sky OLR.
  The other break points concern the MODIS algorithm. We agree that our explanation was not entirely correct (as was also noted by the first reviewer), but we still believe that the transit in MODIS version can cause an inhomogeneity.
  First of all, MODIS clear-sky radiance measurements are added to CERES clear-sky fluxes. This is done when the CERES pixel is cloudy but some of the smaller MODIS pixels are clear (as MODIS has higher resolution than CERES). If such a situation occurs, then the MODIS narrowband radiance measurements are converted to broadband clear-sky fluxes. Until February 2016, MODIS C5 coefficients were used, but from March 2016, MODIS C6.1 coefficients are used. This is also mentioned in the data quality summary of EBAF at the sixth bullet point in Section 3.5 on page 12 (see Clouds and the Earth's Radiant Energy System).
  Secondly, the MODIS cloud properties also transited from MODIS C5 to MODIS C6.1 in March 2016. This can impact the clear-sky flux due to scene identification. The data quality summary of CERES SSF mentions this as well in Section 3.0 at the fifth bullet point under Clouds at page 10 (see Clouds and the Earth's Radiant Energy System).
  The confusion most likely comes from the fact that the L1B data was entirely reprocessed with MODS C6.1, but the L3 data was not (see the data quality summary of EBAF).
  
  Review: Line 245: This sentence may be true but its purpose is not clear and it seems unnecessary as currently written.
  Reply: The purpose of this sentence is to illustrate that break points can have a significant impact on the total residual trend and hence on the attribution.
  
  Review: Line 251: Again, MODIS algorithm version change means the entire record gets reprocessed, so this explanation doesn’t apply here.
  Reply: See our previous reply.
  
  Review: Line 253-255: The July 2001 break is a full year before the switch from Terra-only to Terra+Aqua, as noted. There seems to be no clear justification to assume the addition of Aqua is relevant to this 2001 break. If some justification exists, the authors should state it.
  Reply: In Section 3.2, we show that BEAST does not always correctly locate the break point: especially for smaller break points, it can be off by more than a year. Hence, we added the switch from Terra-only to Terra + Aqua as possible cause.
  
  Review: Line 275-276 and 290: The authors state they can’t explain the break points in the ISCCP data, but it is not clear what they did in an attempt to understand them. Did they investigate whether these changes fall within expectations of natural variability vs a forced response? Did they review the ISCCP ATBD to identify potential dataset changes or switches in satellite source? ISCCP is a stitching over many difficult-to-calibrate instruments.
  Reply: We reviewed the ISCCP ATBD and found no clear cause there.
  
  Review: Line 297: Correlation may not be a sufficient metric for skill. In 9a, the anomalies may be going in the right direction but clearly they are too small and fail to capture the magnitude of the variability compared to the CERES results shown before this. Given that the model did not work for ISCCP either, I begin to wonder if the methodology applied to capture cloud effects is not appropriate, but it is hard for me to judge because the method description is not yet clear, as noted in my comments above.
  Reply: We do not agree with this comment. First of all, Fig. 9a clearly shows that the model captures most of the monthly variations and is not simply increasing along with the observations. It is natural that the observations have more variation than the model. After all, the observations are the model plus noise (or factors not taken into account by the model). Hence, as the noise is uncorrelated with the model, the variance of the observations is larger than the variance of the model. This is also visible in Section 5.1: the CERES clear-sky OLR model also has smaller variance than the all-sky OLR observations, similar to the CLARA-A3 model, even though the CERES clear-sky uses clear-sky OLR observations to estimate the cloud effect, instead of linear regression.
  We agree that the regression model based on ISCCP is not that good and in particular the monthly variations are poorly represented in the model, which is also clearly visible in Figures 7(a) and 8(a). We do acknowledge this in the corresponding text and it is also visible in the lower correlation metric. The reason is mainly that the cloud cover data of ISCCP is rather inhomogeneous and hence, the regression model cannot explain OLR and ASR well. For CLARA-A3 and especially CERES SSF, the regression model seems to work much better as the monthly variations are captured quite well.
  
  Review: Line 304-306: Ignoring pre-2003 is not justified well enough. The authors seem to assume an inhomogeneity signifies an artifact in the dataset processing, when it could be something physical or fall within expected dataset uncertainties.
  Reply: Especially for OLR, the values before 2003 are incomparable to the values after. This is also signalled by BEAST. It is also clearly visible in the cloud data of CLARA-A3 itself. In Figure A2 in the appendix, we show that especially the high cloud cover seems to be inhomogeneous: around 2000, all monthly values are lower than almost everything after 2003 and from approximately 2001 to 2003, all monthly values are higher than almost everything after 2003. Similar results are visible for the middle cloud cover and total cloud cover. It is very unlikely that this is natural variability. It may be related to switching between channels 3A and 3B on the NOAA satellites.
  
  Review: Line 315-317: Who would low cloud changes explain the break point if cloud effects have been removed in this residual calculation? I understand the cloud fields differ between this dataset and SSF, but so do the radiative fluxes that accompany each dataset
  Reply: We always use the same TOA all-sky OLR and ASR (namely from CERES EBAF). Hence, if there is an jump in the cloud cover, then this jump will be subtracted from the (homogeneous) OLR or ASR, which results in a jump in the residuals. We test the residuals for homogeneity, so if there is a break point there, this most likely means that there is a break point in the cloud cover data used to correct the cloud effect.
  
  Citation: https://doi.org/10.5194/egusphere-2025-418-AC2
EC1: 'Comment on egusphere-2025-418', Ivy Tan, 14 May 2025

Dear Jippe J.A. Hoogeveen,

Thank you for submitting the manuscript titled "Likely breaks in cloud cover retrievals complicate attribution of the trend in the Earth Energy Imbalance". I have now received two reviews of your manuscript. Both reviewers express concerns about the suitability of the methods used in the study, highlighting issues related to the clarity and justification of the datasets and methods employed. Concerns about the interpretation of the results were also raised.

Based on these reviews and my own assessment of the manuscript, I unfortunately cannot consider this manuscript for publication in Atmospheric Chemistry and Physics. I encourage you to consider addressing the reviewers' feedback and resubmitting your work in the future. Thank you for your interest in Atmospheric Chemistry and Physics.

Sincerely,
Ivy

Citation: https://doi.org/10.5194/egusphere-2025-418-EC1

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-418', Anonymous Referee #1, 07 Mar 2025

Overall take
The paper attempts to attribute changes in Outgoing Longwave Radiation (OLR) and Absorbed Solar Radiation (ASR), which when combined represent the Earth's Energy Imbalance (EEI) to changes in clouds, Water Vapor + Temperature, GHG, Aerosols + Surface Albedo. A statistical tool is used to diagnose whether the "residual" (GHG, Aerosols, Albedo) is heterogeneous in time, namely whether distinct periods exist where the influence changes sign or accelerates/decelerates rapidly. So-called "breaks" are found, but there is little interpretation or speculation of why they are there (it is admittedly difficult to pin down the reasons). I believe there is a serious misinterpretation of one of the critical datasets, no clear picture emerges, and the conclusion highlighted by the abstract (the role of low cloud changes) is not supported by the presentation, in my opinion. I thought the paper would discuss the challenges of putting together and reconciling long time series of disparate data, but it focuses on the well-observed post-2000 period (where we have high-quality CERES and MODIS); so one has to wonder why other, lower quality data have to be brought in. Ultimately, I think that this paper instead of shedding light on what's going on with OLR/ASR/EEI variations adds confusion with its questionable methodology and interpretation, and I thus can't recommend publication in anything close to its current form.
Below, I provide some specific comments in case the paper will be considered for publication:
Major points
-- Conclusion about low clouds Lines 9-10, but not reiterated in the conclusions. If this is a major finding of the paper any relevant results and figures can't be in the Appendix.
-- Lines 91-93. I think there's a major misunderstanding here. First of all the authors talk about cloud data in CERES SSF1deg above and then in these lines they invoke a CERES EBAF document. I think this statement draws from the DQS document p. 19, Table 7-1. What this part of the document addresses is how to convert MODIS narrowband radiances (within the CERES footprint) to broad band radiances so that a clear-sky flux at the sub-footprint level can be calculated from the clear-sky MODIS pixels. So C5 and C61 refer to radiance (L1b data) not cloud retrievals (Table 7-1 caption mentions fluxes). C5 and C61 cloud retrievals are performed by the GSFC-based Atmospheric Discipline MODIS team. The CERES project has its own cloud retrievals which are included in various CERES products. Invariably, when a retrieval algorithm changes, all the data is being reprocessed with the same algorithm: you don't use one algorithm for one part of the time series and another for another period. I think this major misinterpretation of different cloud retrievals for different periods unfortunately propagates throughout the paper.
-- Now, actual discontinuities in cloud retrievals, despite processing with the same algorithm, can happen because of other events. See for example https://atmosphere-imager.gsfc.nasa.gov/issues/cloud
-- Why was the break detection algorithm not applied to the Appendix cloud cover time series? That would've been interesting!
-- Cloud_cci, ERA5, and HIRS do not appear in sections 5 and 6. Why even include them in the paper?
-- Lines 11-12, decrease of OLR due to GHG increase. Well, there are a lot of negative trends in Table 3. Does this say something about deficiencies in the methodology? Clear-sky OLR does not have to go down if temperature goes up (the Earth warms to get rid of the excess heat).
-- There is no single plot of EEI = OLR+ASR. Wouldn't that give some clues about the quality of the datasets?
-- ISCCP is a notoriously inhomogeneous cloud dataset because of changes in the satellite fleet and instruments throughout time. But this is not mentioned.
Somewhat minor points
-- Figs. 1-4: Except for Fig. 1, pre-2000, it is hard to see what's going on. Perhaps smoothed versions (running mean) are in order? What's going on with ISCCP ASR circa 1984?
-- What's the physical meaning of the specific values of c? Are these values suitable for the anomaly values of this paper? Or is it a normalized value. They seem so arbitrary. So if I was looking at anomalies as high as 100 in the time series, how would those c values change? In general I found subsection 3.2 a bit too long and confusing. Are these details really needed? I think the reader has to take a leap of faith here and assume that the authors are using BEAST correctly.
-- Constant incoming solar assumption. Given the small magnitude of ASR anomalies, can you really make that assumption? I guess in a relative comparison of different ASR anomalies it's OK, but for absolute anomaly magnitudes, it is a bit precarious to make this assumption.
-- Lines 106-109: Why are you mentioning this? Does it matter for the analysis that follows?
-- Lines 98-99. The (radiative) tops of middle and high clouds lie between and above those levels, not the entire clouds.
-- Lines 225-226. This sentence basically says: OLR_all-sky = OLR_clr + (OLR_all-sky - OLR_clr). What is the point here? The two OLR_clr's are not the same?
-- So in Fig. 5, residual = GHG. -0.27 Wm^-2decade^-1 is the value of the red bar, In Fig. 5b, and of the slope of th ered line in Fig. 5c, correct? In Fig. 6 residual = other. These things could be explained more clearly.
-- Lines 152-154. This sentence is unclear to me.
-- Lines 83-84. This sentence doesn't make sense.

Citation: https://doi.org/10.5194/egusphere-2025-418-RC1
- AC1: 'Reply on RC1', Jippe Hoogeveen, 16 Apr 2025
  
  First of all, thank you for your review! Below, we copied the review and we added our replies to your individual points.
  
  Overall take
  The paper attempts to attribute changes in Outgoing Longwave Radiation (OLR) and Absorbed Solar Radiation (ASR), which when combined represent the Earth's Energy Imbalance (EEI) to changes in clouds, Water Vapor + Temperature, GHG, Aerosols + Surface Albedo. A statistical tool is used to diagnose whether the "residual" (GHG, Aerosols, Albedo) is heterogeneous in time, namely whether distinct periods exist where the influence changes sign or accelerates/decelerates rapidly. So-called "breaks" are found, but there is little interpretation or speculation of why they are there (it is admittedly difficult to pin down the reasons). I believe there is a serious misinterpretation of one of the critical datasets, no clear picture emerges, and the conclusion highlighted by the abstract (the role of low cloud changes) is not supported by the presentation, in my opinion. I thought the paper would discuss the challenges of putting together and reconciling long time series of disparate data, but it focuses on the well-observed post-2000 period (where we have high-quality CERES and MODIS); so one has to wonder why other, lower quality data have to be brought in. Ultimately, I think that this paper instead of shedding light on what's going on with OLR/ASR/EEI variations adds confusion with its questionable methodology and interpretation, and I thus can't recommend publication in anything close to its current form.
  Below, I provide some specific comments in case the paper will be considered for publication:
  Major points
  -- Conclusion about low clouds Lines 9-10, but not reiterated in the conclusions. If this is a major finding of the paper any relevant results and figures can't be in the Appendix.
  Reply:
  We agree that this is not a major point of the paper and that the specific statement on low clouds should not be included in the abstract. We propose to change this line to: ‘A formal break detection algorithm strongly suggests the existence of breaks in the timeseries of observed minus modelled TOA fluxes, in particular for ASR.’
  
  -- Lines 91-93. I think there's a major misunderstanding here. First of all the authors talk about cloud data in CERES SSF1deg above and then in these lines they invoke a CERES EBAF document. I think this statement draws from the DQS document p. 19, Table 7-1. What this part of the document addresses is how to convert MODIS narrowband radiances (within the CERES footprint) to broad band radiances so that a clear-sky flux at the sub-footprint level can be calculated from the clear-sky MODIS pixels. So C5 and C61 refer to radiance (L1b data) not cloud retrievals (Table 7-1 caption mentions fluxes). C5 and C61 cloud retrievals are performed by the GSFC-based Atmospheric Discipline MODIS team. The CERES project has its own cloud retrievals which are included in various CERES products. Invariably, when a retrieval algorithm changes, all the data is being reprocessed with the same algorithm: you don't use one algorithm for one part of the time series and another for another period. I think this major misinterpretation of different cloud retrievals for different periods unfortunately propagates throughout the paper.
  Reply: Although our description of the inhomogeneities in the manuscript was not completely accurate, we do believe that there are possible inhomogeneities in both CERES clear-sky OLR/ASR and CERES SSF cloud data, both related to the transit from MODIS C5 to C6.1 radiances. With reference to the relevant CERES documents, we explain below what we think could cause breaks in CERES clear-sky fluxes and CERES SSF cloud properties.
  For CERES clear-sky TOA fluxes, the second bullet point on page 19 and Table 7-1 in the data quality summary of EBAF states that: "For EBAF Ed4.2, the MODIS or VIIRS narrowband to broadband coefficients are based on the collection (C) number that is consistent with imager version used in the CERES record. MODIS C5 coefficients were applied between 2000 and February 2016, MODIS C6.1 coefficients were applied between March 2016 and March 2022, and NOAA20 VIIRS C2.1 coefficients are applied beginning in April 2022." This part explains how a clear-sky flux is computed in mostly-overcast regions, as also summarized by the reviewer: clear-sky MODIS narrowband radiances are converted to broadband radiances and subsequently to fluxes. Until February 2016, MODIS C5 coefficients are used, but from March 2016 onwards MODIS C6.1 coefficients are used (we did not include the latest data from NOAA20 in our analysis). We suggest that this might result in a break in clear-sky fluxes, although we definitely do not exclude other causes.
  In the paper, we will change line 83-84 to: " For cloudy CERES footprints, TOA radiances calculated from clear-sky MODIS pixels within the footprint are used to determine clear-sky CERES fluxes. Coefficients for the conversion of MODIS narrowband radiances to broadband radiances are based on MODIS Collection 5 (C5) until February 2016, while from March 2016 onwards they are based on MODIS Collection 6.1 (C6.1) (CERES, 2024). ."
  For CERES SSF, the third bullet point on page 10 of the data quality summary of CERES SSF1deg states that: "The ED4A MODIS cloud properties are based on upon Collection 5 through February 2016 and upon Collection 6.1 from March 2016 onwards." We suggest that this might cause an inhomogeneity in the cloud measurements of CERES SSF. Potential differences in cloud masks until and after February 2016 might also affect the CERES EBAF clear-sky fluxes through a different selection of clear-sky MODIS pixels. In the paper, we will change lines 85-87 to: "The CERES Single Scanner Footprint (SSF, version SSF1deg Ed4A) product includes cloud cover retrieved from the polar satellites Terra, Aqua, S-NPP and NOAA-20 (Doelling et al., 2013). For this paper, we focused on Terra and Aqua, since the other two satellites do not cover a long enough period. The relevant cloud property retrievals are based on MODIS C5 until February 2016 and MODIS C6.1 from March 2016 onwards (CERES, 2023). We averaged the measurements from Terra and Aqua."
  Maybe the confusion comes from the fact that the L1B data were reprocessed entirely with MODIS C6.1, but the CERES SSF data were not. The data quality summary of EBAF in the sixth bullet point on page 20 states this: "The entire MODIS C6.1 L1B record was reprocessed; however, the CERES project only reprocessed the SSF1deg and SYN1deg records using MODIS C6.1 between February 2016 and March 2018."
  
  -- Now, actual discontinuities in cloud retrievals, despite processing with the same algorithm, can happen because of other events. See for example https://atmosphere-imager.gsfc.nasa.gov/issues/cloud
  Reply: This is acknowledged and we do not intend to exclude potential other causes for breaks in the time series. In the revised manuscript, specifically when discussing breaks between February and March 2016, we will use more careful formulations and leave more room for other potential causes of discontinuities.
  
  -- Why was the break detection algorithm not applied to the Appendix cloud cover time series? That would've been interesting!
  Reply: BEAST assumes that a continuous time series contains only a trend (and noise). However the cloud cover time series most likely contain large natural variation and this makes it harder to estimate break points in them directly. In contrast, clear-sky fluxes contain less natural variation, especially since we also remove the temperature effect from OLR. We did try detecting breaks directly in cloud datasets in Figure A7 by examining the difference between the CLARA-A3 and CERES SSF cloud cover, for which natural variation should largely cancel out. Here a few possible other inhomogeneities were detected which might be related to variations in the collection of satellites used for CLARA-A3. These were strongest in low cloud cover, which is why this was mentioned in the main text.
  
  -- Cloud_cci, ERA5, and HIRS do not appear in sections 5 and 6. Why even include them in the paper?
  Reply: We added them to section 4 to illustrate that reliable OLR and ASR estimates are hard to achieve before 2000.
  
  -- Lines 11-12, decrease of OLR due to GHG increase. Well, there are a lot of negative trends in Table 3. Does this say something about deficiencies in the methodology? Clear-sky OLR does not have to go down if temperature goes up (the Earth warms to get rid of the excess heat).
  Reply: The trends in Table 3 represent the trend in OLR when cloud and temperature and water vapour effects have been removed. Here we mainly expected to find the GHG effect, so negative trends are expected. However in the scientific literature stronger negative trends are reported (see discussion in lines 386-390). A final conclusion on the cause of this difference was not reached.
  
  -- There is no single plot of EEI = OLR+ASR. Wouldn't that give some clues about the quality of the datasets?
  Reply: EEI is a very important variable, but we doubt that this would provide additional insight: we do not expect factors in OLR and ASR to cancel out in EEI, so plotting EEI does not make inhomogeneities easier to detect.
  
  -- ISCCP is a notoriously inhomogeneous cloud dataset because of changes in the satellite fleet and instruments throughout time. But this is not mentioned.
  Reply: We agree that ISCCP is rather inhomogeneous. This was also mentioned in the conclusions, but we will add the following line at the end of Section 2.2: ‘It has been noted that due to varying spatial coverage of and instruments onboard the satellites underlying ISCCP, the data record is relatively inhomogeneous and less suited for trend analyses (Evan et al., 2007; Devasthale and Karlsson, 2023).’.
  
  Somewhat minor points
  -- Figs. 1-4: Except for Fig. 1, pre-2000, it is hard to see what's going on. Perhaps smoothed versions (running mean) are in order? What's going on with ISCCP ASR circa 1984?
  Reply: We agree that the figures are not easy to read. However we do not believe that plotting only the running means would be the best thing to do, because it is also interesting to see if the monthly variations tie up between the different datasets. We propose to plot the monthly values with thinner lines and add the running means in bold.
  Regarding the anomalies of ISCCP ASR in 1984: we do not know what causes this.
  
  -- What's the physical meaning of the specific values of c? Are these values suitable for the anomaly values of this paper? Or is it a normalized value. They seem so arbitrary. So if I was looking at anomalies as high as 100 in the time series, how would those c values change? In general I found subsection 3.2 a bit too long and confusing. Are these details really needed? I think the reader has to take a leap of faith here and assume that the authors are using BEAST correctly.
  Reply: The value c indicates the ratio between the break and the standard deviation. We will mention this more clearly in section 3.2. The idea of this section is to provide a feeling for how strongly BEAST responds to inhomogeneities and if it is able to pinpoint the exact date. We believe that this will help readers to interpret the results of BEAST obtained for real data further on.
  
  -- Constant incoming solar assumption. Given the small magnitude of ASR anomalies, can you really make that assumption? I guess in a relative comparison of different ASR anomalies it's OK, but for absolute anomaly magnitudes, it is a bit precarious to make this assumption.
  Reply: The solar irradiance has only rather small variations (up to 0.4 Wm^-2 over the entire period). This can be important if one is interested in small ASR anomalies. The CERES measurements provide the reflected shortwave radiation, so to obtain the actual ASR, the incoming solar irradiance anomalies have to be added. However, we are interested in determining the cloud cover effect on ASR and finding inhomogeneities in this effect. To do this it is best to remove the solar irradiance effect on ASR again. Since most solar irradiance is absorbed (about 70%), this means that 70% of the solar irradiance anomalies have to be subtracted from ASR again (so only 30% of these anomalies remain in the CERES RSF measurements). Hence only small anomalies caused by solar irradiance remain in our analyses (about 0.1 Wm^-2) and we believe it is safe to ignore this.
  
  -- Lines 106-109: Why are you mentioning this? Does it matter for the analysis that follows?
  Reply: We mention this, because it gives a possible explanation why the CLARA-A3 cloud data can have an inhomogeneity in 2003. We will add it to the conclusions of our paper.
  
  -- Lines 98-99. The (radiative) tops of middle and high clouds lie between and above those levels, not the entire clouds.
  Reply: Thank you for pointing that out! We will adjust it.
  
  -- Lines 225-226. This sentence basically says: OLRall-sky = OLRclr + (OLRall-sky - OLRclr). What is the point here? The two OLRclr's are not the same?
  Reply: This sentence tries to illustrate how we estimate the cloud and temperature and water vapour effect on OLR when using CERES clear-sky for the cloud effect. The idea is that we try to explain OLRclr with temperature and water vapour. This gives an estimate for the temperature and water vapour effect. Next we use OLRall-sky - OLRclr for the cloud effect and we add them up for the total effect. We will improve the description of this reasoning in the paper.
  
  -- So in Fig. 5, residual = GHG. -0.27 Wm-2decade-1 is the value of the red bar, In Fig. 5b, and of the slope of th ered line in Fig. 5c, correct? In Fig. 6 residual = other. These things could be explained more clearly.
  Reply: Indeed, the values in Fig. 5(b) are the trends of the different drivers of OLR. Other is the residual, so the value Other (GHG) in 5(b) is the trend of Residual in 5(c). This works the same way in Fig. 6 where Other (SA + AER) in 6(b) is the trend of Residual in 6(c). We will adjust Other to Residual to clarify this.
  
  -- Lines 152-154. This sentence is unclear to me.
  Reply: In the regression, we take the cloud cover at different heights (and for OLR the global temperature) as explanatory variables to determine their effect. However we expect that other factors impact OLR and ASR that are not taken into account by the regression. These factors (such as GHG for OLR) are typically very monotone and almost linear, which makes it hard to estimate their effect using linear regression. Hence, we add a trend to the regression model as explanatory variable to take such factors into account. This trend is then not added to the model from for example Fig. 5(a) and its only function is to take this kind of factors into account, such that no other variable tries to do this. We will add this clarification in the paper.
  
  -- Lines 83-84. This sentence doesn't make sense.
  Reply: Yes, this is an error, thank you for pointing it out! We will correct it.
  
  References
  CERES: CERES_SSF1deg-Hour/Day/Month_Ed4A Data Quality Summary, Version 2, NASA, 45pp, https://ceres.larc.nasa.gov/documents/DQ_summaries/CERES_SSF1deg_Ed4A_DQS.pdf, 2023.
  CERES: CERES_EBAF_Ed4.2 and Ed4.2.1 Data Quality Summary, Version 4, 30pp, NASA, https://ceres.larc.nasa.gov/documents/DQ_summaries/CERES_EBAF_Ed4.2_DQS.pdf, 2024.
  Evan, A.T., Heidinger, A.K., and Vimont, D.J.: Arguments against a physical long-term trend in global ISCCP cloud amounts. Geophys. Res. Lett., 34, https://doi.org/10.1029/2006GL028083, 2007.
  Devasthale, A. and Karlsson, K.-G.: Decadal Stability and Trends in the Global Cloud Amount and Cloud Top Temperature in the Satellite-Based Climate Data Records. Remote Sensing, 15(15), 3819. https://doi.org/10.3390/rs15153819, 2023.
  
  Citation: https://doi.org/10.5194/egusphere-2025-418-AC1
RC2:
'Comment on egusphere-2025-418', Anonymous Referee #3, 13 May 2025

The authors attempt to identify and explain “break points” in radiation flux satellite data records by first quantifying and removing most physical drivers of the trend caused by changes in temperature, water vapor or clouds, and then calculating break points on the remaining residual variability, which would likely be from GHG or aerosol forcings. These break points may be an artifact of the dataset algorithm or a physical behavior. While attempts to explain trends and identify uncertainty in retrievals are important, I do not feel the methods and results presented here are appropriate for this task. I found the methods to be unclear and not well justified given that the linear regression model often does not capture the variability of the total radiation. This complicates interpretation of the break points.

The proposed model also gives quite different trend breakdowns of the different driving factors across datasets, despite the authors focusing on a post-2000 period where the overall radiation trends agree quite well between the datasets. In the results, the authors seemed to arbitrarily select or filter out break points (phrases like “...so we keep the break point there”) and then often give only surface-level explanations (sometimes incorrect) of potential retrieval artifacts without assessing whether the break points are capturing something physical. In some cases they give no explanation, which would often be fine if they ruled out explanations, but they don’t detail what they investigated before declaring it unexplainable. Break points and potential dataset retrieval issues are an important and often understudied topic, so I encourage the authors to rethink their paper, possibly using more established attribution techniques or a smaller, more manageable number of datasets that will allow them to dig further into the details. But in its current form, I cannot recommend this paper for acceptance and suggest the journal consider a rejection with the possibility for a re-submission in the future.

Line 40-41: The Dentener et al. timeseries is capturing Effective Radiative Forcing while the Loeb, Kramer, etc. estimates are just capturing the instantaneous radiative forcing, which does not include stratospheric or tropospheric adjustments to the CO2 and thus is smaller than the ERF. This is likely the primary cause of the discrepancy mentioned in these lines.

Line 82-84: CERES-EBAF 4.2 now has two types of clear-sky flux: the partial sky (c) product and the total sky (t) product. The authors should note which product they are referring to in these lines and which product they will use for the remainder of the paper.

Section 3.1: This section tries to describe the methods used to isolate the temperature, water vapor and cloud contributions to the total radiation. I found this section to be unclear, too vague and often not particularly well-justified
-Applying linear regression in this manner can’t guarantee an isolation of each contributing factor as cleanly as other well-established methods such as PRP or radiative kernels. Why not use those methods to isolate the contributions from T, WV and clouds?
-This section just refers to using the “temperature data of UAH” to represent temperature and water vapor contributions to the radiation. It is not clear what temperatures from the UAH dataset were used (mid-trop? Near surface?). More detail is necessary. There are also stable, well-tested direct measurements of moisture the authors could have used, such as from AIRS, HIRS, etc rather than using a proxy. Some explanation for why UAH temperature was used instead of those actual water vapor options is warranted.
-Line 146/147 says “Global averages of temperature and clouds are regionally weighted”. Does this mean the authors start with global-mean values and then distribute that regionally based on weights? This was unclear and, if so, would seem to add unnecessary uncertainty since regionally-resolved data is available from these products
-Generally it seems the authors are deriving cloud contributions by scaling clear-sky fluxes by cloud fraction. There needs to be some evidence or citations showing that this proportionality assumption is valid. I commend that the authors acknowledge this assumption could be a source of uncertainty in the conclusion sections, but since it is the foundation of this paper, making those acknowledgments is not enough.
-For CERES applications, the effects of clouds is diagnosed by using cloud radiative effect. However this can be misleading as CRE includes cloud masking effects from other variables, especially for the longwave. The authors should account for this either directly or indirectly in their analysis.

Section 3.2 – I appreciate that the authors evaluated the uncertainties of the BEAST method. It would be helpful if they added a conclusion sentence or two to this section about what we should take away from this analysis and keep in mind as we analyze the results in the rest of the paper. For instance, given that the breaks were place a bit earlier than index 121 (Table 2), should we assume breaks will be biased early when applied to the actual observations too?

General Fig 1-4: Given the improved agreement between datasets after 2000, it begs the question to what extent are most of these datasets dependent on CERES? Is CERES brought into the algorithm of these other datasets? Or are corrections made in order to better align with CERES? Or is the improved agreement around the time CERES began just a coincidence? Some discussion of this would be useful.

Line 235-238: The authors try to point to artificial causes of the breaks in the CERES record here, but neither make sense given the information provided. First, it is not clear which CERES product is being used in Figure 5, but if its an observed flux rather than a computed flux, mention of the GEOS analysis dataset is not relevant. Additionally, the authors surmise a CERES break in 2016 may come from a change in MODIS cloud product version, but CERES does not use the operational MODIS algorithms. And, regardless, when those MODIS algorithms are updated, the entire record is reprocessed rather than transitioned in the middle of the timeseries. A break point in 2016 would most likely come from the large El Nino that occurred rather than the artificial cause that was mentioned. And while one may assume El Nino signals would be removed from the residual timeseries by subtracting out temperature, moisture and cloud effects, that is only true if the model worked perfectly. If not, as seems to be the case here, breaks due to ENSO or other sources of variability are a possible explanation.

Line 245: This sentence may be true but its purpose is not clear and it seems unnecessary as currently written.

Line 251: Again, MODIS algorithm version change means the entire record gets reprocessed, so this explanation doesn’t apply here.

Line 253-255: The July 2001 break is a full year before the switch from Terra-only to Terra+Aqua, as noted. There seems to be no clear justification to assume the addition of Aqua is relevant to this 2001 break. If some justification exists, the authors should state it.

Line 275-276 and 290: The authors state they can’t explain the break points in the ISCCP data, but it is not clear what they did in an attempt to understand them. Did they investigate whether these changes fall within expectations of natural variability vs a forced response? Did they review the ISCCP ATBD to identify potential dataset changes or switches in satellite source? ISCCP is a stitching over many difficult-to-calibrate instruments.

Line 297: Correlation may not be a sufficient metric for skill. In 9a, the anomalies may be going in the right direction but clearly they are too small and fail to capture the magnitude of the variability compared to the CERES results shown before this. Given that the model did not work for ISCCP either, I begin to wonder if the methodology applied to capture cloud effects is not appropriate, but it is hard for me to judge because the method description is not yet clear, as noted in my comments above.

Line 304-306: Ignoring pre-2003 is not justified well enough. The authors seem to assume an inhomogeneity signifies an artifact in the dataset processing, when it could be something physical or fall within expected dataset uncertainties.

Line 315-317: Who would low cloud changes explain the break point if cloud effects have been removed in this residual calculation? I understand the cloud fields differ between this dataset and SSF, but so do the radiative fluxes that accompany each dataset

Citation: https://doi.org/10.5194/egusphere-2025-418-RC2
- AC2: 'Reply on RC2', Jippe Hoogeveen, 25 Jun 2025
  
  First of all, thank you for the review! It appears that the main criticism is on the attribution method, in particular the fact that linear regression is used instead of other techniques. We agree that our method might not produce a perfect attribution. However, this is not really the goal of the paper. Instead, the paper focuses on how break points can impact attribution of the trends in EEI. To achieve this goal, we perform a relatively simple but adequate attribution and test the residuals for inhomogeneities. Below, we copied your review and added our replies.
  
  Review: The authors attempt to identify and explain “break points” in radiation flux satellite data records by first quantifying and removing most physical drivers of the trend caused by changes in temperature, water vapor or clouds, and then calculating break points on the remaining residual variability, which would likely be from GHG or aerosol forcings. These break points may be an artifact of the dataset algorithm or a physical behavior. While attempts to explain trends and identify uncertainty in retrievals are important, I do not feel the methods and results presented here are appropriate for this task. I found the methods to be unclear and not well justified given that the linear regression model often does not capture the variability of the total radiation. This complicates interpretation of the break points.
  Reply: We agree that more sophisticated attribution methods are available but since the main aim of this paper is to study the impact of break points, it is not needed to use the most advanced attribution methods.
  Your main criticism is that the linear regression model has less variance than the observed values. However, for most of our models, the monthly variation is captured rather well, with correlation coefficients around 0.7 to 0.8. This suggests that the regression model works well. It is natural that the model has less variance than the observations, because, generally, the observations are the model plus extra noise, where the noise can also be caused by factors not taken into account in the model.
  We believe that the main improvements in the analysis can be achieved by including more explanatory factors, such as cloud optical depth. However, this makes it more difficult to determine the underlying cause of an inhomogeneity. Hence, we tried to keep the model as simple as possible, also because we do not aim to produce the perfect attribution.
  
  Review: The proposed model also gives quite different trend breakdowns of the different driving factors across datasets, despite the authors focusing on a post-2000 period where the overall radiation trends agree quite well between the datasets.
  Reply: There is indeed some difference in attribution between the different datasets, especially for ASR. For OLR, the CERES clear-sky differs from the other analyses, but the others are all rather similar. However, we do believe that this is not caused by the regression model, but instead by differences between the cloud datasets, partly because of inhomogeneities. For example, the two cloud datasets CLARA-A3 and CERES SSF differ in the low cloud cover trend by about 0.2% per decade, most likely because of inhomogeneities in CLARA-A3. This seems the main reason for the different attribution of ASR between these datasets. This also illustrates the importance of identifying inhomogeneities in cloud datasets, before a correct attribution can be made.
  
  Review: In the results, the authors seemed to arbitrarily select or filter out break points (phrases like “...so we keep the break point there”) and then often give only surface-level explanations (sometimes incorrect) of potential retrieval artifacts without assessing whether the break points are capturing something physical. In some cases they give no explanation, which would often be fine if they ruled out explanations, but they don’t detail what they investigated before declaring it unexplainable. Break points and potential dataset retrieval issues are an important and often understudied topic, so I encourage the authors to rethink their paper, possibly using more established attribution techniques or a smaller, more manageable number of datasets that will allow them to dig further into the details. But in its current form, I cannot recommend this paper for acceptance and suggest the journal consider a rejection with the possibility for a re-submission in the future.
  Reply: We generally tested the significance of break points using BEAST and review the metadata for potential causes. When we were unable to find a potential cause in the metadata, we reported that we do not know an explanation.
  
  Review: Line 40-41: The Dentener et al. timeseries is capturing Effective Radiative Forcing while the Loeb, Kramer, etc. estimates are just capturing the instantaneous radiative forcing, which does not include stratospheric or tropospheric adjustments to the CO2 and thus is smaller than the ERF. This is likely the primary cause of the discrepancy mentioned in these lines.
  Reply: Thank you for pointing that out. This is indeed a difference that needs to be mentioned.
  
  Review: Line 82-84: CERES-EBAF 4.2 now has two types of clear-sky flux: the partial sky (c) product and the total sky (t) product. The authors should note which product they are referring to in these lines and which product they will use for the remainder of the paper.
  Reply: We used the partial sky (c) product.
  
  Review: Section 3.1: This section tries to describe the methods used to isolate the temperature, water vapor and cloud contributions to the total radiation. I found this section to be unclear, too vague and often not particularly well-justified
  -Applying linear regression in this manner can’t guarantee an isolation of each contributing factor as cleanly as other well-established methods such as PRP or radiative kernels. Why not use those methods to isolate the contributions from T, WV and clouds?
  Reply: As explained before, the paper focuses on how inhomogeneities can impact attribution and does not aim to perform the perfect attribution. We believe that linear regression is suitable because it should theoretically find the correct coefficients and is free of (model) biases. Linear regression only struggles if two explanatory variables are highly correlated (such as T and WV), which is why we used T as an explanatory variable for both factors.
  
  Review: -This section just refers to using the “temperature data of UAH” to represent temperature and water vapor contributions to the radiation. It is not clear what temperatures from the UAH dataset were used (mid-trop? Near surface?). More detail is necessary. There are also stable, well-tested direct measurements of moisture the authors could have used, such as from AIRS, HIRS, etc rather than using a proxy. Some explanation for why UAH temperature was used instead of those actual water vapor options is warranted.
  Reply: We used the lower troposphere temperature, which should indeed have been mentioned. Temperature and water vapour are closely correlated, so linear regression has difficulty in handling them correctly. It has in particular difficulty in disentangling the temperature and water vapour effect, but the sum of both effects can be determined rather well. Hence, we use one explanatory variable (temperature) to approximate this combined effect. The paper does not focus on finding the best possible attribution, so it is not really important that the temperature and water vapour effect are not separated. It is only important that the sum of both effects is (approximately) removed from the all-sky flux, because that improves the detection of break points.
  
  Review: -Line 146/147 says “Global averages of temperature and clouds are regionally weighted”. Does this mean the authors start with global-mean values and then distribute that regionally based on weights? This was unclear and, if so, would seem to add unnecessary uncertainty since regionally-resolved data is available from these products
  Reply: This sentence explains that we calculate regional weighting factors and construct a global average by weighing the regional data with these factors. For example, to explain all-sky ASR, we weigh the regional cloud cover with the average clear-sky ASR over that region.
  
  Review: -Generally it seems the authors are deriving cloud contributions by scaling clear-sky fluxes by cloud fraction. There needs to be some evidence or citations showing that this proportionality assumption is valid. I commend that the authors acknowledge this assumption could be a source of uncertainty in the conclusion sections, but since it is the foundation of this paper, making those acknowledgments is not enough.
  Reply: We agree that, if you want to make a perfect attribution, this can have impact. However, we doubt that the weighing has a large impact, so because we do not focus on finding a perfect attribution, we believe that this is good enough. We can test this further by trying other weighing coefficients.
  
  Review: -For CERES applications, the effects of clouds is diagnosed by using cloud radiative effect. However this can be misleading as CRE includes cloud masking effects from other variables, especially for the longwave. The authors should account for this either directly or indirectly in their analysis.
  Reply: We agree that other factors can have impact on CRE too, especially for OLR. The regression can take the effect of temperature and water vapour into account, as these are the most important factors. This is also visible in the results, where the temperature effect on CERES clear-sky OLR is much smaller than expected, due to cloud masking. For ASR, we do not believe that the impact of other factors is too large for our purposes
  
  Review: Section 3.2 – I appreciate that the authors evaluated the uncertainties of the BEAST method. It would be helpful if they added a conclusion sentence or two to this section about what we should take away from this analysis and keep in mind as we analyze the results in the rest of the paper. For instance, given that the breaks were place a bit earlier than index 121 (Table 2), should we assume breaks will be biased early when applied to the actual observations too?
  Reply: We did add some of these remarks after the analysis: (i) if BEAST reports a probability larger than 0.2, then there is likely an inhomogeneity and (ii) BEAST has trouble in finding the exact location. However, we agree that it can be clearer.
  
  Review: General Fig 1-4: Given the improved agreement between datasets after 2000, it begs the question to what extent are most of these datasets dependent on CERES? Is CERES brought into the algorithm of these other datasets? Or are corrections made in order to better align with CERES? Or is the improved agreement around the time CERES began just a coincidence? Some discussion of this would be useful.
  Reply: As far as we know, the other datasets are not tuned to match CERES. There are dependencies: for example, CLARA-A3 uses CERES data for converting radiances to fluxes, but these conversions are constant over the entire period. Instead, we believe that the improved agreement from 2000 onward is to a large extent caused by the increasing number of satellites.
  
  Review: Line 235-238: The authors try to point to artificial causes of the breaks in the CERES record here, but neither make sense given the information provided. First, it is not clear which CERES product is being used in Figure 5, but if its an observed flux rather than a computed flux, mention of the GEOS analysis dataset is not relevant. Additionally, the authors surmise a CERES break in 2016 may come from a change in MODIS cloud product version, but CERES does not use the operational MODIS algorithms. And, regardless, when those MODIS algorithms are updated, the entire record is reprocessed rather than transitioned in the middle of the timeseries. A break point in 2016 would most likely come from the large El Nino that occurred rather than the artificial cause that was mentioned. And while one may assume El Nino signals would be removed from the residual timeseries by subtracting out temperature, moisture and cloud effects, that is only true if the model worked perfectly. If not, as seems to be the case here, breaks due to ENSO or other sources of variability are a possible explanation.
  Reply: The first break point in January 2008 concerns the clear-sky OLR from EBAF. In CERES EBAF 2.8, there was a known inhomogeneity in January 2008 caused by a transit from GEOS 4 to GEOS 5.2.1 (see Clouds and the Earth’s Radiant Energy System (CERES) Energy Balanced and Filled (EBAF) Top-of-Atmosphere (TOA) Edition-4.0 Data Product in: Journal of Climate Volume 31 Issue 2 (2018)). This transit caused an inhomogeneity in scene identification and hence in clear-sky OLR. This was corrected in EBAF 4.0 by using GEOS 5.4.1 over the entire period. Nonetheless, we still detect an inhomogeneity in January 2008 in the clear-sky OLR.
  The other break points concern the MODIS algorithm. We agree that our explanation was not entirely correct (as was also noted by the first reviewer), but we still believe that the transit in MODIS version can cause an inhomogeneity.
  First of all, MODIS clear-sky radiance measurements are added to CERES clear-sky fluxes. This is done when the CERES pixel is cloudy but some of the smaller MODIS pixels are clear (as MODIS has higher resolution than CERES). If such a situation occurs, then the MODIS narrowband radiance measurements are converted to broadband clear-sky fluxes. Until February 2016, MODIS C5 coefficients were used, but from March 2016, MODIS C6.1 coefficients are used. This is also mentioned in the data quality summary of EBAF at the sixth bullet point in Section 3.5 on page 12 (see Clouds and the Earth's Radiant Energy System).
  Secondly, the MODIS cloud properties also transited from MODIS C5 to MODIS C6.1 in March 2016. This can impact the clear-sky flux due to scene identification. The data quality summary of CERES SSF mentions this as well in Section 3.0 at the fifth bullet point under Clouds at page 10 (see Clouds and the Earth's Radiant Energy System).
  The confusion most likely comes from the fact that the L1B data was entirely reprocessed with MODS C6.1, but the L3 data was not (see the data quality summary of EBAF).
  
  Review: Line 245: This sentence may be true but its purpose is not clear and it seems unnecessary as currently written.
  Reply: The purpose of this sentence is to illustrate that break points can have a significant impact on the total residual trend and hence on the attribution.
  
  Review: Line 251: Again, MODIS algorithm version change means the entire record gets reprocessed, so this explanation doesn’t apply here.
  Reply: See our previous reply.
  
  Review: Line 253-255: The July 2001 break is a full year before the switch from Terra-only to Terra+Aqua, as noted. There seems to be no clear justification to assume the addition of Aqua is relevant to this 2001 break. If some justification exists, the authors should state it.
  Reply: In Section 3.2, we show that BEAST does not always correctly locate the break point: especially for smaller break points, it can be off by more than a year. Hence, we added the switch from Terra-only to Terra + Aqua as possible cause.
  
  Review: Line 275-276 and 290: The authors state they can’t explain the break points in the ISCCP data, but it is not clear what they did in an attempt to understand them. Did they investigate whether these changes fall within expectations of natural variability vs a forced response? Did they review the ISCCP ATBD to identify potential dataset changes or switches in satellite source? ISCCP is a stitching over many difficult-to-calibrate instruments.
  Reply: We reviewed the ISCCP ATBD and found no clear cause there.
  
  Review: Line 297: Correlation may not be a sufficient metric for skill. In 9a, the anomalies may be going in the right direction but clearly they are too small and fail to capture the magnitude of the variability compared to the CERES results shown before this. Given that the model did not work for ISCCP either, I begin to wonder if the methodology applied to capture cloud effects is not appropriate, but it is hard for me to judge because the method description is not yet clear, as noted in my comments above.
  Reply: We do not agree with this comment. First of all, Fig. 9a clearly shows that the model captures most of the monthly variations and is not simply increasing along with the observations. It is natural that the observations have more variation than the model. After all, the observations are the model plus noise (or factors not taken into account by the model). Hence, as the noise is uncorrelated with the model, the variance of the observations is larger than the variance of the model. This is also visible in Section 5.1: the CERES clear-sky OLR model also has smaller variance than the all-sky OLR observations, similar to the CLARA-A3 model, even though the CERES clear-sky uses clear-sky OLR observations to estimate the cloud effect, instead of linear regression.
  We agree that the regression model based on ISCCP is not that good and in particular the monthly variations are poorly represented in the model, which is also clearly visible in Figures 7(a) and 8(a). We do acknowledge this in the corresponding text and it is also visible in the lower correlation metric. The reason is mainly that the cloud cover data of ISCCP is rather inhomogeneous and hence, the regression model cannot explain OLR and ASR well. For CLARA-A3 and especially CERES SSF, the regression model seems to work much better as the monthly variations are captured quite well.
  
  Review: Line 304-306: Ignoring pre-2003 is not justified well enough. The authors seem to assume an inhomogeneity signifies an artifact in the dataset processing, when it could be something physical or fall within expected dataset uncertainties.
  Reply: Especially for OLR, the values before 2003 are incomparable to the values after. This is also signalled by BEAST. It is also clearly visible in the cloud data of CLARA-A3 itself. In Figure A2 in the appendix, we show that especially the high cloud cover seems to be inhomogeneous: around 2000, all monthly values are lower than almost everything after 2003 and from approximately 2001 to 2003, all monthly values are higher than almost everything after 2003. Similar results are visible for the middle cloud cover and total cloud cover. It is very unlikely that this is natural variability. It may be related to switching between channels 3A and 3B on the NOAA satellites.
  
  Review: Line 315-317: Who would low cloud changes explain the break point if cloud effects have been removed in this residual calculation? I understand the cloud fields differ between this dataset and SSF, but so do the radiative fluxes that accompany each dataset
  Reply: We always use the same TOA all-sky OLR and ASR (namely from CERES EBAF). Hence, if there is an jump in the cloud cover, then this jump will be subtracted from the (homogeneous) OLR or ASR, which results in a jump in the residuals. We test the residuals for homogeneity, so if there is a break point there, this most likely means that there is a break point in the cloud cover data used to correct the cloud effect.
  
  Citation: https://doi.org/10.5194/egusphere-2025-418-AC2
EC1: 'Comment on egusphere-2025-418', Ivy Tan, 14 May 2025

Dear Jippe J.A. Hoogeveen,

Thank you for submitting the manuscript titled "Likely breaks in cloud cover retrievals complicate attribution of the trend in the Earth Energy Imbalance". I have now received two reviews of your manuscript. Both reviewers express concerns about the suitability of the methods used in the study, highlighting issues related to the clarity and justification of the datasets and methods employed. Concerns about the interpretation of the results were also raised.

Based on these reviews and my own assessment of the manuscript, I unfortunately cannot consider this manuscript for publication in Atmospheric Chemistry and Physics. I encourage you to consider addressing the reviewers' feedback and resubmitting your work in the future. Thank you for your interest in Atmospheric Chemistry and Physics.

Sincerely,
Ivy

Citation: https://doi.org/10.5194/egusphere-2025-418-EC1

Jippe J. A. Hoogeveen, Jan Fokke Meirink, and Frank M. Selten

Viewed

Total article views: 12,173 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
8,550	3,355	268	12,173	214	304

HTML: 8,550
PDF: 3,355
XML: 268
Total: 12,173
BibTeX: 214
EndNote: 304

Views and downloads (calculated since 24 Feb 2025)

Month	HTML	PDF	XML	Total
Feb 2025	500	55	15	570
Mar 2025	499	73	15	587
Apr 2025	548	45	19	612
May 2025	252	34	25	311
Jun 2025	252	54	35	341
Jul 2025	205	95	5	305
Aug 2025	251	95	0	346
Sep 2025	716	20	9	745
Oct 2025	116	45	5	166
Nov 2025	235	200	20	455
Dec 2025	238	300	39	577
Jan 2026	3,222	1,135	60	4,417
Feb 2026	902	500	14	1,416
Mar 2026	394	506	3	903
Apr 2026	91	71	2	164
May 2026	116	108	2	226
Jun 2026	13	19	0	32
Jul 2026	0

Cumulative views and downloads (calculated since 24 Feb 2025)

Month	HTML	PDF	XML	Total
Feb 2025	500	55	15	570
Mar 2025	499	73	15	587
Apr 2025	548	45	19	612
May 2025	252	34	25	311
Jun 2025	252	54	35	341
Jul 2025	205	95	5	305
Aug 2025	251	95	0	346
Sep 2025	716	20	9	745
Oct 2025	116	45	5	166
Nov 2025	235	200	20	455
Dec 2025	238	300	39	577
Jan 2026	3,222	1,135	60	4,417
Feb 2026	902	500	14	1,416
Mar 2026	394	506	3	903
Apr 2026	91	71	2	164
May 2026	116	108	2	226
Jun 2026	13	19	0	32
Jul 2026	0

Viewed (geographical distribution)

Total article views: 12,146 (including HTML, PDF, and XML) Thereof 12,146 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 03 Jul 2026

Short summary

We investigated the effect of clouds on the reflection of sunlight to space and thermal radiation from earth to space. We found a few possible inhomogeneities in the measurements. A clear decrease in reflection of sunlight was found, which we partly attributed to changes in cloud cover. Thermal radiation could be attributed relatively reliably, however we were unable to find the expected decrease due to greenhouse gasses. We do not know a conclusive cause for this.


Total:	0
HTML:	0
PDF:	0
XML:	0