the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Detectability of forced trends in stratospheric ozone
Abstract. The continued monitoring of the ozone layer and its long-term evolution leans on comparative studies of merged satellite records. Such records present unique challenges due to differences in sampling, coverage, and retrieval algorithms between observing platforms, leading to discrepancies in trend calculations. Here we examine the effects of optimal estimation retrieval algorithms on vertically resolved ozone trends, using one merged record as an example. We find errors as large as 1 % per decade and displacements in trend profile features of as much as 6 km altitude due to the vertical redistribution of information by averaging kernels. Furthermore, we show that averaging kernels tend to increase the length of record needed to determine whether vertically resolved trend estimates are distinguishable from natural variability with good statistical confidence. We conclude that trend uncertainties may be underestimated, in part because averaging kernels misrepresent decadal to multi-decadal internal variability, and in part because the removal of known modes of variability from the observed record can yield residual errors. The study provides a framework to reconcile differences between observing platforms, and highlights the need for caution when using merged satellite records to quantify trends and their uncertainties.
- Preprint
(3306 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2627', Anonymous Referee #1, 14 Sep 2024
The manuscript uses ozone time series from the SBUV satellites along with model simulations from the Earth System Chemistry Climate Model (ESM) and the Chemistry Climate Model Intercomparison (CCMI) to investigate long-term trends in ozone and compare them with ozone variability from the model simulations. There are two major results: 1) wide averaging kernels of observations like SBUV mix information from different vertical levels. This can shift and distort vertical trend profiles. 2) Uncertainty estimates are necessary to determine if a trend is significant compared to natural variability. Again, wide averaging kernels combine information from different altitude levels, and this tends to result in underestimated variability. One example are ozone variations associated with the QBO. These are important for trend estimation in the atmosphere, but are reduced and smeared out in SBUV data. This tends to result in errors when accounting for the QBO, and in incorrect uncertainty estimates. Overall this is important information. The paper is well written and deserves publication in ACP.
There are a number of points that should be improved, though:
Abstract and other places in the text: The authors point out a number of problems with merged satellite records (sampling, calibration, instrumental differences, ..). I find this misleading, because the manuscript does not account for any of these "merging" issues. The only issue addressed here, are the SBUV averaging kernels. So I don't think the merging issues should be mentioned in the abstract. The first two sentences should be dropped. In line 4, "one merged" should be replaced by "the SBUV MOD". In line 11 "merged satellite records" should be replaced by "records from instruments with wide averaging kernels".
Line 16/17: "continued recovery" I think this should be "beginning recovery". This also applies to other places where "recovery" is mentioned. We are just at the beginning of ozone recovery. We are far from "recovered" and, as explained in the paper, we are also far from significant recovery in many regions of the atmosphere.
Lines 20,22: delete "lower" and add "in the tropics" after "abundances". The main branch of the Brewer Dobson circulation transports ozone rich air in the mid- and upper stratosphere from the tropics to the extra-tropics. Enhanced upwelling in the tropics is decreasing ozone in the tropical lower stratosphere.
Line 29: suggest to replace "newly detected" by "recent illegal"
Line 30: "increasingly large" is too strong. I would say "possibly increasing"
Line 32: explicitly add "e.g. Hunga Tonga-Hunga Ha'apai in 2022".
Lines 46-47: Reword. You are not adressing merging challenges, you are only adressing the effects of wide averaging kernels.
Figure 1: Please explain why the power density of the ESM4 historical runs in the 2 to 20 year range is lower in the top panel and larger in the bottom panel. I guess it is due to applying the SBUV averaging kernel in the top panel. I think this needs to be said / clarified in caption and text. I looks like the averaging kernels reduce variability. Also change the text in the legend in the top panel e.g. to ESM4 historical@SBUV resolution. It needs to be different from ESM4 historical in the legend of the lower panel.
Line 98: better to say "pre-industrial simulations" instead of "these simulations"
Line 134: I would start a new paragraph after NOAA. It should also be pointed out here that SBUV-MOD and SBUV from NOAA have wide averaging kernels and use the same nadir-viewing satellite data. On the other hand, GOZCARDS, SWOOSH, and the other data sets use LIMB and occultation instruments, which have much finer altitude resolution.
Figure 2: It would be good to have another panel showing the a-priori ozone profil and the two profiles, in addition to the panel shwing the deviation of the two profiles from the a-priori.
Line 230: for clarification, after "70 hpa", add "from the model simulations"? As shown later the model QBO is quite different from the "real" QBO.
Line 233/234: "see NOAA ... June 2024" Again, I am assuming you are using the ENSO of the model simulations, not the "real" ENSO. So, while the NOAA page is a good reference, it is kind of misleading here. Please move, reword / clarify.
Line 275: It would be helpful to explain skewness and kurtosis a bit more here. What you are saying is that the residuals are often not normally distributed, with distributions leaning to the left (skewness greater than 0.5), and distributions that are narrower than a normal distribution (positive excess kurtosis).
Line 294: would be helpful to add "(e.g. the red curve in Fig. 4c)" after "earlier", and " (the black curve in Fig. 4c)" after "itself".
Figure 5: I find it difficult to see much in panel a.) I think it would be better to show here the ratio (standard deviation)/(average values), i.e. the relative standard deviation, e.g. as percent. The overall ozone distribution (average values) will be well known to the readers. The relative standard deviation (or variability) in percent will be much better to compare, e.g. to trends which are also in percent per decade. If the authors don't want to change panel a.) they should add another panel with the relative standard deviation.
Line 314: add "CCMI" before modeled? I assume you are talking about trends from CCMI here.
Line 353: "sampling and retrieval". The way I understand it from section 2.3 you are using SBUV-MOD monthly zonal mean data. I assume you are also using monthly zonal mean data from the model simulations, but without accounting for the specific times and locations of the individual SBUV measurements. Am I correct? Are you dropping polar night data? My guess would be that your model sampling is the same in both hemispheres / polar caps, so "sampling" differences should not play a role here. You only see differences due to the retrieval / averaging kernels, which mixes and redistributes stuff from different altitudes. But, in my understanding, you do not look at sampling differences, i.e. differences due to the specific times and locations of the SBUV observations. So delete "sampling and".
Section 4.3: What you have done is applied averaging kernels and then done trends (avk -> trend). An interesting question to me is whether doing trends first and then applying the averaging kernels to the trend profile (trend -> avk) would give the same result. For the mean this should be the case, because both averging kernels and trend derivation are linear operations on the underlying data. Not sure what it means for the uncertainties though.
Figure 7: not sure what the difference between these three panels is. Are you just assuming three different trend profiles? What is the difference between the left panel and the middle panel? Please explain.
Figure 10: Please put a label / title on each of the three panels. Top panel is 1.6 to 1 hPa, middle panel 25 to 16 hPa, bottom panel is total column.
Line 438: replace "the total column" by "some ozone column metrics"?
Line 442: "modelled climate" instead of "climate"
Line 443: "large" How large? Give numbers. Overall, the changes in uncertainty / significance don't seem to be very large for SBUV (maybe 0.1 or 0.2 % per decade for trend uncertainty according to Fig. 9, a few years according to Fig. 10). They should be smaller to negligible for the LIMB satellites which have much better altitude resolution. Also in lines 6 to 8 in the abstract, it would be good to give some numbers.
Line 450, and also discussion of Fig. 7: You might want to refer to Fig. 3-10 of WMO 2022, which shows the latitude altitude distribution of ozone trends from various satellite records. SBUV-MOD is shown in the top left panel of that Figure. You can very clearly see that the peak of upper stratospheric trends is shifted downwards to about 10 hPa in the SBUV-MOD record, and that SBUV-MOD trends are reduced in the 2 to 3 hPa region.
Line 457: "which has been large in recent years". I would say "which can be large". Compared to Pinatubo in 1991, or El-Chichon in 1982/83, most recent volcanic eruptions, even Hunga Tonga, have only had a small influence on stratospheric ozone.
Citation: https://doi.org/10.5194/egusphere-2024-2627-RC1 -
RC2: 'some major revisions needed', Anonymous Referee #2, 16 Sep 2024
Summary:
This paper reports on the effect of averaging kernels from nadir-type
merged ozone timeseries, here SBUV Mod, on the detectability of
long-term ozone trends. Using CCM model data, the uncertainty in trend
detection due to internal variability as a function of the record
length was investigated. As stated in this study, the uncertainties in
the representativeness of modeled internal variability for
observations is rather large so that the results from this study
should be considered with some caution.The main issue with the averaging kernels from nadir observations, is
that they are quite broad (low vertical resolution) and in the lower
stratosphere asymmetric, so that trends at a given altitude have,
in some cases major, contributions, from other altitudes.Overall the paper is well written. My major concern is that neither
the abstract nor the introduction clearly state what the scope of this
study is and that the results are limited only to a particular subset
of satellite ozone profile measurements. One should not use the term
retrieval as a synonym for averaging kernels (AK). AKs are used in the
retrievals but they are inherent to nadir observation geometry in the
UV. So regardless of the algorithm used, all AKs are similar for the
same observation geometry. Also a clear distinction should be made
early on that this study focuses on nadir-derived ozone timeseries and
that other datasets based upon limb observations have narrower AKs and
are not investigated here.In many figures panels are missing titles. This would make the figures
more readable even without checking the figure caption. This applies to
Figs. 1. 3, 4, 5, 6, and 10.Specifics:
Paper title: The paper title is quite unspecific. I also think that it
is not clear what is meant with "forced" here. I strongly suggest to
add "nadir-derived", like: "Detectability of trends in nadir-derived
stratospheric ozone".l.2: In this paper the focus is on the impact of averaging kernels,
which are inherent to the observation type rather than the retrieval
algorithm (see comments above). Retrieval algorithms can have
different settings that may also impact trends (e.g. uncertainties due
to temperature dependence of ozone cross-sections). Do not use
retrieval algorithm as a synonym for averaging kernels (check also thee
entire paper)
l.4: mention here the use of SBUV MOD and that the data is based upon
satellite nadir measurements.l.22: "to decrease lower stratospheric ozone abundances". This is
limited to the tropics, a corresponding increase in extratropical
ozone is expected.l.42: "disagreement". Probably better to say that "models,
observations, and trends are highly uncertain"l.42: see also WMO (2022, p. 165) which recommends to be very cautious
with trends derived from reanalyses.l.45: A bit more explanation on the role of internal variability is
needed here (see for instance, doi:10.1038/s41612-023-00389-0). Does
the determination of internal variability not require an ensemble of
model runs rather than using single (or two) model runs as done in this
paper. Please discuss.l.46: the first sentence in this paragraph is misleading as it
suggests that all these aspects are considered in this study (see
earlier comments).l. 108: add some references to the SBUV MOD data, e.g
doi:10.5194/acp-17-14695-2017.l. 133: one should mention here that all LOTUS datasets except for
SBUV are derived from limb observations with narrower and less
asymmetric averaging kernels (see earlier comment)l.150: What is the vertical sampling of ESM4.1. State it here or
earlier when describing model data. Also it may be helpful to provide
some typical numbers for the vertical resolution of nadir-type ozone profiles
(see, for instance, 10.5194/amt-14-6057-2021), here or before when
describing SBUV MOD.l. 169: this has not to do with retrieval quality as this is an inherent
physical property from the retrieval using nadir observations (see comments
above). Maybe "limits" is better.l. 227 (Eq.) earlier you mentioned that the LOTUS regression on the six
merged datasets includes AO/AAO, and/or NAO. Not important here?l. 253: replace "retrieval" by "data".
l. 263. Regarding ozone the following paper paper is also relevant here
doi:10.1029/98JD00995.l.275: The distribution with s and k should be shown in Fig. 4.
l.345: AK-adjusted is better than "retrieval-adjusted"
l.356: errors cannot be negative, but trends can be. The standard deviation
is not unexpected to peak near the ozone peak.l.381: "... this analysis shows why trends should be analyzed as
vertical profiles rather than at individual vertical levels." I do not
understand what is meant here. Do you mean that trends should not be
evaluated at a single altitude level, but for all levels? Does this
make sense? Please clarify.l.410: Of the Limb datasets only two use MLS (GOZCARDS, SWOOSH),
rather say hat the LOTUS mean trends are heavily weighted by trends
from the higher vertically resolved merged limb datasets.Figure 9: Probably legend is wrong (no shading for "undetectable").
It seems that the "undectable" and "undectable according
to SBUV" are very similar and differ only by a few tenths of a
percent/dec for most altitudes. I think this should be mentioned.l.435: Regarding Antarctic column ozone recovery, add some references here.
l. 444: emphasize in item 1, that this is only true for nadir-derived
ozone profiles.l. 457: "large in recent years". Are you referring to Hunga-Tonga.
Please specify the large events.l. 462: add some references dealing with size and timing of the ozone hole.
Fig. 4: Both pre-industrial and 500-year runs are labeled ESM4 in the
panels. Use different abbreviations for each run.Figure 5a: More common unit for profiles is DU/km (equivalent to
number density). Clarify here.Fig. 6: add the corresponding SBUV ozone profile to the right panel.
Fig. 7. What are the different conditions between the three panels?
different zonal bands like in Fig. 8?, but averaging kernels from 42.5
degs only used for the synthetic data? Note that averaging kernels are
solar zenith angle (SZA) dependent and SZAs of SBUV measurements are
different in the tropics and higher latitudes.Citation: https://doi.org/10.5194/egusphere-2024-2627-RC2 -
RC3: 'Comment on egusphere-2024-2627', Anonymous Referee #3, 24 Sep 2024
The paper describes analysis using synthetic data from a global model to evaluate the impact that broad averaging kernels have on deriving vertically resolved ozone trends from satellite observations. The topic is important because of society’s need to determine the timing of ozone recovery as ODSs decline and to assess the ozone response to greenhouse gases. The paper is focused on one specific aspect of the problem but this aspect is generally relevant because many trend estimates rely on SBUV ozone profile timeseries stitched together from multiple platforms. The results will be of interest to those who compile and/or interpret trend estimates and to everyone who want to know the limitations of published trends.
The paper is clearly motivated and well written. The authors thoroughly describe the caveats to their analysis. I recommend some revisions before final publications, as itemized below.
General comments:
- The units for the vertically resolved ozone from the model data (DU/layer) make it difficult to compare the simulated ozone with measurements or other models without knowledge of the model vertical grid. Even with knowledge of this, calculations would be needed. Can you show the ozone for these plots in the more conventional units such as ppm or number density? This applies to Figure 4 and Figure 5a.
- It was not clear what the advantage is of interpolating the model profiles to the SBUV grid before applying the averaging kernels (line 188-189). Doesn’t this already remove some of the information about vertical structure that you are trying to identify in your study?
- Section 3.2.1 is hard to follow. Variables are defined (b, y, etc.) but the equation is not given. Since the final paragraph of this subsection appears to be key to the results that follow, it is important that it be clear. For example, do you compare the two emergence estimates y or y*? I could not tell which was identified as the time of emergence in Figure 5b-5d.
Minor comments:
- (line 98) By “optimistic” do you mean too low?
- (line 287) The reminder that one should not over-interpret crossing an arbitrary threshold is appropriate; I’m glad to see it mentioned.
- (line 320) Maybe I missed it but I think this is the first mention that the time to emergence depends on the magnitude of the trend. This is intuitively known but perhaps should be included in the introduction as one of the factors limiting trend detection.
- At line 381, you state “Altogether, this analysis shows why trends should be analyzed as vertical profiles rather than at individual vertical levels.” This is a good summary of the results shown in Figure 7 but is not quantitative. How do you decide whether that criterion has been met? For example, in Figure 8a the time to emergence is detectable over part of the profile but not all of it. However the text indicates that the trend in the upper levels is identified as detectable.
Citation: https://doi.org/10.5194/egusphere-2024-2627-RC3
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
172 | 39 | 97 | 308 | 5 | 6 |
- HTML: 172
- PDF: 39
- XML: 97
- Total: 308
- BibTeX: 5
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1