Quality Assessment of the AC SAF GOME-2 gridded ozone profile data records
Abstract. One of the objectives of the Atmospheric Composition SAF is to produce satellite-derived monthly mean data records that are valuable for operational, scientific, and other applications. One of these data records is the gridded global GOME-2A/B/C ozone profile data set presented in this paper. This data record covers the period 2007–2024 and consists of ozone partial columns on a 0.25 degree × 0.25 degree grid with 40 vertical layers, with the associated (averaged) averaging kernel (AK) and the a priori needed to use the data in other applications. This paper presents the GOME-2 instrument, the (level-2) ozone profile retrieval method and the subsequent gridding procedure used to generate the level-3 product. We discuss the methodology for averaging AKs in latitude bands and demonstrate that the principal structural features are preserved. We provide a description of the balloon sounding, lidar, FTIR and microwave radiometers validation data and methods, and then perform a quality assessment of the gridded ozone profile product through comparison with these independent data sources for the tropics, mid-latitude and polar latitude bands and in three vertical regions: the Troposphere, and the Lower and Upper Stratosphere. Detailed analyses of absolute and relative differences are provided for each region and height range. The results demonstrate a high level of consistency across the three GOME-2 instruments (with GOME-2A used only up to 2018), underscoring the reliability of the GOME-2 constellation for long-term ozone monitoring. In the troposphere, all three sensors tend to slightly overestimate ozone, with absolute differences (AD) of roughly +1 to +3 DU in mid-latitudes and up to +7.5 DU in the tropics for Metop-C. In lower stratosphere, all sensors show a small negative bias, typically between −3 and −7 DU (RD ≈ −3 % to −7 %), corresponding to a modest underestimation of ozone concentrations. In the upper stratosphere, biases are minimal across all sensors, with absolute difference values, close to zero (−0.1 to −0.4 DU) and extremely low variability (STDEV ≈ 0.1–0.2 DU). These findings underscore the reliability of the GOME-2 constellation for long-term ozone analyses and the potential for merged multi-sensor time series without significant inter-calibration artifacts suitable for climate and atmospheric research.
General comments:
Tuinder et al. provide a first full quality assessment of the GOME-2A/B/C Level-3 ozone profile datasets developed within EUMETSAT’s Atmospheric Composition Satellite Application Facility, making use of ozonesonde, lidar, FTIR, and microwave radiometer observations as reference data. This work fits the scope of Atmospheric Measurement Techniques, but requires clarifications in terms of terminology, methodology, and reference data selection before it can be considered for publication. Moreover, this work at several instances lacks positioning within the existing literature.
Specific comments:
Line 4-5: Add that also the gridded uncertainty or covariance matrix is stored, unless this is not the case. The user community would at least expect it.
References are missing for several statements made in the first two paragraphs of the introduction, and for the Metop satellites and GOME-2 instruments in lines 41-44.
The importance of tropospheric ozone (monitoring) is only mentioned in line 40, almost as a side note. The author could better motivate their tropospheric ozone column assessment.
If the authors aim at demonstrating the importance of the GOME-2 instrument for atmospheric ozone monitoring in lines 49-62, it seems remarkable that RAL’s GOME-2 ozone profile retrieval within ESA’s Ozone CCI is missing.
In Section 2, references seem to be missing for the offset on line 89, Optimal Estimation on line 92, an assimilated ozone field on line 108 (different from the climatology, which is referred to earlier), and the remarks in Table 1.
Section 2.3 on averaging kernels is too vague in terms of terminology and methodology to be clear. The authors frequently mix (also throughout the text) “averaging kernels” with the “averaging kernel matrix” that combines all retrieval-layer-specific averaging kernels (as the AK matrix rows, uncommonly called “weight(s) curves” with Fig. 1). The authors should be more specific in lines 119-121: What does “the averaging kernel [matrix] cannot be inverted” mean, as one could in theory easily use a pseudo-inverse? In addition, why can a reference data set with a high vertical resolution easily be compared to a satellite data set, and not the other way around? The averaging kernel smoothing approach that is applied later in the text should be explained with reference to Eq. (1). The authors should add to that what happens to the lower-resolution FTIR data, which are the result of a sort of retrieval themselves and hence come with averaging kernels as well. Is AK smoothing applied on the satellite data or bi-directionally then, as suggested in the literature?
Lines 128-129: Do you mean that each retrieved pixel is divided into 10 by 10 km sub-pixels who’s centres need to be within a grid cell to be included? This is not fully clear.
Line 147 refers to the retrieval (co)variance (matrix?) and error (profile?), while these are not introduced in Section 2.
Lines 185-186 and Figure 2: Only the AK matrix dispersion (in terms of interpercentiles) at a specific location is shown, while upon constructing averaged averaging kernels for entire latitude bands, one would rather be interested in the AK matrix dispersion within each entire band. Showing this for all 18 bands might be too much for the main text, but should be considered as a supplement, especially given the known non-triviality of applying mean averaging kernels, as discussed in the work of von Clarmann and Glatthor (https://doi.org/10.5194/amt-12-5155-2019), which the authors should introduce in their analysis.
Fig. 3: Plot titles, legends, and axis labels should be made more brief and clear. The plotting colours do not seem to correspond to what is indicated in the legend, assuming that M01 to M03 corresponds to Metop-A to Metop-C, respectively. It should be clarified in the text how the retrieval degrees of freedom are obtained and why two modes appear in the DFS of GOME-2A before 2013.
Given the coarse (effective) resolution of FTIR observations, one would expect Table 3 to include FTIR observation specifics.
Section 4.1: Only ~20 ozonesonde sites are used for validation, while the EVDC (also hosted at NILU, including WOUDC and NADIR data) covers ~60 ozonesonde stations providing data in the period under study. Moreover, within the TOAR-II HEGIFTOM working group, the data of ~40 ozonesonde stations has been homogenised for long-term analyses. The authors should hence justify or correct for their (limited) ozonesonde data selection.
Lines 268-269: The authors’ phrasing is too brief to be clear here. What does it mean to apply Eq. 1 if that does not contain a reference profile? The term (also throughout the text) “(AK) retrieved ground-based” data is very confusing. Some ground-based reference data are sort of retrievals by themselves. What the authors refer to are “ground-based data corrected for vertical smoothing difference errors” which is commonly abbreviated as “smoothed ground-based data” but not “retrieved”.
Sections 5 and 6: It is unclear why the authors leave a gap in the vertical extent of their analysis, as seen from the unmatched altitude ranges in the second and third columns of Table 4. Moreover, Sections 5 and 6 do no longer refer to FTIR data. Where are they in the analysis? Is this only as part of Table 5, based on all reference stations shown in Fig. 4? The authors should provide an indication of the consistency of the validation results between the different reference data sources. Finally, the authors should discuss their findings with respect to existing GOME-2 ozone profile retrieval products and uncertainties, both their own Level-2 input as third-party Level-2/3 data.
Fig. 5: The plot layout in Figs. 6 and 7 is much more clear than in 5, which seems to have pairwise identical plots in the first column, and the L3 a priori profiles (green lines) missing.
Fig. 8: Why is the tropospheric column not included? Please provide an indication of the corresponding relative differences. Given that L3 data are considered, global maps of differences over the entire time series could be very insightful too.
Data references: The data availability statement should also include the ground-based reference data. FTIR data are not mentioned in the author contributions. Only the NDACC network is acknowledged, while other network sources appear in the text.
Technical corrections:
Throughout the text, acronyms should be spelled out upon first usage, e.g., SAF, GOME, FTIR, DU, and STDEV in the abstract.
Abstract: Repetition of lines 13 and 18.
Line 15: “In the lower…”
Line 36: “ter-molecular interactions” looks like a typo.
Adding references to the subsequent manuscript sections in the last paragraph of the introduction would improve readability.
Line 87: “ozone partial column density values (in DU)” contains a contradiction between the quantity and the units.
Use appropriate quantities and symbols in Eqs. 2, 6, 7, 8, 9, 10, 11, and 13.
“seen” in Table 2 probably refers to the number of sub-pixels being included?
Where only layer numbers are mentioned, like in line 174, please add the corresponding altitude or pressure. The coloured legend in the central panel of Fig. 1 is insufficiently clear to leave it to that.
Quantify the “relatively small differences” of line 182.
Caption of Fig. 1: Does “at 45° N” mean for the band going from 40° to 50° N?
It would be helpful to add the diagonals in the colour plots of Figs. 1 and 2 to guide the eye.
Line 223: Wrong usage of tildes.
Line 240: “profile” instead of “spectrum”?
Lines 254-255: Do you mean comparing mean reference data with the Level-3 grid cell that overlaps with the station location?
Captions of Figs. 6 and 7: Explain the meaning of r in the second column.
Line 357: In “use of merged Metop GOME-2 datasets” it is unclear whether these already exist to the community.
References somehow do not appear in alphabetical order.