the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Ozone stratospheric trends from regional Bayesian composite of ground-based partial columns
Abstract. Large uncertainties and variability prevent the detection of statistically significant ozone trends from individual ground-based instruments in the lower stratosphere. Available merging studies are typically performed by latitude bands on satellite-based data records. This study derives correlation-based regional composites of ground-based time series towards reducing trends uncertainties.
We address fundamental heterogeneities resulting from grouping individually homogenized ground-based datasets to enable robust merging. Uneven temporal and vertical resolutions of five ozone measurement techniques (Ozonesondes, FTIR, Dobson Umkehr, Lidar and Microwave radiometers) are handled by integrating monthly mean ozone profiles in two sets of four independent partial columns. Spatial heterogeneity is resolved by defining coherent regions using the Copernicus Atmosphere Monitoring Service (CAMS) reanalysis. Regional time series are merged by the BAyeSian Integrated and Consolidated (BASIC) algorithm, adapted to consider propagated measurement uncertainties and the agreement between individual time series by Principal Component Analysis (PCA). Trends for the 2000–2024 period are then estimated by Multiple Linear Regression using the LOTUS model.
We compare BASIC with a conventional weighted mean. While the weighted mean fails to capture variability during periods of low instrument consensus, BASIC produces a more representative time series by robustly handling outliers. Accordingly, BASIC reduces average uncertainties of the trend estimates by 15.3 % relative to the weighted-mean approach. Our results confirm robust positive trends in the upper stratosphere and show predominantly negative significant regional trends in the middle and lower stratosphere. This study establishes a consolidated, global ground-based reference to be used for comparison with global satellite-based ozone trends.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Atmospheric Chemistry and Physics.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(4958 KB) - Metadata XML
-
Supplement
(3659 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-113', Anonymous Referee #1, 06 Apr 2026
-
RC2: 'Comment on egusphere-2026-113', Anonymous Referee #2, 07 Apr 2026
Review of Mirallie et al., "Ozone stratospheric trends from regional Bayesian composite of ground-based partial columns"
The submitted manuscript evaluates 2000-2024 trends in ozone at several defined layers of the atmosphere, from the troposphere to the upper stratosphere, using data from long-running ground-based measurement stations. Merged composites are formed of records from different stations using a Bayesian method, and then trends calculated for different heights and regions using multi linear regression closely following the LOTUS model. Finally the derived trends are compared with those derived in other recent work and differences discussed.
Overall this is a very thorough and clearly explained analysis, and in my opinion very suitable to publish in APC once some minor comments below have been addressed.
General comments
My main suggestion is that the authors should spend a little bit more time talking about the motivation for what they're doing. The Bayesian merger of multiple instrument types and locations together is a much more abstract construction than, for example, a single station record, or a zonal mean, or a trend estimated based on one type of instrument only. In many places the data sets show significant divergences from one another in various ways, (eg different magnitude of annual cycles or extremes, varying bias between different instruments) so the question I am interested in is what exactly is represented by this composite? (That is, not how you constructed it, but how the reader should think about it and what it actually represents?)
Secondly, it seems like an assumption to me that short-term correlations between sites would necessarily imply similar causes and values of trends, most particularly in the case shown in Figure 5 when the northern hemisphere mid-latitudes and southern hemisphere subtropics are well correlated (but not the higher latitude southern midlatitudes). It seems unlikely to me that the trends in these regions would have influences sufficiently in common to make forming a composite physically meaningful. This point should also be justified.
I very much appreciated the clear explanations of the methodology and comparison to other methods, for example in shown in figure 6 and 10. While some reviewers would possibly object to the inclusion of this material, I think it is very helpful to readers at the present time while these techniques are still fairly new in the community. It was very good that you included more conventional weighted-mean results for comparison in many places.
The decision to only use data from HEGIFTOM ozonesondes is slightly surprising to me. Looking at the project map (https://hegiftom.meteo.be/datasets/ozonesondes) the geographic coverage of your study could be very usefully extended if you included the other ozonesonde sites (eg east Asia, southern hemisphere). While I can see that there is a motivation to take advantage of the presumably greater degree of homogenization that has been carried out in HEGIFTOM, in your analysis there are points where there appear to be unresolved inhomogeneities in the different ozonesonde records anyway, such as lines 451-456 and 579-582, so it seems HEGIFTOM can only provide a partial solution.
Specific comments
Lines 50-65 The abstract seems too strident in its statements. Is it really true that no ground-based station anywhere in the world shows a statistically significant ozone trend in the lower stratosphere?
Lines 64-66 This statement doesn't seem supported by figures 21 and 22. The negative trends are only in very specific regions and most of them are not significant. The one very clear result seems to be the negative trend in the tropical middle stratosphere based mainly on ozonesondes. The generally negative trends seem hard to reconcile with Figure 3 (b) of Sofieva et al. 2025 but I see that in your conclusions you say you will be looking at total ozone in further work.
Lines 78-79 If co-located instruments can show significantly different trends, doesn't this call into question the validity of combining them in any way at all? Shouldn't these differences be accounted for first?
Line 80 "large discrepancies" in what?
Lines 89-91 If it is "critical" to use the different co-ordinate systems of Millan et al. as you say, then shouldn't you also be using co-ordinates relative to the tropopause and the jets?
Lines 96-97 The way the sentence is written implies these specific satellite instruments had a particularly limited lifetime compared to other instruments not listed – is that what you meant?
Lines 97-99 You say the resolution of some of them is "sufficient" – do you mean that for the others it is not sufficient?
Lines 103-107 I would like you to be clearer here about what exactly you're hoping to achieve. Is the motivation to determine representative trends over an area rather than single points? Is it to remove the influence of very short scale spatial variability? Is it to hope instrumental drifts cancel out in a composite?
Lines 124-128 You should state that for ozonesondes, though, you don't use NDACC, but HEGIFTOM, which only partly overlap.
Lines 132-133 This statement is too strong. The homogenization attempts to account for and correct the jumps and drifts but in reality is unlikely to completely accomplish this. (If it did it would make your life easier!) Your assumption that the differences are now due to random errors and spatial offsets is a reasonable methodological approach but you can't possibly be sure that this is so.
Lines 148-149 I don't see that forming monthly means "resolves" this problem. For ozonesondes there is the problem of very limited sampling rates, but for Lidars and FTIR it's more the need for clear skies which introduces a bias. I think some more discussion would be beneficial here.
Line 150 Personally I find the diagram to be very clear and helpful.
Line 160 I can't make much sense of this sentence unfortunately. Do you mean that the proxies used in ozone trend analyses are usually representative of large areas and low frequency variability?
Line 165 A very minor point, but if it's "standard" and "alternative" why are they called "o" and "a" and not "s" and "a"?
Line 170 You don't specifically explain that you're trying to allow for the decreasing height of the tropopause with increasing latitude by "stepping" the boundary. This should be added to line 168. Even so, you would only get this right in a climatological sense, not for every single day of the record, does that matter?
Line 207-209 Does it matter that two of the locations are not using the latest version? Wouldn’t this also be an inhomogeneity?
Line 232 – The need for "quasi cloud free conditions" must introduce a significant bias in the lower stratosphere at least and the UTLS?
Line 269 "should resume by March 2026" – hopefully by now it has restarted?
Lines 180-270 One thing missing from all the instrument discussions though is a statement about the possibility of long-term drifts. There are a large number of instrument specialists among the co-authors who could comment for the different instrument types.
Lines 288-315 This is all R and not R-squared, right? I am surprised you don’t see any negative correlations?
Figure 5 – As earlier, I am intrigued by the fact that there is a high correlation between Zugspitze and the southern hemisphere subtropics. Could you please comment on this? To me this suggest a weakness of the correlation method because I don't think the short-term variability and long-term variability would be caused by the same processes across such distinct latitude bands.
Lines 320-326 If I understand correctly, the method assumes that there is a constant offset caused either by distance on the ground or two different instrument types, but do you have any reason to think it is a constant value over time?
Lines 342-345 This is an interesting point, because if the outlier is completely real but caused by spatial variability, is it right to exclude it from the mean of the area? Later I notice you use the term "regional consensus".
Line 346 Is this method actually in Rodgers?
Lines 360-381 This is an extremely helpful discussion and figure, and I commend you for including it.
Lines 418-427 You should give a specific source for these datasets for reproducibility
Lines 421-427 I found these third and fourth bullet points difficult to follow. You don't motivate why you changed from ENSO in LOTUS to the NAO. Could you have used both? Then for Lauder you do use ENSO even though its latitude is comparable to some of the NH areas. Then you say that you used it "with" GLOSSAC sAOD but don't explain what you did here at all. The implication is you don't have an aerosol proxy for the northern hemisphere? There have been significant eruptions in the 2000-2024 time period so this seems odd.
Lines 446-450 (and figure 7) Some of the stations clearly have much larger annual cycles than others and are also display extremes of much greater amplitude. Therefore, I wonder whether it is really valid to merge them into a 'consensus' timeseries and expect the trends to be physically meaningful?
Lines 451-459 The fact the Legionowo "(and other)" ozonesondes appear first higher than the other instruments and later, lower, seems concerning to me, when the main goal is looking for trends. Does this call into question the validity of combining data from different instrument types? Could you comment on this please.
Line 476-477 You say BASIC can "retrieve the common geophysical signal" – this implies that there exists a "common geophysical signal" and the rest of the variability is noise? Can you expand on this please?
Lines 515-519 It's very helpful that you've plotted the BASIC results with the weighted mean and the individual stations.
Lines 545-546 The one sigma is in the darker blue, not lighter, it looks to me.
Lines 545-556 It seems a very big assumption to me that is meaningful to compare trends in such widely diverse locations, even if the monthly correlation is reasonably high.
Lines 579-582 This is another example where ozonesonde seem to be differing from other instrument records. Should you exclude stations suffering from the "drop-off"?
Specific comments
Lines 100-101 "data desert in Salawitch" doesn't make sense, please re-word
Lines 147-148 Please reword this sentence – "is not expected to contribute" to what?
Line 170 In the second table the pressure values don't match in the first column between aMS and UTLS.
Line 289 "Following Weatherhead" would be better wording than "According to Weatherhead".
Line 322 A better wording would be "following Ball et al ." rather than "following Ball's approach"
Line 499 "who" -> "which"
Line 563 For me, the expression "In a nutshell" is too informal for a scientific paper in a journal, I suggest something like "in summary"
Citation: https://doi.org/10.5194/egusphere-2026-113-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 258 | 80 | 20 | 358 | 48 | 58 | 53 |
- HTML: 258
- PDF: 80
- XML: 20
- Total: 358
- Supplement: 48
- BibTeX: 58
- EndNote: 53
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
I've prepared my review as a separate supplementary PDF document.