14 Apr 2023
 | 14 Apr 2023
Status: this preprint is open for discussion and under review for Atmospheric Chemistry and Physics (ACP).

Stratospheric ozone trends and attribution over 1984–2020 using ordinary and regularised multivariate regression models

Yajuan Li, Sandip S. Dhomse, Martyn P. Chipperfield, Wuhu Feng, Jianchun Bian, Yuan Xia, and Dong Guo

Abstract. Accurate quantification of long-term trends in stratospheric ozone can be challenging due to their sensitivity to natural variability, the quality of the observational datasets, non-linear changes in forcing processes as well as the statistical methodologies. Multivariate linear regression (MLR) is the most commonly used tool for ozone trend analysis, however, the complex coupling in most atmospheric processes can make it prone to the over-fitting or multi-collinearity-related issues when using the conventional Ordinary Least Squares (OLS) setting. To overcome this issue, we adopt a regularised (Ridge) regression method to estimate ozone trends and quantify the influence of individual processes. Here, we use the Stratospheric Water and OzOne Satellite Homogenized (SWOOSH) merged data set (v2.7) to derive stratospheric ozone profile trends for the period 1984–2020. Beside SWOOSH, we also analyse a machine-learning-based satellite-corrected gap-free global stratospheric ozone profile dataset from a chemical transport model (ML-TOMCAT), and output from two chemical transport model (TOMCAT) simulations forced with ECMWF reanalyses ERA-Interim and ERA5.

With Ridge regression, the stratospheric ozone profile trends from SWOOSH data show smaller declines during 1984–1997 compared to OLS with the largest differences in the lowermost stratosphere (> 4 % per decade at 100 hPa). Upper stratospheric ozone has increased since 1998 with maximum (~2 % per decade near 2 hPa) in local winter for mid-latitudes. Negative trends with large uncertainties are observed in the lower stratosphere with the most pronounced in the tropics. The largest differences in post-1998 trend estimates between OLS and Ridge regression methods appear in the tropical lower stratosphere (with ~7 % per decade difference at 100 hPa). Ozone variations associated with natural processes such as the quasi-biennial oscillation (QBO), the solar variability, the El Niño–Southern Oscillation (ENSO), the Arctic oscillation (AO) and the Antarctic oscillation (AAO) also indicate that Ridge regression coefficients are somewhat smaller and less variable compared to the OLS-based estimates. Additionally, ML-TOMCAT based trend estimates are consistent with SWOOSH data set. Finally, we argue that the large differences between the satellite-based data and model simulations confirm that there are still large uncertainties in ozone trend estimates especially in the lower stratosphere, and caution is needed when discussing results if explanatory variables used are correlated.

Yajuan Li et al.

Status: open (until 02 Jun 2023)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Yajuan Li et al.


Total article views: 282 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
202 73 7 282 19 2 2
  • HTML: 202
  • PDF: 73
  • XML: 7
  • Total: 282
  • Supplement: 19
  • BibTeX: 2
  • EndNote: 2
Views and downloads (calculated since 14 Apr 2023)
Cumulative views and downloads (calculated since 14 Apr 2023)

Viewed (geographical distribution)

Total article views: 287 (including HTML, PDF, and XML) Thereof 287 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 29 May 2023
Short summary
For the first time a regularised multivariate regression model is used to estimate stratospheric ozone trends. Regularised regression avoids the over-fitting issue due to correlation among explanatory variables. We demonstrate that there are considerable differences in satellite and chemical model-based ozone trends highlighting large uncertainties in our understanding about ozone variability, and we argue that a caution is needed while interpreting results with different methods and data sets.