Deciphering the impacts of meteorology on surface ozone variability in eastern China using explainable machine learning models
Abstract. Understanding how meteorology influences surface ozone variability is critical for interpreting trends and designing effective air quality policies. This study employs explainable machine learning (XML) with SHapley Additive exPlanations (SHAP) to interpret daily ozone variations from 2013 to 2023 across three major regions in eastern China: North China Plain (NCP), Yangtze River Delta (YRD), and Pearl River Delta (PRD). An ensemble of five machine learning models (LightGBM, XGBoost, CatBoost, Random Forest, and Extra Trees) is trained using 14 meteorological variables and two temporal indicators. XML reveals nonlinear, region-specific ozone-meteorology relationships that are broadly consistent with physical understanding, while differences in SHAP attributions across algorithms highlight structural uncertainty arising from multicollinearity among input variables. We use SHAP-derived contributions to attribute warm-season ozone trends to meteorological versus non-meteorological drivers. Before 2019, ozone increases are mainly associated with the temporal proxy for non-meteorological influences (e.g., emission changes), whereas after 2019 meteorological variability dominates regional ozone trends. Exploiting the additive nature of SHAP, we develop a de-weathering framework that partitions daily ozone into a SHAP-based climatological baseline and a meteorology-induced ozone anomaly (MOA). Across all three regions, the magnitude of positive MOA events increases over 2013–2023, while their frequency and duration show no significant trends, indicating a strengthening meteorological amplification of pollution episodes rather than more frequent events. Our results demonstrate both the utility and limitations of XML for disentangling meteorological drivers of ozone pollution and provide new constraints on how meteorology shapes surface ozone under China’s clean air actions.