the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Explainable Machine Learning diagnosis of Ozone Formation Sensitivity in China: Spatiotemporal Evolution and Driver Attribution
Abstract. Accurate diagnosis of ozone (O3) formation sensitivity (OFS) is crucial for effective control strategies, but a long-term, observation-based, interpretable assessment disentangling the roles of meteorology and emissions at the national scale is lacking. This study integrates OMI tropospheric columns of nitrogen dioxide (NO2) and formaldehyde (HCHO) from 2005 to 2023, using the HCHO/NO2 ratio (FNR) as a proxy to track the spatiotemporal evolution of OFS in China. We develop an explainable machine learning framework coupling Random Forest (RF) and SHapley Additive exPlanations (SHAP) to quantify the contributions of meteorology and emissions at regional scales. Our findings reveal a policy-driven phase reversal in OFS: from 2005 to 2012, rising NO2 columns shifted much of China from NOx-limited to VOC-limited or transitional regimes. Post-2013, the Clean Air Actions led to a decline in NO2 and a modest increase in HCHO, triggering a nationwide return to NOx-limited conditions, especially in eastern China. Regionally, the Sichuan Basin (SCB) remained NOx-limited, the Pearl River Delta (PRD) transitioned rapidly to NOx-limited, and the Beijing-Tianjin-Hebei (BTH), Yangtze River Delta (YRD), and Fenwei Plain (FWP) showed gradual shifts from VOC- to NOx-limited regimes. SHAP analysis identifies temperature and surface shortwave radiation as dominant meteorological drivers, while emission patterns vary regionally: non-methane volatile organic compounds (NMVOCs) dominate in BTH, NOx in PRD, and carbon monoxide (CO) amplifies radical cycling in FWP, YRD, and SCB. These results support a “climate-dominated, emission-modulated” framework for OFS restructuring, offering a transferable diagnostic tool for differentiated O3 control strategies.
- Preprint
(2166 KB) - Metadata XML
-
Supplement
(388 KB) - BibTeX
- EndNote
Status: open (until 28 Jan 2026)
- RC1: 'Comment on egusphere-2025-5732', Anonymous Referee #2, 06 Jan 2026 reply
-
RC2: 'Comment on egusphere-2025-5732', Anonymous Referee #1, 07 Jan 2026
reply
The manuscript tackles a central issue in China’s “post-PM₂.₅ era”: how ozone formation sensitivity (OFS) evolves under the combined influence of emission controls and climate variability. The long-term, national-scale framework based on satellite precursors (OMI NO₂ and HCHO, 2005–2023), together with the integration of an indicator approach (FNR = HCHO/NO₂) and explainable machine learning (RF–SHAP), is compelling and potentially valuable for informing differentiated ozone-control strategies. Overall, the study is clearly structured and well-written. Methods are technically robust. The topic and findings are highly relevant to atmospheric chemistry and broadly align with the scope and scientific standards of Atmospheric Chemistry and Physics.
Nevertheless, several issues should be addressed before the manuscript can be considered suitable for publication in ACP.
Specific comments
- Using policy issuance/implementation (e.g., 2013) as the breakpoint for phase division is reasonable. However, incorporating a formal change-point analysis (e.g., Pettitt test or a Bayesian change-point method) would substantially strengthen the argument by demonstrating whether NO₂, FNR, and/or the regime area fractions exhibit statistically significant structural shifts around 2013. This would make the “policy-driven phase reversal” claim more robust, more publishable, and less open to challenge.
- FNR thresholds (VOC-limited < 1; NOₓ-limited > 2) may vary across seasons, regions, and chemical environments. I recommend adding a clearer statement and discussion, preferably in the Conclusions, on the uncertainty and potential variability of these thresholds, and how such variability might influence regime classification and inferred trends.
- While the manuscript cites relevant literature, a more explicit comparison with previous findings would better contextualize the novelty and contribution of this work. For example, how do the identified meteorological drivers and their relative importance compare with those reported in other regions with similar climatic conditions or under comparable emission-control trajectories?
Technical comments
- The manuscript describes filtering HCHO pixels with cloud fraction > 30% and anomalous values (> 1.0 × 10¹⁷ molec cm-2), followed by a 3 × 3 moving average. However, it does not specify how cloud fraction is obtained (e.g., from an OMI cloud mask/cloud product) or how “anomalous values” are defined (e.g., statistical outliers versus physically implausible retrievals). In addition, resampling HCHO (0.1° × 0.1°) to the NO2 grid (0.25° × 0.25°) may introduce spatial-averaging biases; please clarify the resampling method (e.g., nearest neighbor, bilinear interpolation, area-weighted averaging). These details are important for reproducibility and for interpreting the robustness of the derived trends and regimes.
- Figure 2d shows high HCHO values over the western China, including the Qinghai-Tibet Plateau. Is it artificial from satellite retrieval? If not, what’s the possible source? In addition, the contour color of the small panel is not consistent with the main figure. Please clarify.
- The caption of Figure 2 states 2013–2020, whereas the analysis elsewhere emphasizes 2013–2023. Please revise to ensure consistency and confirm that the stated MK–Sen time window matches the actual analysis period.
- Please standardize the unit notation for NO2 and HCHO columns (e.g., “molec cm-2” or “molecules cm-2”) throughout the manuscript and clarify at first mention that FNR is dimensionless.
- Providing “last access” information (e.g., “last access: 6 September 2025”) is helpful. I suggest also reporting the product version and/or DOI wherever applicable and ensuring consistent formatting and completeness across the manuscript (Methods, Data availability, and any Supplement).
Citation: https://doi.org/10.5194/egusphere-2025-5732-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 146 | 92 | 14 | 252 | 27 | 28 | 26 |
- HTML: 146
- PDF: 92
- XML: 14
- Total: 252
- Supplement: 27
- BibTeX: 28
- EndNote: 26
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Review: Explainable Machine Learning diagnosis of Ozone Formation Sensitivity in China: Spatiotemporal Evolution and Driver Attribution
Summary:
This paper develops an explainable classification machine learning model with FRN-divided ozone photochemical regimes as a label to quantify the impact of meteorology and emissions on the ozone formation regimes (VOC-limited, NOx-limited, and transitional regimes). The authors provide a comprehensive assessment of the spatiotemporal evolution, seasonality, and the COVID-19 lockdown response of OFS over China, and reveal an apparent two-stage regime shift during 2005-2024. However, the core methodological logic is not fully convincing. Because the regimes are derived solely from the satellite HCHO/NO2 ratio and a prescribed threshold, they do not explicitly encode meteorological effects. ML attribution therefore quantifies drivers of an FNR-based classification proxy rather than providing a physically grounded diagnosis of OFS. This disconnect weakens the manuscript’s ability to address the key gap stated in the Introduction regarding meteorological impacts on OFS. Moreover, FNR thresholds are known to be region-dependent. Applying the uniform national thresholds potentially introduces non-negligible uncertainty, affecting OFS analysis. Therefore, the authors should clarify the conceptual rationale of this framework and demonstrate robustness to threshold/label uncertainty before it can be considered for publication.
Specific comments: