the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Meteorological influence on surface ozone trends in China: Assessing uncertainties caused by multi-dataset and multi-method
Abstract. China has witnessed notable increases in surface ozone (O3) concentrations since 2013, with meteorology identified as a critical driver. However, meteorological contributions vary with different meteorological datasets and analytical methods, and their uncertainties remain unassessed. This study leveraged decadal observational O3 records (2013–2022) across China, revealing intensified nationwide O3 pollution with increasing O3 trends of 0.79–1.31 ppb yr–1 during four seasons. We gave special focus on uncertainties of meteorology-driven O3 trends by using diverse meteorological datasets (ERA5, MERRA2, FNL) and diverse analytical methods (Multiple Linear Regression, Random Forest, GEOS-Chem model). A useful statistic (coefficient of variation, CV) was adopted as an uncertainty quantification metric. For multi-dataset analysis, models driven by different meteorological datasets exhibited the maximum meteorology-driven O3 trend (+0.55 ppb yr–1, multi-dataset mean) with the highest consistency (CV=0.25) in spring. The FNL-driven model always obtained larger trends compared to ERA5 and MERRA2, which could be attributed to inability to accurately evaluate planetary boundary layer height in FNL dataset. For multi-method analysis, three methods demonstrated optimal consistency in winter (CV=0.40) and the worst consistency in summer (CV=2.00). The meteorology-driven O3 trends obtained from GEOS-Chem model were almost smaller than those obtained by other two methods, partly resulting from higher simulated O3 values before 2018. Overall, all analyses driven by diverse meteorological datasets and analytical methods drew a robust conclusion that meteorological conditions almost boosted O3 increases during all seasons; the uncertainties caused by different analytical methods were larger than those caused by diverse meteorological datasets.
- Preprint
(1812 KB) - Metadata XML
-
Supplement
(4406 KB) - BibTeX
- EndNote
Status: open (until 30 Jun 2025)
-
RC1: 'Comment on egusphere-2025-1880', Anonymous Referee #1, 14 Jun 2025
reply
This study presents an analysis of the meteorological drivers of surface ozone (O3) trends in China from 2013 to 2022, based on an observational dataset of ozone and various supporting analyses, including statistical analyses, simple machine-learning and chemical transport modeling using GEOS-Chem.
The authors highlight the role of meteorological conditions in driving seasonal and regional ozone increases, and use these analyses to begin a discussion of the uncertainties arising from applying these different supporting datasets. The paper will be of interest for those using such large-scale observational datasets to isolate the drivers of air quality trends and may be of interest to policymakers. The use of a consistent metric is interesting. The paper represents a significant effort in gathering and providing an interesting high-level analysis of different ways to analyse the data.
The main result of the study is to assess the consistency between approaches using a coefficient of variability metric, in which higher CVs indicate lower consistency of meteorologically driven O3 trends derived from different datasets or methods. Initially this is used as a comparator between datasets, but towards the end of the MS the authors use this more quantitatively, with thresholds of 0.5 and 1.0 being applied to indicate consistency. How were these numbers chosen? What do they mean?
What other metrics could be used as a metric for comparison?Most time is spent discussing an analysis using meteorological reanalyses with the ML and CTM work in a supporting role as challenger methods to the MLR analysis.
In section 2, the methods used are described. In the regression-based statistical analysis, the authors first use a time-series filter to retrieve trends in ozone and other fields, and then a MLR-based model to derive the drivers of these trends. I was not able to find further details of the method used as it is in a separate publication that is incorrectly referenced.
The ML study is perhaps the least well justified - six of the predictors are proxies for time, with a further six (pressure, temp, wind speed, RH and PBLH) being deemed sufficient to capture the meteorological drivers of ozone. I have reservations about this approach because the RF model is trained on MDA8O3 concentrations. Are the authors satisfied that this model is sufficiently accurate that it can be used for attribution of driver and yield confident results? If so, what is the justification? What is the basis for explaining 50% of the variance to be a threshold for inclusion? I'd like to see more here, particularly the basis for exclusion of e.g. trends in emissions or atmospheric composition which may be drivers. It would seem much more appropriate if they had use RF to predict the recovered LT O3 trend and then used the meteorological data as predictors for the trend. L167 specifies how MDA8 was calculated, but needs much more detail on how the trends were computed.
The use of GEOS-Chem is interesting and the experiment is well-conceived, and the model is well validated in the supporting information. No information on the extraction of the trend data from the GC experiments is given, and this should be included in the main MS. The MERRA2 reanalysis was used to drive the CTM. Given the scope of the MS, why just one reanalysis? It seems that there's an opportunity here to expand the analysis of the uncertainty in the GC trend on meteorological product, and it is certainly necessary to discuss how the lack of independence of the GC and MLR(MERRA2) results affects the analysis in this paper.
Section 3.1 details the results, and leaves some questions unanswered. Please include a discussion of what the analysis says about which are the main drivers, etc. At present, this discussion is more of a comparison with other findings. In fact, the authors note that most of the outcomes are already published elsewhere (L214-L223), which reinforces the need for novel analysis in this section. I believe the MS would be improved by reporting drivers of the trends, particularly as Section 3.2 lumps all these drivers together as the meteorological impact on the MDA8 O3 trends. Maybe a figure showing the contribution of each driver would be useful here.
Section 3.2 addresses the consistency of the MLR results across different reanalyses. I don't understand why the uncertainty in the derived trends is not included here. Could it not be calculated? I suggest it's included, not least to visually assess the consistency/difference between calculated trends and support the CV analysis. If it can be calculated, please add it as an error bar to the figure .
Section 3.3 confronts the MLR method with its challengers. Here the MS inter-compares the metrics and notes the difference across various domains. This provides a brief description of the uncertainty (ie spread) of results but stops short of providing a good assessment of the importance of individual drivers or in making broad recommendations as to which analysis is the most robust, reliable or useful. The analysis of the FNL results is interesting. My main concern here is with the ML/RF approach: it may be undermined by relatively low skill of the resulting model and resulting first-principles questions as to the robustness of these results - does a statistical model of relatively low skill permit us to say much about the drivers?
Overall, the MS has a number of positive qualities: the use multi-dataset and multi-method approaches is welcome. The MS shows that the analysis is quite robust for some regions and some seasons, and so has some policy relevance.
The MS would be much improved if the analysis was extended to identify the drivers of the what the authors call uncertainty, ie intermodel spread. At present, the MS doesn't give enough information on how the ML and GC data were used to compute trends, and whether it was as statistically advanced as for the reanalysis data, separating processes at different timescales. In short, if the comparison is between similar quantities.
It would be interesting to discuss the limitations of working with reanalysis datasets, and indeed the relative strengths and weaknesses of ML and GC data in deriving trends for comparison with observations. The ML and MLR analysis would be stronger if the role of additional chemical, meteorological and climate variables were included to capture a fuller picture of ozone drivers, e.g. solar radiation, soil moisture, vegetation cover, or climate indices like ENSO in driving uncertainty was quantified. Similarly, clustering techniques would be valuable to augment the region based approach and would provide better understanding of the similarity between stations.
To enhance its impact, in broad terms, I'd suggest to provide more detailed justifications for their methods, expand the analysis to include additional variables and uncertainties, and focus on identifying the main drivers of ozone trends. By addressing these points, the value of the study would be increased for researchers and policymakers working to mitigate ozone pollution under changing meteorological conditions.
Finally, regarding data availability, the data do not conform to Copernicus poloicy which states that "access to data is by depositing them (as well as related metadata) in FAIR-aligned reliable public data repositories, assigning digital object identifiers, and properly citing data sets as individual contributions.". This needs to be addressed via a DOI via archiving through Zenodo or similar of the entire O3 dataset.
Minor comments
L31 rapid not repaid
L266 uncertainties caused by multi-model is not clear. How are they caused? what is 'multi-model'in this context?
L296 interesting, but please add reasons why PBLH in FNL introduces these issues.
L300 should read 'for the whole of China'Citation: https://doi.org/10.5194/egusphere-2025-1880-RC1
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
123 | 18 | 6 | 147 | 7 | 4 | 4 |
- HTML: 123
- PDF: 18
- XML: 6
- Total: 147
- Supplement: 7
- BibTeX: 4
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1