the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Recommendations on benchmarks for chemical transport model applications in China – Part 2: Ozone and Uncertainty Analysis
Abstract. Ground-level ozone (O3) has emerged as a significant air pollutant in China, attracting increasing attention from both the scientific community and policymakers. Chemical transport models (CTM) serve as crucial tools in addressing O3 pollution, with frequent applications in predicting O3 concentrations, identifying source contributions, and formulating effective control strategies. The accuracy and reliability of the simulated O3 concentrations are typically assessed through model performance evaluation (MPE). However, the wide array of CTMs available, variations in input data, model setups, and other factors result in a broad range of simulated O3 concentration differences from observed values, highlighting the necessity for standardized benchmarks in O3 evaluation.
Built upon our previous work, this study conducted a thorough literature review of CTM applications simulating O3 in China from 2006 to 2021. 216 relevant articles out of a total of 667 reviewed were identified to extract quantitative MPE results and key model configurations. From our analysis, two sets of benchmark values for six commonly used MPE metrics are proposed for CTM applications in China, categorized into “goal” benchmarks representing optimal model performance and “criteria” benchmarks representing achievable model performance across a majority of studies. It is recommended that the normalized mean bias (NMB) for hourly O3 and daily 8-hr maximum O3 concentrations should ideally fall within ±15 % and ±10 %, respectively, to meet the “goal” benchmark. If the “criteria” benchmarks are to be met, the NMB should be within ±30 % and ±20 %, respectively. Moreover, uncertainties in O3 predictions due to uncertainties in various model inputs were quantified using the decoupled direct method (DDM) in a commonly used CTM. For the simulation period of June 2021, the total uncertainty of simulated O3 ranged 4–25 μg/m3, with anthropogenic volatile organic compound (AVOC) emissions contributing most to the uncertainty of O3 in coastal regions and O3 boundary conditions playing a dominant role in the northwest region. The proposed benchmarks for assessing simulated O3 concentrations, in conjunction with our previous studies on PM2.5 and other criteria air pollutants, represent a comprehensive and systematic effort to establish a model performance framework for CTM applications in China. These benchmarks aim to support the growing modeling community in China by offering a robust set of evaluation metrics and establishing a consistent evaluation methodology relative to the body of prior research, thereby helping to establish the credibility and reliability of their CTM applications. These statistical benchmarks need to be periodically updated as models advance and better inputs become available in the future.
- Preprint
(2396 KB) - Metadata XML
-
Supplement
(898 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2199', Anonymous Referee #1, 27 Oct 2024
Huang et al. present part 2 on proposing benchmarks for CTM applications in simulating ozone in China. The evaluation criteria is based off prior work by Emery et al. (2017) which may be tailored to the U.S. and Europe and not suitable for China, and the authors propose revised criteria and methodology for simulations focusing on China. The work is generally well written, through I have major concerns regarding some areas for the manuscript which need to be clarified prior to recommending this work for publication.
Major comments:
1. L59: "... which may not be suitable for China." Could the authors elaborate on why Emery et al.'s criteria are not suitable and the steps the authors propose for revising them? Is it the range of simulated/observed values in China different from other regions? Differences in the chemical regimes controlling ozone in China? Differences in the input data uncertainty? Differences in model tuning targeting different regions?2. L79 and Figure 1 pose "WRF-Chem" as a single model, which is not very accurate. WRF-Chem provides an extremely large amount of chemical schemes available (e.g., refer to User's Guide https://repository.library.noaa.gov/view/noaa/14945 Page 14-) ranging from simple RADM2 without aerosols with a dozen species to the MOZART chemical mechanism with hundreds of species, not to mention the different configurations of aerosols, photolysis, and underlying meteorology simulated by WRF. Different papers using different schemes of WRF-Chem are not comparable to each other. Fortunately, the authors do separate the studies by chemical mechanism later in the text (in "Choice of gas-phase chemical mechanism") - I would suggest that this separation is done earlier in the text and in Figure 1 to make it clear that individual chemical mechanisms available in WRF-Chem are evaluated separately and not grouped together. I would request that the supplement data in Table S1 be updated similarly to reflect the chemical mechanism in the WRF-Chem studies.
3. P92 - the authors convert mixing ratios to ug/m3 in the analysis. I understand this may be for consistency with the Chinese MEE observational data which is reported in ug/m3. I recall that there may be a temperature / pressure condition used by China MEE for use in the unit conversion to/from ug/m3 - can the authors confirm that 273.15K at 101.325 kPa is the one used (and possibly provide a reference)? This would affect the model to obs. comparisons and should be clarified.
4. L122 - "a uniform O3 concentration of 29 ppb was used as the initial and boundary conditions (BCs)". I have three questions here -
4.1. I assume 29 ppb is at the surface and there is a vertical profile applied to this? What does the vertical profile look like?
4.2. A 10-day spin-up from uniform initial conditions (and not previously spun-up distributions) of 29 ppb for simulating ozone seems very short. How was this chosen? That is shorter than the mean tropospheric lifetime of ozone (although it may be fine for the PBL) but I have concerns about the effects this may have for free tropospheric ozone and influences from that which may be important for East Asia.
4.3. Can the authors confirm that a uniform 29 ppb is used as the boundary conditions? For regional CTMs the transport from outside the domain, which ventilates the simulated region from the boundary conditions, can be quite important for the ozone distribution inside the simulated domain. Why were "realistic" boundary conditions from a global model not used here?5. L209... Impact of grid spacing. I would suggest "horizontal resolution" here. The authors claim in L219 that "no clear trend was evident to indicate better model performances as grid spacing decreases." I understand there's further discussion later in this section but this statement is potentially misleading when unqualified without mentioning that it is not controlled for the same model, the same emissions, input data, etc... The authors state at the end of the section that "reducing grid spacing does not necessarily lead to improved model performance if the input data resolution (i.e., spatial resolution fo the emissions) is not correspondingly high or well-matched." In my opinion, such an argument is better phrased as a caution to model configuration instead of a conclusion - if flawed model configurations where the input data resolution is insufficient for the model resolution are analyzed, I would argue it is evident that improved model resolution may not provide the benefits modelers are looking for. At first glance the authors are close to presenting a "dangerous" argument that model resolution provides no benefits then later saying only if the model is configured incorrectly!
Specific comments:
- L78: GEOS-Chem is not an acronym - see https://geoschem.github.io/narrative.html.
- L115: delete "grid". What is the model top height?
- L116: What are the other configuration parameters of the WRF simulation providing the meteorology? e.g., PBL scheme, ...
- L119: Link for EDGAR is wrong, www.meicmodel.org is written here.
- L192: Would be helpful to define BTH, YRD, and PRD here for readers unfamiliar with the region terminology.
- L212 "i.e. GEOS-Chem" - GEOS-Chem can be used regionally. Many studies use GEOS-Chem nested for China dating back to Y.X. Wang et al. (2004).Citation: https://doi.org/10.5194/egusphere-2024-2199-RC1 -
RC2: 'Comment on egusphere-2024-2199', Anonymous Referee #2, 03 Nov 2024
General Comments:
Given China's unique pollution characteristics, this study develops benchmarks to assess the accuracy of chemical transport models (CTMs) in simulating ground-level ozone (O₃) pollution in China. A systematic literature review was conducted on 216 studies from 2006 to 2021, covering five widely used CTMs (CMAQ, CAMx, GEOS-Chem, WRF-Chem, and NAQPMS) to establish region-specific benchmarks for O₃. The benchmarks are divided into “goal” values (optimal performance) and “criteria” values (achievable performance) for commonly used model performance evaluation (MPE) metrics, including mean bias (MB), normalized mean bias (NMB), root mean square error (RMSE), normalized mean error (NME), correlation coefficient (R), and index of agreement (IOA).
The study also conducts an uncertainty analysis using the decoupled direct method (DDM) with the CMAQ model, identifying key sources of uncertainty in O₃ predictions. Significant contributors to uncertainty include anthropogenic VOC emissions in urban regions and boundary conditions in rural areas. Spatial and seasonal patterns are noted, with regional differences in uncertainty and model accuracy. These benchmarks and uncertainty insights are intended to guide modelers in China, help standardize CTM applications for ozone and improve the reliability of model-based air quality assessments.
Specific Comments:
- While the study explains the need for China-specific benchmarks, a more detailed comparison with international standards (e.g., differences in precursor emissions and climatic impacts) could reinforce why global benchmarks are unsuitable. The authors may consider adding a more thorough analysis of how China’s unique pollution sources and climate contribute to differences in ozone formation and modeling challenges compared to North American or European settings. This could strengthen the rationale for the proposed region-specific benchmarks.
- This study could be strengthened further by considering higher-order sensitivities or additional metrics in future analyses, mainly to capture uncertainties in meteorological inputs and chemistry beyond first-order impacts.
- While the study provides benchmarks, it offers limited guidance on how modelers or policymakers might implement these benchmarks practically. The authors should consider a short section on practical applications of the benchmarks and uncertainty findings. For example, recommend how modelers could adjust their configurations to meet “goal” benchmarks and suggest ways policymakers might use these insights to set air quality standards or prioritize emissions reduction strategies.
Editorial/minor Comments
Lines 17-18: please consider replacing “other factors result in a broad range of simulated O3 concentration differences from observed values” with “other factors result in a broad range of differences between simulated and observed O3 concentrations.”
Line 37: please delete “their” from “their CTM applications” for conciseness.
Line 83: please replace “with a time range between 2006 and 2021” with “for studies published between 2006 and 2021.”
Line 109: please replace “uncertain analysis” with “uncertainty analysis.”
Line 151: replace “relatively less frequent” with “less frequent by comparison” for clarity.
Line 187 (Figure 3): please use the same y-axis scale for most graphs except R and IOA for easy comparison.
Line 207 (Figure 4): see the above comment for Figure 3.
Line 208 (Figure 5): see the above comment. Please replace “quantile distribution of O3 NMB values in different seasons” with “quantile distribution of O3 R and O3 NMB values in different seasons.”
Line 231 (Figure 6): see the above comment for Figure 3.
Lines 235-236: please fix broken lines.
Line 238: should insert a space for ChemistryMechanism.
Lines 245-246: there is no citation for “SAPRC22 mechanism”
Line 276, more metrics are presented in Table S6 but not mentioned here.
Line 346: not sure how the uncertainty factor of 1.68 is derived.
Lines 354-355: replace “with a more evenly distributed spatial impact” with “but has a more evenly distributed spatial impact.”
Line 407+ (references): A link is provided for every reference, but some are visible (in blue and underlined) while some are not.
Citation: https://doi.org/10.5194/egusphere-2024-2199-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
220 | 67 | 18 | 305 | 22 | 6 | 7 |
- HTML: 220
- PDF: 67
- XML: 18
- Total: 305
- Supplement: 22
- BibTeX: 6
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1