the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CLAQC v1.0 – Country Level Air Quality Calculator. An Empirical Modeling Approach
Abstract. The Country Level Air Quality Calculator (CLAQC) is an open-source modeling tool that utilizes national sectoral emissions and weather data to forecast monthly and annual concentrations of fine particulate matter (PM2.5) and tropospheric ozone (O3). CLAQC leverages the recent advancements in the CAMS system, employing CAMS global gridded emissions and CAMS reanalysis pollutant concentrations to improve the accuracy of its predictions. One of the notable strengths of CLAQC is its ability to provide country-specific and sectoral information. We have developed two methodological approaches, namely elastic net modeling and extreme gradient boosting regressor, that can effectively predict annual average concentrations for nearly all countries. Although both methods show good performance for the country's yearly average, the sectoral contributions are not robust enough for the elastic net models. The tool can simulate a vast range of policy scenarios and can be integrated into national policy assessment and optimization frameworks. Finally, we present a method selection framework for each country to optimize performance, and an online tool displaying model results.
Status: open (until 24 Aug 2024)
-
RC1: 'Comment on egusphere-2024-995', Anonymous Referee #1, 18 Jul 2024
reply
General comment:
This article aims at proposing two methodologies for assessing impact of policy scenarios on monthly and annual pollutant concentrations (PM2.5 and O3) at country level on a global scale. The first approach is based on Elastic Net models and the second one on machine learning models (XGboost). Those methodologies used several datasets as inputs such as pollutant emissions, weather data, and concentrations data that are harmonized to a common a grid of 0.5° x 0.5° from 2003 to 2021. Results show that EN models are performing well for annual total exposure, but ML models are better for evaluating the contribution of individual sector of emissions.
I would like to congratulate the authors of the paper for their interesting work. The article is very well written and clearly structured. The scientific topic is of a great interest and the methodologies proposed in the paper are quite innovative considering the published approaches. However, some clarifications still need to be made in the text as highlighted in the specific comments concerning the consideration of secondary inorganic aerosols in the models and the composition of the “Other” sector of emissions for PM2.5 estimations. Some sensitivity tests could also be performed to address the impact of these features on the final estimation as well as the effect of the proportion of the train and test sets when applying the models.
Specific comments:
Page 1, line 33: The following change should be made: “fine particulate matter (particles with a diameter less than 2.5 µm, PM2.5)”
Page 2, line 4: The following change should be made: “chemistry-transport models (CTMs) are tools for calculating the impact of emissions on pollutant concentration levels”
Page 2, lines 7 to 10: ACT tool (Air Control Toolbox, Colette et al., 2022) should me mentioned here. ACT is a surrogate model to explore mitigation scenarios in air quality forecasts. This is designed for estimation at the European level on a daily basis.
Colette, A., Rouïl, L., Meleux, F., Lemaire, V., and Raux, B.: Air Control Toolbox (ACT_v1.0): a flexible surrogate model to explore mitigation scenarios in air quality forecasts, Geosci. Model Dev., 15, 1441–1465, https://doi.org/10.5194/gmd-15-1441-2022, 2022.
Page 2, line 19: “The latter one is the most detailed, up-to-date reduced form air pollution model”, the following clarification should be made: “on a global scale”.
Page 3, line 9: Please explain further what you mean by “factors” and the impact of the monitoring of pollutant concentrations in ambient air. Perhaps, it should be mentioned that monitoring stations are used for air quality assessment and the lack of measurement points in an area is a strong constraint in that objective.
Page 3, line 13: “Global, gridded reanalysis data combine and harmonize satellite air pollution measurements with ground-level monitors.” The following clarification should be made: CTM estimates are also used.
Page 3, lines 24 and 25: The following clarification should be made: “the need to homogenize different grids in terms of spatial resolution”
Page 4, line 24: Is the odd-road transportation corresponds to shipping and aviation? Could you please clarify.
Page 5, line 4: Are the natural emissions included in the “Other” sector? Could you please clarify.
Page 5, line 19: How did you manage to downscale the concentrations data to 0.5°? Please explain further.
Page 5, lines 20 and 21: I don’t understand why you must change the unit here. Aren't ECMWF data already in µg/m3? Please clarify.
In addition, Figure 2.3 is called in the main text, but the figure numbering is Figure 1. In order to improve reading of the figure, the ticks should be added in the two colorbars to make the correspondence with the tick labels in panels a) and b).
Figure 2 title: The expression “weighted by the population” should be mentioned in the title.
Page 7, line 25 to 30: It is a choice of simplicity to not considered secondary aerosols in the model. Have sensitivity tests carried out on the impact of this choice on the final estimation?
Page 9, line 13: Have sensitivity tests been carried out on the splitting of the training and test data sets? If not, these tests should be considered.
Page 9, line 13: It is not clear if the whole period of the dataset (from 2003 to 2021) is used to train de model? Please clarify in the main text what is the periods of the train set and the test set. It is mentioned that the perturbations of emissions are applied to the last 5 years of data (page 9 line 27), thus is the model trained from 2017 to 2021?
Page 10, line 23: If the Other sector includes Natural emissions, it could have a significant impact of PM2.5 concentrations (from desert dust and sea salt). That may probably bias the estimate of the machine learning model. This sector should be considered.
Page 11, line 7: I’m a bit confused with the consideration of the secondary inorganic aerosol’s formation in the model. It is explained in section 3.2: “It is crucial to understand that in situations where secondary reactions substantially affect the overall mass of PM2.5 within a country, our models are designed to omit these precursors from the list of predictors, thereby not reflecting a decrease in PM2.5 levels.” Please clarify if the secondary inorganic aerosols are excluded or not.
Page 11, line 29: Please clarify what you mean by: “We randomly split a gridded data set stratifying by grid cell. Hence, randomization occurs over the temporal dimension.”
Page 11, line 31: Why the splitting of the training and the test data set is different from the EN model? Same question as for the EN model, were sensitivity tests on the choice of the training / test sats carried out?
Page 12, line 2: Why not to say “Emission scenarios” instead of “Stylized scenarios” in section 4.1 title?
Page 12, line 5: Why are emission perturbations ranging to +60%, when we would expect policy scenarios to necessarily seek to reduce precursor emissions?
Page 13, line 4: Move Figure 6 in the main text. The panels of this figure are very small, and it is very difficult to read the figure correctly. It would be preferable to prepare one figure per model and per pollutant to be mor readable.
Page 13, line 26: Move figures 7 and 8 to the main text.
Page 13, line 28: Could you explain why models work better for O3 than for PM2.5?
Page 13, line 42: The following clarification should be made: DACCIWA is preferred to CAMS over Africa only. (same page 14, lines 1 and 2).
Page 13, lines 3 and 4: Move figures 9 and 10 in the main text.
Page 14, line 10: What you mean by “measurement error of unknown distribution”? Could you please clarify.
Page 14, line 39: The evolution of the approach by the consideration of an ensemble of the models is a quite good perspective of work to improve the final estimate.
Citation: https://doi.org/10.5194/egusphere-2024-995-RC1
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
112 | 0 | 0 | 112 | 0 | 0 |
- HTML: 112
- PDF: 0
- XML: 0
- Total: 112
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1