Identifying decadal trends in deweathered concentrations of criteria air pollutants in Canadian urban atmospheres with machine learning approaches

Yao, Xiaohong; Zhang, Leiming

doi:https://doi.org/10.5194/egusphere-2023-2968

AC1: 'Supplementary Information to egusphere-2023-2968', Leiming Zhang, 21 Dec 2023

RC1: 'Comment on egusphere-2023-2968', Anonymous Referee #1, 09 Jan 2024

# Reviewer comments on egusphere-2023-2968: Identifying decadal trends in deweathered concentrations of criteria air pollutants in Canadian urban atmospheres with machine learning approaches

## Overall comments

This manuscript presents a trend analysis of various air pollutants in 10 Canadian cities over a (at maximum) 30-year period. The trend analysis has been conducted in a way that controls for weather over the analysis period and a second component of the data analysis involves exploring outliers, that are wildfire events, and their influence on the trends observed. The results of the analysis are what is expected and are in line with observations gathered from other urban areas with developed economies. Namely, NOx (NO2 is focused on here), CO, and SO2 mole fractions have declined over the analysis period with CO and SO2 becoming less of an "optional issue" in modern times. The reduction of NOx (specifically NO) has however produced the urban O3 rebound where the reduction of NO has resulted in increasing or stable trends of O3. PM2.5 trends are more mixed because of location-specific features of primary emissions and secondary generation processes at local and regional scales. In this sense, the manuscript does not contribute too much new to the literature with respect to processes or mechanisms, but the results are likely important to the Canadian cities in question. I defer to the editor to make a judgement on this novelty point. The manuscript has been well-written and constructed.

My general feedback is as follows. The authors have used two closely related methods to conduct their meteorological normalisation, and these two methods more or less produce the same result. I think it would be useful to add an explicit method comparison objective to the study and add a paragraph to the results or discussion section addressing the similarities and differences between the methods. The manuscript only considers mean slopes (as determined by Mann-Kendall tests) for the trend analysis when exploring the change over time for pollutants and locations. I would like to see these trend slopes plotted alongside the non-normalised and normalised concentrations, at least for one example. It should also be acknowledged and discussed further that collapsing a time series into a single mean slope value misses other changes over time which may also be important when considering the introduction of air quality management policies, especially immediately after a policy change. The manuscript lacks an in-depth site or location-specific interpretation of the results because the focus is placed on stating the trends observed. It seems that many of the anomalies, for example, Hamilton's SO2 (a port city), could be further explained by city-specific interpretation. I am not familiar with these cities, but the manuscript would be far stronger if more site and city-specific information were evaluated and added to the discussion. The authors have used an approach called the identical-percentile autocorrelation method to both determine outlier events (that are driven by wildfires) and their influence on the trend. I believe the text in the methods needs to be addressed further because I cannot follow clearly how and why this approach is conducted.

## Specific comments

Line 13. Why has (NO2 + O3) been used over Ox in the manuscript? Is this not a standard abbreviation used in the atmospheric sciences?

Line 16. Replace including with the: "...methods, the random forest algorithm..."

Line 61. Use the superscript notation for 60 ug m-3

Line 63. Stress that this is an issue for all areas of the world.

Line 82. Elemental carbon rather than element carbon.

Line 90. Ozone to O3.

Line 97. Remove "The" at the start of the sentence.

Line 100. Both Grange et al., 2018 and Grange and Carslaw, 2019 are usually cited together here.

Line 109. Ox?

Line 120. Should a method comparison objective be added regarding the two decision tree methods that are implemented?

Line 145. This logic might be questionable. If observations are not accessible for a site, using the nearest site might not be a good substitute because there could be a change of site type, generally a shift from urban background to urban traffic or vice-versa. If this were to happen, the time series no longer represents the same monitoring conditions. In a related question, why was the analysis not conducted at a site level? Were the time series generally not continuous among the cities for the analysis period?

Line 173. Was dew point an important variable for training and prediction? I would expect not if relative humidity and temperature were included. Was an importance analysis conducted?

Line 184. The ERA5 reanalysis global model product provides these additional variables and could be included in future analyses.

Line 201. These figures show that the two methods produce more or less the same result and relates to by general comment above. The scatterplots show different observations due to the different sets for training and testing sets. This is probably worth a note in the caption.

Line 226. I think this section needs to be revised for simplicity. I do not understand how this approach isolates and quantifies the effect of the extreme events. Could a simple outlier test suffice?

Line 282. Was block bootstrapping used or the Mann-Kendal trend tests?

Line 294. Please consider line plots for this type of plot. It is understandable however if points work better.

Line 311. How was 5% determined? To get a robust uncertainty measurement, a number of tests would need to be run and compared to a ground truth?

Line 315. Please consider plotting the values of the trends together too. This would give a good graphical comparison among all the different time series.

Line 357. Can you conclude CO and SO2 are no longer an issue across most of Canada's urban areas?

Line 402. Do you have an explanation of why the two closely related algorithms had a larger difference in this particular case?

Line 416. I am not familiar with Hamilton, Ontario, but a quick look shows this city is a port city. This feature would probably explain the observed anomalous SO2 behaviour when considering other cities.

Line 422 and 452. Ox?

Line 515. The same question as line 402, is there an explanation for this behaviour between the decision tree algorithms?

Line 530. Add "a": "Note that a large....". It would be useful to state that this indicates a high variability.

Line 552. How much higher?

Line 555. It sounds like high AQHI values are found in all seasons. Perhaps stating the frequency per season would be a clear way to present these differences.

Line 569. Replace "were" with "are".

Line 609. I like this section and is an important component of the study where this source of air pollutants will be more of an issue moving into the future. I would like some clarity on the method however so it can be better understood how these conclusions were made.

Line 643. Add this statement to the conclusions too.

Citation: https://doi.org/10.5194/egusphere-2023-2968-RC1

AC2: 'Response to Referee #1', Leiming Zhang, 26 Mar 2024

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-2968/egusphere-2023-2968-AC2-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2023-2968-AC2

RC2: 'Comment on egusphere-2023-2968', Anonymous Referee #2, 31 Jan 2024

Overall Comments

This manuscript analyzes the long-term trends of air pollutant concentrations of NO2, SO2, CO, O3, (NO2+O3) and PM2.5 in 10 Canadian cities using observational concentrations and deweathered concentrations. The latter were generated with two machine learning methods. Correlation analysis was carried out to evaluate the association between concentrations and provincial level air emissions. Further, the impact of wildfire on PM2.5 air quality trend was investigated apparently by assuming all high concentrations were due to wildfires. The three datasets, the observed concentrations, and the deweathered concentrations by each of the two methods, yield similar trends in general. Therefore, the variation in weather conditions has a very small impact on the trends of annual mean concentrations, as expected. The long-term trends are consistent with results reported by other researchers with a few exceptions. The topic is relevant to the journal of ACP. The manuscript has a great potential to advance our knowledge in terms of the effectiveness of existing control mensurate in Canada and the directions of future mitigation policies. However, more in-depth analysis could strengthen the scientific contributions of the manuscript. Further, there are quite a few clarification issues. These are listed in the section below.

Specific Comments

Abstract

Line 20, “on the time scale of 20 years or longer, the perturbation from varying weather conditions exerted a very minor influence on the decadal trends of original annual averages (within ±2%) in ~70% of the cases, and a moderate influence up to 16% of the original trends in the other 30% cases”. Those statistics were not presented in the main body. Further, leaping from the difference in the annual means between the observed and deweathered data to “influence on the decadal trends of original annual averages” could be problematic.

Introduction

A review of the “two popular machine-learning packages” (Line 167) should be presented, including the advantages over other methods, shortcomings or limitations, and applications in air quality studies.

Method

The training of some machine learning methods requires input of known values of the dependent or output variable to facilitate the learning. For example, observed O3 concentrations are required to train the model to predict O3 concentrations using O3 precursor concentrations. Kindly specify all input variables in the training stage as well as the input and output during the testing stage. If deweathered air pollutant concentrations are required in the training phase, provide the sources of such datasets.

Kindly clarify that the training and testing were conducted for each site for each pollutant. It would be useful to have the performance matrix presented.

Kindly clarify 1) whether the performance presented are the results of the training phase or testing phase, 2) whether the trends presented are based on the training datasets, testing datasets, or the entire dataset, for the original, RF algorithm, and the BRTs, respectively.

Fig 1 suggests that the observed concentrations were used in training and testing. There are two questions, 1) would a well trained and tested model be expected to predict concentrations with systematic and significant deviation for the observed values, 2) when the predicted concentrations deviate systematic and significantly from the observed values, is the bias due to a) a poorly trained model, b) uncertainty of the model predictions, c) influence of some factors, such as weather conditions, or d) some combination of causes listed in a)-c).

Consequently, the use of regression slopes between the original and deweathered annual means to infer “perturbation from varying weather conditions” or “influence on the decadal trends of original annual averages” is debatable. As the authors pointed out, the predictions of both models carry large uncertainties, and the agreement between the two models could be poor at times (Line 399). Therefore, some statements in the Abstract could be rephrased. Nonetheless, the 95% confidence intervals (CI) of each of the three annual concentration slopes, i.e., observed, RF-deweathered, BRTs-deweathered, could be used to determine whether the slopes are statistically different at 95% CI, and if yes in some cases perhaps derive a bound of the influence of the weather conditions on the long-term trend of original annual averages, with caution.

Line 117, “To establish the relationship between air pollutants concentrations and emission reductions, the deweathered and original mixing ratios (or mass concentrations) of the air pollutants were correlated with the corresponding provincial-level emissions.” Did you mean the concentrations and emissions were correlated as seen in previous analysis, or correlation between concentrations and emissions was analyzed in this study? Kindly specify whether Pearson or Spearman correlation analysis was used and justify the method selection.

The use of “provincial-level emissions” (Line 120) in the analysis should be justified when there is a large variation of emission reductions among different cities in a province, such as Ontario. Note that the National Pollutant Release Inventory (National Pollutant Release Inventory - Canada.ca) could provide emission inventories at a smaller spatial scale.

Line 129, “2.1 Monitoring sites and data sources”. Kindly specify 1) the averaging time and unit of each pollutant. 2) when a large percentage of data is missing in a particular year, either pollutant concentrations or meteorological parameters, was that year considered in the trend analysis?

Line 189, “The testing datasets were different between the RF algorithm and the BRTs.” Kindly 1) justify the use of different testing datasets for the RF algorithm and the BRTs and the impact of this approach on the comparison of the performance of the two machine-learning methods. 2) whether the training datasets were the same or different between the RF algorithm and the BRTs.

Line 265, “100th percentile”, kindly explain this term, considering that the maximum concentration is still part of the sample. For example. the 90^th percentile means 90% of the data points are below that value. Similarly, “95th-100th percentile PM2.5 mass concentration” (Line 519) needs explanation.

Line 282, kindly justify the use of “The M-K analysis” instead of other trend detection methods.

The attribution of all high PM2.5 concentrations to wildfire seems speculative or qualitative. The authors may want to clearly state the assumptions or use wildfire database and airmass directions to identify concentration data points under heavy influence of wildfires.

Line 576, “Thus, O3 data with mixing ratios lower and higher than 40 ppb were analyzed separately below, with the former case representing net O3 sinks occurring in the atmospheric boundary layer and the latter one representing net O3 sources occurring therein (Table 3).” Kindly provide citations to support the classification of net source or net sink, i.e., 39 ppb being net sink and 40 ppb being net source. Atmospheric reactions of O3 suggest that both production and consumption occur at urban centers and a short distance downwind. Further, O3 concentrations vary significantly with season and city. Perhaps O3 concentrations collected at a background site could be used instead. Alternatively, the use of city specific median value to obtain two concentration levels in each city could be considered instead of a fixed value of 40 ppb.

Results

When reporting model performance in the main body, the units of the statistical metrics should be included when applicable.

When presenting r or R2 values, p-values should be included in the plots.

Line 403, “The increased uncertainties led to the difference between the RF-deweathered and original SO2 mixing ratios being up to 16% in Winnipeg.” Kindly clarify how these uncertainties were quantified for each pollutant in each city.

Discussion

This section is a mixture of method, results, and discussion. A consolidation of all methods or results in perspective sections could improve the readability of the manuscript.

Conclusions

Limitations of the study could be included.

The reviewer did not find any conclusions on

1) “the perturbations from varying weather conditions on the observed mixing ratios” or on the long-term trends (Line 104)

2) whether the deweathered datasets yield any trends which are statistically different from that by the original dataset,

3) the benefit of employing two machine learning methods, and

4) whether the deweathering process is recommended in trend analysis of air quality.

Overall, the scientific contribution and policy implication of the manuscript could be strengthened by considering the following,

Incorporation Canadian perspectives, perhaps a map showing the locations of the 10 cities could aid the discussion of regional or transboundary inputs, if any.
Providing city specific information, such as site classification, major emission sources of each pollutant in each city, proximity to major point sources, emissions of such point sources from NPRI.
Tidying up the interpretation of statistical results.
Offering more reasoning, for example, whether the small influence of weather conditions on the 2-3 decades trends of air quality is expected and why; reasons of large discrepancy in emission trend and concentration trend, such as the decreasing trends in NO2 concentrations when NOx emissions were increasing during the same period (L352), and no trend in CO concentrations when CO emissions were increasing during the same period (L381).
Including more in-depth analysis, e.g., whether the deweathered datasets yield any trends which are significantly different from that by the original dataset and why, and whether the two deweathered datasets yield similar or different trends, and why.

Other Clarification Issues

Citations seem missing at times, e.g., L46, “CAAQS”; L132, NAPS; L282, The M-K analysis.

L95, “but most modeling results suffer from large uncertainties, which could exceed annual average changes of the simulated pollutants.” Kindly clarify. Did you mean, “but most modeling results suffer from large uncertainties which could exceed changes in annual means of the simulated pollutant concentrations”?

L99, “weather and/or meteorological conditions”, kindly specify weather conditions and meteorological conditions, or perhaps choose one.

L104, L14, L306, L309, L601, “the perturbations from varying weather conditions on the observed mixing ratios”, The reviewer is unsure about the “perturbations from varying weather conditions”, perhaps “perturbations due to varying weather conditions”, “perturbations from normal weather conditions”?

L105. “criteria air pollutants”. Kindly specify whether it is “some criteria air pollutants”, or “all criteria air pollutants”.

L130, the list of the 10 cities should include the provinces.

L145, kindly provide 1) a list of monitoring stations in each of the 10 cities that are within 1 km, and 2) a list of sites with one or more years of unfilled missing data. Occasionally, long-term monitoring stations are relocated within a city. Was that encountered in this study?

Line 147, “SO2, CO, NOx and PM2.5 emission data”, kindly specify 1) the reporting time period, e.g., annual or monthly, 2) the types of emissions included/excluded, such as residential wood burning and wildfires.

Line 158, the category of “AQHI between 4-6” should be provided.

Line 166, “2.2 Statistical analysis”, data sources of “meteorological parameters” are better placed in section 2.1 Monitoring sites and data sources.

Line 171, “(hour, day, weekday, week and month)”, kind specify each parameter, e.g., hour (0-23), day (1-365 or 366), week (1-52), month (1-12).

Line 185, “Nevertheless, good performance can still be achieved in the present study mainly because of multi-decade length of the datasets”. Kindly provide evidence that a large dataset would lead to a good performance or rephrase.

Some results are in the Method section, e.g., Line 201-215, which could be better placed in the Results section.

Some methodology descriptions are in the Results or Discussion sections, e.g., Line 563-577, which could be better placed in the Method section.

Line 292, “Fig. 3a and b show decadal variations in the original annual averages of NO2 mixing ratios...” The reviewer could not find “b” being NO2 concentrations. Similarly, not both “Fig. 3c, d” (Line 463) are PM2.5 concentrations.

Line 370, “Halifax (90-92%)…”, kindly clarify the meaning of the ranges.

Line 376 and other places including tables, the term “grand total and transportation emissions” is confusing. Kindly clarify whether there is one item, i.e., “grand total including transportation”, or two items, i.e., “grand total excluding transportation and transportation emissions”. Similarly, “total grand” in some tables.

Line 440, “The increased O3 mixing ratio values likely reflected the lower limit resulted from the reduced titration reaction between O3 and NO (Simon et al., 2015; Xing et al., 2015).” Kindly rephrase.

Line 555, seasonal averages, kindly specify which months are classified as each of the four seasons.

Line 669, kindly clarify whether 1) “decrease in NO2 during the last 2-3 decades varied by 37%-62%”, or “decrease in NO2 during the last 2-3 decades ranged 37%-62%”, and 2) “37%-62%” are among the three datasets or among the 10 cities.

Tables should be referenced when reporting results.

Fig 1, kindly clarify what is being predicted by the models, deweathered concentrations or observed concentrations. If the former, kindly clarify the source of observed deweathered concentrations. If the latter, kindly justify the need of those models.

“Fig. 2. Correlations between hourly PM2.5 concentration in a single year and its 22-year average in each hour in Edmonton.” Did you mean, “Fig. 2. Correlations between hourly PM2.5 concentration in a single year and 22-year average PM2.5 concentration in each hour of a year in Edmonton”? Furthermore, the reviewer could not find “percentile series” in the “Left column”.

“Fig. 4 Deweathered hourly mixing ratios of O3 (left column) and NO2+O3 (right column) at levels ≥40 ppb in five eastern Canadian cities.” These bar charts seem to suggest little variability among hourly concentrations within any of the years.

Fig 6. Kindly clarify the pollutant studied and which factor the “perturbation contribution” refers to.

Editorial Suggestions

Typos, syntax errors, and awkward word choices could be corrected. For example,

Line 44, “human health and the Environment”, maybe “human health and the environment”.

Line 62, “95% cities”, perhaps “95% of cities”.

Line 113, “accurately quantify”, suggest considering a more conservative term such as “better quantify”.

Line 130 and other places, “Quebec”, “Quebec City” could be more appropriate.

Line 163, “British Columbia Province”, could be replaced with “British Columbia” or “the province of British Columbia”. Similarly, “Alberta province” (Line 354).

Line 172, “ambient temperature”?

Line 192, “coefficient of determination (R²)” could be more appropriate.

Line 206, “reasonably well reproduced”, kindly rephrase.

Line 211, “good predictions”?

Line 292, “Fig. 3a and 3b”?

Line 342, “strong correlations”?

Line 366, “mixing ratios again the original ones varied from 0.97 to 1.03”, kindly rephrase.

Line 378, “nearly” could be replaced with “approximately”.

Line 399, “regional transport on the continental scale”, kindly rephrase.

Line 410, “large discrepancy”?

Line 595, “In the cases with O3 mixing ratios  40 ppb”?

Line 617, “dominantly contributed to the population-weighted exposure to PM2.5 in northern Canada (59%) and western Canada (18%)”. kindly rephrase.

Line 678, “By only considering”?

Line 693, “caused AQHI to a level of above 10”, perhaps “elevated AQHI to a level of above 10”.

Citation: https://doi.org/10.5194/egusphere-2023-2968-RC2

AC3: 'Response to Referee #2', Leiming Zhang, 26 Mar 2024

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-2968/egusphere-2023-2968-AC3-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2023-2968-AC3

Country	#	Views	%
United States of America	1	144	29
China	2	116	23
Canada	3	59	11
Germany	4	29	5
United Kingdom	5	21	4


Total:	0
HTML:	0
PDF:	0
XML:	0

Identifying decadal trends in deweathered concentrations of criteria air pollutants in Canadian urban atmospheres with machine learning approaches

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Journal article(s) based on this preprint

Viewed

Viewed (geographical distribution)