the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Operational chemical weather forecasting with the ECCC online Regional Air Quality Deterministic Prediction System version 023 (RAQDPS023) – Part 2: Multi-year prospective and retrospective performance evaluation
Abstract. The online version of the Regional Air Quality Deterministic Prediction System (RAQDPS) is a chemical weather forecast system that has been employed operationally by Environment and Climate Change Canada (ECCC) since 2009. It is run twice daily to produce 72 hour forecasts of hourly 10 km abundance fields of three key predictands, NO2, O3, and PM2.5 total mass, as well as other gas-phase chemical species, PM2.5 chemical components, and dry and wet deposition for Canada, the contiguous U.S., and northern Mexico. Version 023 of the RAQDPS (RAQDPS023) went into service at ECCC in December 2021 and was replaced by the RAQDPS025 in June 2024. A companion paper by Moran et al. (2025) describes the RAQDPS023 in detail. In this paper we present the results of a five-year performance evaluation of prospective and retrospective annual air quality (AQ) simulations made with the RAQDPS023. The annual simulations considered were the first year of RAQDPS023 forecasts in 2021/22 and four years of retrospective annual simulations for the 2013‒2016 period that used historical, year-specific emissions. Forecasts made by the RAQDPS-FW023, a duplicate operational system to the RAQDPS023 except for the addition of near-real-time (NRT) biomass burning (BB) emissions, were also evaluated for the 2021/22 period. A NRT measurement data set consisting of hourly NO2, O3, and PM2.5 surface measurements for Canada and the U.S. was used for the 2021/22 evaluation whereas a much more extensive set of air-chemistry and precipitation-chemistry measurements was used for the 2013‒2016 evaluations. Some evaluation results were also compared with results for the 2010‒2019 period for forecasts made by earlier operational versions of the RAQDPS and with evaluation results for several peer AQ forecast models. In addition to looking at a number of highly aggregated “headline” scores, many stratified analyses were also performed, including evaluations by network, season, month, hour of day, region, and land-use type. Consideration of simulations for multiple years with the same model and year-specific input emissions helped to identify systematic model errors by reducing the influence of year-to-year variations in meteorology, and a comprehensive evaluation for many species for 2013‒2016 supported by stratified analyses provided diagnostic insights that allowed the scientific basis for the RAQDPS023 forecasts to be assessed (i.e., “right answers for the right reasons?”). Although one confounding factor for this study was the sizable reduction in the emissions of some pollutants in North America that occurred from 2013 to 2021, it was found that the trends in AQ observations over this period agreed with the year-specific description of emissions used for the five annual simulations from a rank-ordered perspective.
While RAQDPS023 evaluation scores for hourly NO2 and O3 volume mixing ratio forecasts were found to be competitive with peer models and often met suggested performance benchmarks for the five simulation years, another key finding was that the RAQDPS023 forecasts consistently underpredicted hourly PM2.5 total mass concentrations for all months in 2021/22 and for the majority of months in 2013‒2016. The largest underpredictions occurred in summer and at rural stations whereas overpredictions often occurred in the cold season at urban stations. The model also missed the observed bimodality in monthly PM2.5 concentrations and exaggerated the observed diurnal variations in hourly PM2.5 concentrations. Additional evaluations with daily PM2.5 chemical composition measurements and daily gravimetric PM2.5 total mass measurements were also examined to better understand the hourly PM2.5 underpredictions. Consistent overpredictions of elemental carbon and sea salt concentrations and underpredictions of sulfate concentration were identified, but scores for predictions of daily gravimetric PM2.5 total mass were better than those for hourly PM2.5 total mass, directing attention to differences in measurement methods. SO2 and HNO3 levels were also found to be overpredicted in general while NH3 levels were underpredicted: these three gas-phase species are all PM2.5 precursors, which raises concerns about some process representations such as those for sulfur oxidation and gas-phase dry deposition. As well, springtime O3 levels were underpredicted while isoprene levels were consistently overpredicted. The impact of BB emissions on predictions of NO2, O3, and PM2.5 was also characterized in detail by comparing evaluation results for the 2021/22 RAQDPS023 and RAQDPS-FW023 forecasts. Negligible impact was found for monthly NO2 forecasts when BB emissions were included, but monthly O3 forecast scores were modestly improved and monthly PM2.5 forecast scores were markedly improved from July to September 2021, as well as summer and annual scores. Taken together, the results of this comprehensive multi-year evaluation point to a number of RAQDPS023 system components where improvements are desirable. These results also provide a strong benchmark against which to compare the performance of future versions of the RAQDPS.
- Preprint
(5110 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2025-4324', Anonymous Referee #1, 21 Jan 2026
- AC1: 'Reply on RC1', Michael Moran, 02 Apr 2026
-
RC2: 'Comment on egusphere-2025-4324', Anonymous Referee #2, 02 Mar 2026
The RAQDPS is one of the main North-American chemical weather / air quality forecasting systems. A detailed documentation of both the modelling aspects as well as the scores from the comparison with observations at the surface are valuable for the large group of users of the forecasts, and is provided by the part 1 and 2 of this article. The paper part 2 by Moran et al. is providing a very comprehensive evaluation of the skill of the modelling system to predict the main air pollutants ozone, NO2, PM2.5, relevant for health. Apart from this the authors compare to available other trace gas observations as well as measurements of the chemical composition of the aerosols in air and in precipitation. This extra set of comparisons give important additional information on the skill of the model concerning the emissions and chemical components that contribute to the formation of the main pollutants. The scores are compared to benchmarks for chemistry modelling and to other AQ prediction efforts, which puts the scores in perspective. In particular section 4.4 is a highlight of the paper because it identifies processes that may be responsible for differences with the observations, and discusses potential improvements in the model. The authors are honest in documenting the comparisons, showing skill for ozone and NO2, but more mixed results for PM2.5 (Fig. 8, 11, 15).
The paper is a very complete reference to the skill of the RAQDPS model to simulate surface concentrations. It is well written, and has a good list of important references. At the same time it is also very long, and even interested / motivated readers may be hesitant to go through all the results. Nevertheless, the paper, together with part 1, serves as a complete reference for RAQDPS. As such I am in favour of publication after the authors have answered my more minor comments and questions.
Comments and questions:
l 63: "expanded rapidly over the last two decades". I found that the discussion of air quality forecasting systems worldwide, as provided in the introduction, was providing a valuable historic overview of relevant literature but the references are not fully up-to-date and may be expanded to other countries/continents (e.g. China, Japan, Europe, global-scale forecasting). For instance, in Europe the regional air quality forecasting is organised by the Copernicus Atmosphere Monitoring Service (CAMS) and is described in a recent paper by Colette et al, https://doi.org/10.5194/gmd-18-6835-2025, which I propose to add to the references.
l 145: "dynamic; and probabilistic" evaluations are not discussed in this paragraph.
l 305: "Note also that neither system version considered some other types of natural emissions, .."
This may be formulated more clearly.l 356: "Two automated data filters .." Stations close to major roads or other big emitters will not be representative for a model grid cell of 10km diameter. Is a site classification available for the stations in the US and Canada? Did the authors consider to remove stations that will not give representative comparison results? (Urban vs rural statistics is discussed in line 475)
Sec 2.3: As far as I understand, AirNow is employing NO2 analysers which use molybdenum converters. The resulting NO2 measured values are known to overestimate NO2, because the technique is also sensitive to other NOy components which are also converted in the instrument. Please discuss this aspect and how it influences the NO2 evaluation.
Fig.1. Looking at the figure it seems that the model values at the stations does not match the continental modelled distribution, especially for NO2 and PM where model-at-station values seem to be higher than the background? This figure was confusing to me.
l 637: "NME scores" Should this be NMAE? Please use consistent acronyms.
l 656: "U.S. NOx emissions used for these forecasts may have been too low ". Are there other indications, e.g. from literature, that this is the case?
l 776: "The fact that annual NMB scores for gravimetric PM2.5 mass measurements are more positive than those for continuous PM2.5 measurements is thus surprising."
After reading the paper these differences and comparisons remain unclear to me. Maybe a few sentences in the concluding section could be added to highlight this aspect once more, and how this impacts the interpretation of the evaluation.l 838: "but these scores were confounded by the model’s inclusion of isoprene oxidation products in this lumped VOC species (Moran et al., 2025), suggesting that overpredictions should be expected." Is there an estimate of the ratio of pure isoprene to lumped "isoprene", e.g. is a factor of more than 3 for the model/obs ratio to be expected?
l 1226, Fig 19: This figure is a key result to document modelling improvements over time, showing improvements in O3 and NO2 scores, but not for PM2.5. l1234: "The 2010‒2019 seasonal R scores for PM2.5, on the other hand, are lower than those for NO2 and O3, are less cyclical seasonally, and do not show any improvement with time. l 1256, wildfire: Important to improve the scores, especially PM2.5 in Summer. Is it considered to use RAQDPS-FW023 to replace RAQDPS023 as operational system?
l 1442: Could the boundary conditions for ozone be partly responsible for ozone biases and east-west gradients? Please discuss.
l 1493: Wet deposition modelling is mentioned as cause for low NH3. What about dry deposition, which is a dominant removal mechanism? Please discuss.
In section 4.4 there could be more attention for transport aspects, vertical mixing, stability. The evaluation focusses on the surface but vertical profiles may have issues as well. A few remarks on this would be welcome.
Sec 4.4: Maybe good to add a remark on the injection height of emissions in this section. First layer vs. first two layers was discussed earlier.
Will RAQDPS-FW023 with biomass burning emissions replace the RAQDPS forecasting system in the future? Please discuss pros and cons.
Citation: https://doi.org/10.5194/egusphere-2025-4324-RC2 - AC2: 'Reply on RC2', Michael Moran, 02 Apr 2026
Status: closed
-
RC1: 'Comment on egusphere-2025-4324', Anonymous Referee #1, 21 Jan 2026
General comments:
This manuscript presents the evaluation of the operational chemical weather forecasting system of ECCC, which is described in the Part 1 paper. These continuous presentations will be useful to understand the current modeling performances. The authors conducted a comprehensive analysis of gas, aerosol, and depositions, and discussed them well. As a critical concern, I am confusing the wording “RAQDPS023” throughout the manuscript. It will be better to clearly define “RAQDPS-FW023” and “RAQDPS-OP023” in Abstract (lines 22 and 24) and in Introduction, and then throughout the manuscript. Maybe, the wording “RAQDPS023” in line 22 would be “RAQDPS-FW023”. Although the wording “RAQDPS-OP023” was not introduced, I can see this expression in Table 7, and I feel these clear expressions will be needed to clearly follow the context of this manuscript. In addition to this clarification, I have several specific and technical comments as follows.
Specific comments:
- Lines 1271-1287 and Table 7: From this part, I can see that there was a comparison in operational and forecast for 2021/22; however, we cannot see such comparisons in the following section 4.4 (Fig. 19 and other figures in the supplemental file). Excuse me if I missed this comparison approach, but could you clarify this point?
- Line 1404-1457: It would be much better to explicitly mention the reason for the worsened R values in the forecast case.
Technical points:
- Please confirm a subscript for the chemical species throughout the manuscript.
- I am wondering about the usage of the supplemental file within this journal or the outer storage (zenodo repository). Could you confirm with the editorial office?
- Line 1828: Typo “PM3.5”.
Citation: https://doi.org/10.5194/egusphere-2025-4324-RC1 - AC1: 'Reply on RC1', Michael Moran, 02 Apr 2026
-
RC2: 'Comment on egusphere-2025-4324', Anonymous Referee #2, 02 Mar 2026
The RAQDPS is one of the main North-American chemical weather / air quality forecasting systems. A detailed documentation of both the modelling aspects as well as the scores from the comparison with observations at the surface are valuable for the large group of users of the forecasts, and is provided by the part 1 and 2 of this article. The paper part 2 by Moran et al. is providing a very comprehensive evaluation of the skill of the modelling system to predict the main air pollutants ozone, NO2, PM2.5, relevant for health. Apart from this the authors compare to available other trace gas observations as well as measurements of the chemical composition of the aerosols in air and in precipitation. This extra set of comparisons give important additional information on the skill of the model concerning the emissions and chemical components that contribute to the formation of the main pollutants. The scores are compared to benchmarks for chemistry modelling and to other AQ prediction efforts, which puts the scores in perspective. In particular section 4.4 is a highlight of the paper because it identifies processes that may be responsible for differences with the observations, and discusses potential improvements in the model. The authors are honest in documenting the comparisons, showing skill for ozone and NO2, but more mixed results for PM2.5 (Fig. 8, 11, 15).
The paper is a very complete reference to the skill of the RAQDPS model to simulate surface concentrations. It is well written, and has a good list of important references. At the same time it is also very long, and even interested / motivated readers may be hesitant to go through all the results. Nevertheless, the paper, together with part 1, serves as a complete reference for RAQDPS. As such I am in favour of publication after the authors have answered my more minor comments and questions.
Comments and questions:
l 63: "expanded rapidly over the last two decades". I found that the discussion of air quality forecasting systems worldwide, as provided in the introduction, was providing a valuable historic overview of relevant literature but the references are not fully up-to-date and may be expanded to other countries/continents (e.g. China, Japan, Europe, global-scale forecasting). For instance, in Europe the regional air quality forecasting is organised by the Copernicus Atmosphere Monitoring Service (CAMS) and is described in a recent paper by Colette et al, https://doi.org/10.5194/gmd-18-6835-2025, which I propose to add to the references.
l 145: "dynamic; and probabilistic" evaluations are not discussed in this paragraph.
l 305: "Note also that neither system version considered some other types of natural emissions, .."
This may be formulated more clearly.l 356: "Two automated data filters .." Stations close to major roads or other big emitters will not be representative for a model grid cell of 10km diameter. Is a site classification available for the stations in the US and Canada? Did the authors consider to remove stations that will not give representative comparison results? (Urban vs rural statistics is discussed in line 475)
Sec 2.3: As far as I understand, AirNow is employing NO2 analysers which use molybdenum converters. The resulting NO2 measured values are known to overestimate NO2, because the technique is also sensitive to other NOy components which are also converted in the instrument. Please discuss this aspect and how it influences the NO2 evaluation.
Fig.1. Looking at the figure it seems that the model values at the stations does not match the continental modelled distribution, especially for NO2 and PM where model-at-station values seem to be higher than the background? This figure was confusing to me.
l 637: "NME scores" Should this be NMAE? Please use consistent acronyms.
l 656: "U.S. NOx emissions used for these forecasts may have been too low ". Are there other indications, e.g. from literature, that this is the case?
l 776: "The fact that annual NMB scores for gravimetric PM2.5 mass measurements are more positive than those for continuous PM2.5 measurements is thus surprising."
After reading the paper these differences and comparisons remain unclear to me. Maybe a few sentences in the concluding section could be added to highlight this aspect once more, and how this impacts the interpretation of the evaluation.l 838: "but these scores were confounded by the model’s inclusion of isoprene oxidation products in this lumped VOC species (Moran et al., 2025), suggesting that overpredictions should be expected." Is there an estimate of the ratio of pure isoprene to lumped "isoprene", e.g. is a factor of more than 3 for the model/obs ratio to be expected?
l 1226, Fig 19: This figure is a key result to document modelling improvements over time, showing improvements in O3 and NO2 scores, but not for PM2.5. l1234: "The 2010‒2019 seasonal R scores for PM2.5, on the other hand, are lower than those for NO2 and O3, are less cyclical seasonally, and do not show any improvement with time. l 1256, wildfire: Important to improve the scores, especially PM2.5 in Summer. Is it considered to use RAQDPS-FW023 to replace RAQDPS023 as operational system?
l 1442: Could the boundary conditions for ozone be partly responsible for ozone biases and east-west gradients? Please discuss.
l 1493: Wet deposition modelling is mentioned as cause for low NH3. What about dry deposition, which is a dominant removal mechanism? Please discuss.
In section 4.4 there could be more attention for transport aspects, vertical mixing, stability. The evaluation focusses on the surface but vertical profiles may have issues as well. A few remarks on this would be welcome.
Sec 4.4: Maybe good to add a remark on the injection height of emissions in this section. First layer vs. first two layers was discussed earlier.
Will RAQDPS-FW023 with biomass burning emissions replace the RAQDPS forecasting system in the future? Please discuss pros and cons.
Citation: https://doi.org/10.5194/egusphere-2025-4324-RC2 - AC2: 'Reply on RC2', Michael Moran, 02 Apr 2026
Data sets
Operational GEM-MACH model evaluation against air quality surface observation networks across Canada and the United States for 2013-16 and 2021/22 Alexandru Lupu and Michael D. Moran https://doi.org/10.5281/zenodo.16944371
RAQDPS023 Predicted 2013-2016 and 2021/22 Seasonal and Annual Dry, Wet, and Total Acidic Deposition Fields and Related Concentration Fields for North America Michael Moran and Verica Savic-Jovcic https://doi.org/10.5281/zenodo.16970403
Model code and software
Global Environmental Multiscale model‒Modelling Atmospheric CHemistry (GEM-MACH) version 3.1.0.0 Verica Savic-Jovcic et al. https://doi.org/10.5281/zenodo.15330612
Canadian Fire Emissions Prediction System (CFFEPS) v4.1 Kerry Anderson and Jack Chen https://doi.org/10.5281/zenodo.15305591
Version 5.1 package for the Global Environmental Multiscale (GEM) model (ECCC-ASTD-MRD/gem: 5.1.0) ECCC GEM Development Team, Environment and Climate Change Canada https://zenodo.org/records/17782580
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 417 | 291 | 38 | 746 | 62 | 68 |
- HTML: 417
- PDF: 291
- XML: 38
- Total: 746
- BibTeX: 62
- EndNote: 68
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General comments:
This manuscript presents the evaluation of the operational chemical weather forecasting system of ECCC, which is described in the Part 1 paper. These continuous presentations will be useful to understand the current modeling performances. The authors conducted a comprehensive analysis of gas, aerosol, and depositions, and discussed them well. As a critical concern, I am confusing the wording “RAQDPS023” throughout the manuscript. It will be better to clearly define “RAQDPS-FW023” and “RAQDPS-OP023” in Abstract (lines 22 and 24) and in Introduction, and then throughout the manuscript. Maybe, the wording “RAQDPS023” in line 22 would be “RAQDPS-FW023”. Although the wording “RAQDPS-OP023” was not introduced, I can see this expression in Table 7, and I feel these clear expressions will be needed to clearly follow the context of this manuscript. In addition to this clarification, I have several specific and technical comments as follows.
Specific comments:
Technical points: