Operational chemical weather forecasting with the ECCC online Regional Air Quality Deterministic Prediction System version 023 (RAQDPS023) – Part 2: Multi-year prospective and retrospective performance evaluation
Abstract. The online version of the Regional Air Quality Deterministic Prediction System (RAQDPS) is a chemical weather forecast system that has been employed operationally by Environment and Climate Change Canada (ECCC) since 2009. It is run twice daily to produce 72 hour forecasts of hourly 10 km abundance fields of three key predictands, NO2, O3, and PM2.5 total mass, as well as other gas-phase chemical species, PM2.5 chemical components, and dry and wet deposition for Canada, the contiguous U.S., and northern Mexico. Version 023 of the RAQDPS (RAQDPS023) went into service at ECCC in December 2021 and was replaced by the RAQDPS025 in June 2024. A companion paper by Moran et al. (2025) describes the RAQDPS023 in detail. In this paper we present the results of a five-year performance evaluation of prospective and retrospective annual air quality (AQ) simulations made with the RAQDPS023. The annual simulations considered were the first year of RAQDPS023 forecasts in 2021/22 and four years of retrospective annual simulations for the 2013‒2016 period that used historical, year-specific emissions. Forecasts made by the RAQDPS-FW023, a duplicate operational system to the RAQDPS023 except for the addition of near-real-time (NRT) biomass burning (BB) emissions, were also evaluated for the 2021/22 period. A NRT measurement data set consisting of hourly NO2, O3, and PM2.5 surface measurements for Canada and the U.S. was used for the 2021/22 evaluation whereas a much more extensive set of air-chemistry and precipitation-chemistry measurements was used for the 2013‒2016 evaluations. Some evaluation results were also compared with results for the 2010‒2019 period for forecasts made by earlier operational versions of the RAQDPS and with evaluation results for several peer AQ forecast models. In addition to looking at a number of highly aggregated “headline” scores, many stratified analyses were also performed, including evaluations by network, season, month, hour of day, region, and land-use type. Consideration of simulations for multiple years with the same model and year-specific input emissions helped to identify systematic model errors by reducing the influence of year-to-year variations in meteorology, and a comprehensive evaluation for many species for 2013‒2016 supported by stratified analyses provided diagnostic insights that allowed the scientific basis for the RAQDPS023 forecasts to be assessed (i.e., “right answers for the right reasons?”). Although one confounding factor for this study was the sizable reduction in the emissions of some pollutants in North America that occurred from 2013 to 2021, it was found that the trends in AQ observations over this period agreed with the year-specific description of emissions used for the five annual simulations from a rank-ordered perspective.
While RAQDPS023 evaluation scores for hourly NO2 and O3 volume mixing ratio forecasts were found to be competitive with peer models and often met suggested performance benchmarks for the five simulation years, another key finding was that the RAQDPS023 forecasts consistently underpredicted hourly PM2.5 total mass concentrations for all months in 2021/22 and for the majority of months in 2013‒2016. The largest underpredictions occurred in summer and at rural stations whereas overpredictions often occurred in the cold season at urban stations. The model also missed the observed bimodality in monthly PM2.5 concentrations and exaggerated the observed diurnal variations in hourly PM2.5 concentrations. Additional evaluations with daily PM2.5 chemical composition measurements and daily gravimetric PM2.5 total mass measurements were also examined to better understand the hourly PM2.5 underpredictions. Consistent overpredictions of elemental carbon and sea salt concentrations and underpredictions of sulfate concentration were identified, but scores for predictions of daily gravimetric PM2.5 total mass were better than those for hourly PM2.5 total mass, directing attention to differences in measurement methods. SO2 and HNO3 levels were also found to be overpredicted in general while NH3 levels were underpredicted: these three gas-phase species are all PM2.5 precursors, which raises concerns about some process representations such as those for sulfur oxidation and gas-phase dry deposition. As well, springtime O3 levels were underpredicted while isoprene levels were consistently overpredicted. The impact of BB emissions on predictions of NO2, O3, and PM2.5 was also characterized in detail by comparing evaluation results for the 2021/22 RAQDPS023 and RAQDPS-FW023 forecasts. Negligible impact was found for monthly NO2 forecasts when BB emissions were included, but monthly O3 forecast scores were modestly improved and monthly PM2.5 forecast scores were markedly improved from July to September 2021, as well as summer and annual scores. Taken together, the results of this comprehensive multi-year evaluation point to a number of RAQDPS023 system components where improvements are desirable. These results also provide a strong benchmark against which to compare the performance of future versions of the RAQDPS.