Transferable Hourly Ozone Forecasting with Transformers
Abstract. We investigate the suitability of a transformer-based approach for air-quality forecasting, focusing on 4-day ahead hourly predictions of surface ozone (O3). The study employs Google’s Temporal Fusion Transformer (TFT) to integrate meteorological predictors, historical pollutant observations, and static station metadata, using an open source implementation with minimal domain-specific preprocessing. The analysis addresses two questions: (1) how efficiently a transformer model can be deployed for regional air quality forecasting, and (2) how well the learned representations transfer across geophysically distinct regions.
Model performance is evaluated against state-of-the-art regional chemical transport model Copernicus Atmosphere Monitoring Service (CAMS) ensemble forecast using observations from Germany. The TFT consistently achieves lower bias and higher forecast skill across all lead times. Suburban monitoring sites exhibit the highest skill relative to CAMS based on RMSE and SMAPE-based metrics. Urban stations show moderate skill against CAMS baseline, while rural stations have reduced skill in comparison but remain positive across the full 96 h forecast, with the strongest improvements observed at shorter lead times. Post–day-1 results indicate a clear separation of performance by station type; suggesting increasing performance stratification by station type beyond day 1, with larger relative gains at urban and suburban sites and smaller but consistently positive skill at rural locations.
Geographic transferability is assessed by adapting a model trained over Germany to South Korea by retraining region-specific metadata embeddings while preserving learned temporal representations. Forecast errors increase by only 5–10 %, indicating that the model captures meteorological drivers of O3 variability that generalize across contrasting anthropogenic and climatic regimes. Ablation experiments further demonstrate the robustness of the chosen experimental configuration for both forecasting performance and cross region transferability.