Spatio-temporal modeling of air pollutant concentrations in Germany using machine learning
Abstract. Machine learning (ML) models are becoming a meaningful tool for modeling air pollutant concentrations. ML models are capable of learning and modeling complex non-linear interactions between variables, and they require less computational effort than chemical transport models (CTMs). In this study, we used gradient boosted tree (GBT) and multi-layer perceptron (MLP; neural network) algorithms to model near-surface nitrogen dioxide (NO2) and ozone (O3) concentrations over Germany at 0.1 degree spatial resolution and daily intervals.
We trained the ML models using TROPOMI satellite column measurements combined with information on emission sources, air pollutant precursors and meteorology as feature variables. We found that the trained GBT model for NO2 and O3 explained a major portion of the observed concentrations (R2 = 0.68–0.88, RMSE = 4.77–8.67 μg m-3 and R2 = 0.74–0.92, RMSE = 8.53–13.2 μg m-3, respectively). The trained MLP model performed worse than the trained GBT model for both NO2 and O3 (R2 = 0.46–0.82 and R2 = 0.42–0.9, respectively).
Our NO2 GBT model outperforms the CAMS model, a data-assimilated CTM, but slightly under-performs for O3. However, our NO2 and O3 ML models require less computational effort than CTM. Therefore, we can analyze people’s exposure to near-surface NO2 and O3 with significantly less effort. During the study period (2018-04-30 and 2021-07-01), it was found that around 36 % of people lived in locations where the WHO NO2 limit was exceeded for more than 25 % of the days, while 90 % of the population resided in areas where the WHO O3 limit was surpassed for over 25 % of days. Although metropolitan areas had high NO2 concentrations, rural areas, particularly in southern Germany, had high O3 concentrations.
Furthermore, our ML models can be used to evaluate the effectiveness of mitigation policies. Near-surface NO2 and O3 concentrations changes during the 2020 COVID-19 lockdown period over Germany were indeed reproduced by the GBT model, with meteorology-accounted for near-surface NO2 significantly decreased (by 23±5.3 %) and meteorology-accounted for near-surface O3 slightly increased (by 1±4.6 %) over ten major German metropolitan areas, compared to 2019. Finally, our O3 GBT model is highly transferable to other countries, at least to neighboring countries and locations where no measurements are available (R2 = 0.87–0.94), whereas our NO2 GBT model is moderately transferable (R2 = 0.32–0.64).