Preprints
https://doi.org/10.5194/egusphere-2023-1531
https://doi.org/10.5194/egusphere-2023-1531
10 Aug 2023
 | 10 Aug 2023

Diagnosing drivers of PM2.5 simulation biases from meteorology, chemical composition, and emission sources using an efficient machine learning method

Shuai Wang, Mengyuan Zhang, Yueqi Gao, Peng Wang, Qingyan Fu, and Hongliang Zhang

Abstract. Chemical transport models (CTMs) are widely used for air pollution modeling, which suffer from significant biases due to uncertainties in simplified parameterization, meteorological fields, and emission inventories. Accurate diagnosis of simulation biases is critical for improvement of models, interpretation of results, and efficient air quality management, especially for the simulation of fine particulate matter (PM2.5). In this study, an efficient method based on machine learning (ML) was designed to diagnose the drivers of the Community Multiscale Air Quality (CMAQ) model biases in simulating PM2.5 concentrations from three perspectives of meteorology, chemical composition, and emission sources. The source-oriented CMAQ were used to diagnose influences of different emission sources on PM2.5 biases. The ML models showed good fitting ability with small performance gap between training and validation. The CMAQ model underestimates PM2.5 by -19.25 to -2.66 μg/m3 in 2019, especially in winter and spring and high PM2.5 events. Secondary organic components showed the largest contribution to PM2.5 simulation bias for different regions and seasons (13.8–22.6 %) among components. Relative humidity, cloud cover, and soil surface moisture were the main meteorological factors contributing to PM2.5 bias in the North China Plain, Pearl River Delta, and northwestern, respectively. Both primary and secondary inorganic components from residential sources showed the largest contribution (12.05 % and 12.78 %), implying large uncertainties in this sector. The ML-based methods provide valuable complements to traditional mechanism-based methods for model improvement, with high efficiency and low reliance on prior information.

Shuai Wang, Mengyuan Zhang, Yueqi Gao, Peng Wang, Qingyan Fu, and Hongliang Zhang

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CEC1: 'Comment on egusphere-2023-1531', Juan Antonio Añel, 05 Sep 2023
    • AC1: 'Reply on CEC1', Hongliang Zhang, 07 Oct 2023
  • RC1: 'Comment on egusphere-2023-1531', Anonymous Referee #1, 06 Sep 2023
    • AC2: 'Reply on RC1', Hongliang Zhang, 07 Oct 2023
  • RC2: 'Comment on egusphere-2023-1531', Anonymous Referee #2, 15 Sep 2023
    • AC3: 'Reply on RC2', Hongliang Zhang, 07 Oct 2023

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • CEC1: 'Comment on egusphere-2023-1531', Juan Antonio Añel, 05 Sep 2023
    • AC1: 'Reply on CEC1', Hongliang Zhang, 07 Oct 2023
  • RC1: 'Comment on egusphere-2023-1531', Anonymous Referee #1, 06 Sep 2023
    • AC2: 'Reply on RC1', Hongliang Zhang, 07 Oct 2023
  • RC2: 'Comment on egusphere-2023-1531', Anonymous Referee #2, 15 Sep 2023
    • AC3: 'Reply on RC2', Hongliang Zhang, 07 Oct 2023
Shuai Wang, Mengyuan Zhang, Yueqi Gao, Peng Wang, Qingyan Fu, and Hongliang Zhang

Model code and software

Machine learning code and training datasets Shuai Wang https://zenodo.org/record/7907626

Shuai Wang, Mengyuan Zhang, Yueqi Gao, Peng Wang, Qingyan Fu, and Hongliang Zhang

Viewed

Total article views: 539 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
370 142 27 539 44 17 21
  • HTML: 370
  • PDF: 142
  • XML: 27
  • Total: 539
  • Supplement: 44
  • BibTeX: 17
  • EndNote: 21
Views and downloads (calculated since 10 Aug 2023)
Cumulative views and downloads (calculated since 10 Aug 2023)

Viewed (geographical distribution)

Total article views: 522 (including HTML, PDF, and XML) Thereof 522 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 27 Apr 2024
Download
Short summary
Numerical models are widely used for air pollution modeling, but suffer from significant biases. Machine learning model designed in this study shows highly efficiency in identifying such biases. Meteorology (relative humidity and cloud cover), chemical composition (secondary organic components and dust aerosol), and emission sources (residential activities) are diagnosed as the main drivers of bias in modeling PM2.5, a typical air pollutant. The results will help to numerical model improvements.