A Physics-Constrained Deep-Learning Framework based on Long-Term Remote-Sensing Data for Retrieving Vertical Distribution of PM2.5 Chemical Components
Abstract. The vertical distribution of PM2.5 chemical components is crucial for identifying the causes of atmospheric pollution and its impact on climate change and extreme weather. By integrating long-term lidar measurements, deep-learning algorithms and a physics-constrained optimization method, this paper presents a novel lidar-based retrieval framework to obtain vertical mass concentration profiles of PM2.5 chemical components for the first time. Identifiable components include sulfate (SO42-), nitrate (NO3-), ammonium (NH4+), organic matter (OM) and black carbon (BC), which extend beyond the component types that traditional remote-sensing retrievals can identify. A 1-year retrieved surface mass concentrations of these components closely aligned with the observations, with Pearson correlation coefficient values ranging from 0.91 to 0.98. The retrieval framework applied in varying non-training spatiotemporal scenarios also showed robust generalization capabilities. Tower and aircraft-based field campaigns indicate that the retrieved and observed vertical profiles of these components exhibited consistent patterns in mass concentrations and proportions. Subsequently, an explainable method was incorporated into the retrieval framework to quantify the multivariate driving effects on vertical profile retrieval. Results showed that the extinction coefficient and representative indicators within physiochemical processes contributed significantly to mass concentrations of these components. Finally, a dataset of vertical mass concentration profiles of these components over six years in a Chinese megacity was generated by the retrieval framework, revealing the dominant roles of OM and NO3- in PM2.5 throughout the entire boundary layer across all seasons. Through implementing clean air policies, the reduction rates of these components in the megacity exhibited the highest reduction rate of 0.17–0.82 µg m-3 a-1 occurring at an altitude of ~300 m. Our retrieval framework offers a novel approach for acquiring vertical profiles of PM2.5 chemical components, thereby providing a new perspective on elucidating the vertical evolution of atmospheric pollutants.
"A Physics-Constrained Deep-Learning Framework based on Long-Term Remote-Sensing Data for Retrieving Vertical Distribution of PM2.5 Chemical Components” by Li et al., proposes a novel method to retrieve concentrations of major aerosol components (sulfate, nitrate, ammonium, organic matter, and black carbon) from ground-based lidar extinction data combined with ERA5 meteorological reanalysis. The authors validate their retrieved concentrations against surface, airborne, and tower observations in the greater Beijing region, reporting generally high correlations. Given the importance of aerosols in Earth's radiative balance and air quality, developing methods that leverage lidar's high vertical resolution to determine constituent species concentrations is a valuable endeavor. However, the manuscript in its current form requires significant revisions to adequately describe the methodology and contextualize the results.
Major Points:
1. A fundamental issue is how PM2.5 concentrations are distinguished from larger particles using lidar extinction data. The lidar dataset presumably provides total aerosol extinction from particles of all sizes, yet the work presented here centers only on PM2.5. The authors provide no explanation of how contributions from larger particles (PM10, coarse mode, etc.) are excluded from the extinction signal (if they are). This represents a potentially significant source of error, particularly during dust events when coarse particles may dominate extinction.
2. The data processing section, especially the lidar data processing, lacks essential technical details. Key missing information include lidar instrument specifications, data quality control procedures (e.g, cloud screening), lidar extinction retrieval algorithms/methods, and the methods for reconciling the different vertical resolutions between the lidar data (6m) and the meteorological inputs.
3. The validation dataset is insufficient to support the broad conclusions presented. Aircraft validation comprises only four flights (limited to three different calendar months), while tower measurements span just 11 days across two time periods. This is of particular importance because the retrieved aerosol concentrations appear to show similar vertical distributions across different seasons. The limited validation prevents assessment of whether this method captures realistic atmospheric processes or simply learns scaling relationships under specific (mostly wintertime) meteorological conditions.
4. The manuscript would benefit from some extensive editing to improve its readability. Sentences are overly dense and the model development section would be difficult for most readers to follow. The excessive number of figures (~75 figure/subplots) dilutes the presentation of key results.
5. The manuscript lacks an adequate discussion of the limitations of this retrieval technique. This would be essential for readers considering applying this method in different regions or with slightly different instruments.
6. The SHAP feature importance analysis raises some questions and methodological concerns. For example, why are specific humidity and relative humidity treated as independent? These are clearly related. There is also lacking a discussion about the definition of and why "geopotential" is so important. Also, it strikes me that the combined SHAP value of extinction, relative humidity, and v-wind being under 50% is relatively low considering they are noted to determine the vertical structure and chemical and physical processes (L410-411).
Minor Comments:
References:
Matus, A. V., Nowottnick, E. P., Yorks, J.E., & da Silva, A. M. (2025). Enhancingsurface PM2.5 air quality estimates inGEOS using CATS lidar data. Earth andSpace Science, 12, e2024EA004078.https://doi.org/10.1029/2024EA004078
Toth, T. D., Zhang, J., Vaughan, M. A., Reid, J. S., & Campbell, J. R. (2022). Retrieving particulate matter concentrations over the contiguousUnited States using CALIOP observations. Atmospheric Environment, 274, 118979. https://doi.org/10.1016/j.atmosenv.2022.118979