the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Quantifying the driving factors of particulate matter variabilities in the Beijing-Tianjin-Hebei and Yangtze River Delta regions from 2015 to 2020 by machine learning approach
Abstract. Particulate matter (PM) pollution is a critical air quality challenge in China. This study quantifies meteorological versus anthropogenic contributions to PM variations in Beijing-Tianjin-Hebei (BTH) and Yangtze River Delta (YRD) (2015–2020) using ground observations, meteorological assimilated data, emission inventories, and a LightGBM model. Observations show significant PM2.5 and PM10 declines (e.g., BTH PM2.5: −0.07 ± 0.03 μg m⁻³ yr⁻¹; PM10: −0.11 ± 0.04 21 μg m⁻³ yr⁻¹). Model decomposition identifies anthropogenic emission reductions as the primary driver (PM2.5 decrease: 7.19–24.76 μg m⁻³; PM10 decrease: 0.40–27.12 μg m⁻³). Key meteorological drivers differ: 2-m specific humidity (QV2M), sea-level pressure (SLP), 2-m temperature (T2M), and 10-m meridional (V10M) collectively explain 15 % of PM2.5 variance; precipitation flux (PRECTOT) is critical for PM10. PM2.5 concentrations are primarily governed by PM10, CO, NO2, and SO2 (cumulative contribution 37.60 %), while PM10 variations center on PM2.5, interacting with NO2, CO, and SO2 (explaining 34 % variance). PM2.5 shows stronger correlation with CO than PM10 (regional difference +0.07–+0.08), linked to combustion/SOA. SO₂/NO₂ exhibit comparable PM correlations but divergent mechanisms: NO₂ with traffic/nitrate, SO₂ with stationary sources/sulfate, both via "co-emission-chemical transformation-meteorological synergy". Our research support optimizing region-specific control strategies.
- Preprint
(2157 KB) - Metadata XML
-
Supplement
(1588 KB) - BibTeX
- EndNote
Status: open (until 03 Sep 2025)
-
RC1: 'Comment on egusphere-2025-2786', Anonymous Referee #1, 31 Jul 2025
reply
This manuscript reports and interprets the reductions in ground-level PM2.5 and PM10 as observed by the air quality surveillance network in China during 2015-2020, using a machine learning approach to attribute these changes to drivers of emissions and meteorology. A key finding is that anthropogenic emissions are dominant in the observed changes. While I find the scope fits ACP well, I cannot recommend acceptance of this paper at its present form. The main concern is the severely lack of novelty in all aspects (data, method, and insights from the analysis) among a wealth of literature.
Main comments:
1) Method: the inclusion of concentrations of PM2.5 (PM10), SO2, NO2, O3 and CO in the machine learning (ML) model of PM10 (PM2.5) is very confusing (and inadequate in my opinion). The ultimate aim of this approach is to separate contributions from emissions and meteorology to the changes in PM2.5 and PM10. Meanwhile, these pollutant concentrations themselves are jointly determined by both factors. In Line 193-205, the authors fix emissions in 2015 in the trained ML model to separate the two contributions, so the variations and trends driven by these pollutant concentrations (and these variations are in the top-7 ranks according to their importance scores) are attributed to "meteorology", which is essentially incorrect.
2) There are many existing papers that used statistical and machine learning models to attribute the changes of air pollution in China into emission and meteorological contributions. I list several examples below.
a. https://acp.copernicus.org/articles/19/11031/2019/
b. https://www.sciencedirect.com/science/article/pii/S0160412023006347
c. https://acp.copernicus.org/articles/19/11303/2019/
d. https://pubs.acs.org/doi/full/10.1021/acs.est.2c06800
e. https://acp.copernicus.org/articles/21/9475/2021/
and many more. The method of this paper exhibits no significant improvement/novelty relative to the above papers. The data locations and time period are also well covered by these papers. Results and insights from this manuscript, without a process-based model or framework, are overall shallow based on the ML model alone. There are few novel insights or analyses compared to the above papers.
3) The section of "4. Discussions" introduces new analysis of the correlations of PM2.5/PM10 vs. the other observed concentrations of CO, NO2 and SO2. This piece emerges randomly and doesn't fit well within the story of machine-learning interpretation of PM trends. It is also unusual to introduce new results in the "Discussion" section.
Overall, the manuscript reads to me a shallow analysis of air quality trends in China, a well-covered topic in existing work. This work does not offer a substantial contribution beyond the existing literature.
Other comments:
1) Line 21: The PM2.5 and PM10 trends appear very small to me. Check if correct.
2) Line 31: Throughout the paper, there is little explanation of the so-called "co-emission-chemical transformation-meteorological synergy". Also, if this topic is not a core finding from the work, it might not be suitable in the abstract.
3) Line 51-57: I suggest to move these descriptions to follow the first introduction of PM2.5 and PM10 (Line 39).
4) Line 60-61: VOC is also a very important category of PM precursors.
5) Line 66-67: Secondary aerosols can be formed in both the boundary layer and free troposphere. I do not know the purpose of emphasizing "free atmosphere" here.
6) Line 75: The paper (Zhang et al. 2016) is not a "conventional linear modeling approach". Please cite adequate papers.
7) Line 103-104: Besides the table, should provide a map of these cities. Without a map it is very hard to locate them.
8) Line 114-115: The "reference state" of air pollutant measurements was at 273 K before September 2018, and at 298 K afterwards. Is this factor considered?
9) Line 118-119: Why GEOS-FP is chosen while more stable met fields (e.g., MERRA2) are available?
10) Line 135: "Paraffinic reactive primary emissions" is not a conventional term. Could you please change it to "VOC emissions" and list the VOC species you used?
11) Equations 1-5 and associated text: are these very conventionally accepted concepts really worth such detailed discussion in the main text?
12) Section 3.1: Again, these trends (<0.1 ug/m3/yr for most cases) appear too small to me according to my understanding of air quality changes in China.
13) Line 239: The scatter plots in Figure 2 have too many overlapping points, and should be converted to colored 2-d histogram density plot.
14) Figure 4: Why are the meteorology-driven changes overall opposite for PM2.5 (positive) and PM10 (negative)? What is the key parameter causing this?
15) Figures 6 and 7: Instead of showing the trends of these parameters, it might be more straightforward to support the analysis by showing the contributions of each parameter to PM2.5 and PM10 trends?
16) Line 343: Clarify if the "correlations" are calculated based on hourly or daily data.
17) Section 4: Based on these correlations alone, no conclusive argument can be made, as also indicated by many conjecturing text in this section. I find it hard to understand the purpose of this section and this analysis.
Citation: https://doi.org/10.5194/egusphere-2025-2786-RC1 -
RC2: 'Comment on egusphere-2025-2786', Anonymous Referee #2, 14 Aug 2025
reply
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-2786/egusphere-2025-2786-RC2-supplement.pdf
-
RC3: 'Comment on egusphere-2025-2786', Anonymous Referee #3, 15 Aug 2025
reply
This manuscript builds a LightGBM framework that combines ground-monitor observations, reanalysis meteorology, and an emissions inventory (CEDS) to attribute 2015–2020 PM2.5/PM10 trends in the BTH and YRD regions to meteorology versus anthropogenic emissions. The topic is good. However, the current attribution design suffers from endogeneity, potential train–test leakage in validation, and limited uncertainty quantification; in addition, key claims rely on variable-importance metrics and trend magnitudes that need correction/clarification. I recommend major revision.
Main comments
1.Attribution mix-up (endogeneity). Authors keep co-pollutants (CO, NO₂, SO₂, PM) as predictors while only “freezing” emissions. Those pollutant levels already reflect emissions, so they leak emission info into the “meteorology-only” case and bias the split.
2.Cross-validation leakage. Authors mentioned ‘a 5-fold cross-validation framework was implemented: the full training dataset was randomly partitioned into five mutually exclusive subsets’ in this study. Random k-fold lets nearby days and the same cities appear in both train and test, inflating scores. That is not enough to check and avoid leakage for this study. I suggest that use blocked CV: leave-one-year/season out; leave-one-city out; ideally both. Report R, RMSE/MAE, and bias for each scheme.
3.“Importance doesn’t mean variance explained.” The analysis of variable importance is not explained well and clearly in this study. Tree importance (gain/splits) isn’t “% of variation explained.”. Authors should use SHAP or permutation importance and show partial-dependence (or ALE) plots. Reword claims to avoid “explains X% of variation.”
4.Trend numbers/units look off. Very small annual rates don’t match the multi-year drops shown. Authors should recheck units and decimals. Report both absolute (μg m⁻³ yr⁻¹) and relative (% yr⁻¹) trends with uncertainty, using a consistent method.
5.Inventory selection. The reason for choosing CEDS as inventory is not clear. CEDS is a global inventory, and the city grids for China are not representative enough. I recommend to compare it with the MEIC inventory, which is a China-specific inventory.
6.Tone down causal claims. Linking the 2019–2020 drop mainly to policy may overstate causality, especially with COVID shocks. Authors should add a check excluding the year 2020.
7.Clarify the contribution. The time coverage is outdated. Specify what is new (e.g., data, model, scale, or attribution design) versus prior studies, cite those studies, and show how your results change or add value.
Minor comments
- Clarify feature groups (meteorology vs. emissions/activity vs. concentrations) and which ones go into each counterfactual. State any lags.
- List LightGBM hyperparameters, seeds, data splits and more details.
- Check variable names/units (e.g., T2M is near-surface air temperature, not “maximum”). It is confused that this paper shows: ‘Line 24: 2-m temperature (T2M)’. Line 127: ‘2-m maximum air temperature (T2M)’.
- For city averages, give station counts, completeness rules, and weighting (simple mean vs. population/land-use weights).
- Figure3 and relevant description: explain and show which “importance” metric you use; add SHAP/PD plots.
In summary, this study requires a more defensible attribution design, leakage-safe validation, and stronger uncertainty treatment before the conclusions can be considered robust.
Citation: https://doi.org/10.5194/egusphere-2025-2786-RC3 -
RC4: 'Comment on egusphere-2025-2786', Anonymous Referee #4, 15 Aug 2025
reply
This manuscript presents an application of the LightGBM machine learning model to a multi-source dataset to quantify the respective contributions of meteorology and anthropogenic emissions to PM2.5 and PM10 variability in the BTH and YRD regions from 2015 to 2020. Accurately attributing these drivers is essential for formulating effective air-quality policies. However, there are several major concerns regarding the methodology, and the subsequent conclusions, which I believe need to be thoroughly addressed before the manuscript can be considered for publication.
Major Comments:
1. The method used to separate meteorological and emission contributions (Section 2.6), which involves fixing one set of variables to a baseline year (2015) while allowing others to vary, is a central component of the analysis. While this approach could be used in physical based models (e.g., CTMs), its application to a purely data-driven model like LightGBM warrants further discussion. Machine learning models learn non-linear relationships that are specific to the co-varying patterns present in the training data. Creating scenarios with combinations of variables that have not been observed historically (e.g., 2020 meteorology with 2015 emissions) may represent an out-of-distribution task. However, the model evaluation is only based on the sample-based cross validation. The performance and the physical interpretability of its output under such conditions could be uncertain. Furthermore, the prediction is merely based on instantaneous states, excluding the cumulative effects of previous moments. The authors are encouraged to provide further justification for this method’s suitability in an ML context, perhaps by citing literature where this technique has been validated for similar models or by conducting a sensitivity analysis to support the attribution results.
2. Several aspects of the machine learning implementation is confusing. The ML model includes PM10 (when predicting PM2.5) and vice versa among the feature set. Please discuss potential information leakage and quantify how much predictive skill derives from cross-pollutant auto-correlation versus direct meteorology/emission inputs. A sensitivity experiment retraining the model without inter-pollutant inputs would clarify the true drivers. The feature importance ranking presented in Fig. 3 indicates that the date is the most significant input variable. However, the methodology for constructing this feature is not detailed in either the manuscript or the supplement. Upon examining your code base at https://zenodo.org/records/16346573, I noted that treating temporal features, specifically using the date (e.g., YYYYMMDD) as a continuous numerical input, may present a significant methodological limitation. Tree-based models cannot inherently interpret the cyclical nature of time from a simple int/float representation. Consequently, this input feature could inadvertently act as an identifier for different days, potentially causing the model’s predictions to over-rely on training data from the same day. Given the absence of a temporal-based split in the model evaluation, the current performance metrics might be overestimated.
3. The reported magnitudes for the PM concentration trends seem unexpectedly low. For instance, a reported annual decline of −0.07±0.03 μg m−3 for the BTH region appears to be several orders of magnitude smaller than what would be derived from the absolute concentration changes observed over the study period in public reports. The authors are kindly requested to verify these calculations and confirm the units. Additionally, it is suggested that the time series line charts, including the fitted lines calculated from Section 2.5, be presented in the supplement and clarify whether the trends in these regions remained stable throughout the 2015-2020 period, or if there were notable shifts at any point.
Other Comments:
Line 131-134: Please specify the temporal and spatial resolution of the CEDS emissions dataset. Also, explain why only NO was chosen rather than the full NOx.
Line 144: Random Forest is not a gradient-boosting method; it belongs to the Bagging family.
Line 158-162: Pearson’s R measures linear association and is not sufficient alone for nonlinear models like LightGBM. The coefficient of determination (R2) would more appropriately assess the model’s explanatory power in this context.
By the way, the provided source code reveals a critical error in calculating R2. The code r2_value = r2_score(test, y) reverses the required (y_true, y_pred) argument order for the sklearn.metrics.r2_score function (https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html). The formula for R2 (1−SSres /SStot ) is dependent on the total sum of squares of the true values. Reversing the arguments changes this denominator to the total sum of squares of the predicted values, which is mathematically incorrect and yields a metric that is not R2.
Line 268: The x-axis of Figure 2 needs a clear unit label.
Citation: https://doi.org/10.5194/egusphere-2025-2786-RC4
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
337 | 47 | 10 | 394 | 24 | 6 | 11 |
- HTML: 337
- PDF: 47
- XML: 10
- Total: 394
- Supplement: 24
- BibTeX: 6
- EndNote: 11
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1