the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Assessing the causal impact of the Chinese Spring Festival on PM2.5 air quality in Beijing-Tianjin-Hebei and surrounding region using a machine learning counterfactual modeling approach
Abstract. Acute short-term exposure to extremely high PM2.5 levels posed serious health risks. Human culture-based festival activities can significantly alter emission patterns, often leading to sharp yet understudied fluctuations in air quality. The Chinese Spring Festival (CSF), marked by large-scale family reunions and widespread use of fireworks, raises air pollution concerns. Commonly, this effect is quantified using receptor models or chemical transport models, but the relevant chemical component data and emission inventories are often lacking. This study presents a machine learning counterfactual approach to causally quantify PM2.5 changes associated with holiday activities. The results align well with traditional chemical composition-based estimates of fireworks contributions, highlighting the strong potential of using widely accessible routine monitoring data to quantify source contributions driven by specific interventions. Applied to the twenty-eight major cities in Beijing-Tianjin-Hebei and surrounding area, one of the most polluted regions in China, the approach revealed an average PM2.5 reduction of 19.0 ± 17.5 μg/m3 during the CSF holiday period in 2025, with fireworks accounting for ≥35 % of first-day severe deteriorated PM2.5 air quality and up to 89 % in Baoding. The approach offers a robust tool for evaluating holiday emissions and guiding air quality interventions.
Status: closed
-
RC1: 'Comment on egusphere-2025-4562', Anonymous Referee #1, 29 Oct 2025
-
AC2: 'Reply on RC1', Qili Dai, 22 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4562/egusphere-2025-4562-AC2-supplement.pdf
-
AC2: 'Reply on RC1', Qili Dai, 22 Jan 2026
-
RC2: 'Comment on egusphere-2025-4562', Anonymous Referee #2, 05 Nov 2025
The manuscript proposes a machine-learning counterfactual framework to estimate the impact of the Chinese Spring Festival (CSF) on PM2.5 in Hangzhou and the “2+26” cities. The study is timely and policy-relevant, with a clear intention to distinguish air-quality changes from emissions, and the manuscript is well organized and clearly presented. However, several aspects of causal ML practice, the temporal validation strategy, and issues of data representativeness need to be strengthened before publication.
Major Comments:
1. The manuscript frames its analysis within a causal framework, treating the Chinese Spring Festival (CSF) as a “treatment” and using the XGBoost model to predict a counterfactual business-as-usual (BAU) scenario. While this is a conceptually appropriate starting point, the current methodology does not yet meet a rigorous causal ML design. The CSF is a composite factor, bundling the effects of fireworks, altered traffic patterns, and changes in industrial/construction activity. This complexity challenges the core identification assumptions required for causal claims.
Furthermore, the analysis does not adequately address potential influence of these assumptions, such as the inconsistent overlap in covariate distributions between festival and non-festival periods. Some features, like the lunar calendar day, are inherently confounded with the treatment, violating conditional independence. The study could be characterized as a causally inspired counterfactual prediction for BAU rather than a causal estimator under verified identification conditions. Hence, the authors may wish to reconsider the title and tone down the causal claims to avoid overstatement.2. The current modeling approach, which relies on instantaneous covariates, does not account for the temporal auto-correlation inherent in air pollution. The concentration at any given time is also influenced by the emissions and meteorological conditions of previous periods. The choice of a random 80/20 split for model validation may introduce data leakage when evaluating the model performance. A blocked or rolling time-based cross-validation would be more appropriate here.
Separately, uncertainty quantification has been extensively discussed in ML-based atmospheric remote sensing, yet is not addressed in the present manuscript; providing calibrated predictive uncertainty would improve the interpretability of the results.3. The abstract opens with acute short-term health risks from extremely high PM2.5, but the regional result emphasizes an average decrease of 19.0 ± 17.5 μg/m3 over the extended holiday period. These two statements are not contradictory but currently feel weakly connected.
Besides that, Section 3.2 (Hangzhou) explicitly reports large concurrent source changes (e.g., vehicles -31%; dust +2790%), yet Section 3.5 (“2+26”) estimates fireworks’ contribution “under the assumption that emissions from other sources remained unchanged.” The authors need to address this inconsistency or provide sensitivity analysis under alternative assumptions.4. Section 2.1 requires several clarifications. First, key details for the ERA5 dataset, including its temporal/spatial resolution and a reference link, should be provided in the manuscript or SI (Text S1/Table S1). To address the potential for reanalysis data to smooth over urban-scale extremes, a brief comparison of ERA5 variables against ground-station data would strengthen the analysis. Additionally, the usage of total precipitation (TP) needs to be explained; since it is an accumulated value, please describe any transformation performed to make it suitable for an hourly model. The specific parameters or a reference for Emanuel’s saturated vapor pressure formula should be included.
5. There is the spatial representativeness mismatch between the machine learning model, which uses a 14-site city average, and the DN-PMF analysis, which uses chemical data from a single site. This difference could introduce a bias, particularly for localized sources like fireworks. The authors could discuss this limitation and its potential impact on their findings. Given the team’s related work (e.g., Journal of Environmental Sciences), a brief comparison of the methodological advantages and efficiency gains relative to prior work would also help position the contribution.
Minor corrections:
- Line 22: twenty-eight -> 28
- Line 119: meterorology -> meteorology
- Line 164: A -> An
- Line 207: in the midnight of the New Year Eve -> at midnight on New Year’s Eve
- Line 244: Please add units for RMSE and MAE.
- Line 260: reliablity -> reliability
- Line 261: techique -> technique
- Line 323: deterioriation -> deterioration
- Table S1: Please use Pa (not pa) for pressure unit.
Citation: https://doi.org/10.5194/egusphere-2025-4562-RC2 -
AC1: 'Reply on RC2', Qili Dai, 22 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4562/egusphere-2025-4562-AC1-supplement.pdf
-
AC1: 'Reply on RC2', Qili Dai, 22 Jan 2026
-
RC3: 'Comment on egusphere-2025-4562', Anonymous Referee #3, 09 Jan 2026
This study employs machine learning to evaluate the impact of firework displays on PM2.5 pollution during the Chinese Spring Festival. Overall, the manuscript is well-structured and concisely written. The findings offer valuable insights for the scientific managememnt of PM2.5 pollution in China. Revisions are needed before consideration for publication.
P2, L41: “often marked by a decline in nitrogen oxide levels and a sharp increase in PM2.5 concentrations”. This statement requires supporting references.
P3, L74-81: (1) Please specify the instrumentation used for measuring chemical compositions, along with their limits of detection, accuracy, and precision. (2) Regarding the calculation of SOC: The OC/EC minimum ratio method assumes stable emission sources over a period, which is clearly not applicable to the drastic emission changes during the Spring Festival. The authors should re-evaluate the validity of this method.
P3, L88: Why was ERA5 reanalysis data used instead of locally measured meteorological data? Please justify this choice.
P 6-7, Line 180-181: Fireworks also release substantial amounts of potassium.
Wang Ying et al. The air pollution cased by the burning of fireworks during the lantern festival in Beijing, Atmospheric Environment, 417-431, 41, 2007.
Wang Wenhua et al. Chemical composition and morphology of PM2.5 in a rural valley during Chinese New’s Eve: Impact of firework/firecracker display, Atmospheric Environment, 120225, 318, 2024.P8, L214: The authors report that fireworks contributed ~70% to PM2.5, which is an exceptionally high figure. How does this compare with previous studies? Were these fireworks discharged in the immediate vicinity of the monitoring sites? Furthermore, are firework bans implemented in this city?
P12-13, L329-335: Figure 5 is well-presented, but the accompanying description and discussion are too superficial. What explains the extreme disparity in firework contributions across different cities (>80% vs <10%)? Is this linked to local government bans? A more in-depth discussion is warranted.
Citation: https://doi.org/10.5194/egusphere-2025-4562-RC3 -
AC3: 'Reply on RC3', Qili Dai, 22 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4562/egusphere-2025-4562-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Qili Dai, 22 Jan 2026
Status: closed
-
RC1: 'Comment on egusphere-2025-4562', Anonymous Referee #1, 29 Oct 2025
This manuscript "Assessing the causal impact of the Chinese Spring Festival on PM2.5 air quality in Beijing-Tianjin-Hebei and surrounding region using a machine learning counterfactual modeling approach" by Yuan Li and team, addresses an important and interesting topic: the influence of the Chinese Spring Festival (CSF) on regional PM2.5 concentrations, particularly the attribution of emissions to fireworks. The use of a machine learning counterfactual model is an interesting approach to isolate the festival's effect. However, the core conclusions regarding the high contribution of fireworks, especially at the regional scale, are based on data and methodological interpretations that lack sufficient resolution and rigor to justify the claim. Specifically, the analysis appears to conflate highly local, transient firework plumes with persistent regional emissions from industrial and urban sources. This weakness must be addressed before the manuscript can be considered for publication.
Major Comments:
- The manuscript interprets elevated specific chemical tracers (e.g., Al, Ba, Cu and K) as fireworks and then equates the PMF-resolved “fireworks” factor with a very large fraction to city PM2.5 (e.g., fireworks reaching 137 µg m-3 and up to 81.8% at the peak) and then treats that as validation of the ML holiday-attributable signal. But trace metals from fireworks can adsorb or be scavenged onto pre-existing aerosol mass-a single atom/molecule of a tracer attaching to many aerosol mass units can cause PMF to label existing background aerosol as “firework-contaminated” even if the mass contribution from fireworks is small. This overlooks the critical aspect of aerosol mixing and coating. The presence of a fireworks marker on an aerosol particle only indicates mixing or coating has occurred, not that the particle's mass is primarily sourced from fireworks. For instance, a persistent industrial Black Carbon (BC) or organic carbon particle, when mixed in the atmosphere with even a single molecule of Al/Na/Cu or K from fireworks, will be registered a 'firework' marker. But in reality, the particle's overall mass and emission may still be dominated by the pre-existing industrial or alternative sources of BC core. Would this not induce a overestimation of mass contribution of fire works and biases the source apportionment. I think the author’s revise their interpretation to differentiate between a source signature (the tracer) and mass contribution (the overall particle composition). One approach to distinguish this is maybe to examine the ratios of Ba/Al, K/Ba or K/EC or K/BC and their co-variation with EC or BC to see whether the tracers appear as spikes superimposed on otherwise continuous mass.
- Multi-platform observations of persistent BC and other aerosols in the adjascent regions – such as Xuzhou and Suzhou, which share air-mass pathways with Shandong and Linyi- reveal substantial , continuous emissions from urban and small industrial sources (Tiwari et al., 2025). These background sources have far greater emission potential than transient firework events. The manuscript should reconcile its high firework attribution with these documented, regional emissions, placing CSF findings within the borader context of persistent regional aerosol loading. This also demands a clarification to the geographic scope of the conclusions. A localized short duration fire works plume from Shandong is unlikely to represent a regional-scale PM2.5 driver across the entire Beijing-Tianjin-Hebei domain. The authors should specify whether their results describe a local fireworks effect or a regional influence and adjust the claims accordingly.
- The ML counterfactual approach is trained and evaluated on city-averaged PM2.5 (14 monitoring stations averaged for Hangzhou; Fig. 3 caption), but does the DN-PMF chemical composition used for validating the fireworks factor comes from a single site (Wolongqiao). Is the manuscript assuming the single-site composition being representative of the whole-city PM2.5 composition when assessing fireworks contribution. Please clarify and justify representativeness, if possible via spatial correlation of species across stations or multi-site composition if available. If the differences are large, the validation claim should be downweighted or additional PMF runs may be necessary for other sites.
- The authors attribute ML > DN-PMF magnitude difference partly to “unproportionally amplification effect due to midnight affect due to the midnight unfavorable dilution conditions (extremely low VC)”. This is plausible, however, if low VC multiplies existing aerosol mass, then fireworks mass should be amplified only in concentration but not in emission strength if fireworks were purely local instantaneous injections. Thus, clarifying the distinction between concentration amplification vs inferred emission-strength amplification would further enhance the overall understanding of this section.
- In Sect. 3.3-3.5 the authors compute fireworks contribution on the first day as “increase relative to counterfactual, under assumption that other sources remained unchanged.” This is a very strong assumption: other sources may decline (traffic reductions) or increase (localized cooking, residential heating). The manuscript needs to either: (a) quantify the extent of other source changes using available proxies (traffic counts, NO2/CO changes, DN-PMF vehicle factor), or (b) present their fireworks fraction as an upper/lower bound with explicit caveats. Right now the wording implies more certainty than justified.
- I think the claim firework explain 68.8% of instantaneous PM2.5 needs more uncertainty reporting, to be deemed robust. The DN-PMF uses 8 factors and labels Factor 1 as firework because Al/Ba/K peaks. But PMF solutions can be non-unique and sensitive to input species and uncertainties. Shedding light on PMF diagnostics would make the results more robust.
Minor/Technical Comments:
- The Methods state “No strict hyper-parameter tuning requirements were imposed — either Bayesian optimization or grid search could be used” (Sect. 2.2). That is ambiguous. Please state what the authors actually used (grid values, CV folds, early stopping) and include the tuned hyperparameters in a supplement. Report learning curves and variable importance (SHAP or partial dependence) so readers see what drivers the model uses.
- Mixing gas-phase species (CO/NOX) with PM speciation should be echoed in the manuscript as a caveat: gas tracers and PM tracers respond to different source processes and lifetimes, so combining them without accounting for atmospheric chemistry and differential lifetimes could mislead source attribution (Li et al., 2025). The authors briefly note NO2/CO decline; they should deepen that discussion (e.g., use NO2 as independent corroboration of traffic reduction).
- Figure 1, Figure 2 and Figure 3 have missing references in the main text.
Reference:
Tiwari, P., Cohen, J.B., Lu, L. et al. Multi-platform observations and constraints reveal overlooked urban sources of black carbon in Xuzhou and Dhaka. Commun Earth Environ 6, 38 (2025). https://doi.org/10.1038/s43247-025-02012-x
Li, X., Cohen, J.B., Tiwari, P. et al. Space-based inversion reveals underestimated carbon monoxide emissions over Shanxi. Commun Earth Environ 6, 357 (2025). https://doi.org/10.1038/s43247-025-02301-5
Citation: https://doi.org/10.5194/egusphere-2025-4562-RC1 -
AC2: 'Reply on RC1', Qili Dai, 22 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4562/egusphere-2025-4562-AC2-supplement.pdf
-
RC2: 'Comment on egusphere-2025-4562', Anonymous Referee #2, 05 Nov 2025
The manuscript proposes a machine-learning counterfactual framework to estimate the impact of the Chinese Spring Festival (CSF) on PM2.5 in Hangzhou and the “2+26” cities. The study is timely and policy-relevant, with a clear intention to distinguish air-quality changes from emissions, and the manuscript is well organized and clearly presented. However, several aspects of causal ML practice, the temporal validation strategy, and issues of data representativeness need to be strengthened before publication.
Major Comments:
1. The manuscript frames its analysis within a causal framework, treating the Chinese Spring Festival (CSF) as a “treatment” and using the XGBoost model to predict a counterfactual business-as-usual (BAU) scenario. While this is a conceptually appropriate starting point, the current methodology does not yet meet a rigorous causal ML design. The CSF is a composite factor, bundling the effects of fireworks, altered traffic patterns, and changes in industrial/construction activity. This complexity challenges the core identification assumptions required for causal claims.
Furthermore, the analysis does not adequately address potential influence of these assumptions, such as the inconsistent overlap in covariate distributions between festival and non-festival periods. Some features, like the lunar calendar day, are inherently confounded with the treatment, violating conditional independence. The study could be characterized as a causally inspired counterfactual prediction for BAU rather than a causal estimator under verified identification conditions. Hence, the authors may wish to reconsider the title and tone down the causal claims to avoid overstatement.2. The current modeling approach, which relies on instantaneous covariates, does not account for the temporal auto-correlation inherent in air pollution. The concentration at any given time is also influenced by the emissions and meteorological conditions of previous periods. The choice of a random 80/20 split for model validation may introduce data leakage when evaluating the model performance. A blocked or rolling time-based cross-validation would be more appropriate here.
Separately, uncertainty quantification has been extensively discussed in ML-based atmospheric remote sensing, yet is not addressed in the present manuscript; providing calibrated predictive uncertainty would improve the interpretability of the results.3. The abstract opens with acute short-term health risks from extremely high PM2.5, but the regional result emphasizes an average decrease of 19.0 ± 17.5 μg/m3 over the extended holiday period. These two statements are not contradictory but currently feel weakly connected.
Besides that, Section 3.2 (Hangzhou) explicitly reports large concurrent source changes (e.g., vehicles -31%; dust +2790%), yet Section 3.5 (“2+26”) estimates fireworks’ contribution “under the assumption that emissions from other sources remained unchanged.” The authors need to address this inconsistency or provide sensitivity analysis under alternative assumptions.4. Section 2.1 requires several clarifications. First, key details for the ERA5 dataset, including its temporal/spatial resolution and a reference link, should be provided in the manuscript or SI (Text S1/Table S1). To address the potential for reanalysis data to smooth over urban-scale extremes, a brief comparison of ERA5 variables against ground-station data would strengthen the analysis. Additionally, the usage of total precipitation (TP) needs to be explained; since it is an accumulated value, please describe any transformation performed to make it suitable for an hourly model. The specific parameters or a reference for Emanuel’s saturated vapor pressure formula should be included.
5. There is the spatial representativeness mismatch between the machine learning model, which uses a 14-site city average, and the DN-PMF analysis, which uses chemical data from a single site. This difference could introduce a bias, particularly for localized sources like fireworks. The authors could discuss this limitation and its potential impact on their findings. Given the team’s related work (e.g., Journal of Environmental Sciences), a brief comparison of the methodological advantages and efficiency gains relative to prior work would also help position the contribution.
Minor corrections:
- Line 22: twenty-eight -> 28
- Line 119: meterorology -> meteorology
- Line 164: A -> An
- Line 207: in the midnight of the New Year Eve -> at midnight on New Year’s Eve
- Line 244: Please add units for RMSE and MAE.
- Line 260: reliablity -> reliability
- Line 261: techique -> technique
- Line 323: deterioriation -> deterioration
- Table S1: Please use Pa (not pa) for pressure unit.
Citation: https://doi.org/10.5194/egusphere-2025-4562-RC2 -
AC1: 'Reply on RC2', Qili Dai, 22 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4562/egusphere-2025-4562-AC1-supplement.pdf
-
AC1: 'Reply on RC2', Qili Dai, 22 Jan 2026
-
RC3: 'Comment on egusphere-2025-4562', Anonymous Referee #3, 09 Jan 2026
This study employs machine learning to evaluate the impact of firework displays on PM2.5 pollution during the Chinese Spring Festival. Overall, the manuscript is well-structured and concisely written. The findings offer valuable insights for the scientific managememnt of PM2.5 pollution in China. Revisions are needed before consideration for publication.
P2, L41: “often marked by a decline in nitrogen oxide levels and a sharp increase in PM2.5 concentrations”. This statement requires supporting references.
P3, L74-81: (1) Please specify the instrumentation used for measuring chemical compositions, along with their limits of detection, accuracy, and precision. (2) Regarding the calculation of SOC: The OC/EC minimum ratio method assumes stable emission sources over a period, which is clearly not applicable to the drastic emission changes during the Spring Festival. The authors should re-evaluate the validity of this method.
P3, L88: Why was ERA5 reanalysis data used instead of locally measured meteorological data? Please justify this choice.
P 6-7, Line 180-181: Fireworks also release substantial amounts of potassium.
Wang Ying et al. The air pollution cased by the burning of fireworks during the lantern festival in Beijing, Atmospheric Environment, 417-431, 41, 2007.
Wang Wenhua et al. Chemical composition and morphology of PM2.5 in a rural valley during Chinese New’s Eve: Impact of firework/firecracker display, Atmospheric Environment, 120225, 318, 2024.P8, L214: The authors report that fireworks contributed ~70% to PM2.5, which is an exceptionally high figure. How does this compare with previous studies? Were these fireworks discharged in the immediate vicinity of the monitoring sites? Furthermore, are firework bans implemented in this city?
P12-13, L329-335: Figure 5 is well-presented, but the accompanying description and discussion are too superficial. What explains the extreme disparity in firework contributions across different cities (>80% vs <10%)? Is this linked to local government bans? A more in-depth discussion is warranted.
Citation: https://doi.org/10.5194/egusphere-2025-4562-RC3 -
AC3: 'Reply on RC3', Qili Dai, 22 Jan 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4562/egusphere-2025-4562-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Qili Dai, 22 Jan 2026
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 682 | 0 | 6 | 688 | 0 | 0 |
- HTML: 682
- PDF: 0
- XML: 6
- Total: 688
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript "Assessing the causal impact of the Chinese Spring Festival on PM2.5 air quality in Beijing-Tianjin-Hebei and surrounding region using a machine learning counterfactual modeling approach" by Yuan Li and team, addresses an important and interesting topic: the influence of the Chinese Spring Festival (CSF) on regional PM2.5 concentrations, particularly the attribution of emissions to fireworks. The use of a machine learning counterfactual model is an interesting approach to isolate the festival's effect. However, the core conclusions regarding the high contribution of fireworks, especially at the regional scale, are based on data and methodological interpretations that lack sufficient resolution and rigor to justify the claim. Specifically, the analysis appears to conflate highly local, transient firework plumes with persistent regional emissions from industrial and urban sources. This weakness must be addressed before the manuscript can be considered for publication.
Major Comments:
Minor/Technical Comments:
Reference:
Tiwari, P., Cohen, J.B., Lu, L. et al. Multi-platform observations and constraints reveal overlooked urban sources of black carbon in Xuzhou and Dhaka. Commun Earth Environ 6, 38 (2025). https://doi.org/10.1038/s43247-025-02012-x
Li, X., Cohen, J.B., Tiwari, P. et al. Space-based inversion reveals underestimated carbon monoxide emissions over Shanxi. Commun Earth Environ 6, 357 (2025). https://doi.org/10.1038/s43247-025-02301-5