the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The radiative forcing of PM2.5 heavy pollution, its influencing factors and importance to precipitation during 2014–2023 in the Bohai Rim, China
Abstract. There were PM2.5 heavy pollution events in the Bohai Rim regions in China over the past decade, which can significantly affect radiative forcing (RF). However, the characteristics and influencing factors of RF on heavy pollution days, and its relative importance to precipitation remain unclear. This work combined ground-based and satellite observations and reanalysis data to investigate the RF characteristics of regional PM2.5 heavy pollution in the Bohai Rim regions during the fall and winter of 2014–2023. Additionally, the impact of meteorological vertical profiles on surface PM2.5 and pollution RF, and the importance of various factors to pollution RF and precipitation, were explored based on machine learning algorithms. The results showed that the RF on PM2.5 regional heavy pollution days can be up to approximately -70 Wm-2 at the surface, ±8 Wm-2 at top of atmosphere (TOA), and +80 Wm-2 in the atmosphere in clear-sky, with lower absolute values in all-sky. Low- to medium-altitude inversions of temperature (T) profiles in the boundary layer favored higher surface PM2.5 concentration, whereas isothermal stratification and medium- to high-altitude inversions corresponded to higher surface RF. Lower horizontal speeds and upward motion at low levels can induce higher surface PM2.5 and surface RF. Surface PM2.5 was the most important factor to surface and atmosphere RF in clear-sky, but V wind in high level (500 hPa) in all-sky. Moreover, pollution RFs in all-sky were as important as vertical winds to the total precipitation. Notably, there was considerable regional heterogeneity in the important factors affecting the RF and precipitation in the Bohai Rim regions.
- Preprint
(3693 KB) - Metadata XML
-
Supplement
(437 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-4464', Anonymous Referee #1, 19 Dec 2025
-
RC2: 'Comment on egusphere-2025-4464', Anonymous Referee #3, 25 Dec 2025
This study integrates ground observations, satellite remote sensing, reanalysis data, and machine learning techniques to systematically analyze the radiative forcing (RF) characteristics, influencing factors, and their relative importance to precipitation during severe autumn and winter PM2.5 heavy pollution events in the Bohai Sea region (2014–2023). While the work is meaningful, the following issues need to be addressed:
1. Page 6, Line 135: The term “clean days” is used without a clear definition. Please specify the exact PM2.5 concentration threshold or criteria used to classify a day as “clean”.
2. Although machine learning algorithms are central to this study, the manuscript lacks quantitative performance evaluation. Please provide essential statistical metrics, such as the coefficient of determination (R²), RMSE, and MAE, for both the radiative forcing and precipitation prediction models to validate their accuracy.
3. Page 20, Line 455: The discussion regarding regional heterogeneity is currently superficial. The authors should elaborate on the underlying physical mechanisms driving these differences. Specifically, the analysis should link the results to sub-regional variations in surface types, emission intensities, and topographical features.
4. Please clarify whether data standardization was performed prior to K-means clustering. Given the disparate units and scales of the input variables, omitting standardization could significantly bias the clustering results towards variables with larger magnitudes. If standardization was not applied, a rigorous justification is required.
5. Data Quality and Pre-processing: The validity of applying machine learning to linearly interpolated CERES radiation data requires justification. Linear interpolation may introduce artifacts or smooth out extreme values, potentially biasing the training process of the machine learning models. The authors should discuss: The extent of missing data prior to interpolation. Whether the spatial resolution of CERES data is sufficient to capture the local variability of pollution events in the study area. How these uncertainties affect the generalizability of the conclusions.
6. In Section 2.4 :The current description of K-means and Random Forest is too generic. This section should be condensed to focus on the specific implementation details specific to this study, such as hyperparameter settings, input feature selection, and the cross-validation strategy employed.Citation: https://doi.org/10.5194/egusphere-2025-4464-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 211 | 77 | 37 | 325 | 34 | 23 | 27 |
- HTML: 211
- PDF: 77
- XML: 37
- Total: 325
- Supplement: 34
- BibTeX: 23
- EndNote: 27
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Authors present research on an interesting topic: the influence of meteorological parameters and aerosol concentrations measured at ground level on radiative forcing. They use up-to-date methods and data sources; however, in my opinion, the manuscript in its current form is not suitable for publication, and some points should be explained, changed, or even require additional data analysis.
Moreover, the way the authors present their findings is chaotic and sometimes hard to follow. The figures are sometimes unreadable, captions are too small, etc. There is probably too much data presented in the manuscript; maybe moving some data to an appendix would help and allow readers to focus on the main scope of the manuscript.
The main issue of the manuscript, in my opinion, is how the authors “compare” in-situ data with satellite measurements. They use multisource data, which is fine; however, the relationship between them and the transition from in-situ measurement data to gridded data is unclear.
For instance, they use a monitoring network to identify high-concentration episodes, which is acceptable. Then, some gridded data are used in statistical analysis. What happens between in-situ and gridded data? How does TAP work? A brief description is needed in the manuscript. By the way, why do the authors use ERA-5 data while the TAP website claims that meteorological data are combined with aerosol data? If ERA-5 data are better, then what is the quality of TAP aerosol data? The authors use PM2.5 data as a predictor in the Random Forest model. They claim it is a proxy for anthropogenic sources; however, elsewhere they discuss the influence of transport on local concentrations, and in another place, they state that columnar optical properties determine radiative forcing. So, is local PM2.5 a predictor of radiative forcing or not? Maybe it would be better to use emission inventories—TAP claims that one is incorporated—to estimate anthropogenic sources instead of PM2.5 concentrations? Moreover, in section 3.3, the authors again claim that PM2.5 is related to emissions while meteorological profiles reflect diffusion. I cannot agree: emissions together with diffusion factors influence PM2.5 concentrations. So, PM2.5 is not an independent variable, contrary to what is stated in the conclusion.
Another example concerns the “mean profiles” of meteorological parameters. It is written that the authors used profiles at over 11 stations. How are they representative for the 0.25 × 0.25 grid used in the Random Forest analysis? Is there any local orography favoring aerosol transport or accumulation in valleys? What about sea-land differences? I can understand that clustering is performed over land (land stations). It seems that clustering and Random Forest are independent. So, it should be explained somewhere why the authors perform such investigations.
Another question is: what are the profiles during the rest of the analyzed time, not only during high-concentration episodes? For example, temperature profile clusters 2 to 4 exhibit inversion from around 950 to 850 hPa. The frequency of occurrence during the investigated episodes is around 30–40%. What happens during the rest of the investigated period (autumn, winter)? Another issue is how “wind clusters” are presented. Figure 5a is completely unreadable. Maybe the authors will find another way to present changes with the altitude of wind speed and direction?
Regarding mean clustered profiles, it would be interesting to connect temperature profiles with wind profiles. I suggest performing multidimensional cluster analysis to find meteorological situations favoring large PM2.5 concentrations—for example, inversion and low wind speed near the ground.
One major weakness of the manuscript is the insufficient discussion of the Random Forest analysis. The authors should elaborate on why individual parameters influence radiative forcing, separately for the clear-sky and all-sky cases. What are the potential mechanisms? What is the influence of clouds? Furthermore, the land-sea aspect requires a more in-depth analysis, particularly regarding aerosol transport between sea and land—for example, the influence of wind speed and direction.