the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Influencing Factors of Gas-Particle Distribution of Oxygenated Organic Molecules in Urban Atmosphere and its Deviation from Equilibrium Partitioning
Abstract. Gas-to-particle partitioning governs the fate of Oxygenated Organic Molecules (OOMs) and the formation of organic aerosols. We employed a FIGAERO-CIMS to measure gas-particle distribution of OOMs in a winter campaign in urban atmosphere. The observed gas to particle (G/P) ratios show a narrower range than the equilibrium G/P ratios predicted from saturation mass concentration C* and organic aerosol content. The difference between observed and equilibrium G/P ratios could be up to 10 orders of magnitude, depending on C* parameterization selection. Our random forest models identified relative humidity (RH), aerosol liquid water content (LWC), temperature and ozone as four influential factors driving the deviations of partitioning from equilibrium state. Random forest models with satisfactory performance were developed to predict the observed G/P ratios. Intrinsic molecule features far outweigh meteorological and chemical composition features in the model's predictions. For a given OOM species, particle chemical composition features including pH, RH, LWC, organic carbon, potassium and sulfate dominate over meteorological and gaseous chemical composition features in predicting the G/P ratios. We identified positive or negative effects, as well as the sensitive ranges, of these influential features using SHapley Additive exPlanations (SHAP) analysis and curve fitting with a generalized additive model (GAM). Our models found that temperature does not emerge as a significant factor influencing the observed G/P ratios, suggesting that other factors, most likely associated with particle composition, inhibit the gas/particle partitioning of OOMs in response to temperature change.
- Preprint
(2391 KB) - Metadata XML
-
Supplement
(1056 KB) - BibTeX
- EndNote
Status: open (until 28 Apr 2025)
-
RC1: 'Comment on egusphere-2025-229', Anonymous Referee #3, 01 Apr 2025
reply
Wang et al. measured both gaseous and particle-phase OOMs using FIGAERO-CIMS during a winter campaign in Wuhan. They derived gas-to-particle ratios (G/P) using measured FIGAERO-CIMS signals and predicted equilibrium G/P based on predicated OOM volatility. They further applied machine learning methods to revealed key factors that are associated with G/P. The manuscript aligns with the scope of Atmospheric Chemistry and Physics. However, clarification of the principal findings is recommended prior to publication. Specific comments are given below:
1) The machine learning analysis provides intriguing insights into G/P influencing factors. However, given potential limitations in the representativeness of the dataset and the ML methodology, the mechanistic interpretation of identified factors may not be entirely clear. I recommend expanded discussion in Section 3.2.2 to include detailed analysis of at least one parameter (e.g., temperature).
2) Please explain Eq. 4 with an emphasis on its underlying assumptions that are possibly violated in the real atmosphere. This clarification would aid the discussion on the influencing factors of the ratio of (G/P)obs to (G/P)eq.
3) Lines 179-181, Page 7. Please specify the data partitioning strategy for training and test sets and the measures to prevent model overfitting.
4) Fig. 1 and its relevant discussion. The method in Ren et al. (2022) provided the equilibrium pressure (Ceq) rather than the saturation pressure (C*), as the formula was fit to atmospheric aerosols (a mixture of many OOMs). Using Ceq instead of C* in Eq. 4 may introduce systematic biases. Could this be the reason for the observed discrepancy between (G/P)obs and (G/P)eq in Figs 1a and 1d?
5) Previous studies (e.g., Voliotis et al., 2021; Chen et al., 2024) have reported a narrow volatility range of OOMs retrieved using the partitioning method. It is not surprising to see a huge difference between the volatility obtained using different methods. Would it be possible to expand discussion on this finding?
6) As noted by the authors, atmospheric OOMs may not reach equilibrium between the gas and particle phases (e.g., Li et al., 2024). Could machine learning features capture this non-equilibrium effects?
References
[1] Voliotis, A., Wang, Y., Shao, Y., Du, M., Bannan, T. J., Percival, C. J., Pandis, S. N., Alfarra, M. R., and McFiggans, G.: Exploring the composition and volatility of secondary organic aerosols in mixed anthropogenic and biogenic precursor systems, Atmos. Chem. Phys., 21, 14251–14273, 10.5194/acp-21-14251-2021, 2021.
[2] Chen, W., Hu, W., Tao, Z., Cai, Y., Cai, M., Zhu, M., et al.: Quantitative characterization of the volatility distribution of organic aerosols in a polluted urban area: Intercomparison between thermodenuder and molecular measurements. J. Geophys. Res. Atmos., 129, e2023JD040284, 10.1029/2023JD040284, 2024.
[3] Li, Y., Cai, R., Yin, R., Li, X., Yuan, Y., An, Z., Guo, J., Stolzenburg, D., Kulmala, M., and Jiang, J.: A kinetic partitioning method for simulating the condensation mass flux of organic vapors in a wide volatility range, J. Aerosol Sci., 180, 106400, 10.1016/j.jaerosci.2024.106400, 2024.
Citation: https://doi.org/10.5194/egusphere-2025-229-RC1 -
RC2: 'Comment on egusphere-2025-229', Anonymous Referee #1, 03 Apr 2025
reply
The manuscript by Wang et al. presents a well-executed study that integrates FIGAERO-CIMS measurements with a machine learning approach to understand the gas-particle partitioning of OOMs in urban environments. They identified key influencing factors such as relative humidity, liquid water content, and particle-phase composition. The authors showed how these factors affect the deviations from equilibrium partitioning, contributing significant knowledge to atmospheric aerosol science. I recommend publication after the authors address the following questions.
- It would be better to add more details about the sampling site’s characteristics, such as location and distances to pollution sources (e.g., traffic or industrial areas), and discuss how these factors influence OOM partitioning.
- The authors use different C* parameterizations to estimate the equilibrium G/P ratios and compare them with the observed G/P ratios. However, these different C* parameterizations may introduce significant uncertainty. Please provide a more detailed discussion of the advantages and limitations of each parameterization and explain why these specific methods were chosen.
- The authors mentioned that the model is based on winter data and limited to specific OOM species. It is recommended that the authors clearly discuss the model’s limitations, such as whether certain OOM types (e.g., highly volatile organics) were underrepresented, and how seasonal variations (e.g., summer heat or rainy season humidity) might affect model predictions. The authors should address these limitations and suggest future improvements, such as expanding the dataset to include different seasons or OOM species.
- In the feature selection process, the authors identify essential features using the random forest model and explain them with SHAP analysis. It is recommended that the authors provide more statistical justification for the feature selection, such as significance tests or correlation analysis, to validate the importance of these features. Additionally, the SHAP interpretation could be enhanced by including quantitative analysis of how each feature’s variation impacts the G/P ratio, not just the ranking of feature importance. Specifically, sensitivity curves for different features across ranges could visually show their contribution to the model output.
Citation: https://doi.org/10.5194/egusphere-2025-229-RC2 -
RC3: 'Comment on egusphere-2025-229', Anonymous Referee #4, 03 Apr 2025
reply
Referee commentGeneral commentsWang et al. have conducted a measurement campaign using a FIGAERO-CIMS using Iodide to detect OOMs in Wuhan during the winter 2022-2023. The deployed instrument was able to record concentrations of both compounds in gas and particle phases. Using the data the authors developed several random forest models, attempting to predict the gas to particle partitioning of detected compounds. Models were also constructed to study the discrepancy between observed and modelled partitioning ratios. The significance of the features in each model were further analyzed in an attempt to elucidate physicochemical properties impacting the underlying processes.The manuscript fits the scope of Atmospheric Chemistry and Physics, and presents valuable new knowledge. However, there are a number of concerns when it comes to the analysis presented. Therefore further clarifications are recommended before final publication.Specific comments
-
I suggest the authors the authors provide some further explanation of why the selected target variable is chosen. This would help readers know what to expect from the analysis. What are the benefits of attempting to predict the gas to particle partitioning rather than e.g. absolute particle phase concentration?
-
Given the data first approach chosen, the quality of the dataset is important. Given that the datasets represents a time period of one month in winter, how representative is it of varying conditions? Are there several different meteorological conditions or weather patterns included in the analysed dataset, and what is their significance for the analysis? I encourage the authors to give a brief overview of the importance of varying conditions, and perhaps provide some time series of common meteorological parameters such as temperature and windspeed+direction in the supplementary as an overview for the reader.
-
Given the (it seems) limited scope of the dataset, how general are the conclusions of this work?
-
What is in general the certainty of the partitioning ratio measured? For example it is stated that 26.8% of particle mass is detected as fragments. Does this not bias the G/P ratio towards the gas phase for compounds that fragment? What about other losses in the FIGAERO system, or fragments that are not detected by the ionization scheme used?
-
Also, the authors select the data based on peaks being relatively dominant on their masses, and with substantial concentration in the particle phase (lines 104-105). This excludes compounds with very small concentrations in either particle or gas phase. This, in turn, leads to a narrow range of G/P values, as compounds predominantly in either phase are filtered out. This is observed (Fig. 1), and should be commented on more.
-
On the comparison with previous C* parametrisations: Mohr et al parametrisation is still based on SIMPOL (through Tröstl et al), although with increased contribution from OOH groups. This is not clear in the text. Kurten et al (2016, 10.1021/acs.jpca.6b02196) and subsequent publications show that for HOM-type compounds, SIMPOL predicts a too steep dependence of C* on e.g. oxygen content and molar mass. Also, Peräkylä et al do not use particle-phase concentrations, and there is no assumption of equilibrium. In contrast, Priestley et al assume equilibrium.
-
The authors compare the observed G/P with equilibrium G/P from other studies. When this comparison is presented to the reader, the reason behind the comparison is not clear, given that these are very different quantities. As the authors themselves state, OOMs rarely achieve equilibrium partitioning in the free atmosphere. I suggest the authors clearly motivate why the comparison is being made.
-
In Figure 1 the errorbars denote the range of observations, but systematic errors (such as those mentioned in comment 4) are not mentioned. Although these may not that relevant for the model, they may impact the absolute comparison presented here.
-
My understanding is that the Ren et al. comparison is based on the parametrization derived from thermal desorption temperatures. It seems that the authors cloud also have derived some volatility estimate from their data since it was collected using the FIGAERO. Why is this not presented for further comparison?
-
Do the authors believe they mostly observed OOMs close to equilibrium partitioning? Would this still be the case during summer, when the changes in precursors and oxidants are presumably faster?
-
On line 255 the authors state that they observed significant fluctuations in observed G/P diurnal variation. in the referenced Figure 2 the concentrations don’t look like they vary very much relative to the mean, and the diurnal patterns look mostly random. I do not understand what the authors mean by this statement.
-
Since the diurnal variation of temperature and concentration of organic aerosol are highlighted as the only factors influencing the diurnal variation of equilibrium G/P I would like to see time series and/or diurnal plots of these parameters.
-
The authors claim that the importance of pH is due to enhanced partitioning of acidic OOMs. This argument relies on the assumption that the partitioning ratio is close to equilibrium for a significant fraction of the observations, and the observed G/P is mostly determined by factors shifting the equilibrium partitioning, which has not been shown in the manuscript. Could there be other reasons such as a common source for gas phase OOMs and more acidic particulate matter e.g. sulfuric acid?
-
Line 338 prohibited should probably be changed to inhibited, or another word.
-
The authors hypothesize that elevated O3 leads to depletion of OOMs in the gas phase. Does O3 not also contribute to OOM formation? Can particle phase OOMs not react with O3?
-
The models identified RH, LWC, O3 and temperature as influential factors driving the deviation between observed and equilibrium G/P. Are these truly influential, or do they serve as proxies for the diurnal pattern present mostly in the equilibrium G/P.
-
I suggest adding the term ”Random forest” or ”Machine learning” to the title since the paper mostly focuses on using these methods to study the partitioning. There are also few definitive conclusions about the influencing factors and processes, with the main conclusion being the importance of particle phase composition and processes.
-
Results from the same measurement campaign have already been published by Wang et al. (2024, 10.1021/acsestair.4c00076). This is completely fine, but it would be good to mention this clearly in the manuscript.
-
The authors use very many explanatory variables in their models. Many of these are correlated with, or even derived from, each other. Examples include the O/C ratio with oxidation state of carbon, and RH, sulfate and potassium concentrations with LWC and pH from ISORROPIA II. This leads to unreasonable conclusions, such as that O:C and oxidation state have an opposite effect on the G/P ratios. The authors should comment on problems with multicollinearity.
-
Also, the supporting measurements (such as OC and aerosol composition) are poorly described in the methods.
-
Add reference for Eq. (4)
-
Data availability: I strongly recommend that the data of the study should be made openly available.
Citation: https://doi.org/10.5194/egusphere-2025-229-RC3 -
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
107 | 26 | 10 | 143 | 15 | 7 | 7 |
- HTML: 107
- PDF: 26
- XML: 10
- Total: 143
- Supplement: 15
- BibTeX: 7
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 44 | 32 |
China | 2 | 40 | 29 |
undefined | 3 | 7 | 5 |
Germany | 4 | 5 | 3 |
Finland | 5 | 5 | 3 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 44