the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Development and implementation of SOMA: A Secondary Organic Module for Aerosol integration in high-resolution air quality simulations
Abstract. Secondary Organic Aerosols (SOAs) are formed following oxidation of Volatile Organic Compounds (VOCs) in the atmosphere and have a significant contribution to fine particulate matter concentrations. Understanding SOA formation is crucial, particularly in urban environments, where various emission sources contribute across different time scales. To decipher SOA formation dynamics, this study introduces SOMA (Secondary Organic Module for Aerosol) embedded in air quality modelling. SOMA considers VOC oxidation with OH using species concentrations, exposure duration, NOx levels and SOA yields as inputs, the latter obtained from the GECKO-A model. A total of 113 experiments are gathered from literature, involving four different VOC species (α-pinene, isoprene, limonene, and toluene), to produce correction factors depending on ozone (O3) levels, relative humidity (RH), and temperature (T). SOMA was linked to CFD modelling and was used to characterise the dispersion of toluene SOA emissions from traffic in a heavily trafficked area in Augsburg, Germany. The dispersion model was used to simulate pollutant recirculation in the examined area using a novel approach by combining both local road traffic emissions and background sources. SOA formation from toluene was examined over a 12-h period. The results indicated that background SOA constituted 21–53 % of the identified SOA mass. After 7 hours, the influence of background SOA on modelled concentrations became negligible due to precursor consumption and dilution. The combination of high-resolution pollution maps generated by CFD and atmospheric chemistry involving SOA formation enhances the air quality modelling capabilities and can provide valuable information to the scientific community.
- Preprint
(2216 KB) - Metadata XML
-
Supplement
(1506 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-193', Anonymous Referee #1, 29 Jun 2025
-
RC2: 'Comment on egusphere-2025-193', Anonymous Referee #2, 31 Jul 2025
This paper presents a novel model named SOMA, which represents Secondary Organic Aerosol (SOA) formation in a way that the authors claim is suitable for use in complex atmospheric chemistry models and capable of including correction factors depending on various inputs which can influence SOA formation including ozone levels, relative humidity, and temperature. The authors have so far created a version of this model capable of representing four key SOA-forming compounds based on laboratory experiments: alpha-pinene, isoprene, limonene, and toluene. The authors then apply this model in a case study to characterize toluene SOA production from traffic sources in Augsburg Germany, using a recirculating model framework (essentially a periodic boundary condition) which enables them to represent their system and boundary conditions with a single model. Using this framework, the authors are able to conclude (for a simplified system of toluene aerosols) that background sources contribute 21-53% of SOA over the course of the day, and that the influence of any given background signal degrades to nothing over the course of 7 hours due to precursor consumption and dilution.
The paper advances two interesting ideas, and deserves consideration. However, it also needs significant revisions of both text and ideas before it should be published.
Main Points
SOMA vs VBS
The authors make a strong claim that a main advantage of their model vis a vis a more traditional volatility basis set (VBS) framework lies in the ability of SOMA to simulate complex chemistry, while the main advantage SOMA has over more complex semi-explicit chemical mechanism models lies in its reduced computational costs.
I do not entirely agree with the authors’ framing here, specifically regarding the comparison of SOMA and VBS-type techniques, and in fact I would foreground an entirely different strength of their model. It does not seem to me that there is any process represented in SOMA that could not be represented with a sufficiently complex VBS framework. The authors highlight the role of ozone levels, relative humidity, and temperature, but all of these could, in principle, be represented in VBS frameworks as they already exist. In my view, the main advantages of this model over the VBS framework are:
- the simplicity of the calculations
- Where a VBS might require tracers representing 3-6 volatility bins per SOA precursor, SOMA applies parameterized gamma and lambda terms to calculate an overall yield. As the authors move beyond 4 SOA precursors, the enormous advantages of this construction over the VBS framework will become even more obvious.
- the transparency of the assumptions and
- the interpretability of the intermediate steps.
- SOMA generates an SOA yield term! This is immediately comprehensible to any experimentalist and can be readily understood by any reader or other modeler in a manner that the complex partitioning theory of a VBS makes challenging.
- Since this is how measurements of SOA formation are taken *in a lab* in any case, comparing model input with new experimental data does not require a complex process of fitting volatility bin yields to experimental observations in order to back out the desired OA yield.
These are not minor advantages, and I encourage the authors to highlight them!
Additionally, SOMA has one drawback when compared to a VBS framework: it assumes implicitly that the partitioning of low-volatility gases into the particle phase is unidirectional and irreversible. The authors do not discuss this assumption, and they may wish to do so.
The Recirculation Approach
The authors repeatedly stress the ‘innovation’ and ‘novelty’ of their recirculation modeling approach (though the distinctions between it and the cyclic/periodic boundary condition options available in such models as OpenFOAM elude me somewhat), but they do not sufficiently expand on *why* the approach might be specifically valuable, nor what might make it applicable elsewhere. Section 4.2.2 dispenses with this explanation in a single phrase: “traffic-related emissions can be considered representative of the surrounding zones, given the study area’s central location within the city of Augsburg”. They do not expand further on why this approach might be advantageous. This atmospheric modeler recognizes that the recirculation approach that the authors have used here greatly simplifies the treatment of what would otherwise have to be independently-generated boundary condition inputs. In fact, this fine point of modeling concept is key to understanding the authors’ designation of F/BG/BGF-SOA in section 4.2.2 (see my confusion regarding the discussion from lines 392-403), but is not spelled out in the paper.
Additionally, given that the authors identify a 7-hour window over which the background toluene SOA formation might contribute to observed concentrations, there might be a useful exercise regarding how large a city airshed would have to be for this recirculation approach to be representative, or alternately a quantification of how much recirculation is present in the true atmosphere over a city in question, in order to determine whether the recirculation model might be appropriate for use.
Specific Comments
Line 31-32: citing ‘Saiz-Lopez et al., 2017’ for the statement that ‘hydroxyl radicals have a great influence on SOA production as they are the most potent oxidant in the atmosphere’ seems like an odd choice. This is neither a new nor particularly controversial statement – why include this citation? If the authors *do* want to include a cite here, Calvert et al. (2002) might be an appropriate citation, at least for aromatic hydrocarbons such as toluene.
Calvert, J. G., Atkinson, R., Becker, K. H., Kamens, R. M., Seinfeld, J. H., Wallington, T. J., and Yarwoord, G.: The Mechanisms of Atmospheric Oxidation of Aromatic Hydrocarbons, Oxford University Press, New York, 556pp., 2002.
Lines 44-45: citing two papers from 2014 and 2017 for the source of ‘oxidation chambers’ seems like an odd choice, especially since the paper also cites earlier work on the same topic (e.g, both of the Ng. et al 2007 papers cited). The authors should re-examine these citations and consider more foundational choices, unless there is a specific reason why these two studies are being cited. Similarly for the OFR reactions, the original citations should probably be those of Kang et al (2007) and Lambe et al. (2011)
Kang, E.; Root, M. J.; Toohey, D. W.; Brune, W. H. Introducing the Concept of Potential Aerosol Mass (PAM) Atmos. Chem. Phys. 2007, 7, 5727– 574417
Lambe, A. T.; Ahern, A. T.; Williams, L. R.; Slowik, J. G.; Wong, J. P. S.; Abbatt, J. P. D.; Brune, W. H.; Ng, N. L.; Wright, J. P.; Croasdale, D. R. Characterization of Aerosol Photooxidation Flow Reactors: Heterogeneous Oxidation, Secondary Organic Aerosol Formation and Cloud Condensation Nuclei Activity Measurements Atmos. Meas. Tech. 2011, 4, 445– 46118
Line 63: ‘the VBS approach does not include detailed chemical reactions’ – I do not entirely agree with this statement, and I think the authors should refine the claim they are trying to make here. The VBS necessitates simplification of chemistry *on the basis of volatility*, and I think this is what the authors mean? But this does not precisely preclude the inclusion of detailed chemical reactions, it simply abstracts them behind the layer of partitioning theory and volatility bins.
Line 68: This is why the prior statement matters. This reviewer is familiar with VBS frameworks capable of considering NOx levels, SOA yields, oxidation duration, and OH concentrations. Doing all of these things in a single VBS comes with significant drawbacks to computational efficiency, as it necessitates a large number of reactions representing different yields of individual volatility bins and additional reactions passing mass from one volatility bin to another.
Line 80: “No study has yet modelled and quantified the contribution to local SOA mass over various time scales in urban environments, highlighting a gap in current research” is a bold claim to make. The authors should reconsider what precisely they mean to say here, as I do not think that the statement is accurate as written.
Figure 1 and discussion: how should readers interpret these GECKO-corrected yields in the context of their use in SOMA, given that they change over time while SOMA gives a single yield term, albiet corrected for temperature, ozone, and RH? Also, is there a logic behind the color choices here?
Line 123-124: Ng et al. 2007 citation here is getting treated oddly (Ng, Kroll et al. 2007) since it is one of two Ng et al. 2007 cites in this paper. As someone who has fallen victim to this exact problem with precisely these two papers, I have sympathy. I think the preferred style is (Ng et al. 2007a; Ng et al. 2007b), but the authors may want to double check.
Figure 4: I appreciate this figure! It’s a helpful mental map. One question: why is limonene shaded a completely different color from the other compounds?
Line 285/Section 3.4: Does this model contemplate the oxidation of species which are primarily oxidized by O3 rather than OH? It’s possible this question is outside of the scope of this paper, but it would become relevant if models like this one became more widely used.
Lines 300-320: it is not at all clear to this reader *how* the authors managed to account for temperature dependence based on their source experiments. As both the text and Figure 6 make clear, the ratio of Texp/Tgecko was essentially 1 for all cases. Despite this, the authors claim to be able to account for temperature dependence both in their introduction/abstract and explicitly in Equation (10). How this was achieved is unclear. Is the answer just: ‘in principle SOMA could do this assuming the relevant experiments had been performed, but in the case of toluene no such experiments exist’? If so, isn’t the inclusion of a temperature correction at all potentially misleading? What happens if you use this temperature relationship (derived across the range of temperatures from 25.8-26.9 C) at some atmospherically relevant temperature outside of the range of the parameterization?
Figure 7: This figure is compelling but could be considerably clearer. For one thing, the labels in the caption and on the figure do not correspond to one another. The authors should make the fact that ‘original model’ (I believe this is uncorrected GECKO-A output, currently in orange) and ‘corrected’ (I believe this to be the SOMA output, currently in green) are model outputs, and that purple is the experimental output and *NOT* from any model. This could easily be achieved with 1) a better caption and 2) a clear connection between the two modeled outputs – perhaps as shades of the same color, or the same color with different textures applied. Additionally, the y-axis label is unnecessarily unclear.
Figure 8: I think this figure could be shifted to the supplemental materials – it is unclear to me that it is much more relevant than the GECKO-A output in Figure 5 as it only summarizes the effect of the inputs already in GECKO-A, rather than the effects of the new factors which SOMA can account for (RH, T, O3). If it is to be kept in the paper, the Y-Axes should be relabeled to be in percent, as the current y-axis is opaque. Also, the label on the figure ‘SOA toluene’ does not match with the caption ‘toluene-to-SOA’ formation. Neither label makes much sense, and both should be re-thought.
Line 349: delete the ‘and’ in this line to make the claim clearer: ‘toluene is …the only traffic-linked SOA precursor for which … measurements were available’.
Line 353: how good is this assumption of a uniform background concentration within the modelled domain given the atmospheric lifetime of toluene?
Lines 363-365: These sentences seem backwards to me, as I understand them. The authors identified the toluene attributable to traffic activity *BY* subtracting out the background toluene, which left them with a toluene concentration they could compare to model output. Is that correct? I’m left slightly uncertain, because as written this is confusing.
I think my prior comment relates to my confusion with Figure 9: both the design of this figure and its discussion should make clearer what, precisely, is being compared to what here. As I understand it, the black line meant to show total measurement, and the green line is the measurement at KP-LFU. So far so good. But now, what is being represented in the orange and green bars? These are model outputs with background toluene and the traffic-toluene represented. So, should *only* the green bars be compared with the green line? Should the Green+Orange be compared with the black line? The conclusion reached: " The agreement with measured toluene concentrations suggests that the model effectively replicates the dispersion of traffic-related toluene within the urban environment” suggests that the comparison ought instead to be between the green line and green bars. But if this is the case, why even show the orange bars in the figure? The more I read this section, the more confused I get.
Line 383: The authors should include a very brief discussion of how their choice of OH concentration compares to other urban environmental studies or observations. I know how that [OH] lines up but the average reader may or may not.
Lines 392-400: the paragraph and the numbered list are repetitive and could be condensed down to just the list with a few additions. Additionally, couldn’t we also consider a fourth category of ‘Freshly formed SOA resulting from VOC emissions produced within the domain during an hour prior to the hour modeled’? From the discussion further down on page 20, it appears that this gets added into BGF-SOA – is that correct? On re-reading the paper, I think this choice is because of the recirculation assumptions, but the authors might considering making that explicit.
Line 405: something is odd with the formatting of equation (11). Specifically, the dash (e.g., in F-SOA) has been converted to an em-dash, which makes it look like a minus sign. This should be edited for clarity.
Line 419-421: Equations 12-14 are also oddly formatted and should be edited. If necessary, the authors should work with journal editors to get the effect they want. The authors currently appear to want to use two separate subscripts (e.g. VOC with subscripts [domain] and [t-1]). The result is an equation that is difficult to parse and confusing to look at. In equation (14), for example, there are two different lengths of ‘minus’ signs, which are very confusing. The same issues continue to plague the paper in equations 15-16.
Line 446: once again, the authors have chosen some odd citations (why cite Wang et al. 2021?) for the obvious point that ‘photochemistry occurs during periods of sunlight’. The entire sentence beginning with ‘SOA is mostly formed during periods with UV radiation’ is unnecessary given the prior sentence and the discussion of this idea on lines 32-33 of the paper.
Figure 11: I think this is an effective figure, and it is sufficient to make the point. However, because there are so many colors, it can be challenging to intercompare between pies to figure out ‘how long does it take for one of these times to stop mattering. If the authors wanted to rethink this plot, they could use line-and-area plots instead, so that you can follow a single hour along the x-axis until it disappears. However, this might lose the ‘local’ vs ‘background’ distinction, so if that is what the authors prefer to highlight, the figure is ok as-is.
Additionally, some house-keeping on this figure: If the yellow pie slice in the ’14:00-15:00’ plot is striped, I cannot see it on my copy of the paper, so this color may need to be adjusted. Also, I think that in the caption, the authors meant to write ‘striped slices’ rather than ‘stripped slices’.
Figure 12: I think this figure would be more intuitive to read if x-axis were reversed, with the 0 were on the right. Additionally, the y-axis label and figure caption should be harmonized rather than using different terminology between the two, and the x-axis label should be more descriptive. If the x-axis is reversed, it might be easy for the x-label to read ‘hours before modeled time period’.
Line 490: ‘making these conclusions broadly representative’ but you just said they weren’t representative? Perhaps the authors mean ‘relevant’ here rather than ‘representative’?
Lines 525-526: The authors close with a forceful statement “Since SOAs cannot be directly linked to specific sources and are not currently regulated, this study is crucial for uncovering their formation processes and advancing knowledge in this area”. This statement, while correct in its general gist, contains two strong claims, neither of which I agree with entirely. The authors may want to reconsider what exactly they mean here. SOA cannot *always* be linked to specific sources, but the use of Aerosol Mass Spectrometers allows for a certain amount of source fingerprinting (e.g., pyrogenic vs biogenic vs cooking vs hydrocarbon-like organic aerosols), so it’s an overstatement to say categorically that SOA ‘cannot be directly linked to specific sources’. Additionally, are SOA not currently regulated at all? Air quality regulations often target a reduction in PM (of which SOA is a component) and control of specific VOC emissions (many of which are SOA precursors). I think it is correct to say that regulating SOA is difficult because the system is so complex and poorly understood. For this precise reason, identifying the precise effects of a known SOA precursor such as toluene is valuable! Similarly, identifying the effects of ‘natural’ SOA formation gives policymakers valuable information in terms of understanding what ‘background’ SOA concentrations would look like. All of this is doubly true if different types of SOA have different health effects, which is certainly a possibility we as a field have not yet ruled out.
Citation: https://doi.org/10.5194/egusphere-2025-193-RC2 - the simplicity of the calculations
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
468 | 71 | 18 | 557 | 42 | 23 | 35 |
- HTML: 468
- PDF: 71
- XML: 18
- Total: 557
- Supplement: 42
- BibTeX: 23
- EndNote: 35
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Ioannidis et al. develop parameterizations for SOA mass yields using GECKO and laboratory data and combine those parameterizations with a CFD model to examine the spatial and source patterns of SOA in a targeted urban area. SOA is a complex pollutant with a multitude of sources and pathways. It contributes to PM1 significantly and there is a need to develop approaches that can simulate SOA formation in atmospheric models. Hence, the paper’s need is justified. However, the paper’s motivation (described in the introduction) is adequate but not strong. I also found the methods inadequate and confusing. The use of CFD seems novel but I contend that the use case to study SOA is not a good fit. Overall, the paper is weak, and I do not recommend publication of this paper in ACP. The comments below outline my most important objections, and I would strongly recommend the authors consider those as they think of novel applications for their CFD modeling.
Major comments:
1. Even for highly reactive VOCs, the timescales for SOA production at ambient concentrations of OH and O3 are on the order of hours (even longer for species like toluene that has an e-folding lifetime of over 1.5 days). The motivation for using CFD in ‘street canyons’ to estimate SOA production – which is more regional - isn’t quite strong. I remain highly skeptical of the primary finding that the SOA in a domain this small (1.8 km) is dominated by SOA from local sources within the domain. I suspect that CFD at this spatial scale would be much more useful in tracking the spatiotemporal evolution of primary particles and gases where transport and dilution are much more relevant than chemistry. In my opinion, to get at airshed level burdens of SOA, 0D box models (e.g., Hayes et al., ACP, 2015) and high-resolution chemical transport models (e.g., Pennington et al., ACP, 2021) are likely better tools to model the formation, evolution, and properties of SOA.
2. Figure 1: Details about the initial VOC, OH/O3 concentration, OA mass loading are all missing. Why are results shared in the methods section?
3. Sections 3.2-3.4: I don’t understand the rationale for these methods. I see several glaring problems. First, toluene SOA has been widely studied and there 10s (if not 100s) of studies that have documented SOA mass yields. While there isn’t an expectation to include every published study on toluene SOA, what is expected is a rationale for why these (i.e., Deng and Chen) were picked and how they are representative of the broader consensus about toluene SOA mass yields. Second, it is unclear why GECKO wasn’t directly run for the same experimental conditions for initial VOC, NOx, RH, and T. Depending on the model-measurement performance, a case could have been made for parameterizing SOA_exp/SOA_org to differences in O3. Regardless, I would still be skeptical of this parametrization as no attempt was made to mechanistically explain why the model underestimates the measurements so substantially. Third, SOA is expected to be a strong function of the OA/SOA mass loading (depending on whether an organic seed was used to aid SOA condensation) and hence any model-measurement difference in yield is likely to also be a function of the SOA mass concentration (which in itself is the primary output that is being used to compute the yield). This non-linearity is probably the most difficult to resolve.
4. Section 4.1: I do not agree with how the model was setup for background/boundary values of toluene. I don’t see how a uniform background toluene assumption is justifiable given that the concentrations inside the modeled domain vary spatially. The modeled domain is identical to the regions surrounding it so if toluene varies inside the domain, that fact should also hold for regions immediately surrounding the domain.
5. The strong suit of this work is the CFD modeling and what can be learned from it. The development of methods to estimate SOA mass yields are clunky at best. This work’s Achilles heel is GECKO, which is great at gaining a fundamental understanding of multi-generational SOA chemistry and not a good fit for predicting SOA mass yields. Why not use published VBS parameterizations with NOx dependence to get at SOA mass yields directly? They may not be perfect but would work better than GECKO and allow the authors to focus on the CFD insights.
Minor comments:
Line 32: I haven’t seen ‘OHs’ written in plural.
Line 111: Provide references for the NOx levels used to generate GECKO output. Same goes for the choice of RH and T.