Causal Analysis of Aerosol Impacts on Isolated Deep Convection: Findings from TRACER

Wang, Dié; Kobrosly, Roni; Zhang, Tao; Subba, Tamanna; van den Heever, Susan; Gupta, Siddhant; Jensen, Michael

doi:https://doi.org/10.5194/egusphere-2024-2436

Preprints

https://doi.org/10.5194/egusphere-2024-2436

Preprints

14 Aug 2024

| 14 Aug 2024

Causal Analysis of Aerosol Impacts on Isolated Deep Convection: Findings from TRACER

Dié Wang, Roni Kobrosly, Tao Zhang, Tamanna Subba, Susan van den Heever, Siddhant Gupta, and Michael Jensen

Abstract. This study employs a novel application of causal machine learning, specifically g-computation, to quantify aerosol effects on deep convective clouds (DCCs). Focusing on isolated DCCs in the Houston-Galveston region, we leverage comprehensive ground-based observations from the TRacking Aerosol Convection interactions ExpeRiment (TRACER) to estimate aerosol influences on convective core depth, intensity, and area. Our results reveal that greater aerosol number concentrations generally have a limited impact on convective core echo top height (ETH), with an increase of about 1 km (13 % of average ETH). This effect is observed under specific conditions, particularly when ultrafine particles are activated in updraft regions. Additionally, greater aerosol levels correspond to increased convective core intensity and area, though these changes remain within radar measurement uncertainties. In DCCs associated with sea breezes, aerosol effects are more pronounced, resulting in a 1.4 km deepening of ETH. However, this heightened effect could be attributed to the exclusion of key confounders such as boundary layer updrafts in the causal model. This study pioneers the application of causal machine learning to explore aerosol-convection interactions, shedding light on unraveling complex interplay between aerosols and meteorological variables.

Received: 01 Aug 2024 – Discussion started: 14 Aug 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 4381 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (4381 KB)

Supplement (2462 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

26 Aug 2025

Aerosol impacts on isolated deep convection: findings from TRACER

Dié Wang, Roni Kobrosly, Tao Zhang, Tamanna Subba, Susan van den Heever, Siddhant Gupta, and Michael Jensen

Atmos. Chem. Phys., 25, 9295–9314, https://doi.org/10.5194/acp-25-9295-2025,https://doi.org/10.5194/acp-25-9295-2025, 2025

Short summary

Dié Wang, Roni Kobrosly, Tao Zhang, Tamanna Subba, Susan van den Heever, Siddhant Gupta, and Michael Jensen

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2436', Toshi Matsui, 30 Aug 2024

Summary.
This study used novel causal machine learning techniques to statistically quantify the impact of aerosols on tracked isolated deep convection observed by the NEXRAD. The authors argue that the new machine-learning technique can separate the various meteorological parameters to isolate the relationship between aerosols and deep convection. The results indicate that increases in background aerosols are associated with a 1.4km deepening of the echo-top height of deep convection. Overall, this paper is well-written, and the approach seems appropriate. However, I suggest some modifications in writing and explanation, which can improve the readability. These are described in the major comments, but not so critical. Hence, my recommendation is a "minor revision".

Major Comments.
1) Readability: The new statistical approach is meticulously written, but it is often hard to read due to various statistical jargon that is unfamiliar to atmospheric scientists like me. Can you clearly define these terms at the beginning? For example, define like this in the table.
Confounder: a variable that affects both the dependent and independent variables in a study, causing an association that may not be accurate. (parameters include ….)
Exposures: Any factor that may be associated with an outcome of interest. (parameters include ….)
Probably these terms are common in epidemiology, but not in atmospheric science.
2) New and traditional approach: At the end of the manuscript, authors mentioned quite significant statements “Nevertheless, this study pioneers the use……….. scientistifc questions”. To be honest, I still wonder why this new method is so novel compared to the previous old approach because there’s no comparison between the new and traditional statistical approaches. For example, here is one of the earliest aerosol-deep convection manuscript.
Lin, J. C., Matsui, T., Pielke, R. A., & Kummerow, C. (2006). Effects of biomass-burning-derived aerosols on precipitation and clouds in the Amazon Basin: A satellite-based empirical study. Journal of Geophysical Research: Atmospheres, 111(D19). https://doi.org/10.1029/2005JD006884
In this paper, DCC properties (precipitation, cloud top height, and cloud fraction) are related to aerosol optical depth for a given meteorological parameter (cloud work function in that study). Can you compare your novel approach with this traditional approach (simple statistics stratified by meteorological parameters)? Do you think the old approach leads to significant biases in understanding the aerosol-DCC relationship? Can you prove or briefly explain?
3) Potential biases in radar-based approach: Authors use threshold NEXRAD radar parameters to define DCC. However, if DCC has a much smaller amount of raindrops due to a large number of background aerosols, this cell may not be counted as DCC due to larger concentrations of small-size droplets, which won’t increase S-band reflectivity. Alternatively, if you use cloud optical depths and top height, the DCC sampling can include such cells. This is a NEXRAD-based cell tracking approach, so you cannot change your approach. However, it is important to discuss potential sampling biases using the NEXRAD radar.

Minor Comments.
Line 87: Please remove parenthesis “(either invi….. )”.
Line 120-121: “exclude the presence of shallow convection” sounds like removing the sampling during shallow stages. So I suggest just re-write as “exclude the shallow convection cells”.
Line 179: Please define the threshold of diameters of “ultrafine aerosols”.
Line 274: “buoyancy-driven DCCs”. Well, all DCCs are driven by the buoyancy over the flat terrain. So you may re-write this as “locally driven DCCs”.
Line 294: “30-dBZ ETH/15-dBZ ETH” should be “30-dBZ ETH and15-dBZ ETH”.
Line 303-306: We won’t be able to measure supersaturation directly within the convective storms. however, you can infer the required supersaturation in order to active all aerosols (including ultrafine). For this case, can you describe roughly how much supersaturation is required to support your argument?
Fig. 4: Why is there no correlation between thermodynamics and Nccn? It seems to be more important?
Line 547: “30-dBZ ETH/15-dBZ ETH is 1.1 km/1.0 km,” should be “30-dBZ ETH and15-dBZ ETH is 1.1 km and 1.0 km, respectively.”

Citation: https://doi.org/10.5194/egusphere-2024-2436-RC1
- AC1: 'Reply on RC1', Dié Wang, 07 Dec 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2436/egusphere-2024-2436-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2436-AC1
RC2:
'Comment on egusphere-2024-2436', Anonymous Referee #2, 28 Sep 2024
Overview
This study tracks isolated convective cells during conditions of large-scale subsidence using NEXRAD radar reflectivity within different distances of 20 to 50 km of the TRACER primary observing site. Per 4 to 6-hourly sounding, the average of cell maximum radar echo top heights (ETHs) are used to derive a multiple linear regression where predictors consist of an aerosol concentration variable and 2 sounding-derived meteorological variables, with different variables tested and chosen based on their correlation with ETHs. This regression is then used to keep confounding meteorological predictors constant and define aerosol concentration predictors as separately polluted and clean to compute a change in ETH, which is then called the causal effect from aerosol concentration. The increase in ETH between high and low ultrafine aerosol concentrations is approximately 1 km. In sea breeze conditions, this increases to 1.4 km. CCN variables with supersaturation up to 1% do not have any robust relationships with ETHs, but CN concentrations do, which is attributed to ultrafine aerosols activating in updraft regions and creating condensational invigoration. Convective core maximum reflectivity also increases by about 2 dBZ moving from low to high aerosol concentrations but this result may not be robust given radar reflectivity uncertainty.
There are some nice analyses and discussion in this study including sensitivity tests and some caveats that provide important context. However, I have several major concerns with how the analyses are interpreted and some of the conclusions that are drawn.
Major Comments
Non-invigoration aerosol-DCC interactions that could affect aerosol-ETH relationships are ignored. Aerosol-DCC interactions include direct effects on microphysics in addition to indirect effects on updraft strength. The paragraph starting on line 43 starts by referencing aerosol-DCC interactions in general but then the discussion that follows in the introduction focuses purely on updraft invigoration. This is problematic because aerosols can also directly affect microphysical properties (e.g., collision-coalescence, riming), which affects radar reflectivity and thus reflectivity echo top height. These direct effects may or may not be further associated with a change in updraft strength. To assume that updraft strength alone is the cause for changed in ETH assumes that changes in aerosols do not alter the reflectivity profile for a given cloud top. Furthermore, there is an assumption that the relationship between ETH and the true cloud top (the vertical gradient of reflectivity between the ETH and cloud top) does not change with changes in aerosols. It is not clear how valid those assumptions are. What evidence is there to suggest that ETH changes are primarily corresponding to changes in updraft strength?

The g-computation model does not provide the causal direction, which still needs to be assumed, even if it is called a causal inference model. This assumption is made in the multiple linear regression model where the predicted convective property is assumed to follow from the predictors. The reasoning for this is that the meteorological and aerosol properties are defined prior to the convective cell properties, which makes sense, but this is similar to what has been done in some prior studies. Furthermore, this time offset still doesn’t ensure the assumed causal direction because there is a lot of atmospheric complexity that isn’t being quantified that can affect the properties of the cells and atmosphere offset in space and time. Thus, describing this research as the first to show cause-effect is misleading. The methods do have unique aspects relative to past studies that can be highlighted but there is no reason to believe that the causal direction has been more discerned than in past studies.

It is not clear what value the g-computation model provides over the multiple linear regression. If the underlying model where a more complex nonlinear model, there would be some justification for it, but multiple linear regression is used. The multiple linear regression coefficients can be used to describe convective sensitivity to aerosols, giving the same results. Even with using the g-computation model, describing an aerosol effect as just the change in ETH without the corresponding change in aerosols, as is done throughout the paper, doesn’t make much sense. It is the sensitivity, i.e., the change in ETH per change in aerosol concentration, that is most relevant with the underlying assumption that this is approximately linear, and this is simply the slope for the aerosol concentration predictor from the multiple linear regression model. What does the g-computation model provide that the regression cannot other than calling the model “causal machine learning”?

Tests for multiple linear regression model accuracy and robustness are missing. For example, the predictor coefficients should have 95% confidence intervals computed. In addition, how well does the MLR predict the observed ETHs? What is its r² value? The r² is important as it shows how much of the ETH variance remains unexplained by the model, which is relevant for missing information that could still confound the relationships of ETH with the current predictors.

The argument for activation of ultrafine aerosols in updrafts leading to increases in ETHs lacks evidence. Activation of ultrafine particles seems highly unlikely given the high concentrations of larger aerosols for most of the samples assessed (Figure 7). Activation of the ultrafine particles would result in cloud droplet concentrations of a few thousand per cm³. Are there aircraft measurements (e.g., during ESCAPE) to support such high drop concentrations? Assuming a favorable composition for nucleation, what would the supersaturation need to be to activate particles at a certain size (e.g., 10 nm) given observed aerosol size distributions? This could be assessed in a parcel model to show if the argument being made is even physically possible.

The diurnal cycle needs to be ruled out as a cause of the CN-ETH and UFP-ETH relationships. Over land, ultrafine aerosols often have a strong diurnal cycle just as deep convection does, which can affect relationships between the two. Accumulation mode aerosols often have a much weaker diurnal cycle, which is potentially a hypothesis for why one wouldn’t get robust CCN relationships but robust CN relationships with ETHs. For example, Fast et al. (2024) shows this for the CACTI campaign. This occurs because new particle formation processes over land operate during the daytime. What are the typical changes in ETH and predictor variables including CN and CCN over the diurnal cycle? Do CN and ETH variables both peak in later afternoon? If hour of day is controlled for, does that affect the aerosol-ETH relationships?

Relevance of sounding convective parameters at M1 for some situations needs further inquiry. Convective parameters like CAPE are not stable for 4-6 hours over land, and the study (Prein et al., 2022) used to support this claim on line 209 does not state that so far as I can tell. That study uses a limit of 4 hours difference between observed and simulated MCSs to match them, and MCSs are not the same as isolated convective clouds in atmospheric sensitivities. Other studies such as Nelson et al. (2021) show large changes in low level moisture on distances < 50 km and times of ~1 hour over some land convective regions. The statement after this on lines 209-211 that the M1 site is not heavily affected by maritime conditions is also confusing because the M1 site is close to Galveston Bay, and as noted in the study, a bay breeze often forms. Perhaps the bay air mass is similar to the continental air mass in terms of aerosol and thermodynamic properties, but I’m not sure that can be assumed. It may not be possible to easily assess these caveats, but they should at least be highlighted. Something that could be looked into though is whether the M1 surface measurements are relevant to air feeding cells at nighttime and/or after the bay/sea breezes have passed inland of the M1 site by examining stability at and through the boundary layer up to approximate cloud base to assess the likelihood of coupling to M1 site surface conditions.

More information on the spatiotemporal distribution of cells and cell properties is needed. Because of potentially substantial gradients in aerosol and thermodynamic properties given the coastal and large urban area, it would be ideal to plot the initiation locations and/or locations where the cell ETHs are maximized on maps for different ranges from the M1 site rather than the tracks in Figure 1 that don’t provide much information. In addition, it would be helpful to map out cell properties like those in Figure 6 to see if there are spatial gradients in the properties with respect to the M1 site location.

Are ETH retrievals from level 2 NEXRAD data unbiased with range from the radar? Related to the previous comment, ETHs should be mapped with range from the radar to see if there are biases related to beam filling and gaps between elevation angles with range.

ACP recommends making processed data and code openly available in a FAIR-aligned reliable public repository to support study reproducibility. It is likely not possible to reproduce the methodology with only links to TINT and raw datasets given the information provided in the study.

Minor Comments
Line 7: Only a single model predicts a significant relationship between an aerosol concentration and convective core area, which 0.8% CCN within 30 km of the M1 site (Figure 10). The other 31 models are not significant. That seems pretty random, particularly since some models switch sign with changes in range within M1, and not enough to support this statement in the abstract that greater aerosol levels correspond to increased convective core area.

Lines 31-33: This is an odd motivation since ERFaci uncertainty is currently mostly attributed to non-deep convective clouds that are not the focus of this study.

Discussion of leading invigoration mechanisms in introduction: Semi-direct effects by aerosols that alter atmospheric thermodynamic stability should also be included.

Lines 60-63: Some of the studies cited here are not simply questioning the importance of invigoration mechanisms relative to other forcings but showing that there is a spectrum of enervation to invigoration possible, thus suggesting that referring to the mechanisms only in terms of invigoration is misleading.

Lines 75-76: Though individual modeling studies have quantified aerosol effects, it is important to note that there is still disagreement between these studies, even in the sign of effects, because models and the methods for analyzing them (e.g., discussion in Varble et al., 2023).

It isn’t clear how updraft strength is being defined. Is this referring to updraft mass flux, average vertical wind speed, or maximum vertical wind speed?

Lines 124-128: Not tracking cells when max 2-km Z < 40 dBZ leaves out more than non-precipitating stages as suggested here. It also leaves out lightly precipitating periods.

For the meteorological variables, there is almost an unlimited number that could potentially be relevant and tested. Were different shear layers other than 0-5 km tested? Was mid-level RH tested (separate from the boundary layer)?

What assumptions are made for the lifted parcel calculations (LCL, LNB, CAPE)? Is liquid pseudoadiabatic or reversible ascent assumed?

Line 187: CCN at various supersaturations does not have a temporal resolution of 1 minute or less as stated here. The supersaturation is varied over the course of about an hour usually so there is 1 value at each supersaturation every ~hour or so.

Lines 194-195: A t-test may not be valid here if the aerosol distributions are skewed.

How are DCC tracking results averaged? Does each DCC have a single value for a variable like ETH and then all of the ETHs are averaged together?

Lines 234-235: I don’t follow the argument for why large-scale ascent needs to be avoided, though I can see why MCSs would want to be avoided. Is that the primary reason for avoiding certain large-scale meteorological conditions?

Lines 273-275: Mesoscale deep convective systems are still buoyancy driven, so I don’t understand what this sentence is trying to get across.

Figure 4: Why are values not filled in for the significant correlations less than 0.4? Also, I may have missed it, but are the aerosols in Figure 4 sampled around the same time as the soundings or are they sampled after the soundings?

In some places, LWS is used and in others, shear is used. It would be best to choose one or the other and be consistent throughout.

Line 342: Should “accuracy” be “robustness” here?

Lines 364-365: Including some critical meteorological quantities supports this assumption, but I wouldn’t say that it is necessarily sufficient. That is hard to know without an in-depth study of possible confounders.

Lines 388-389: I don’t follow the argument of multi-collinearity supporting standardization. Isn’t the reason for standardization stated on lines 390-392?

Lines 463-464: There is not enough evidence to make this statement that Ncn and Nupf are causing higher ETH via their activation.

Line 494: I disagree that a causal link was demonstrated. The only thing supporting cause is that the aerosols are sampled prior to cells in time, but there is no evidence to show the causal mechanisms, and there are potentially other confounders not accounted for (see major comments).

Lines 536-540: It’s true that uncertainty renders the max reflectivity results less robust, but the same argument can be made for how well 4-6 hourly soundings and aerosols at a single point represent conditions where cells are growing.

Lines 603-605: I think this sentence can be clarified. Aerosol is not robustly associated with DCC max ETH (not its evolution) given the sampling in this study. That does not mean that it couldn’t be if more samples we added.

References
Fast, J. D., Varble, A. C., Mei, F., Pekour, M., Tomlinson, J., Zelenyuk, A., Sedlacek III, A. J., Zawadowicz, M., and Emmons, L. K.:, 2024 Large Spatiotemporal Variability in Aerosol Properties over Central Argentina during the CACTI Field Campaign, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2024-1349.
Nelson, T. C., J. Marquis, A. Varble, and K. Friedrich, 2021: Radiosonde Observations of Environments Supporting Deep Moist Convection Initiation during RELAMPAGO-CACTI. Mon. Wea. Rev., 149, 289–309, https://doi.org/10.1175/MWR-D-20-0148.1
Citation: https://doi.org/10.5194/egusphere-2024-2436-RC2
- AC2: 'Reply on RC2', Dié Wang, 08 Dec 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2436/egusphere-2024-2436-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2436-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2436', Toshi Matsui, 30 Aug 2024

Summary.
This study used novel causal machine learning techniques to statistically quantify the impact of aerosols on tracked isolated deep convection observed by the NEXRAD. The authors argue that the new machine-learning technique can separate the various meteorological parameters to isolate the relationship between aerosols and deep convection. The results indicate that increases in background aerosols are associated with a 1.4km deepening of the echo-top height of deep convection. Overall, this paper is well-written, and the approach seems appropriate. However, I suggest some modifications in writing and explanation, which can improve the readability. These are described in the major comments, but not so critical. Hence, my recommendation is a "minor revision".

Major Comments.
1) Readability: The new statistical approach is meticulously written, but it is often hard to read due to various statistical jargon that is unfamiliar to atmospheric scientists like me. Can you clearly define these terms at the beginning? For example, define like this in the table.
Confounder: a variable that affects both the dependent and independent variables in a study, causing an association that may not be accurate. (parameters include ….)
Exposures: Any factor that may be associated with an outcome of interest. (parameters include ….)
Probably these terms are common in epidemiology, but not in atmospheric science.
2) New and traditional approach: At the end of the manuscript, authors mentioned quite significant statements “Nevertheless, this study pioneers the use……….. scientistifc questions”. To be honest, I still wonder why this new method is so novel compared to the previous old approach because there’s no comparison between the new and traditional statistical approaches. For example, here is one of the earliest aerosol-deep convection manuscript.
Lin, J. C., Matsui, T., Pielke, R. A., & Kummerow, C. (2006). Effects of biomass-burning-derived aerosols on precipitation and clouds in the Amazon Basin: A satellite-based empirical study. Journal of Geophysical Research: Atmospheres, 111(D19). https://doi.org/10.1029/2005JD006884
In this paper, DCC properties (precipitation, cloud top height, and cloud fraction) are related to aerosol optical depth for a given meteorological parameter (cloud work function in that study). Can you compare your novel approach with this traditional approach (simple statistics stratified by meteorological parameters)? Do you think the old approach leads to significant biases in understanding the aerosol-DCC relationship? Can you prove or briefly explain?
3) Potential biases in radar-based approach: Authors use threshold NEXRAD radar parameters to define DCC. However, if DCC has a much smaller amount of raindrops due to a large number of background aerosols, this cell may not be counted as DCC due to larger concentrations of small-size droplets, which won’t increase S-band reflectivity. Alternatively, if you use cloud optical depths and top height, the DCC sampling can include such cells. This is a NEXRAD-based cell tracking approach, so you cannot change your approach. However, it is important to discuss potential sampling biases using the NEXRAD radar.

Minor Comments.
Line 87: Please remove parenthesis “(either invi….. )”.
Line 120-121: “exclude the presence of shallow convection” sounds like removing the sampling during shallow stages. So I suggest just re-write as “exclude the shallow convection cells”.
Line 179: Please define the threshold of diameters of “ultrafine aerosols”.
Line 274: “buoyancy-driven DCCs”. Well, all DCCs are driven by the buoyancy over the flat terrain. So you may re-write this as “locally driven DCCs”.
Line 294: “30-dBZ ETH/15-dBZ ETH” should be “30-dBZ ETH and15-dBZ ETH”.
Line 303-306: We won’t be able to measure supersaturation directly within the convective storms. however, you can infer the required supersaturation in order to active all aerosols (including ultrafine). For this case, can you describe roughly how much supersaturation is required to support your argument?
Fig. 4: Why is there no correlation between thermodynamics and Nccn? It seems to be more important?
Line 547: “30-dBZ ETH/15-dBZ ETH is 1.1 km/1.0 km,” should be “30-dBZ ETH and15-dBZ ETH is 1.1 km and 1.0 km, respectively.”

Citation: https://doi.org/10.5194/egusphere-2024-2436-RC1
- AC1: 'Reply on RC1', Dié Wang, 07 Dec 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2436/egusphere-2024-2436-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2436-AC1
RC2:
'Comment on egusphere-2024-2436', Anonymous Referee #2, 28 Sep 2024
Overview
This study tracks isolated convective cells during conditions of large-scale subsidence using NEXRAD radar reflectivity within different distances of 20 to 50 km of the TRACER primary observing site. Per 4 to 6-hourly sounding, the average of cell maximum radar echo top heights (ETHs) are used to derive a multiple linear regression where predictors consist of an aerosol concentration variable and 2 sounding-derived meteorological variables, with different variables tested and chosen based on their correlation with ETHs. This regression is then used to keep confounding meteorological predictors constant and define aerosol concentration predictors as separately polluted and clean to compute a change in ETH, which is then called the causal effect from aerosol concentration. The increase in ETH between high and low ultrafine aerosol concentrations is approximately 1 km. In sea breeze conditions, this increases to 1.4 km. CCN variables with supersaturation up to 1% do not have any robust relationships with ETHs, but CN concentrations do, which is attributed to ultrafine aerosols activating in updraft regions and creating condensational invigoration. Convective core maximum reflectivity also increases by about 2 dBZ moving from low to high aerosol concentrations but this result may not be robust given radar reflectivity uncertainty.
There are some nice analyses and discussion in this study including sensitivity tests and some caveats that provide important context. However, I have several major concerns with how the analyses are interpreted and some of the conclusions that are drawn.
Major Comments
Non-invigoration aerosol-DCC interactions that could affect aerosol-ETH relationships are ignored. Aerosol-DCC interactions include direct effects on microphysics in addition to indirect effects on updraft strength. The paragraph starting on line 43 starts by referencing aerosol-DCC interactions in general but then the discussion that follows in the introduction focuses purely on updraft invigoration. This is problematic because aerosols can also directly affect microphysical properties (e.g., collision-coalescence, riming), which affects radar reflectivity and thus reflectivity echo top height. These direct effects may or may not be further associated with a change in updraft strength. To assume that updraft strength alone is the cause for changed in ETH assumes that changes in aerosols do not alter the reflectivity profile for a given cloud top. Furthermore, there is an assumption that the relationship between ETH and the true cloud top (the vertical gradient of reflectivity between the ETH and cloud top) does not change with changes in aerosols. It is not clear how valid those assumptions are. What evidence is there to suggest that ETH changes are primarily corresponding to changes in updraft strength?

The g-computation model does not provide the causal direction, which still needs to be assumed, even if it is called a causal inference model. This assumption is made in the multiple linear regression model where the predicted convective property is assumed to follow from the predictors. The reasoning for this is that the meteorological and aerosol properties are defined prior to the convective cell properties, which makes sense, but this is similar to what has been done in some prior studies. Furthermore, this time offset still doesn’t ensure the assumed causal direction because there is a lot of atmospheric complexity that isn’t being quantified that can affect the properties of the cells and atmosphere offset in space and time. Thus, describing this research as the first to show cause-effect is misleading. The methods do have unique aspects relative to past studies that can be highlighted but there is no reason to believe that the causal direction has been more discerned than in past studies.

It is not clear what value the g-computation model provides over the multiple linear regression. If the underlying model where a more complex nonlinear model, there would be some justification for it, but multiple linear regression is used. The multiple linear regression coefficients can be used to describe convective sensitivity to aerosols, giving the same results. Even with using the g-computation model, describing an aerosol effect as just the change in ETH without the corresponding change in aerosols, as is done throughout the paper, doesn’t make much sense. It is the sensitivity, i.e., the change in ETH per change in aerosol concentration, that is most relevant with the underlying assumption that this is approximately linear, and this is simply the slope for the aerosol concentration predictor from the multiple linear regression model. What does the g-computation model provide that the regression cannot other than calling the model “causal machine learning”?

Tests for multiple linear regression model accuracy and robustness are missing. For example, the predictor coefficients should have 95% confidence intervals computed. In addition, how well does the MLR predict the observed ETHs? What is its r² value? The r² is important as it shows how much of the ETH variance remains unexplained by the model, which is relevant for missing information that could still confound the relationships of ETH with the current predictors.

The argument for activation of ultrafine aerosols in updrafts leading to increases in ETHs lacks evidence. Activation of ultrafine particles seems highly unlikely given the high concentrations of larger aerosols for most of the samples assessed (Figure 7). Activation of the ultrafine particles would result in cloud droplet concentrations of a few thousand per cm³. Are there aircraft measurements (e.g., during ESCAPE) to support such high drop concentrations? Assuming a favorable composition for nucleation, what would the supersaturation need to be to activate particles at a certain size (e.g., 10 nm) given observed aerosol size distributions? This could be assessed in a parcel model to show if the argument being made is even physically possible.

The diurnal cycle needs to be ruled out as a cause of the CN-ETH and UFP-ETH relationships. Over land, ultrafine aerosols often have a strong diurnal cycle just as deep convection does, which can affect relationships between the two. Accumulation mode aerosols often have a much weaker diurnal cycle, which is potentially a hypothesis for why one wouldn’t get robust CCN relationships but robust CN relationships with ETHs. For example, Fast et al. (2024) shows this for the CACTI campaign. This occurs because new particle formation processes over land operate during the daytime. What are the typical changes in ETH and predictor variables including CN and CCN over the diurnal cycle? Do CN and ETH variables both peak in later afternoon? If hour of day is controlled for, does that affect the aerosol-ETH relationships?

Relevance of sounding convective parameters at M1 for some situations needs further inquiry. Convective parameters like CAPE are not stable for 4-6 hours over land, and the study (Prein et al., 2022) used to support this claim on line 209 does not state that so far as I can tell. That study uses a limit of 4 hours difference between observed and simulated MCSs to match them, and MCSs are not the same as isolated convective clouds in atmospheric sensitivities. Other studies such as Nelson et al. (2021) show large changes in low level moisture on distances < 50 km and times of ~1 hour over some land convective regions. The statement after this on lines 209-211 that the M1 site is not heavily affected by maritime conditions is also confusing because the M1 site is close to Galveston Bay, and as noted in the study, a bay breeze often forms. Perhaps the bay air mass is similar to the continental air mass in terms of aerosol and thermodynamic properties, but I’m not sure that can be assumed. It may not be possible to easily assess these caveats, but they should at least be highlighted. Something that could be looked into though is whether the M1 surface measurements are relevant to air feeding cells at nighttime and/or after the bay/sea breezes have passed inland of the M1 site by examining stability at and through the boundary layer up to approximate cloud base to assess the likelihood of coupling to M1 site surface conditions.

More information on the spatiotemporal distribution of cells and cell properties is needed. Because of potentially substantial gradients in aerosol and thermodynamic properties given the coastal and large urban area, it would be ideal to plot the initiation locations and/or locations where the cell ETHs are maximized on maps for different ranges from the M1 site rather than the tracks in Figure 1 that don’t provide much information. In addition, it would be helpful to map out cell properties like those in Figure 6 to see if there are spatial gradients in the properties with respect to the M1 site location.

Are ETH retrievals from level 2 NEXRAD data unbiased with range from the radar? Related to the previous comment, ETHs should be mapped with range from the radar to see if there are biases related to beam filling and gaps between elevation angles with range.

ACP recommends making processed data and code openly available in a FAIR-aligned reliable public repository to support study reproducibility. It is likely not possible to reproduce the methodology with only links to TINT and raw datasets given the information provided in the study.

Minor Comments
Line 7: Only a single model predicts a significant relationship between an aerosol concentration and convective core area, which 0.8% CCN within 30 km of the M1 site (Figure 10). The other 31 models are not significant. That seems pretty random, particularly since some models switch sign with changes in range within M1, and not enough to support this statement in the abstract that greater aerosol levels correspond to increased convective core area.

Lines 31-33: This is an odd motivation since ERFaci uncertainty is currently mostly attributed to non-deep convective clouds that are not the focus of this study.

Discussion of leading invigoration mechanisms in introduction: Semi-direct effects by aerosols that alter atmospheric thermodynamic stability should also be included.

Lines 60-63: Some of the studies cited here are not simply questioning the importance of invigoration mechanisms relative to other forcings but showing that there is a spectrum of enervation to invigoration possible, thus suggesting that referring to the mechanisms only in terms of invigoration is misleading.

Lines 75-76: Though individual modeling studies have quantified aerosol effects, it is important to note that there is still disagreement between these studies, even in the sign of effects, because models and the methods for analyzing them (e.g., discussion in Varble et al., 2023).

It isn’t clear how updraft strength is being defined. Is this referring to updraft mass flux, average vertical wind speed, or maximum vertical wind speed?

Lines 124-128: Not tracking cells when max 2-km Z < 40 dBZ leaves out more than non-precipitating stages as suggested here. It also leaves out lightly precipitating periods.

For the meteorological variables, there is almost an unlimited number that could potentially be relevant and tested. Were different shear layers other than 0-5 km tested? Was mid-level RH tested (separate from the boundary layer)?

What assumptions are made for the lifted parcel calculations (LCL, LNB, CAPE)? Is liquid pseudoadiabatic or reversible ascent assumed?

Line 187: CCN at various supersaturations does not have a temporal resolution of 1 minute or less as stated here. The supersaturation is varied over the course of about an hour usually so there is 1 value at each supersaturation every ~hour or so.

Lines 194-195: A t-test may not be valid here if the aerosol distributions are skewed.

How are DCC tracking results averaged? Does each DCC have a single value for a variable like ETH and then all of the ETHs are averaged together?

Lines 234-235: I don’t follow the argument for why large-scale ascent needs to be avoided, though I can see why MCSs would want to be avoided. Is that the primary reason for avoiding certain large-scale meteorological conditions?

Lines 273-275: Mesoscale deep convective systems are still buoyancy driven, so I don’t understand what this sentence is trying to get across.

Figure 4: Why are values not filled in for the significant correlations less than 0.4? Also, I may have missed it, but are the aerosols in Figure 4 sampled around the same time as the soundings or are they sampled after the soundings?

In some places, LWS is used and in others, shear is used. It would be best to choose one or the other and be consistent throughout.

Line 342: Should “accuracy” be “robustness” here?

Lines 364-365: Including some critical meteorological quantities supports this assumption, but I wouldn’t say that it is necessarily sufficient. That is hard to know without an in-depth study of possible confounders.

Lines 388-389: I don’t follow the argument of multi-collinearity supporting standardization. Isn’t the reason for standardization stated on lines 390-392?

Lines 463-464: There is not enough evidence to make this statement that Ncn and Nupf are causing higher ETH via their activation.

Line 494: I disagree that a causal link was demonstrated. The only thing supporting cause is that the aerosols are sampled prior to cells in time, but there is no evidence to show the causal mechanisms, and there are potentially other confounders not accounted for (see major comments).

Lines 536-540: It’s true that uncertainty renders the max reflectivity results less robust, but the same argument can be made for how well 4-6 hourly soundings and aerosols at a single point represent conditions where cells are growing.

Lines 603-605: I think this sentence can be clarified. Aerosol is not robustly associated with DCC max ETH (not its evolution) given the sampling in this study. That does not mean that it couldn’t be if more samples we added.

References
Fast, J. D., Varble, A. C., Mei, F., Pekour, M., Tomlinson, J., Zelenyuk, A., Sedlacek III, A. J., Zawadowicz, M., and Emmons, L. K.:, 2024 Large Spatiotemporal Variability in Aerosol Properties over Central Argentina during the CACTI Field Campaign, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2024-1349.
Nelson, T. C., J. Marquis, A. Varble, and K. Friedrich, 2021: Radiosonde Observations of Environments Supporting Deep Moist Convection Initiation during RELAMPAGO-CACTI. Mon. Wea. Rev., 149, 289–309, https://doi.org/10.1175/MWR-D-20-0148.1
Citation: https://doi.org/10.5194/egusphere-2024-2436-RC2
- AC2: 'Reply on RC2', Dié Wang, 08 Dec 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2436/egusphere-2024-2436-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2436-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Dié Wang on behalf of the Authors (15 Dec 2024) Author's response Manuscript

EF by Anna Glados (18 Dec 2024) Author's tracked changes

ED: Referee Nomination & Report Request started (19 Dec 2024) by Shaocheng Xie

RR by Toshi Matsui (07 Jan 2025)

RR by Anonymous Referee #2 (24 Jan 2025)

Suggestions for revision or reasons for rejection

I thank the authors for their detailed responses and revisions that I believe have improved the manuscript. I have 1 remaining major comment clarifying my previous major comment 3 that I think is critical and don’t think was adequately addressed.

Major Comment

I do not understand the reply to my major comment 3 regarding whether g-computation and multiple linear regression results differ. From my read of section 3.4 describing the g-computation model and the Zenodo-archived code, the g-computation model produces the same output as the underlying multiple linear regression (the “Q-model”). If the multiple linear regression model is being used to estimate the aerosol effect on ETH, that is perfectly fine. However, if that is the case, the language in the study needs to be changed throughout it because it is stating that a new causal model is being used that is superior to past studies using regressions or other predictive models in isolating causal effects. From what I can tell, this isn’t true, but it is possible I am missing something. If I am, then I suspect others will too because it is not clear from the paper or the response to my comment. Thus, I think it is important that the authors clearly demonstrate how the result from the g-computation model differs from that obtained from the multiple linear regression model alone with A=1 and A=0 inputs, which just produces the b1 coefficient being multiplied by A, as shown below.

As stated in section 3.4, a multiple linear regression (the Q-model) is performed with standardized variables (Y = b0 + b1*A + b2*V1 + b3*V2). Y is ETH, V1 and V2 are meteorological confounders, and A is aerosol concentration, where observations are inputted to derive the b coefficients. Then, 0 is used as an input for A to represent clean conditions and 1 is used as an input to represent clean conditions with V1 and V2 either held constant or entered as observed values for each observations (this isn’t entirely clear, but shouldn’t matter so long as V1 and V2 inputs are the same for both the clean (A=0) and polluted (A=1) calculations). The 2 multiple linear regression calculations are differenced to give a change in Y (ETH). Since A is standardized, A = 0 is the same as the mean aerosol concentration and A = 1 is 1 standard deviation above the mean and the change in Y (ETH) is associated with a 1 standard deviation change in A (aerosol concentration). Is this what is being done? If so, then this would just give b1 as the answer, which is the sensitivity of Y (ETH) to A (aerosol concentration). Mathematically, holding V1 and V2 to constant values or using the values from any given event to control for confounding variables, this would be:

Y(A=1) – Y(A=0) = (b0 + b1 + b2*V1 + b3*V2) – (b0 + b2*V1 + b3*V2) = b1

Weighting this result by polluted vs. clean conditions still gives b1 since it is a constant, and the ETH change becomes the change per 1 standard deviation change in aerosols (A=1 minus A=0) holding confounding factors the same for each scenario. Note that b1 is easier to interpret than the ETH change that is provided now because it is an ETH change per aerosol concentration change. This is purely obtained from the multiple linear regression model and similar to past studies using multiple linear regression to estimate causal effects. Perhaps something else is being done, but this is how the paper and code currently read.

To clarify how calculations are being done, I suggest writing out the math so that it is clear how the ETH change is obtained to conclusively show one way or the other whether anything other than the multiple linear regression is being used. If only the regression is being used, then the paper should remove reference to g-computation and a new causal inference framework that is superior to past regression or random forest approaches. If instead something is being done that produces a different result than the regression alone, then that needs to be clearly demonstrated.

Minor Comment

Related to my previous major comment 1, I still feel that the introduction focuses primarily on aerosol effects on cloud dynamics. This is fine if the authors prefer to keep it this way, but why I had recommended to broaden the discussion to direct aerosol effects on cloud microphysics is that throughout the introduction, there are references to aerosol-DCC interactions in general. For example, lines 31 and 41 begin paragraphs by using this phrase but then just focus on indirect effects on dynamics. Aerosol-DCC interactions is not synonymous with aerosol indirect effects on cloud dynamics, so I think it needs to be clearer when mentioning aerosol-DCC interactions that the manuscript specifically focuses on aerosol effects on dynamics rather than direct effects of aerosols on hydrometeor properties or convective cloud effects on aerosols.

Hide

ED: Reconsider after major revisions (25 Jan 2025) by Shaocheng Xie

AR by Dié Wang on behalf of the Authors (25 Feb 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (03 Mar 2025) by Shaocheng Xie

RR by Anonymous Referee #2 (24 Mar 2025)

Suggestions for revision or reasons for rejection

The authors have now admitted that the g-computation results in the previous version of the manuscript equaled the multiple linear regression (MLR) predicted sensitivity of echo top heights (outcomes) to aerosol concentrations (exposures). I appreciate this admission, but the response that focuses primarily on adding in new models does not address my previous major concern.

There are still claims being made that g-computation is controlling for confounding factors and isolating causality in a superior way to simply using Q-models without g-computation like prior studies. However, the evidence supports my contention that g-computation is simply inputting values into the underlying Q-model to compute an outcome, which is the same as just using the Q-model and thus all assumptions in the Q-model are inherited (as are any problems in terms of the Q-model fit, generality, etc.), whether it is MLR or one of the new models used. It’s not clear that the new models were needed since the primary conclusions are unchanged from those supported by the original MLR, and if all the models support the same general conclusions, why use the more complex models that are more difficult to interpret? Thus, the authors have sidestepped the essence of my primary concern, which is that g-computation is not adding any value over the underlying Q-models. Because of this, the claim in the response and throughout the manuscript regarding usage of a new g-computation method that better handles confounding variables and isolates causality relative to the Q-models used in previous studies is untrue.

Rather than point out all the places in the manuscript where this is an issue, I’ve isolated some quotes from the response to my previous review that are untrue to highlight the essence of the problem.

Author’s response quote: “we generate counterfactual outcomes by setting the exposure variable X to fixed values representing different scenarios. As we demonstrate below, this step is critical for estimating the ACE and clearly sets this methodology apart from using MLR alone.”

My response: Setting the exposure variable to fixed values to generate counterfactual outcomes does not set this methodology apart from using multiple linear regression (MLR). The values are literally being inputted to the MLR model to generate outcomes and thus the assumptions are the same as those in the MLR model. Nothing has magically isolated causality (ACE).

Author’s response quote: “Although the differences between this ACE and the one presented in our original manuscript are within ±0.1 km (which may seem negligible for this study), g-computation uncovers causal effects rather than merely representing associations between variables, as MLR does.”

My response: This is untrue. G-computation inherits the assumptions of the underlying Q-model. The only thing that supports causality here is sampling aerosols and other conditions prior to the full growth of cells, but there are no physical inputs ensuring that statistical relationships between the inputs are related to causality, which can’t be proven given the input datasets with statistics alone. The previous iteration of this manuscript used MLR as the Q-model and made the same claim of uncovering causal effects before backtracking in the latest response to admit that the g-computation results were in fact the same as the MLR predicted sensitivity of the outcome (ETH) to the exposure (aerosol concentration). Saying that g-computation is uncovering causality and superior to previous studies using regressions is untrue.

Author’s response quote: “MLR does NOT inherently account for confounding or causal structure, which is a key capability of g-computation.”

My response: There is no support provided for this statement. All the evidence is to the contrary. G-computation is simply a choice of inputs to the MLR model (or any other Q-model chosen).

Author’s response quote: “We recognize that when using a standard MLR, as in our original manuscript, the regression coefficient of the exposure variable quantitatively aligns with the results from g-computation. However, this does not mean that MLR and g-computation are equivalent; in fact, their purposes and interpretations are fundamentally different. Moreover, it also does not imply that MLR can be used for estimating causal effects without specific constraints or conditions. Additionally, g-computation is just one of many causal inference methods. If other causal inference methods/frameworks (e.g., propensity score matching [Rosenbaum et al., 1983]) were employed (not relying on fitting a Q-model), the results would probably differ between the two approaches (Chatton et al., 2020).

Basically, MLR is a statistical tool for modeling the relationship between a dependent variable and multiple predictors. Its primary purpose is to predict outcomes and estimate associations, not causal effects. It does not inherently account for confounding, and its regression coefficients only represent causal effects under strict assumptions (e.g., no unmeasured confounding, correct model specification, random exposure assignment).

G-computation, on the other hand, is a powerful causal inference method that explicitly estimates the causal effect of an intervention by modeling the relationship between variables and simulating counterfactual outcomes. G-computation controls for confounding by keeping confounding variables constant across hypothetical interventions (i.e., setting all exposure to 1 or 0). This ensures that the estimated causal effects are not biased by confounders, leading to a more accurate assessment of the true causal effect of the exposure on the outcome.”

My response: There are false statements here. To be clear, g-computation is simply inputting 2 different exposure values (0 and 1 for standardized inputs where 0 = mean (not "clean" as claimed in the manuscript) and 1 = 1 standard deviation above the mean) and constant confounding variable values to the underlying Q-model. As such, the results inherit all the assumptions of the underlying Q-model, whether it is MLR or any other model. Controlling for confounding variables is being done by holding those variables constant in the Q-model. Exposure variable values are inputted to the Q-model. Counterfactual outcomes can be generated with MLR or any other Q-model by plugging in constant values for all variables while setting the exposure variables to 0 or 1. As such, g-computation provides no additional value over the underlying Q-models that are being used for all the calculations. The only support provided for g-computation isolating causality is that it is called a causal inference model. The only support for assumed causality throughout the paper is sampling predictors in time prior to cells being sampled, which makes sense but has also been done in previous studies and does not prove that relationships are causal.

To summarize, claims by the authors that g-computation isolates causal effects and controls confounding factors that the underlying model it is using (the “Q model”) cannot provide is not supported by evidence and all such statements to this effect should be stripped from the manuscript. If that is done, the manuscript will likely be suitable for publication.

Hide

ED: Reconsider after major revisions (31 Mar 2025) by Shaocheng Xie

AR by Dié Wang on behalf of the Authors (02 May 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (09 May 2025) by Shaocheng Xie

RR by Anonymous Referee #2 (17 May 2025)

ED: Publish subject to minor revisions (review by editor) (27 May 2025) by Shaocheng Xie

AR by Dié Wang on behalf of the Authors (30 May 2025) Author's response Author's tracked changes Manuscript

ED: Publish as is (04 Jun 2025) by Shaocheng Xie

AR by Dié Wang on behalf of the Authors (04 Jun 2025)

Journal article(s) based on this preprint

26 Aug 2025

Aerosol impacts on isolated deep convection: findings from TRACER

Dié Wang, Roni Kobrosly, Tao Zhang, Tamanna Subba, Susan van den Heever, Siddhant Gupta, and Michael Jensen

Atmos. Chem. Phys., 25, 9295–9314, https://doi.org/10.5194/acp-25-9295-2025,https://doi.org/10.5194/acp-25-9295-2025, 2025

Short summary

Dié Wang, Roni Kobrosly, Tao Zhang, Tamanna Subba, Susan van den Heever, Siddhant Gupta, and Michael Jensen

Supplement

https://doi.org/10.5194/egusphere-2024-2436-supplement

Dié Wang, Roni Kobrosly, Tao Zhang, Tamanna Subba, Susan van den Heever, Siddhant Gupta, and Michael Jensen

Viewed

Total article views: 829 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
548	174	107	829	44	28	41

HTML: 548
PDF: 174
XML: 107
Total: 829
Supplement: 44
BibTeX: 28
EndNote: 41

Views and downloads (calculated since 14 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	172	27	11	210
Sep 2024	83	22	4	109
Oct 2024	21	3	31	55
Nov 2024	25	10	49	84
Dec 2024	48	17	6	71
Jan 2025	30	14	3	47
Feb 2025	11	4	0	15
Mar 2025	12	11	0	23
Apr 2025	16	12	2	30
May 2025	17	5	1	23
Jun 2025	37	29	0	66
Jul 2025	30	10	0	40
Aug 2025	46	10	0	56

Cumulative views and downloads (calculated since 14 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	172	27	11	210
Sep 2024	83	22	4	109
Oct 2024	21	3	31	55
Nov 2024	25	10	49	84
Dec 2024	48	17	6	71
Jan 2025	30	14	3	47
Feb 2025	11	4	0	15
Mar 2025	12	11	0	23
Apr 2025	16	12	2	30
May 2025	17	5	1	23
Jun 2025	37	29	0	66
Jul 2025	30	10	0	40
Aug 2025	46	10	0	56

Viewed (geographical distribution)

Total article views: 823 (including HTML, PDF, and XML) Thereof 823 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Aug 2025

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (4381 KB)
Metadata XML

Short summary

We use a new method to understand how tiny particles in the air, called aerosols, affect rain clouds in the Houston-Galveston area. Aerosols generally do not make these clouds grow much taller, with an average height increase of about 1 km under certain conditions. However, their effects on rainfall strength and cloud expansion are less certain. Clouds influenced by sea breezes show a stronger aerosol impact, possibly due to unaccounted factors like vertical winds in near-surface layers.


Total:	0
HTML:	0
PDF:	0
XML:	0