Assessment of Aerosol Iron Solubility using Global Dataset, Part II: Machine Learning and Deep Neural Network Coupled with SHapley Additive exPlanation Combined with Independent Component Analysis (SHAP-ICA)
Abstract. The supply of dissolved iron (d-Fe) can enhance marine CO2 fixation. Aerosols are one source of d-Fe to the ocean surface, but aerosol iron solubility (Fesol%) depends on emission sources and atmospheric alteration processes that remain poorly reproduced by global climate and chemical transport models. Although recent advances in machine and deep learning models can capture nonlinear relationships in observational datasets, applications to environmental samples are still limited and approaches for improving interpretability require further development. This study trained XGBoost and a deep neural network (DNN) using East Asian aerosol data and tested whether Fesol% and d-Fe concentrations in marine aerosols can be reproduced. The effects of individual features on Fesol% and d-Fe were quantified using SHapley Additive exPlanations (SHAP), and independent component analysis (ICA) was applied to SHAP values to extract independent components representing dominant controlling processes of Fesol%. East Asian Fesol% was reproduced well by both XGBoost and DNN. For marine aerosols, higher reproducibility was achieved by the DNN than by XGBoost, likely because deeper relationships among features can be learned. SHAP indicated that variability in Fesol% and d-Fe is primarily driven by chemical alteration of Fe in mineral dust and anthropogenic aerosols. ICA further suggested that additional processes, including heavy oil combustion, influence a subset of samples. Spatial variations in process contributions were visualized by mapping the influence of each independent component. This DNN-based framework can improve interpretation of both current results and future observational datasets.
General Comments
This study applied two machine learning (ML) approaches, eXtreme Gradient Boosting (XGBoost) and a deep neural network (DNN) to see if the ML-models themselves could effectively reproduce observed dissolved (dFe) concentrations and Fe solubility across marine deposition regions. The data features (input variables) included total Fe, Aluminum (Al) solubility, dissolved Al, total Fe to total Al ratio, and dissolved Fe to dissolved Al ratio. The ML-models were trained on observational datasets described primarily in the related Part I manuscript (that I accessed and did not read in full to complete this review).
Model outputs were analyzed using two component analyses (Shapley Additive explanation: SHAP and independent component analysis: ICA) to improve interpretability of the ML-model outputs and to isolate the drivers of Fe solubility and dFe concentrations. ML-model skill for marine deposition regions were evaluated against observational datasets compiled from the literature (for marine regions including the North Pacific, North Atlantic, and South Atlantic) and presented in Part I (for Eastern Asia). The authors primarily conclude that (1) atmospheric processing is the dominant driver of Fe solubility and dFe concentration, and (2) anthropogenic aerosols are not a major source of dFe in Eastern Asia, but only when aerosol size fraction is not distinguished in the model framework.
The manuscript does not clearly articulate the broader goal of applying this ML-framework, limiting the perceived impact and motivation for the ML-model development. Is it to advance empirical parameterizations applicable to climate projections or air quality forecasting? If so, how will this ML-model be leveraged in the future? And why were the input features useful explanatory variables? In the absence of process-based modeling advancement, the work would ideally demonstrate some new insight into processes governing dFe/dFe solubility in Eastern Asia. However, the findings provide limited knowledge advancement beyond pre-established atmospheric biogeochemical principles, which I believe limits the study's broader impact. Perceived impacts/applications must be provided. Otherwise, it occurs to me that this work would be more appropriately presented as a condensed component of Part I rather than as a standalone manuscript. For these reasons, I recommend that the article is reconsidered after major revisions. I identify several Specific Comments that can hopefully provide a source of targeted improvements in addition to Technical Corrections to be addressed at the authors’ and editors’ respective discretions.
Specific Comments
Technical Corrections
L37: Provide a definition for dissolved in this context.
L48: The snapshot problem is important for interpretation of your ML-model/observation comparisons, but it is not really discussed outside of the background section. Enhance?
L50: What is meant by reanalysis here? As in meteorological nudging? Or retrospective analyses of historical datasets?
L52-54: When speaking about marine regions, specify from the beginning that samples are airborne (not waterborne).
L65-100: Better suited for methods, in my opinion.
L102: I did not have explicit knowledge of Part I reading this. It is advisable to provide a brief summary here that includes the temporal and spatial distribution of samples in addition to the solubility quantification approaches. Perhaps a succinct table or map that is conserved across Parts I & II.
L108: specifies that models cannot handle missing values, but L143 states XGBoost can handle missing values, correct.
L118-119: citation for this reported range of dFe/dAl?
Figure 1: please provide scales/labels on all axes for all figures, even if unitless rank scales.
L169: The use of model accuracy here is better replaced with “model performance” or “model skill”, since the observations are not always perfectly representative of the truth or ambient average, as indicated by the snapshot problem discussed earlier.
L400: Is there direct measurement of Fe-oxides or is this a speculation to explain your claim? See specific comment on source apportionment concerns.
L472: What is marine chemical alteration in this context?
Figure 9: please provide units for dFe in labels
L800: suggest rephrase of marine aerosol to shipboard based measurements; marine aerosol implies sea spray
L815: Fe solubilization is already parameterized directly by aerosol acidity in Earth System Models, discuss this?
References Cited in this Review: