Assessment of Aerosol Iron Solubility using Global Dataset, Part II: Machine Learning and Deep Neural Network Coupled with SHapley Additive exPlanation Combined with Independent Component Analysis (SHAP-ICA)

Sakata, Kohei; Kurisu, Minako; Takahashi, Yoshio

doi:10.5194/egusphere-2026-1615

Preprints

https://doi.org/10.5194/egusphere-2026-1615

Preprints

08 Apr 2026

| 08 Apr 2026

Assessment of Aerosol Iron Solubility using Global Dataset, Part II: Machine Learning and Deep Neural Network Coupled with SHapley Additive exPlanation Combined with Independent Component Analysis (SHAP-ICA)

Kohei Sakata, Minako Kurisu, and Yoshio Takahashi

Abstract. The supply of dissolved iron (d-Fe) can enhance marine CO₂ fixation. Aerosols are one source of d-Fe to the ocean surface, but aerosol iron solubility (Fe_sol%) depends on emission sources and atmospheric alteration processes that remain poorly reproduced by global climate and chemical transport models. Although recent advances in machine and deep learning models can capture nonlinear relationships in observational datasets, applications to environmental samples are still limited and approaches for improving interpretability require further development. This study trained XGBoost and a deep neural network (DNN) using East Asian aerosol data and tested whether Fe_sol% and d-Fe concentrations in marine aerosols can be reproduced. The effects of individual features on Fe_sol% and d-Fe were quantified using SHapley Additive exPlanations (SHAP), and independent component analysis (ICA) was applied to SHAP values to extract independent components representing dominant controlling processes of Fe_sol%. East Asian Fe_sol% was reproduced well by both XGBoost and DNN. For marine aerosols, higher reproducibility was achieved by the DNN than by XGBoost, likely because deeper relationships among features can be learned. SHAP indicated that variability in Fe_sol% and d-Fe is primarily driven by chemical alteration of Fe in mineral dust and anthropogenic aerosols. ICA further suggested that additional processes, including heavy oil combustion, influence a subset of samples. Spatial variations in process contributions were visualized by mapping the influence of each independent component. This DNN-based framework can improve interpretation of both current results and future observational datasets.

Received: 23 Mar 2026 – Discussion started: 08 Apr 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2958 KB)

Supplement (3633 KB)

Download & links

Kohei Sakata, Minako Kurisu, and Yoshio Takahashi

Status: final response (author comments only)

RC1:
'Comment on egusphere-2026-1615', Anonymous Referee #1, 20 May 2026
General Comments
This study applied two machine learning (ML) approaches, eXtreme Gradient Boosting (XGBoost) and a deep neural network (DNN) to see if the ML-models themselves could effectively reproduce observed dissolved (dFe) concentrations and Fe solubility across marine deposition regions. The data features (input variables) included total Fe, Aluminum (Al) solubility, dissolved Al, total Fe to total Al ratio, and dissolved Fe to dissolved Al ratio. The ML-models were trained on observational datasets described primarily in the related Part I manuscript (that I accessed and did not read in full to complete this review).
Model outputs were analyzed using two component analyses (Shapley Additive explanation: SHAP and independent component analysis: ICA) to improve interpretability of the ML-model outputs and to isolate the drivers of Fe solubility and dFe concentrations. ML-model skill for marine deposition regions were evaluated against observational datasets compiled from the literature (for marine regions including the North Pacific, North Atlantic, and South Atlantic) and presented in Part I (for Eastern Asia). The authors primarily conclude that (1) atmospheric processing is the dominant driver of Fe solubility and dFe concentration, and (2) anthropogenic aerosols are not a major source of dFe in Eastern Asia, but only when aerosol size fraction is not distinguished in the model framework.
The manuscript does not clearly articulate the broader goal of applying this ML-framework, limiting the perceived impact and motivation for the ML-model development. Is it to advance empirical parameterizations applicable to climate projections or air quality forecasting? If so, how will this ML-model be leveraged in the future? And why were the input features useful explanatory variables? In the absence of process-based modeling advancement, the work would ideally demonstrate some new insight into processes governing dFe/dFe solubility in Eastern Asia. However, the findings provide limited knowledge advancement beyond pre-established atmospheric biogeochemical principles, which I believe limits the study's broader impact. Perceived impacts/applications must be provided. Otherwise, it occurs to me that this work would be more appropriately presented as a condensed component of Part I rather than as a standalone manuscript. For these reasons, I recommend that the article is reconsidered after major revisions. I identify several Specific Comments that can hopefully provide a source of targeted improvements in addition to Technical Corrections to be addressed at the authors’ and editors’ respective discretions.

Specific Comments
There are several concerns with the author's approach to/discussion of source apportionment. I am not convinced that the ML- framework alone is able to distinguish anthropogenic contributions to dFe with the confidence implied. First, there is no consideration of wildfire aerosol as a distinct and regionally impactful source of dFe. Even though smaller than dust and anthropogenic sources by mass, wildfire emissions introduce Fe-containing aerosols across fine and coarse size fractions with distinct solubility profiles (Bergas-Masso et al., 2023, Ito et al., 2023), introducing additional source apportionment uncertainty within their framework. Volcanic rock is discussed as an additional potential source of aerosol-Fe introducing uncertainty near Australia, but fire is not. Second, the conclusions presented in Sections 3.2 and 3.3 with respect to relative contributions by anthropogenic Fe are directly contradicted by the size-resolved results in Section 3.4. This organizational structure initially misled me and likely a broader audience. The original interpretation also conflicts with an expanding body of literature documenting anthropogenic combustion to dFe across the region, particularly within fine aerosol modes (e.g., Ito et al., 2023, Rathod et al., 2024; Zhang et al., 2021, and references therein). Lastly, throughout the manuscript, the authors speculate on specific mineralogical compositions (e.g., heavy-oil combustion fly ash) to explain elevated dFe or Fesol% (Lines 299, 457, 549, 555, 617, 715, etc). Oil emissions represent a far-reaching anthropogenic Fe source across terrestrial/ocean regions (Rathod et al., 2020), therefore elevated dFe in any particular region is not indicative of oil emissions alone. Further, oil is not the only anthropogenic source that can exceed 38% Fe solubility (Li et al., preprint). The authors acknowledge that definitive source attribution requires isotopic/tracer labeling or process-based modeling, yet throughout the text they overstate speculative source assignments.

What is the intention to leverage the ML-models in the future? The practical utility of reproducing dFe and Fesol% using the EA-DNN model with the rationale for use of trace metal input features remains unclear. It appears to me that the ML-models merely reproduce known associations that exist within the input data. Are these input features intended as mechanistic proxies for solubility processes, or as operational shortcuts for dFe estimation? In addition, the manuscript does not clearly articulate whether the model is intended for future projections of dFe under changing atmospheric conditions or to serve primarily as a diagnostic tool. Lines 776–777 mention potential expansion to include air pollutants influencing solubility. This application could have yielded mechanistic insights into uncertainties in current Fe solubility parameterizations. However, the authors claim limited data availability as a barrier to this approach, despite sulfate aerosols and SO₂/NOₓ measurements coinciding with Fe aerosol observations are widely available in global atmospheric datasets derived from models, satellites, and ground-based instruments (AeroCOM: Gliß et al., 2021, GHOST: Bowdalo et al., 2024).

The component analyses appear to lack consideration of seasonality. The manuscript does not clarify how seasonality/any temporal component is incorporated into the ML framework. What are the sensitivities of the model predictions to seasonal variations in dust emissions and factors influencing acidic chemical processing such as boundary layer dynamics and anthropogenic pollutants?

The study does not clearly articulate a novel contribution to the field. Atmospheric chemical processing is a well-established mechanism controlling trace metal solubility in aerosols, with its mechanistic pathways clearly characterized through physical and chemical investigations. It is also well-established that although dust Fe exhibits lower solubility than alternative sources of atmospheric Fe, dust-derived dFe dominates global and regional inventories per its total mass exceeding other sources by orders of magnitude (Hamilton et al., 2019, Myriokefalitakis et al., 2018). Therefore, it is unsurprising that the ML-models reproduce these relationships, particularly given that the input features only included variables with direct correlations to dFe and Fe solubility.

The results section is excessively repetitive and could reduce jargon. The conclusion that atmospheric chemical processing controls Fesol% and dFe is restated multiple times across sections. It could be simplified in the discussion as the conclusion from multiple modeling approaches/lines of evidence. The iterative presentation of model configurations and SHAP/integrated gradients analyses for each region creates unnecessary redundancy and increases reader fatigue. Additionally, ML terminology (e.g., “features”, "loadings"), while necessary within the methods section, should be used with restraint in the results/discussion sections to make the paper more accessible to a larger audience. I recommend that a condensed presentation--potentially even this work as its own section in Part I--with simplified language and more consolidated regional interpretation would markedly improve readability and allow focus on substantive findings.

Technical Corrections
L37: Provide a definition for dissolved in this context.
L48: The snapshot problem is important for interpretation of your ML-model/observation comparisons, but it is not really discussed outside of the background section. Enhance?
L50: What is meant by reanalysis here? As in meteorological nudging? Or retrospective analyses of historical datasets?
L52-54: When speaking about marine regions, specify from the beginning that samples are airborne (not waterborne).
L65-100: Better suited for methods, in my opinion.
L102: I did not have explicit knowledge of Part I reading this. It is advisable to provide a brief summary here that includes the temporal and spatial distribution of samples in addition to the solubility quantification approaches. Perhaps a succinct table or map that is conserved across Parts I & II.
L108: specifies that models cannot handle missing values, but L143 states XGBoost can handle missing values, correct.
L118-119: citation for this reported range of dFe/dAl?
Figure 1: please provide scales/labels on all axes for all figures, even if unitless rank scales.
L169: The use of model accuracy here is better replaced with “model performance” or “model skill”, since the observations are not always perfectly representative of the truth or ambient average, as indicated by the snapshot problem discussed earlier.
L400: Is there direct measurement of Fe-oxides or is this a speculation to explain your claim? See specific comment on source apportionment concerns.
L472: What is marine chemical alteration in this context?
Figure 9: please provide units for dFe in labels
L800: suggest rephrase of marine aerosol to shipboard based measurements; marine aerosol implies sea spray
L815: Fe solubilization is already parameterized directly by aerosol acidity in Earth System Models, discuss this?

References Cited in this Review:
Rathod et al., 2024: https://doi.org/10.1029/2023JD040332

Rathod et al., 2020: https://doi.org/10.1029/2019JD032114

Zhang et al., 2021: https://doi.org/10.1029/2021JD036070

Bergas-Masso et al., https://doi.org/10.1029/2022EF003353

Gliß et al., 2021: https://doi.org/10.5194/acp-21-87-2021

Bowdalo et al., 2024: https://doi.org/10.5194/essd-16-4417-2024

Hamilton et al., 2019: https://doi.org/10.5194/gmd-12-3835-2019

Myriokefalitakis et al., 2018: https://doi.org/10.5194/bg-15-6659-2018

Ito et al., 2023: https://doi.org/10.5194/acp-21-87-2021

Li et al., (preprint): https://doi.org/10.5194/egusphere-2025-4058
Citation: https://doi.org/10.5194/egusphere-2026-1615-RC1
RC2: 'Comment on egusphere-2026-1615', Anonymous Referee #2, 09 Jun 2026

The manuscript applies machine learning models to aerosol iron solubility and achieves good statistical performance. However, the broader implication for the aerosol‑iron field is not clearly drawn. ACP usually expects work to advance conceptual understanding. At present, it is not clear how the results change our conceptual picture of aerosol iron or can be used by the wider community (e.g. global modelers, marine biogeochemists).
Another major issue is that the ML features directly involving dissolved iron (such as [d‑Fe]/[d‑Al]) are used to predict Fesol%, which itself is defined from d‑Fe. This amounts to using the target variable to predict itself and leads to information leakage. It would be very helpful to include some sensitivity tests: train and evaluate the models with and without d‑Fe‑related features, and compare performance and SHAP/ICA results.
Ovearll, I think the manuscript would benefit from a substantial re‑organization. At present it reads mainly as an application of machine‑learning techniques to an existing aerosol‑iron dataset, which cannot meet the expectations of ACP journal.

Citation: https://doi.org/10.5194/egusphere-2026-1615-RC2
RC3: 'Comment on egusphere-2026-1615', Anonymous Referee #3, 12 Jun 2026

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-1615/egusphere-2026-1615-RC3-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2026-1615-RC3

Kohei Sakata, Minako Kurisu, and Yoshio Takahashi

Supplement

https://doi.org/10.5194/egusphere-2026-1615-supplement

Kohei Sakata, Minako Kurisu, and Yoshio Takahashi

Viewed

Total article views: 384 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
266	91	27	384	45	14	18

HTML: 266
PDF: 91
XML: 27
Total: 384
Supplement: 45
BibTeX: 14
EndNote: 18

Views and downloads (calculated since 08 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	158	38	13	209
May 2026	61	33	5	99
Jun 2026	34	13	8	55
Jul 2026	13	7	1	21

Cumulative views and downloads (calculated since 08 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	158	38	13	209
May 2026	61	33	5	99
Jun 2026	34	13	8	55
Jul 2026	13	7	1	21

Viewed (geographical distribution)

Total article views: 376 (including HTML, PDF, and XML) Thereof 376 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 22 Jul 2026

Short summary

Aerosols supply dissolved iron (d-Fe) to the ocean surface, where it can enhance marine CO₂ fixation. Machine and deep learning can capture nonlinear relationships in observational datasets, but applications to atmospheric chemistry remain limited. Using East Asian aerosol data, this study trained XGBoost and a deep neural network to predict Fesol% and d-Fe in marine aerosols. SHAP and ICA showed that variability was governed mainly by chemical processing of mineral dust and anthropogenic Fe.


Total:	0
HTML:	0
PDF:	0
XML:	0