the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Correcting Aerosol Extinction Coefficient Vertical Structure Biases in GEOS-Chem via a Physics-Informed Transformer with Physical Mechanism Diagnosis
Abstract. We propose a physics-informed Transformer framework to correct biases in the Aerosol Extinction Coefficient (AEC, km-1) profiles simulated by GEOS-Chem. Unlike standard Transformer, our framework features a dual-stream architecture with explicit physical constraints. It employs Gated Feature Fusion to integrate vertical structures (combining GEOS-Chem priors with MERRA-2 profiles) by dynamically identifying height-dependent drivers, and leverages Cross-Attention to incorporate MERRA-2 surface environmental constraints for modulating AEC vertical reconstruction with synoptic contexts. This approach effectively predicts systematic biases relative to Cloud-Aerosol Lidar with Orthogonal Polarization satellite observations and resolves AEC profiles, surpassing methods retrieving only aerosol layer heights. "Leave-One-Year-Out" validation over East Asia during 2017–2019 demonstrates significant AEC fidelity improvements, increasing R from 0.49–0.53 in the GEOS-Chem simulations to 0.66–0.73 and reducing RMSE by approximately 25 %. The model effectively mitigates over-diffusion, significantly reducing AEC simulation biases in the critical near-surface layer while restoring smoothed biomass burning and dust plumes. Additionally, it exhibits robust cross-continental transferability, reproducing bias patterns over North American domain (R=0.70) without retraining, confirming the internalization of universal physicochemical relationships linking atmospheric states to simulation biases. Furthermore, interpretability analysis establishes a feedback loop from data-driven correction to physical model improvement. The model identifies temperature and sensible heat flux as primary drivers to constrain boundary layer mixing, and uses environmental proxies (e.g., vegetation indices) to diagnose deficiencies in dust uplift and secondary aerosol formation. These insights provide a physical basis for refining parameterization schemes in chemical transport models.
- Preprint
(7550 KB) - Metadata XML
-
Supplement
(7441 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-397', Anonymous Referee #1, 09 Mar 2026
-
RC2: 'Comment on egusphere-2026-397', Anonymous Referee #2, 11 Apr 2026
This study tried to minimize the biases in GEOS-Chem aerosol simulation vertical structure using CALIPSO data. It meets the need for reconstructing aerosols’ spatially continuous distributions with high-accuracy vertical profiles. Methodologically, the paper proposes a Physics-Informed Transformer framework, explicitly incorporating physical priors through dual-stream inputs, gated feature fusion, and cross-attention mechanisms, thereby overcoming the limitations of traditional CNNs in capturing vertical dependencies of aerosols. Several issues need to be carefully addressed in the manuscript.
(1) The manuscript is too long to read. Please try to reduce redundant text. In particular, the methodology part contains many technical terms that make it extremely difficult to follow. Figure 2 shows the technical framework. However, I do not understand anything except the input layer when looking at this figure. Please add more details in the figure to show the physical meaning of feature embedding layer, Transformer Encode, and Cross Attention Layer. The methodology part needs to re-write in a way that atmospheric chemists and physicist can understand.
(2) In Section 3.1 (Eq. 1), the learning target is defined as the bias of GEOS-Chem relative to CALIOP. However, Section 2.2 states that CALIOP AOD shows a mean relative bias of −5.1% ± 8.5% against AERONET, and CALIOP backscatter agrees with HSRL within 1.0% ± 3.5%. These results indicate that CALIOP itself contains systematic uncertainties. Consequently, the learned “bias” effectively represents a combination of GEOS-Chem error and CALIOP error. If CALIOP has a negative bias, the model may incorrectly learn a tendency to increase AEC, even in cases where GEOS-Chem is accurate. This issue directly affects the interpretation and reliability of the bias-correction results. Suggestions: (a) Explicitly acknowledge this limitation in Section 2.2 or 3.1, and discuss the potential impact of CALIOP uncertainty on the training target. (b) Add a sensitivity analysis in the Results section (Section 4): quantify how the bias-correction results change if perturbations are applied to the CALIOP inputs.
(3) The manuscript states that the interpretability analysis can provide a solid physical basis for improving GEOS-Chem parameterizations and emission inventories, thereby establishing a feedback loop from “data-driven correction” to “physical mechanism improvement.” However, in Section 3.5, the interpretability framework is limited to feature-sensitivity approaches such as gradient attribution, permutation importance, and SHAP, without explaining how these results can be translated into concrete parameterization adjustments. For example, if SHAP identifies “sensible heat flux” as a dominant driver of the bias, it is unclear which specific GEOS-Chem parameters should be modified (e.g., diffusion coefficients in the PBL scheme, surface flux parameterizations, or others), and how such modifications would be implemented. This missing link weakens the claimed feedback loop and makes the statement appear largely conceptual rather than actionable.
(4) The Introduction appears to be over-cited, which makes it difficult for readers to clearly distinguish foundational studies from more recent developments. It would improve readability and focus to streamline the citations, limiting each statement to approximately three to five representative and/or recent review or key references.
(5) Section 3.5 is divided into 3.5.1 (Dual-Mechanism Attribution), 3.5.2 (Gated Fusion Analysis), and 3.5.3 (Feature Sensitivity and Regional Drivers). However, the Permutation Feature Importance and SHAP analysis in 3.5.3 overlap functionally with the Gradient-based Attribution in 3.5.1—both are essentially feature importance assessments. It is recommended to clearly articulate the complementarity of these three attribution methods: Gradient-based Attribution captures local sensitivity, Permutation Feature Importance provides global ranking, and SHAP analysis handles feature interactions and regional heterogeneity.
(6) Line 198: Explain the specific collocation strategy. How are the two datasets matched in space and time? What level of representativeness error might be introduced by this collocation approach?
(7) Line 746-748: The lower transfer performance over North America (R = 0.70) compared to East Asia (R = 0.93) is attributed to a shift in aerosol composition regimes (higher SOA fraction in North America versus sulfate–nitrate–dust dominance in East Asia). While this explanation is reasonable, it remains qualitative and lacks supporting evidence. It would be helpful to further evaluate the performance over North America stratified by CALIOP aerosol types.
(8) Abstract: too technical. Suggest to add several sentences in the beginning to introduce the science context and research gap before jumping into technical details.
(9) Figure 3. R between model prediction and what data? What are the units for RMSE and Bias?
(10) Figure 5. Why India shows much negative results?
(11) Figure 6. I would say that the correlation even after correction is not that good. Can you explain where are those points that are far away from the 1:1 line?
(12) Figure 10. Font size too small.
Citation: https://doi.org/10.5194/egusphere-2026-397-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 186 | 156 | 15 | 357 | 65 | 10 | 26 |
- HTML: 186
- PDF: 156
- XML: 15
- Total: 357
- Supplement: 65
- BibTeX: 10
- EndNote: 26
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript presents a sophisticated physics-informed Transformer framework to correct GEOS-Chem aerosol extinction coefficient profiles using CALIOP observations. The study is ambitious, methodologically advanced, and addresses an important problem in bridging chemical transport models (CTMs) and vertically resolved lidar observations. The reported improvements in correlation and RMSE, along with cross-continental transferability tests, are promising. However, several issues require clarification before the scientific contribution and methodological advantage can be properly evaluated as follows.
First, the scientific objective requires clearer framing. CALIOP observations are used to define simulation bias during training, but they are not included as inputs during inference. Therefore, the framework is not performing data assimilation, but rather learning a state-dependent mapping between atmospheric variables and historical GEOS-Chem biases. If the goal is to generate corrected AEC fields when CALIOP is unavailable, the method should be clearly described as a supervised bias-correction model conditioned on CTM state and meteorology, and its limitations should be acknowledged. For example, if key emissions (e.g., wildfire events) are missing in GEOS-Chem and not represented in the input features, the model cannot reconstruct those missing signals. The correction is inherently constrained by the information content of the CTM and meteorological predictors. The manuscript should therefore distinguish more carefully between correcting systematic state-dependent biases and compensating for missing physical processes. Clarifying this distinction would strengthen the scientific positioning of the study.
Second, the model architecture appears to rely on instantaneous vertical profiles and meteorological context, without explicit time-series modeling. It is unclear whether any temporal continuity, lagged predictors, or time-window averaging is incorporated into the inputs. A precise description of the temporal collocation strategy between GEOS-Chem and CALIOP is necessary to assess the robustness of the results. In addition, the manuscript does not discuss how diurnal variability in aerosol vertical structure is handled. Given the strong diurnal cycle of boundary layer evolution, turbulent mixing, hygroscopic growth, and photochemistry, aerosol extinction can vary substantially on hourly timescales. It should be clarified whether simple hour-by-hour matching is sufficient, or whether a temporal window similar to those used in traditional data assimilation frameworks, was considered to reduce representativeness errors. Without such analysis, it remains uncertain whether the reported improvements reflect stable bias correction or sensitivity to sampling timing and diurnal variability.
Third, the proposed architecture includes multiple advanced components. While the performance improvements are reported relative to the original GEOS-Chem simulation, there is no comparison with simpler machine learning baselines. It is therefore unclear whether the reported gains arise from the Transformer architecture itself, from the inclusion of additional meteorological predictors, or simply from the supervised bias-learning framework. To justify the methodological novelty, the study should include comparisons with at least one conventional model, such as a multilayer perceptron, a CNN-based model, or a tree-based regression approach. Ideally, ablation experiments isolating the contributions of the cross-attention module and gated fusion mechanism would further demonstrate the necessity of the proposed architecture. Without such benchmarks, it is difficult to assess whether the architectural complexity is warranted.