the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Improving coastal ocean pH estimates through assimilation of glider observations and hybrid statistical methods
Abstract. Ocean acidification monitoring and carbon accounting require accurate estimates of marine carbonate system variables, particularly in dynamic coastal regions where observations remain sparse. This study presents an approach to improving carbonate system state estimates in the California Current System through the assimilation of underwater glider observations with both dynamical and statistical models. We implement a 4D-Var data assimilation system that jointly assimilates physical variables, chlorophyll, and glider-based pH and alkalinity data into a regional coupled physical-biogeochemical model. Our results demonstrate that joint assimilation of carbonate system variables successfully improves pH and alkalinity estimates while maintaining the quality of physical and chlorophyll estimates. Cross-validation experiments reveal that pH data assimilation typically improves estimates near the observation network, although downstream advection of increments can occasionally degrade results. We also show that hybrid estimates that combine the output of the dynamical, physical ocean model with a statistical model produce accurate carbonate system estimates without requiring a biogeochemical model. This finding suggests that physical ocean models and data assimilation systems can obtain reasonable carbonate system estimates by combining statistical methods with model estimates of temperature and salinity.
- Preprint
(1219 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-3560', Anonymous Referee #1, 25 Nov 2025
-
RC2: 'Comment on egusphere-2025-3560', Anonymous Referee #2, 01 Feb 2026
The manuscript is an excellent and very well-written paper. It provides a clear methodological advance for improving pH estimates and, more broadly, carbonate-system observation in oceanography by combining data assimilation with existing observing networks and hybrid approaches. The results are convincing but a few points would benefit from clarification.
There is a strong dependence on estimated variables. The pH assimilation relies on a joint assimilation of pH and TA, but TA is not directly observed here: it is taken from statistical estimates (ESPER) because autonomous platforms do not routinely measure alkalinity. If TA is biased (e.g., mCDR influence, river plumes/DOM,…), the whole approach can quickly become fragile, or even unusable. This is an important limitation, because it reduces the method’s operational value in complex coastal settings and outside regimes where TA is well captured by broad, global relationships. This point should be mentionned in the manuscript
There is a risk of statistical “circularity”. ESPER is used both to generate the pH/TA pseudo-observations and to reconstruct carbonate variables from model/DA outputs (Section 2.4, 3.3-3.6 and Figure 4). The same statistical machinery appears on both sides of the evaluation. The authors acknowledge that the good performance of the hybrid product may partly reflect this methodological proximity, rather than a truly independent skill. It would help to include a more “decoupled” validation, either with alternative algorithms (not ESPER) or, ideally, with independent carbonate-system observations.
The sensitivity to observation/background error choices should be tested. In the section 2.5 the errors settings are tuned via FPI for T/S/SSH/Chla, but not in the same way for carbonate variables, where the system is more non-linear and where several choices remain assumption-driven (e.g., ±3σ-type reasoning, fixed parameters). What is missing is a real sensitivity analysis: how stable are the results if σ(pH), σ(TA), correlation length scales, or relative weights are changed within reasonable bounds ? Do the main conclusions survive if carbonate uncertainties are slightly under- or over-estimated ?
In the section 3.6 and discussion section, the ROMS 4D-Var assumes zero cross-variable covariances. As a result, assimilating pH mainly adjusts DIC/TA, but it does not propagate much information to other biogeochemical controls (e.g., nutrients), and therefore cannot strongly improve oxygen indirectly. The manuscript flags this as a key limitation and points toward ensemble-based approaches or multivariate formulations where cross-covariances are allowed. In its current form, the benefit remains largely “diagnostic” for the carbonate system, rather than truly improving coupled biogeochemical dynamics.
There is a limited spatial/temporal generalization. The analysis is restricted to 2019 and a region that is heavily constrained by the observing network. This is fine for a first demonstration, but it limits how far the conclusions can be generalized. Extending the assessment to other years (including extreme or anomalous conditions) and bringing in additional datasets (e.g., BGC-Argo, high-frequency pCO₂ products) would be important to demonstrate interannual robustness and transferability to other coastal margins.
In highly dynamic waters (Line 67), DA “pH” does not always beat the reference unless pH is directly assimilated. For the “measured” pH dataset (glider line), DA experiment 1 (CUGN pH/TA estimates) is sometimes slightly worse than the free/reference run. The clear improvements in the upper 0-150 m come mainly from the hybrid approach or from direct assimilation of the pH sensor (DA experiment 2). In the manuscript’s conclusion, denser spatio-temporal observations are needed, seems justified. It means that “pH via ESPER pseudo-observations” is not a universal solution in regions dominated by strong meso/submesoscale variability. This should be mentionned in the discussion and conclusion
To summarize, these points should be considered:
- Add a sensitivity analysis (σ(pH), σ(TA), correlation scales, localization choices, 4D-Var window length) to show that the qualitative conclusions are robust.
- Strengthen independent validation: use a different algorithm than ESPER and/or independent carbonate data (BGC-Argo, discrete samples, pCO₂ products), especially outside the CUGN footprint.
- Discuss concrete options for allowing cross-covariances (EnKF/EnVar, multivariate B), since this is identified as a structural limitation.
- Better frame situations where TA is likely perturbed (mCDR, terrigenous inputs): detection criteria, potential switch to a more regional/statistically adapted framework, or assimilation of alternative variables.
Citation: https://doi.org/10.5194/egusphere-2025-3560-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,178 | 202 | 34 | 1,414 | 41 | 48 |
- HTML: 1,178
- PDF: 202
- XML: 34
- Total: 1,414
- BibTeX: 41
- EndNote: 48
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript presents a rigorous and timely assessment of how glider-based carbonate-system observations can improve coastal pH estimates through 4D-Var assimilation in ROMS–NEMUCSC. The integration of pH and alkalinity with physical and chlorophyll data, combined with a thorough evaluation of ESPER-based hybrid estimates, makes this contribution relevant for coastal carbon monitoring and DA system design.
Overall, the study is technically strong, clearly motivated, and generally well executed. The comparison between full biogeochemical DA and hybrid statistical–dynamical methods is valuable and will interest both modeling and observational communities. The manuscript is publishable after major revisions aimed at sharpening key messages and clarifying methodological choices.
Major comments
1. The manuscript is rich in experiments, but the core scientific conclusions could be distilled more explicitly. The three main findings (limited impact of physical DA on pH, strong improvement from pH+alkalinity DA, and competitive performance of hybrid ESPER approaches) should be highlighted earlier and revisited more succinctly in the Discussion.
2. The necessity to assimilate estimated, not measured, alkalinity (Section 2.6) is a central limitation. The discussion acknowledges this but remains somewhat cautious. The authors should explicitly quantify the sensitivity of the pH increments to TA uncertainty and clarify in which coastal regimes the ESPER TA is reliable, and where it may fail (river plumes, OM-rich waters, denitrification).
3. Some cross-validation experiments show deterioration of pH downstream of the lines, attributed to advection of increments. This is important for future glider network design. A brief dynamical explanation (e.g., density structure, mesoscale features along Line 67) would strengthen the argument.
4. The result that hybrid ESPER estimates outperform the full BGC model (when carbonate variables are not assimilated) is striking. The implications deserve more emphasis: under which conditions does a hybrid approach suffice operationally? Is the benefit solely from improved T–S via physical DA, or also from limitations in the NEMUCSC carbon module?
5. The study shows an expected improvement when O2 is assimilated, but the weak coupling between pH and O2 increments reflects structural constraints of the DA system. It would be beneficial to comment on whether variable-covariance specification (currently set to zero) is a limiting assumption for future biogeochemical DA.
6. The manuscript relies exclusively on ESPER for alkalinity and DIC estimation, but does not justify this choice. This is important because CANYON-B/CONTENT is widely used in the community, specifically trained for glider-type variables, and often performs better in coastal and upwelling systems due to its inclusion of oxygen and sometimes nitrate as predictors. The authors should briefly explain why ESPER was selected, and whether alternative empirical regressions (e.g., CANYON-B, LIAR, multi-sensor neural networks) were evaluated. A short comparison or rationale would strengthen confidence in the robustness of the hybrid approach. At minimum, please clarify: what variables ESPER requires in this implementation, whether CANYON-B was unsuitable due to predictor availability or training domain, whether differences between algorithms could alter the conclusions on hybrid performance.
Minor comments
-Figures 4 and 5 are informative but visually dense; consider simplifying color scales or moving supplementary diagnostics to the Appendix.
-State the glider pH sensor accuracy explicitly when first introduced (currently only in Table 3).
-Clarify whether ESPER was re-trained or used as published.
-The manuscript is long; some methodological descriptions (e.g., NEMUCSC structure) could be tightened.