Substantial inter-model variation in OAE efficiency between the CESM2/MARBL and ECCO-Darwin ocean biogeochemistry models

Tyka, Michael Dominik; Zhou, Mengyang; Yankovsky, Elizabeth; Carroll, Dustin

doi:10.5194/egusphere-2025-3713

Preprints

https://doi.org/10.5194/egusphere-2025-3713

Preprints

27 Aug 2025

| 27 Aug 2025

Substantial inter-model variation in OAE efficiency between the CESM2/MARBL and ECCO-Darwin ocean biogeochemistry models

Michael Dominik Tyka, Mengyang Zhou, Elizabeth Yankovsky, and Dustin Carroll

Abstract. Induction of a surface-ocean DIC (dissolved organic carbon) deficit through alkalinity-based or direct CO₂ removal methods has been recognized as a promising approach to meet the projected need for negative emissions. The difficulty of directly measuring the counter-factual CO₂ flux due to rapid spreading of the DIC-deficient plume has put ocean circulation models in the center of the Measurement, Reporting and Verification (MRV) challenge. Confidence in the results of such models is essential for the emerging industry to access carbon credit markets and grow at the required pace, to reach substantial negative emissions by 2050, as envisioned by the Intergovernmental Panel on Climate Change (IPCC).

The kinetics and equilibration time of such a DIC deficit have been shown to vary substantially depending on the location and season of the initial induction point. A major component of this variance is the vertical transport and mixing of the DIC-deficient plume; however, air-sea CO₂ gas exchange and carbonate chemistry are also important.

Currently, it is poorly understood how much the results of DIC-deficit pulse simulations depend on the models chosen. To help close this knowledge gap, we investigate two global circulation models, the CESM2/MARBL model (1°) and the data-assimilative ECCO-Darwin model (1/3°). We perform pulse injection simulations at twelve locations with both models, matched precisely in terms of injection patch geometry, release year and season. We analyze the differences in CO₂ uptake curves, vertical mixing, gas exchange and carbonate chemistry.

We show that in some locations, such as subtropical regions, substantial differences exist between these two models — well beyond the expected intrinsic variation of each model. Furthermore, we demonstrate that the majority of the differences are attributable to the representation of vertical transport, followed by the effect of wind parameterizations. A small amount of difference is attributable to carbonate chemistry parameterization. In some locations, there exists good agreement between the models. In most injection locations, the largest differences between models are found in the first 7 years post alkalinity injection, followed by slow convergence towards the expected theoretical maximums.

Received: 30 Jul 2025 – Discussion started: 27 Aug 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 15914 KB)

Supplement (2291 KB)

Download & links

Preprint (15914 KB)
Metadata XML
Supplement (2291 KB)
BibTeX
EndNote

Michael Dominik Tyka, Mengyang Zhou, Elizabeth Yankovsky, and Dustin Carroll

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-3713', Anonymous Referee #1, 16 Oct 2025
This work is innovative and highly relevant. It addresses the complex challenge of comparing models when plume trajectories and CO₂ equilibration both differ and are interlinked. The proposed methodology is elegant and offers a valuable approach for disentangling model parameterizations. This study could inspire future multi-model comparisons of OAE. Overall, it represents a strong and meaningful contribution to understanding OAE model uncertainties. It requires minor clarifications, figure adjustments, and additional context for full clarity.
Specific comments:
Line 110: Sea ice should be mentioned here, given that it is studied in your work. You could also consider mentioning thermal effects (e.g., changes in dDIC/dCO₂ affecting equilibration rate constants) and salinity.

Table 1: How the optimization method of ECCO-Darwin may influence the results?

Line 151: Please include the corresponding figure in the supplementary materials.

Line 188: Could you rephrase this sentence. It is unclear why this equilibration rate differs from Zeebe and Wolf-Gladrow (2001). Also, clarify that μ represents vertical dilution, not horizontal (if I understood correctly), and clarify z0.

Lines 200–230: Equations (3) and (4) are referenced but not explicitly defined. Please add their definitions.

Line 220: The weighting method appears to overestimate the plume extent since surface decay is not included. Will this plume attribution method remain valid in longer simulations with multiple OAE deployments? It seems that plumes extent could be overestimated, with accumulation in the ocean and mixing occurring without any corresponding effect on the carbon flux.

It is not entirely clear to me weighting with ΔDIC is problematic, shouldn’t the weights reflect both the trajectory and gas exchange history?

Please provide an explicit definition (equation) for the weights wᵢⱼ.

Lines 295–305: This section suggests that all locations converge after seven years. Please rephrase for clarity. The phrase “not observed in the North Atlantic” is confusing since Figure 1 indicates similar behavior is absent in the Western Sahara, Oman, and the U.S. East Coast. Even Although you discuss these differences later in the text, consider rewording this paragraph to introduce the subsequent sections.

Lines 310–320: A short discussion on how interannual variability influences nₘₐₓ, and why this differs between injection sites and models, would be valuable.

Line 325: Can convergence can really be concluded from Figure 2? It is only evident in panel 2a.

Figure 3: Please move the legend away from the curves for readability. Consider adding a zoom on the first few months.

Line 340: The reported correlation is not immediately evident. It may help to show this relationship explicitly in an additional panel or inset. There appears to be sufficient blank space in Figure 3 to include both plots clearly.

Line 340: The phrasing “the effect is much stronger” suggests subduction causes the n(t) difference in Norway, but this interpretation seems premature. The correlation is not clear, Alaska shows similar n(t) curves between models despite larger surface fraction differences.

Line 370: Did Xie et al. use coastal or offshore injection? Can this change their interpreatation?

Line 402: The sentence “This is consistent with a general understanding that ocean carbonate chemistry is well understood and therefore not a major contributor to the final efficacy of OAE” seems to be missing a word—perhaps “uncertainty” or “variability”?

Lines 410–420: Could you comment on the effect of wind on MLD?

Line 433: The discussion on biological influence is insightful. However, the cause–effect relationship in this sentence is unclear: “However, the shape of the OAE efficiency curves is, in principle, independent of the quantity added, so in that sense the efficiency curves appear to be indeed relatively independent of biological activity’. Consider rephrasing for clarity.

Line 451: It would help readers if you reiterated which parameters are being referred to here. Also, it is not clear how parameters vary in Figures 9–12 or the magnitude of these variations. This information could be added to the Methods section.

Line 462: Please specify the magnitude of the change in β.

Line 464: Please add a reference to a figure supporting this result.

Figures 9–11: Add x-axis labels and include the black line in the text legend.

Figure 9: Were ECCO-Darwin parameters also changed to CESM2 ones? Does the 3-hourly versus 6-hourly wind forcing matters?

Line 490: One might argue that the remaining challenge lies in accurately determining the location of the plume.

Line 502: Providing specific locations would make the statement clearer to readers.

Figure 13: You might consider moving this figure earlier in the manuscript.

Line 576: It may be worth reiterating that vertical DIC and alkalinity gradients need to be appropriately resolved, as this is highly dependent on the biological representation.

Limitations paragraph: Could you discuss limitations specific to ocean-only models? For instance, are internal variability and feedback processes a concern?

General :
Ensure consistent labeling of subplots : use i), ii), iii) in figure, referred to a) b) c) in the text legend.

Fig 11 & 12 : Should it be “ECCO minus CESM2” in figures inset legend?
Citation: https://doi.org/10.5194/egusphere-2025-3713-RC1
- AC2:
  'Reply on RC1', Michael Tyka, 12 Nov 2025
  We thank the reviewer for their constructive comments and suggestions !
  
  Line 110: Sea ice should be mentioned here, given that it is studied in your work. You could also consider mentioning thermal effects (e.g., changes in dDIC/dCO₂ affecting equilibration rate constants) and salinity.
  Yes, other factors can also play a role. We’ve added “the gas transfer velocities (which are a function of wind speed and ice coverage)” and “Similarly temperature and salinity can affect the carbonate chemistry state.” to line 110, to clarify this.
  
  Table 1: How the optimization method of ECCO-Darwin may influence the results?
  The optimization method of ECCO-Darwin uses the adjoint method for the physics, which adjusts initial conditions, surface-ocean boundary conditions, and 3-D time-invariant mixing coefficients; model biogeochemistry is optimized using a low-dimensional Green's Functions approach to adjust initial conditions and Darwin parameters. This results in a physically-consistent counterfactual solution with fully-closed property budgets (i.e., no nudging is used in the assimilation process). We note that ECCO-Darwin does not have a long spin-up period from pre-industrial conditions, as done in many forward-only ocean and Earth System Models, but uses the ECCO data assimilation methodology to both reduce spin-up and drift in a data-constrained simulation that starts in 1992.
  
  We’ve added this description to the manuscript. It is hard to anticipate what exactly the effect of these architectural differences will be. In general, the assumption is that constraining the model to data should yield a more realistic model. However, in this paper we’re not trying to determine which model is the “best”, compared to a suite of ocean observations but simply ask the question how large the variance is between two commonly- used models.
  
  Line 151: Please include the corresponding figure in the supplementary materials.
  We added a Figure showing this is indeed the case.
  
  Line 188: Could you rephrase this sentence. It is unclear why this equilibration rate differs from Zeebe and Wolf-Gladrow (2001). Also, clarify that μ represents vertical dilution, not horizontal (if I understood correctly), and clarify z0.
  The representation of the equilibration rate used here was developed in Zhou et al 2024. It was adapted to account for the fact that in the simulation gas exchange only occurs from the top simulation cell, whereas the equation in Zeebe and Wolf-Gladrow (2001) expresses the rate in terms of the mixed layer depth (using a simpler box model) which does not have a strict correspondence in the simulation, as the mixed layer typically spans many simulation layers and varies from place to place.
  
  We’ve expanded this section in the manuscript to give more context on where this formulation came from and define more explicitly what the terms μ and z0 mean.
  
  Lines 200–230: Equations (3) and (4) are referenced but not explicitly defined. Please add their definitions.
  We’re not sure what the reviewer is referring to - both equations (3) and (4) are explicitly defined on lines 171 and 175, respectively and show how [D] can be calculated. Perhaps this comment was referring to something else or to a different line?
  
  To clarify further we’ve added explicit equations for how wij is calculated using alkalinity (which is the method we use in the paper) in a section further below.
  
  Line 220: The weighting method appears to overestimate the plume extent since surface decay is not included. Will this plume attribution method remain valid in longer simulations with multiple OAE deployments?
  This analysis is not intended to work with multiple OAE deployments - it is only designed to interrogate a single pulsed release.
  
  It seems that plumes extent could be overestimated, with accumulation in the ocean and mixing occurring without any corresponding effect on the carbon flux.
  Ideally the weighting function should average differences in equilibration parameters over the areas where equilibration is occurring. In that sense, yes, using the excess Alkalinity as the weighting function could potentially include areas where no more equilibration is occurring. However in practice we find that the plume is well tracked by ∆Alk and does not significantly exceed the area of ∆D. By the time the equilibration is close to complete, however, the ∆D plume is very noisy and poorly defined, due to the vanishing denominator, whereas the ∆Alk-based plume definition remains numerically stable and well defined. We can thus continue to compare the effective rate constant over that area (even though the rate itself has dropped to zero).
  
  It is not entirely clear to me weighting with ΔDIC is problematic, shouldn’t the weights reflect both the trajectory and gas exchange history?
  Assuming what is meant is weighting with ∆D (not ∆DIC, which does not track the plume correctly), yes, in theory weighting with ∆D is closer to the real situation, however there are two issues we ran into which we did not manage to solve despite quite some effort.
  
  One is conceptual, in that because ∆D depends on the gas exchange history, it convolutes model differences resulting from different mass transport modelling from differences in gas-exchange parameterization. However, we wanted to disentangle these two sources of variance between models. This is described in lines 207-212 in the original preprint. It is true that using ∆Alk instead introduces some differences in the plume extent. We have added a sentence clarifying this downside.
  
  The second problem is that towards the end of the simulation in many places the deficit becomes extremely small or even negative, because the calculated deficit also depends on changes in Alk which are not due to gas exchange. For example a part of the plume can first fully equilibrate, but then drift into an area where ∂DIC/∂Alk is different and actually become positive and the parcel will offgas a little. Ultimately the concept of a deficit is approximate and conceptual and doesn’t reflect a real tracer. The fundamental issue here is that this causes ratios based on D to become numerically unstable, because SumOf(Dij) tends towards zero. This means when we calculate weights, which always takes the form of Dij/SumOf(Dij), the weights become more and more numerically unstable and noisy, to the extent that weighting with such noisy weights ends up being not very useful.
  
  In contrast, weighting the plume extent based on a conservative tracer like Alk avoids this issue because sumof(∆Alkij) remains stable. This gives stable weights and thus a stable estimate of the plume extent. We discussed this on lines 212-216 in the original preprint.
  
  Please provide an explicit definition (equation) for the weights wᵢⱼ.
  An explicit equation was added.
  
  Lines 295–305: This section suggests that all locations converge after seven years. Please rephrase for clarity. The phrase “not observed in the North Atlantic” is confusing since Figure 1 indicates similar behavior is absent in the Western Sahara, Oman, and the U.S. East Coast. Even Although you discuss these differences later in the text, consider rewording this paragraph to introduce the subsequent sections.
  The locations where convergence is clearly absent are Norway, Iceland and U.S. East Coast. We’ve rephrased lines 295-305 for more clarity and included U.S. East coast in the exceptions to the general convergence trend.
  
  We would argue that the differences are continuing to converge in Western Sahara and Oman, even though they are not converged by year 15. However the gradient at the end of the simulation of the dashed line (CESM) is steeper than that of the ECCO lines, suggesting the CESM prediction is slowly catching up to the ECCO one.
  
  Lines 310–320: A short discussion on how interannual variability influences nₘₐₓ, and why this differs between injection sites and models, would be valuable.
  A paragraph discussing the possible sources of differences was added.
  
  Line 325: Can convergence can really be concluded from Figure 2? It is only evident in panel 2a.
  Of course we don’t know what happens after the extent of our simulations, in this case 5 years. But we can observe that the interannual and intermodel variance appears to decrease from year 1 to year 5 in cases a,c and d. Whether it will eventually converge entirely we cannot say for sure. We’ve updated this paragraph to express this in a more precise way.
  
  Figure 3: Please move the legend away from the curves for readability. Consider adding a zoom on the first few months.
  We updated the figure to zoom in on the first year for better clarity and moved the legend such as not to obstruct the plot lines.
  
  Line 340: The reported correlation is not immediately evident. It may help to show this relationship explicitly in an additional panel or inset. There appears to be sufficient blank space in Figure 3 to include both plots clearly.
  
  Line 340: The phrasing “the effect is much stronger” suggests subduction causes the n(t) difference in Norway, but this interpretation seems premature. The correlation is not clear, Alaska shows similar n(t) curves between models despite larger surface fraction differences.
  We shouldn’t have used the word correlation here, that’s perhaps too strong and of course subduction is not the only contributor to the difference as shown later. What can be gleaned however by just qualitatively comparing the curves in Fig 1 and Fig 2, is that in general faster surface dilution corresponds to slower equilibration. We’ve rephrased this section to make it clearer that we’re discussing a clear qualitative relationship between equilibration speed and surface dilution which holds in many locations but does not explain all the differences observed. In particular we’ve corrected the discussion of the Norway location in which subduction appears similar between the models but equilibration is nevertheless much slower in CESM2, owing to other model differences, which motivates the remainder of the analysis.
  
  Line 370: Did Xie et al. use coastal or offshore injection? Can this change their interpreatation?
  Xie et al. also used coastal injections for the most part (most of which did not show much of a resolution dependence), except for their South China location (which, likewise, showed almost no resolution dependence) and the two southern ocean deep water formation areas (which did show some resolution dependence). There’s not a clearly discernable correlation between coastal distance and resolution dependence in their data.
  
  Line 402: The sentence “This is consistent with a general understanding that ocean carbonate chemistry is well understood and therefore not a major contributor to the final efficacy of OAE” seems to be missing a word—perhaps “uncertainty” or “variability”?
  Changed to “This is consistent with a general understanding that the ocean carbonate chemistry is well understood and therefore not a major contributor to the inter-model-variance of OAE efficiency”
  
  Lines 410–420: Could you comment on the effect of wind on MLD?
  Wind stress can increase vertical mixing in the upper ocean, which is parameterized using the GGL90 scheme. We added a sentence to this section.
  
  Line 433: The discussion on biological influence is insightful. However, the cause–effect relationship in this sentence is unclear: “However, the shape of the OAE efficiency curves is, in principle, independent of the quantity added, so in that sense the efficiency curves appear to be indeed relatively independent of biological activity’. Consider rephrasing for clarity.
  We’ve removed this unnecessary and confusing sentence. We added more discussion on how the biological model may interact with alkalinity and DIC in general and how this may be reflected in the analysis of equilibration rates.
  
  Line 451: It would help readers if you reiterated which parameters are being referred to here.
  We replaced “(k and β)” with “(gas-exchange velocity $k$ and carbonate sensitivity $\beta$)” to make this more clear.
  
  Also, it is not clear how parameters vary in Figures 9–12 or the magnitude of these variations. This information could be added to the Methods section.
  The parameters are not varied in these experiments. Instead the parameter sets are changed for those from another model, one at a time. For example the gas exchange velocities in CESM2 are swapped for those from OceanSODA while everything else remains the same, and the effect on the calculated overall equilibration constant is observed.
  
  Line 462: Please specify the magnitude of the change in β.
  β is not directly changed in these experiments, rather the entire parameter set is swapped out, against a fixed trajectory. The effective differences in β are detailed in 6e
  
  Line 464: Please add a reference to a figure supporting this result.
  We referenced the sub figure in the text.
  
  Figures 9–11: Add x-axis labels and include the black line in the text legend.
  Thank you for noticing that oversight. The black line is a 12-month moving average. Added to legend.
  
  Figure 9: Were ECCO-Darwin parameters also changed to CESM2 ones?
  No, we only did the comparison in the CESM2→ ECCO-Darwin direction, i.e. all the parameters were from CESM2, with the exception of one parameter at a time (beta, ice and k) benign swapped for the corresponding ECCO-Darwin one.
  
  This was done for each of three possible plume trajectories.
  
  The black line is a 12-month moving average. We added this missing detail to the legend.
  
  Does the 3-hourly versus 6-hourly wind forcing matters?
  We don’t have any good way to determine this one way or another, though our suspicion is that this is not important relative to the differences in the wind speeds themselves.
  
  Line 490: One might argue that the remaining challenge lies in accurately determining the location of the plume.
  Indeed. We discuss this in the following section and in the overall discussion. Accurately predicting the plume trajectory given a particular deployment will require models to be improved.
  
  Line 502: Providing specific locations would make the statement clearer to readers.
  Added a specific reference to Oman instead of the vague “some places”.
  
  Figure 13: You might consider moving this figure earlier in the manuscript.
  Good idea! We moved this into the relevant methods section and added a paragraph to motivate the rate constant framework based on distinguishing plume and parameter-based differences between models:
  
  “This is illustrated in Figure \ref{plume_trajectories} for an alkalinity release near Alaska. The extent of the alkalinity plume from three different runs is overlaid on the local gas-exchange parameter $k$. One can see how the intersection of the plume with $k$ (as well as the other gas-exchange parameters) will determine the overall equilibration rate. Different models will not only have different gas-exchange parameters, they will also predict different plume trajectories - both contributing to the overall observed variance between models. The goal of the following section is to develop a framework to be able to attribute the differences to the various contributing aspects.”
  
  Line 576: It may be worth reiterating that vertical DIC and alkalinity gradients need to be appropriately resolved, as this is highly dependent on the biological representation.
  Yes, it is only during the pulse trajectory that modelling biology makes little difference to the uptake curve. We added a sentence here clarifying this fact. “However, the biological model is critical to setting up the correct starting state of the ocean, in particular the vertical Alk and DIC gradients.”
  
  Limitations paragraph: Could you discuss limitations specific to ocean-only models? For instance, are internal variability and feedback processes a concern?
  We’ve added a short paragraph discussing these limitations.
  
  “The inherent limitations of using ocean-only models, vs full earth-system models have also not been explored sufficiently yet, where reservoir feedbacks or long term changes to calcification rates at large deployment scales could potentially play a role. However many of these effects occur on longer timescales and may not directly influence the equilibration speed of individual OAE deployments.”
  
  Ensure consistent labeling of subplots : use i), ii), iii) in figure, referred to a) b) c) in the text legend.
  Thank you for catching this discrepancy in Figure 11 and 12. It has been corrected.
  
  Fig 11 & 12 : Should it be “ECCO minus CESM2” in figures inset legend?
  Actually, given Reviewer RC2’s questions about what the three different lines represent, and especially because the three trajectories are similar enough, we simplified that figure to just show a single trajectory.
  
  Citation: https://doi.org/10.5194/egusphere-2025-3713-AC2
RC2:
'Comment on egusphere-2025-3713', Anonymous Referee #2, 21 Oct 2025

Review of “substantial inter-model variation in OAE efficiency between the CESM2/MARBL and ECCO-Darwin ocean biogeochemistry model”

This study presents a model intercomparison examining how ocean alkalinity enhancement efficiencies in virtual alkalinity deployments differ due to differences between ocean biogeochemistry models. The upshot is that uncertainties (or biases) in vertical transport in the upper ocean are associated with pretty substantial uncertainties (or biases) in OAE efficiency in popular global ocean biogeochemical models. This is a valuable contribution to the literature, building on recent work to map OAE efficiency and understand its sensitivity to deployment location and ocean circulation regime. It should be published after revision.
Comments/suggested revisions line-by-line:
Table 1. If you’re going to include this Biological Ablation Experiment, I think you should include some reference to and discussion of the actual equations in Darwin and specifically the coupling between Alk, Dic and biology, and whether it is realistic. In general, I think you should have more explicit caveats about potential missing processes, e.g. small-scale plume processes, microscale processes on particles, etc.
L121: State the definition of the brackets [..]
Eq1-2 / Supplemental Material. I didn’t check all the calculations completely. That said, they feel out of scope for this paper. Can you justify why this is needed or appropriate in this context? Why can’t you just use available tools? Is there some reason why you can’t make these calculations using existing infrastructure in pyCO2SYS, e.g. finite differences or automatic differentiation? Are sensitivities like d DIC/ dCO2 or d ln DIC/d ln CO2 not already available from existing outputs of pyCO2SYS? Is this critical to making your computations efficient, i.e. speeding up pyCO2SYS for this application?
The map in Fig S1 would be better in the main text.
Fig S2. If you’re going to keep this material, it would help to discuss how frequently/where DIC/Alk > 1 occurs in the real ocean and thus whether this parameterization improvement beyond Tyka et al. 2022 has practical relevance. The approximation from Tyka et al. 2022 is nice to have here, but how important are these adjustments. Also, I’m wondering if OAE could push the system into low DIC/Alk< 0.5, where the approximations start to fail? (I don’t see how it could push it to high DIC/Alk)
Eq (2): I’m a little wary about this. Be careful to discuss units on these. The Revelle factor d ln CO2/d ln DIC has the advantage of making this type of sensitivity unitless.
Section 2.6: What does “surface dilution” versus “surface distribution” mean? This is not very intuitive. Does it mean vertical vs lateral distribution of the plume? Be more precise and maybe show and discuss examples.
L169/Eq 3: To me, “fully re-equilibrate with the atmosphere (not accounting for reservoir feedbacks (Tyka 2025))” is not quite the right wording. It is not clear what you mean by “equilibrate” or “re-equilibrate”. The equation is also a bit imprecise here, because d DIC/dAlk varies spatially and temporally, as the plume moves. So it is not really a partial derivative. I think of this more precisely as representing a theoretical scenario in which co2 flux adjustment to surface alkalinity perturbations are instantaneous and leave pCO2sw unchanged, in which the partial derivatives make more sense. However, I appreciate the desire to frame it as the limit of a long-term adjustment when the ocean is fully mixed, which is related because the partial derivatives do not vary that much in space and time… Try to frame better.
Related to the above/Eq 7 I think the relationship between pCO2sw and DIC is non-linear, so expressing beta as a partial derivative can introduce errors for large D. By contrast, the partial derivative in eta_max is reasonably linear and thus not so sensitive to large perturbations. I’d appreciate seeing some initial plots of 1/D dD/dt and r, which should not be the same due to the various approximations, maybe in the appendix. Showing errors and giving some intuition. It feels like you jump to the model comparisons without illustrating this initial methodology.
Section 2.7.1/Figures 9-10: I’m confused. I don’t follow what is interchanged and what the 4 lines on the plot represent. Shouldn’t there be a matrix of 9=3 x 3 scenarios for each parameter and virtual deployment? 3 different physical plumes (reflecting same changes in both numerator and denominator in each scenario) and 3 different ratios reflecting different values in numerator and denominator (ECCO 99/CESM99; ECCO92/CESM99; ECCO99/ECCO92) or more if oceanSODA is included as an option for k and beta? Please clarify what is happening here and in the comparisons in Fig 9-10.
Figs 1/4: Move the map in Supplemental to the Main Manuscript.
Fig 3: It would help if you could show zoomed in the first 1-2 years by making the x axis non-uniform. Legend in (a) could be moved to the top right of the panel.
Fig 3-5: We need a more precise definition and maybe plot showing example of how “surface fraction” and “surface-ocean excess alkalinity fraction” are defined. See my comment above about this.
Fig 6f: perhaps use eta_max on the x axis rather than the partial derivative? also what are the units in panels e-f?
Fig 8: I don’t think this Biological Ablation experiment is worth publishing. See my comments above regarding the model formulation. The lack of sensitivity is unsurprising to me, and I’m not sure the Darwin model adequately captures the sensitivities of biology to DIC and Alk (to the extent these sensitivities are significant at all). (I'm more surprised by the difference near Oman)
Figs 9-10: What is the black line? (See my related comment in Section 2). I get a bit overwhelmed by the examples here. I suggest: a) try to reduce the content in Figs 9-12 to a table if possible and/or b) synthesize the narrative and reduce the number of plots. Questions to think about: What is the message you are trying to convey here? Could most of this be in supplemental? Why 6 examples or 12 examples or 2 examples? Why just time series? Why no plots of depth and time if vertical distribution is so important?
Fig 10/12: Use of the term “tropical” rather than “near-equatorial” is more appropriate for these regions.

Citation: https://doi.org/10.5194/egusphere-2025-3713-RC2
- AC1:
  'Reply on RC2', Michael Tyka, 12 Nov 2025
  We thank the reviewer for the thorough review and the great suggestions !
  
  Review of “substantial inter-model variation in OAE efficiency between the CESM2/MARBL and ECCO-Darwin ocean biogeochemistry model”This study presents a model intercomparison examining how ocean alkalinity enhancement efficiencies in virtual alkalinity deployments differ due to differences between ocean biogeochemistry models. The upshot is that uncertainties (or biases) in vertical transport in the upper ocean are associated with pretty substantial uncertainties (or biases) in OAE efficiency in popular global ocean biogeochemical models. This is a valuable contribution to the literature, building on recent work to map OAE efficiency and understand its sensitivity to deployment location and ocean circulation regime. It should be published after revision.
  Table 1. If you’re going to include this Biological Ablation Experiment, I think you should include some reference to and discussion of the actual equations in Darwin and specifically the coupling between Alk, Dic and biology, and whether it is realistic. In general, I think you should have more explicit caveats about potential missing processes, e.g. small-scale plume processes, microscale processes on particles, etc.
  We will add a more extensive discussion of the features of the Darwin biology model, its dependence on the carbonate state and processes not modeled here.
  
  In the Darwin equations, the productivity rates are not explicitly dependent on DIC or Alk. Therefore our null hypothesis was indeed that the effect of removing biology should be very small if any. However it was useful to numerically confirm that it is. However, our results also show that the sensitivity is not completely zero. We’re still investigating the underlying reasons for this, however for practical purposes our hypothesis was confirmed.
  
  Fig 8: I don’t think this Biological Ablation experiment is worth publishing. See my comments above regarding the model formulation. The lack of sensitivity is unsurprising to me, and I’m not sure the Darwin model adequately captures the sensitivities of biology to DIC and Alk (to the extent these sensitivities are significant at all). (I'm more surprised by the difference near Oman)
  The importance of modelling biological processes when simulating OAE has so far not been explored sufficiently in the literature and we have had many conversations with colleagues who have argued either way. Therefore we believe this negative result is worth publishing and of interest to the community.
  
  Indeed when we showed some early ablation results in presentations, many listeners were very surprised by these results. It is reasonable to think that because the biological pump sets up the vertical DIC and ALK gradients in the first few 100 meters, simulating this feature during the pulse trajectory is critical. However we show that as long as these resulting gradients are present in the starting conditions, it is not necessary to simulate the processes that maintain them for short-medium length simulations during which the OAE pulses equilibrate - not necessarily an intuitive result for everybody. We've further clarified this in the manuscript.
  
  Furthermore, as we discussed in lines 433–337 the fact that for this sort of efficiency calculation you can ignore the biological tracers, and thus a very substantial part of the calculation, is a useful fact for practitioners to know, as it can save a lot of compute time (advection of the 31 Darwin tracers consumes almost half the compute time).
  
  The reason for the small difference near Oman continues to be unclear. We investigated this a little more since submitting the preprint, suspecting perhaps an effect of the Oman location being the only location at the mouth of a marginal sea (Arabian gulf). However, we tried several marginal sea locations since, and found virtually no difference between the regular and the ablated model. Conversely we tried the offshore Oman location as well as the red sea and here too there appears to be a small dependence on the biological model being present. We’re continuing to investigate the source of this.
  
  L121: State the definition of the brackets [..]
  Added (where the square brackets mean "concentration of")
  
  Eq1-2 / Supplemental Material. I didn’t check all the calculations completely. That said, they feel out of scope for this paper. Can you justify why this is needed or appropriate in this context? Why can’t you just use available tools? Is there some reason why you can’t make these calculations using existing infrastructure in pyCO2SYS, e.g. finite differences or automatic differentiation?
  We don’t disagree with the reviewer that some of the exploration of interesting approximations are auxiliary and supplemental. However, we believe that these approximations are valuable to the community and should exist in the publication record. They have, to our knowledge, not been published before and the various approximations from the literature have never been compared systematically. We’ve heard from colleagues that although they would like to use them, they can’t unless they can give a citation. Inclusion here in the supplement solves this issue for the community. It was included in this manuscript because they were developed and used as part of the work presented here.
  
  Finally we see little downside in including these derivations here - they do not obstruct the main paper in any way but may be of value to those who are interested.
  
  Is this critical to making your computations efficient, i.e. speeding up pyCO2SYS for this application?
  Yes we used the approximations when calculating the deficit for every voxel and timepoint in the simulations simply due to the speed advantage of obtaining ∂DIC/∂Alk. With today's availability of computing power we would not call this critical, but certainly convenient.
  
  Approximations that don’t require solving the carbonate system and are based on Alk and DIC alone (fundamental tracers that are readily available from most biogeochemical simulators) are very convenient and useful because there’s no need to use pyCO2Sys.
  
  We have seen the previously published approximation for (dALK/dDIC ~~ 3-2DIC/Alk) used by other practitioners for this reason. The fact that eta_max is almost independent of anything but DIC/Alk makes this possible (and is an interesting fact by itself). However the previous approximation retains some inaccuracy especially when borate is taken into account with a deviation of up to 3-4% (Fig S2, b)) for typical ocean DIC/Alk values. Given that our new softplus-based equation is just as easy to calculate we felt that we should share this insight with the community.
  
  Are sensitivities like d DIC/ dCO2 or d ln DIC/d ln CO2 not already available from existing outputs of pyCO2SYS?
  d DIC/ dCO2 is not directly available from pCO2sys, but it can be calculated from gammaDIC and [CO2] from pyCO2 like so: dDIC_dCO2 = csys["gamma_dic"]/(csys["CO2"]). We didn’t use this in our approach, but used the full explicit formula given here - the two approaches give the same result. If it is preferable we can remove the entire section 3 from the supplement and present the calculation based on γDIC instead.
  
  Fig S2. If you’re going to keep this material, it would help to discuss how frequently/where DIC/Alk > 1 occurs in the real ocean and thus whether this parameterization improvement beyond Tyka et al. 2022 has practical relevance. The approximation from Tyka et al. 2022 is nice to have here, but how important are these adjustments.
  We’ve changed the plot of the approximations (preprint FigS3) to span only realistic values of Alk/DIC, taken from OceanSODA, rather than taking a full grid of Alk and DIC values. This gives a more practically relevant comparison. The new approximation is shown to be quite a bit better than the old one for high values of DIC/Alk, which do occur (0.95-1.01). Older approximations do more poorly across bigger ranges of eta_max
  
  Also, I’m wondering if OAE could push the system into low DIC/Alk< 0.5, where the approximations start to fail? (I don’t see how it could push it to high DIC/Alk)
  Large scale applications of OAE could push the system into very low DIC/Alk but only very transiently and local to the injection site. Since eta_max is primarily concerned with the ultimate, long-term equilibration point, inaccuracies in transient low DIC/Alk are not particularly important. We’ve added some discussion of this to the supplement.
  
  The map in Fig S1 would be better in the main text.
  
  Figs 1/4: Move the map in Supplemental to the Main Manuscript.
  Moved figure to main text as suggested.
  
  Eq (2): I’m a little wary about this. Be careful to discuss units on these. The Revelle factor d ln CO2/d ln DIC has the advantage of making this type of sensitivity unitless.
  The partial derivative ∂[DIC]/∂[CO2] in equation 2 is also unitless, since it’s concentration divided by concentration. As this is the term explicitly appearing in the gas exchange equation, rather than the Revelle factor (∂ln pCO2/∂ln DIC), we chose this to be the sensitivity that’s being investigated and compared.
  
  We’ve added a sentence in the methods when these equations are introduced to remind the reader that they are unitless.
  
  Section 2.6: What does “surface dilution” versus “surface distribution” mean? This is not very intuitive. Does it mean vertical vs lateral distribution of the plume? Be more precise and maybe show and discuss examples.
  Yes, they are referring to vertical and lateral distribution. These concepts are made more rigorous later in the methods section. However in the interest of giving the reader a more intuitive idea from the start we’ve rephrased the end of section 2.6:Instead, we take a different approach, whereby the effective equilibration rate constant is "reconstructed" from its component terms offline. These include terms dependent solely on the bulk flow of the plume, namely the vertical and horizontal distributions of excess alkalinity over time (in particular the fraction of the excess alkalinity resident in the surface layer). They also include terms dependent only on the parameterization of the gas exchange, i.e the wind, ice and carbonate parameters. In this framework one can then directly examine the sensitivity with respect to any given parameter.
  
  L169/Eq 3: To me, “fully re-equilibrate with the atmosphere (not accounting for reservoir feedbacks (Tyka 2025))” is not quite the right wording. It is not clear what you mean by “equilibrate” or “re-equilibrate”. The equation is also a bit imprecise here, because d DIC/dAlk varies spatially and temporally, as the plume moves. So it is not really a partial derivative. I think of this more precisely as representing a theoretical scenario in which co2 flux adjustment to surface alkalinity perturbations are instantaneous and leave pCO2sw unchanged, in which the partial derivatives make more sense. However, I appreciate the desire to frame it as the limit of a long-term adjustment when the ocean is fully mixed, which is related because the partial derivatives do not vary that much in space and time… Try to frame better.
  We would argue that this framing isn’t really new and goes back to Renforth and Henderson 2017, and was used in He et al 2023 and further developed in Zhou et al 2024 and several other papers. The equation is a precise description of the changes in the carbonate system in small parcel of water for small changes in ∆Alk, over which ∂DIC/∂Alk is a constant. Re-equilibration in this context means that if we start with a water parcel that was previously at some pCO2, and we add a small amount of alkalinity then equation 3 gives the amount of DIC that needs to be added that offsets that amount of alkalinity such that the pCO2 returns to its previous value. While it is true that ∂DIC/∂Alk can change over time for a given parcel, it still gives an estimate of the final equilibration point relative to the alkalinity perturbation at any given moment in time, for a given water parcel. The consequence is simply that the final equilibration point can change somewhat over time, we don’t see this as a problem - it remains a useful tool: Transforming the alkalinity perturbation into a DIC deficit is a useful framework that conceptually unites Ocean Alkalinity Enhancement and mCDR methods that remove DIC from the ocean.
  
  Related to the above/Eq 7 I think the relationship between pCO2sw and DIC is non-linear, so expressing beta as a partial derivative can introduce errors for large D. By contrast, the partial derivative in eta_max is reasonably linear and thus not so sensitive to large perturbations.
  In equation 7 we state the differential equation relating the rate of change of the deficit with the remaining deficit itself, at any given moment in time, which can be valid, even if the terms are dependent on time or D. If the terms k, beta, z0 and mu are independent of D itself, we have a simple first order differential equation. This assumption is reasonably true for k and mu for any value of D (and z0 is a constant). But as the reviewer points out, for beta this assumption is only true when D is small, i.e. where the additional alkalinity is small. While this is true in principle, here even the largest deviations in alkalinity caused by the pulsed alkalinity injections are in the low micromolar range over which beta is pretty linear, and quickly drop to below µM as the plume spreads.
  
  However even if beta is a function of D, the differential equation can still be valid for any given point in time, the only difference is that its integration (i.e. the time evolution of D from its initial value back to zero) won’t be a simple exponential as would be expected for a first order ODE but a slightly deviating curve.
  
  In this case the rate constant isn’t really a constant in time. Indeed, all of the variables here (k, mu and beta) are dependent on time (and location). For simplicity of notation we did not make every variable here an explicit function of time ( k(t), mu(t), etc..) however they all are.
  
  We’re using this context to construct a metric for the instantaneous rate constant at which the deficit is decaying. Because the variables that determine it are not actually constant in time, we can observe how the equilibration slows or speeds up and use this to compare different models. In this context, a slight dependence of beta on D is not fundamentally different than the dependence of beta, k or mu on time and location.
  
  I’d appreciate seeing some initial plots of 1/D dD/dt and r, which should not be the same due to the various approximations, maybe in the appendix. Showing errors and giving some intuition. It feels like you jump to the model comparisons without illustrating this initial methodology.
  We added a figure in the supplement plotting two ways of estimating the instantaneous rate. a) The direct rate of dDIC_total/dt in mol/s (which is the rate of excess CO2 uptake due to the excess alkalinity added) directly from the simulation. b) A “reconstructed” rate based on Equation 8 and 9. I.e. For each surface cell we calculated its deficit D times the gas exchange parameters, which gives the rate of gas exchange in that cell. Summing over all surface cells yields an estimate of the total gas exchange rate in mol/s.
  
  The two curves are in close agreement. This suggests that our conceptual model of using the virtual quantity D is a reasonable model of the gas exchange during the simulation and can therefore be used to compare models.
  
  We have added more text int the methods to make this clearer.
  
  Fig 3: It would help if you could show zoomed in the first 1-2 years by making the x axis non-uniform. Legend in (a) could be moved to the top right of the panel.
  We’ve implemented these great suggestions.
  
  Fig 3-5: We need a more precise definition and maybe plot showing example of how “surface fraction” and “surface-ocean excess alkalinity fraction” are defined. See my comment above about this.
  We’ve added an equation defining the “surface-ocean excess alkalinity fraction” in the methods section. It is simply the fraction of the total excess alkalinity (the alkalinity added in the pulse) that is currently residing in the top simulation layer.
  
  Fig 6f: perhaps use eta_max on the x axis rather than the partial derivative? also what are the units in panels e-f?
  The partial derivatives ∂DIC/∂CO2 and ∂DIC/∂Alk= eta_max are both unitless quantities. We’ve indicated them as such in the figure.
  
  We changed the y-axis label to eta_max in 6f as suggested.
  
  Section 2.7.1/Figures 9-10: I’m confused. I don’t follow what is interchanged and what the 4 lines on the plot represent. Shouldn’t there be a matrix of 9=3 x 3 scenarios for each parameter and virtual deployment? 3 different physical plumes (reflecting same changes in both numerator and denominator in each scenario) and 3 different ratios reflecting different values in numerator and denominator (ECCO 99/CESM99; ECCO92/CESM99; ECCO99/ECCO92) or more if oceanSODA is included as an option for k and beta? Please clarify what is happening here and in the comparisons in Fig 9-10.
  To avoid making things even more confusing we did not look at every possible parameter replacement. Here we just changed a single gas-exchange parameter at a time, from its CESM2 value to its ECCO-Darwin equivalent. Each parameter has its own column, beta, ice and k. For each such replacement, we have to choose a physical plume trajectory to apply this substitution to, we did this once for each plume trajectory we have, namely from each of the three simulations (ECCO1992, ECCO1999 and CESM2). It turns out that the choice of plume doesn’t make a lot of difference to this analysis and the three curves are similar enough. However, there’s a lot of seasonal variation, which is why we added the black 12-month moving average (see below) to smooth out seasonality and look at the long term trends..
  
  Figs 9-10: What is the black line? (See my related comment in Section 2).
  The black line is simply a running 12-month average over all 3 plumes. We failed to include this in the legend and we have corrected this oversight.
  
  I get a bit overwhelmed by the examples here. I suggest: a) try to reduce the content in Figs 9-12 to a table if possible and/or b) synthesize the narrative and reduce the number of plots. Questions to think about: What is the message you are trying to convey here? Could most of this be in supplemental? Why 6 examples or 12 examples or 2 examples? Why just time series?
  As the reviewer suggests, we picked 6 locations (3 polar and 3 equatorial) to be featured in the main paper and moved the remaining three locations to the supplement, for figure 9/10 and figure 11/12, reducing each pair to a single figure in the main text. For the old figures 9/10 we also simplified the analysis and used only a single plume (taken from CESM2), rather than three plumes, since the three resultant curves were extremely similar.
  
  To guide the reader better through this section we added more explicit explanation of which properties got swapped and which were retained.
  
  Why no plots of depth and time if vertical distribution is so important?
  We’ve added plots of depth vs time in the supplement as Figure S2.
  
  Fig 10/12: Use of the term “tropical” rather than “near-equatorial” is more appropriate for these regions.
  We changed the wording accordingly
  
  Citation: https://doi.org/10.5194/egusphere-2025-3713-AC1

Michael Dominik Tyka, Mengyang Zhou, Elizabeth Yankovsky, and Dustin Carroll

Supplement

https://doi.org/10.5194/egusphere-2025-3713-supplement

Michael Dominik Tyka, Mengyang Zhou, Elizabeth Yankovsky, and Dustin Carroll

Viewed

Total article views: 2,319 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
2,120	161	38	2,319	54	34	38

HTML: 2,120
PDF: 161
XML: 38
Total: 2,319
Supplement: 54
BibTeX: 34
EndNote: 38

Views and downloads (calculated since 27 Aug 2025)

Month	HTML	PDF	XML	Total
Aug 2025	448	8	3	459
Sep 2025	1,448	28	7	1,483
Oct 2025	97	27	9	133
Nov 2025	60	47	10	117
Dec 2025	61	39	6	106
Jan 2026	6	12	3	21

Cumulative views and downloads (calculated since 27 Aug 2025)

Month	HTML	PDF	XML	Total
Aug 2025	448	8	3	459
Sep 2025	1,448	28	7	1,483
Oct 2025	97	27	9	133
Nov 2025	60	47	10	117
Dec 2025	61	39	6	106
Jan 2026	6	12	3	21

Viewed (geographical distribution)

Total article views: 2,309 (including HTML, PDF, and XML) Thereof 2,309 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 09 Jan 2026

Short summary

Quantification of the kinetics of the induced ocean CO₂ uptake following application of marine carbon dioxide removal technologies (mCDR) is crucial for such technologies to gain scientific and social acceptance. Here, we compare two circulation models commonly used for this purpose and find substantial differences in their predictions. We analyze which physical aspects of the models contribute the most to the inter-model discrepancies, and thus require future research.


Total:	0
HTML:	0
PDF:	0
XML:	0