the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Assessing Earth system responses in deep mitigation scenarios with activity-driven simulation of carbon dioxide removal
Abstract. Assessing Earth system responses arising from carbon dioxide removal (CDR) requires developing and simulating pairs of scenarios – a mitigation scenario with deployment of CDR and a corresponding no-CDR baseline. The latter describes a world where no CDR is deployed, such that net carbon emissions are higher and a given temperature target may be missed. While over the past years a rich literature on deep mitigation scenarios with CDR has been emerging, no-CDR baselines have mostly been explored in stylized Earth system model (ESM) experiments. In such simulations, a no-CDR baseline simply assumes that CDR is “switched off”, while socio-economic constraints are not considered. However, the deployment of CDR in deep mitigation scenarios, created by integrated assessment models (IAMs), is embedded in a consistent socio-economic description of plausible futures, and disallowing CDR may affect climate drivers due to changes in the energy system and in land-use dynamics. Particularly, when moving towards an activity-driven representation of CDR in emission-driven ESMs, where the activity that draws down CO2 from the atmosphere is explicitly modelled, the creation of no-CDR baselines comes with challenges and trade-offs. Here, we conceptualize a framework for emission-driven ESM simulations of IAM scenarios that allows us to determine carbon-cycle and biogeophysical feedbacks of CDR deployment using no-CDR baselines. We show that different options exist for the creation of no-CDR baselines, which offer different insights and have their specific advantages and limitations. We also demonstrate that internal variability of the climate system inherently limits our ability to detect the small signals related to CDR deployment and its feedbacks. Hence, unless a sufficiently large initial conditions ensemble is employed, stylized modelling approaches may remain preferable for some applications, e.g., the quantification of regional biogeophysical effects of CDR deployment.
- Preprint
(1156 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-833', Irina Melnikova, 06 Apr 2026
-
RC2: 'Comment on Schwinger et al', Benjamin Sanderson, 18 Apr 2026
Schwinger et al. present a conceptual framework for assessing Earth system responses to carbon dioxide removal (CDR). The central contribution is the operational distinction between net atmospheric removal (NAR) and process carbon removal (PCR), through construction of a no-CDR baseline in the IAM-ESM modelling chain.
This is a timely paper, a complete emissions-driven framework necessitates activity-driven CDR representations, but the community lacks a shared vocabulary for what coupled CDR assessment is actually measuring. The no-CDR baseline formalism is well thought through, and the ocean alkalinity enhancement (OAE) illustration gives a concrete number for the carbon-cycle offset that can be used for benchmarking and comparison.
I recommend minor revisions. The framework is sound but four points warrant development. The treatment of internal variability in Section 5.1 identifies a real problem but doesn't follow through on the implications for simulation design, and in particular stops short of recommending offline experiments that would sidestep the noise problem for PCR estimation. The multi-model intercomparison gap flagged in the conclusions deserves a more concrete protocol suggestion than it currently receives. The freezing approach to A/R in the ex-post baseline construction conflates policy-driven afforestation with passive forest regrowth from land abandonment, which likely biases the attributable A/R signal. And the counterfactual baseline framing in Section 3.2 understates how different ex-post and counterfactual approaches actually are: they answer different questions and the paper should distinguish them more carefully rather than presenting counterfactual as a cleaner-in-principle alternative.
Major Comments
1. The internal variability discussion suggests a complementary offline experiment tier
Section 5.1 is one of the most valuable parts of the paper. The authors show that the CDR scenario and Baseline diverge in ENSO phase within a few years of OAE deployment beginning, and that noise in the annual PCR estimate remains comparable to the signal even for decadal averages with a three-member ensemble. Larger initial-condition ensembles are clearly needed for emission-driven NAR estimation. For the PCR specifically, though, the authors could consider a complementary approach worth recommending alongside the large-ensemble strategy.
The PCR is the CO2 drawdown attributable to the CDR activity before carbon-cycle feedbacks. For OAE this depends on ocean carbonate chemistry, circulation, and mixing, all explicitly represented in ocean biogeochemistry models that do not require an interactively coupled atmosphere. Running the ocean in offline mode, forced by prescribed circulation from a scenario climatology, controls the modes of variability driving the noise in Figure 4 rather than having them emerge interactively. A single offline simulation pair in this mode could be statistically informative for PCR estimation with minimal additional computational cost. The same logic holds for A/R and BECCS land carbon; offline land surface models (TRENDY-like) give cleaner process-level estimates and can span multiple scenario backgrounds cheaply.
The paper hints at this in the conclusions but doesn't develop it as a design recommendation. I'd suggest a short paragraph distinguishing what offline runs can and cannot do. For NAR estimation, coupling is essential, since the outgassing response of the ocean and the terrestrial sink weakening under declining CO2 require the full coupled system. For PCR, coupling adds noise without adding information. This is a useful distinction to make explicit for groups planning CDRMIP contributions; as currently framed, readers could reasonably conclude that very expensive coupled ensembles are the only way to do activity-driven CDR assessment - and if there are alternatives, it would be good to spell them out.
2. The multi-model dimension deserves a more concrete recommendation
The authors note that NorESM2-LM results are model-dependent and that the land feedback in particular may vary greatly across ESMs. Given the authors' central involvement in CDRMIP planning, this caveat should be translated into a concrete protocol suggestion rather than left as a general observation.
A coordinated offline PCR intercomparison, with models forced by a common prescribed atmospheric state and alkalinity input, is tractable where a multi-model emission-driven ensemble is not. It would also give the community something that doesn't currently exist: a separation of CDR process uncertainty from feedback uncertainty, without which the inter-model spread in NAR is fundamentally ambiguous (analagous to the TRENDY protocol's assessment of land use impacts on carbon budgets). An analogous ocean protocol under prescribed alkalinity forcing, run across a small set of background CO2 trajectories, could run alongside the coupled simulation. I'm not suggesting this replaces the coupled runs, but a development of supporting offline experiments would be valuable.
3. The scenario dimension could be acknowledged more explicitly
The 35% feedback offset is derived from a single REMIND-MAgPIE scenario. The authors have demonstrated in past papers that OAE efficiency depends on the background CO2 trajectory, so this number is scenario-specific.. A high-emission world and a deep mitigation world produce different carbon-cycle states and therefore different feedback fractions; the scenario dependence of the PCR is itself a key quantity for carbon budget applications. The paper doesn't need to fully resolve this dependency, but scenario dependence should be more highlighted in the conclusions and abstract.
The connection to remaining carbon budget calculations - the zero emissions commitment (ZEC), for example, is also not mentioned. Extending the emission-driven coupled runs past the net-zero point would let the framework diagnose how active CDR deployment modifies the ZEC. ZEC under ongoing CDR is potentially different to that in idealised experiments e.g. it's not immediately obvious if the ocean carbon ZEC dynamics would differ under OAE and a short discussion on this would be useful.
4. The A/R freezing approach conflates policy-driven afforestation with passive regrowth
Section 3.1.2 proposes to freeze all land-use transitions into forest in grid cells where forest area expands, treating these as attributable to A/R. The authors acknowledge on lines 343-346 that A/R can also happen on abandoned land and choose not to distinguish between unintentional and policy-driven A/R, citing inventory reporting conventions. This works for inventory purposes but leaves ambiguity in the NAR/PCR framework, which is specifically trying to decompose CDR effects on the Earth system rather than align with national reporting.
Deep mitigation scenarios, including the REMIND-MAgPIE scenario used here, typically combine demographic transition (population peak and decline), continued agricultural yield improvement, and shifts in dietary patterns. All of these reduce cropland demand and produce secondary forest expansion through passive regrowth, independently of any climate mitigation policy. In scenarios where abandonment-driven forest expansion is a meaningful fraction of total secondary forest gain, freezing all forest transitions in the no-CDR baseline attributes passive regrowth to A/R as a CDR method. This overstates the PCR, and by extension the feedback offset. The scenario and baseline thus differ by more than just CDR deployment,.
For OAE this is not an issue, since the land-use differences between S and B are small. The framework is meant to apply to A/R as well, though, and for A/R-heavy scenarios the attribution bias could be meaningful. A refined freezing approach using IAM-internal flags for policy-driven versus economically-driven land-use transitions could give a cleaner A/R signal. An alternative worth discussing is a partial counterfactual in which the no-CDR baseline uses the land-use pattern from a no-A/R-policy IAM variant if one is available, limiting the counterfactual adjustment to land use rather than re-running the full scenario. The current treatment likely sets an upper bound on attributable A/R PCR rather than a best estimate, and the paper should be explicit about this
5.. The counterfactual baseline framing needs a more careful treatment of IAM behaviour
Section 3.2 presents the counterfactual IAM baseline as a cleaner alternative to ex-post adjustment, limited mainly by practical tractability. However - this conceptual framing understates how different the two approaches actually are, and the paper would benefit from engaging more directly with what a counterfactual baseline would and wouldn't tell you.
The first issue is that CDR options in IAMs are not independent. They substitute for each other through the cost-optimization framework and interact through shared resource constraints. Disabling BECCS in REMIND-MAgPIE doesn't just remove BECCS; it frees CCS capacity for fossil applications, shifts DACCS onto the efficient frontier at lower carbon prices, releases land that was growing bioenergy crops for other uses (including potentially more A/R), and alters the marginal abatement cost curve across the whole mitigation portfolio. The counterfactual baseline therefore isn't a "no-BECCS-everything-else-equal" world but a world with substantially more DACCS, different land use, different fossil CCS deployment, and different energy mix. For attribution this raises significant complexity. The atmospheric CO2 difference between scenario and counterfactual baseline reflects not just the absence of BECCS but also all the compensating adjustments the IAM makes in response.
A/R is the hardest case, and the paper doesn’t properly acknowledge this. A/R is not a discrete technology with a well-defined deployment rate that can be switched off. In MAgPIE it emerges from the interaction of land-use optimization, carbon pricing applied to land-use emissions, agricultural productivity projections, and forest management assumptions. Disabling A/R requires deciding what that operation actually means, and the options have quite different implications. Removing the land-use carbon price eliminates the A/R incentive but also removes the incentive to avoid deforestation, producing more forest loss in the baseline than in the scenario. Capping forest area at present-day levels prevents A/R but creates a physically unrealistic land system where cropland is abandoned but not allowed to revert to forest, which connects directly to the passive regrowth problem raised in Comment 4. Requiring the IAM to solve for net-zero without A/R lets the model re-optimize but effectively reframes the whole land-use problem rather than isolating A/R. None of these produces a clean "no A/R" baseline.
As such, ex-post and counterfactual baselines answer different questions. Ex-post holds the socio-economic world fixed except for CDR deployment, which is closer to an attribution question. Counterfactual lets the IAM re-optimize, which is closer to a systems-response question. Both are useful, but they give different information and shouldn't be treated as better or worse versions of the same approach. The paper's current framing suggests counterfactual is preferable in principle and limited only by cost; in practice the choice depends on what you want to measure. For quantifying carbon-cycle feedbacks attributable to a specific CDR method, ex-post is probably the right tool despite its socio-economic inconsistency. The counterfactual approach is better suited to asking how the mitigation portfolio as a whole responds when one option is removed, which is a different and separately interesting question. Making this distinction explicit would be useful.
Minor Comments
- The spatially homogeneous BECCS storage ratio (kgCO2/tDM uniform across all bioenergy cropland) is a reasonable pragmatic choice but the assumption potentially introduces a regional bias in PCR attribution which is worth a caveat.
- The IAM output recommendations in Section 6 read as a wish list. Tables 2 and 3 do some of the needed work but don't cover the gross positive emissions disaggregation needed specifically for no-CDR baseline construction. A clear table specifying required outputs, temporal and spatial resolution for each CDR method would be useful reference for IAM groups.
Summary
The NAR/PCR framework is a strong contribution to understanding and attributing the effectiveness of CDR in ESM simulations. I suggest minor revisions to address the issues raised above.
Citation: https://doi.org/10.5194/egusphere-2026-833-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 537 | 280 | 44 | 861 | 36 | 59 |
- HTML: 537
- PDF: 280
- XML: 44
- Total: 861
- BibTeX: 36
- EndNote: 59
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
In this manuscript, Schwinger et al. discuss the challenges of representing carbon dioxide removal (CDR) in Earth system models (ESMs) and the resulting inconsistencies with integrated assessment model (IAM) simulations. The authors identify key issues, propose a framework to address them, and provide an example using NorESM2-LM (ESM) and REMIND-MAgPIE (IAM).
This is a timely contribution in the context of the upcoming CMIP7 phase, particularly with respect to CDRMIP. The manuscript is clearly written and has the potential to make a significant contribution to both the IAM and ESM communities.
Below, I provide several suggestions for improving the manuscript along with questions that arose during my reading.
In addition, depending on the research question, there may be a need to estimate (e.g., regional) Earth system feedbacks independently of associated socioeconomic or land-use changes, i.e., isolating “pure” Earth system responses to CDR.
While the manuscript does a very good job of explaining the framework itself, I suggest adding a short section or paragraph that explicitly discusses its limitations or intended scope of application and clarifies which sources of uncertainty are included or excluded (especially, taking into account the broad ESM-IAM community, for whom the paper is addressed).
E.g., I speculate that Figure 4a would look about the same if both “CDR” and “no-CDR” were compared but just two perturbed ensemble members of the same “CDR” scenario. This raises the question of whether the illustrated variability is uniquely tied to CDR effects or simply reflects general internal variability.
In addition, the framework involves comparisons between emission-driven and concentration-driven simulations (e.g., for estimating biogeophysical effects), which are also subject to internal variability. I suggest clarifying why interannual variability is particularly critical in the context of CDR. A slight refinement of the text may be sufficient to make this point clearer.
Some other comments/questions:
L 93 “which is always activity-driven in ESMs”: maybe add smth like “see Section …” (reference to a section that provides explanation why A/R is always activity-driven
L126 would the reduction of CO2 be always less than the gross amount removed? Even during early mitigation phases under increasing emissions?
L134 NAR abbreviation. CDR and PCR have “carbon” in abbreviations. Is there any particular reason for not defining Net Atmospheric Carbon Removal or Net Carbon Removal?
Table 1. The absence of symbol for biogeophysical effects made me think whether it should be explained in radiative forcing (or temperature) space? Would including it improve clarity or overly complicate the framework?
Section 3.1.2 A/R was a bit difficult to follow (together with Figure 2, which I still do not understand completely). Maybe first explaining the difference between states and transitions could help?
Equations 9-12. Is the “no-more-active” carbon from BECCS (stored in geological reservoirs) included in delta_Cland?
Figure 3b: what does the difference between yellow and grey lines (B and BS) indicate?
L 514 “reasonable agreement”: is it given for comparing green and purple lines (with about 5 GtC difference)? Does the difference arise because “it takes up to 10-15 years until the full efficiency of an alkalinity addition has been reached” (L516)?