the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Incorporating Soil Organic Carbon Dynamics into Global Hydrogen Uptake Models: A Focus on Microbial Activity
Abstract. Molecular hydrogen is a secondary greenhouse gas that indirectly contributes to climate forcing by extending the atmospheric lifetime of methane through competition for hydroxyl radicals. Soil serves as a major sink for atmospheric hydrogen, making accurate estimation of soil hydrogen uptake essential for understanding its role in atmospheric chemistry. Most existing process-based models of hydrogen uptake focus primarily on abiotic controls, such as soil temperature and moisture, while either neglecting or oversimplifying the role of biotic factors, particularly microbial activity. In this study, we refine four widely used hydrogen uptake models by integrating microbial activity rate modifiers and machine learning derived soil porosity. The microbial activity rate modifiers are derived from the decomposability of soil organic carbon, which is assumed to be a proxy for potential microbial activity. This leverages simulations of soil organic matter turnover provided by well-established and tested models of soil organic matter decomposition. This simple approach enables application of hydrogen uptake models from field to global scales. We have integrated our simulations of microbial activity into four widely used hydrogen uptake models. Model performance is evaluated against empirical datasets from four detailed studies of soil hydrogen uptake. Results show that replacing traditional texture-based porosity with machine learning derived estimates significantly improved physical transport modelling, particularly for the Bertagni and Ehhalt frameworks. Furthermore, incorporating the coupled climate-carbon microbial activity rate modifier consistently strengthened model performance, producing larger reductions in prediction error and more pronounced increases in correlation than using microbial activity alone, thereby providing a more realistic representation of soil microbial processes. These findings highlight the importance of including biologically relevant factors in atmospheric hydrogen modelling and offer a more mechanistic framework for predicting soil–atmosphere hydrogen exchange under diverse environmental conditions.
- Preprint
(2363 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 09 Jul 2026)
-
RC1: 'Comment on egusphere-2026-1312', Anonymous Referee #1, 28 Apr 2026
reply
-
AC1: 'Reply on RC1', Saeed Karbin, 18 May 2026
reply
Karbin et al. presents a revision of the four major H2 uptake models to improve model performance. To improve accuracy, they modify the models to include a (1) microbial activity rate modification, and (2) ML-derived soil porosity. While the results are presented clearly, some of the interpretations can be more thoroughly discussed.
First, there is no discussion of the broader implications of the revised model formulations. Since the total sink strength is held constant by recalibrating k'max, the model refinements are actually change is the regional (and seasonal ?) distribution of uptake. The key question then is whether the revised models produce a more realistic spatial pattern of H₂ uptake.
We thank the reviewer for this insightful comment. We agree that, because the total global H₂ sink strength is conserved through recalibration of k'max , the main effect of the revised formulations is a redistribution of H₂ uptake in space (and seasonally), rather than a change in the global total. The impact on the global and temporal distribution of H2 uptake will be presented in a separate paper. We aim to do this in the next step.
Specifically, the revised models change the spatial and temporal patterns of H₂ uptake by linking H₂ consumption more directly to microbial activity and soil physical conditions, such as temperature and moisture. The introduction of microbial activity modifiers leads to a redistribution of uptake toward regions and seasons where environmental conditions are more favourable for microbial processes, while reducing uptake in soils with low microbial activity.
In addition to carbon quantity, carbon quality is also represented through microbial activity. Plant residues with a high lignin content (for example, from forest systems) do not stimulate microbial activity to the same extent as more labile and easily decomposable plant inputs, such as those from grasslands. As a result, two land‑use types with the same total carbon input can show different levels of microbial activity; the new model formulation aims to capture these differences.
In addition, legacy effects of climate and soil physical properties on soil carbon are included. Soil physical properties, such as clay content, are explicitly represented in the model. However, using only total carbon indicators, such as NPP or total soil carbon, does not fully capture differences in microbial activity due to long term differences in soil management. The new formulation aims to provide greater sensitivity to changes in both carbon quality and carbon quantity.
Overall, the redistribution of uptake shifts more H₂ uptake to areas with higher SOC availability and more favourable conditions for microbial activity. For example, although peatlands have high SOC stocks, the lower carbon quality and higher proportion of humified material are taken into account, leading to a more realistic representation of H₂ uptake patterns (lines 769-783 and 852-857 in the revised version).
specific comments:
L239-240: The revised value of k’max,SD essentially serves to redistribute the total uptake spatially, so that the global mean remains unchanged. But is the new pattern of uptake more realistic than the previous? Which is difficult to test since the actual datasets available for validation are from a geographically limited set of sites. Could the authors comment on the validity of the revised k’max,SD estimates?
We thank the reviewer for raising this important point. We agree that recalibrating k'max,SD keeps the global mean soil H₂ uptake unchanged, and that the main effect of the revised formulation is a redistribution of uptake in space (and seasonally), rather than a change in the total sink. In this approach, k'max,SD functions as a normalisation factor, ensuring consistency with global sink estimates while allowing spatial variability to be controlled by the microbial activity modifier mCMAC.
The updated uptake patterns therefore depend on independently derived SOC pool sizes and turnover rates, rather than on regional tuning of kmax. This links H₂ uptake more directly to both carbon quantity and carbon quality, with faster‑cycling carbon pools contributing more strongly to microbial activity. Although direct validation is limited by the geographically restricted observational data, the resulting redistribution is more consistent with established controls on microbial activity and soil biogeochemistry. To represent these effects, we used the RothC model, which is a well‑established framework for simulating SOC pool dynamics. The evaluation against the limited data that is available demonstrated increased correlation and model fit, suggesting that the revised k'max,SD estimates provide more realistic patterns of H2 uptake than the previous models. Further evaluation against datasets from a wider range of environments would be required to further support this finding. However, unfortunately, these datasets do not yet exist (lines 834-840 in the revised version).
L268-269: If modifying Kmax by mCMAC leads to similar sensitivities as using NPP as a proxy for SOC-modulation, what’s the value gained by running RothC in non-desert soils? Also, since NPP is largely a proxy for labile C input from vegetation, I wonder if using DPM and BIO alone in estimating mCMAC would give identical results to the full mCMAC. Not suggesting that the authors do this, but just wondering whether the HUM and RPM fractions contribute anything significant at all.
We thank the reviewer for this thoughtful question. Although modifying kmax by mCMAC leads to sensitivities that may appear similar to those obtained using NPP, the added value of running RothC lies in representing soil microbial activity rather than vegetation productivity alone. NPP is primarily a proxy for carbon inputs from vegetation and does not directly represent the amount, turnover, or activity of carbon already present in the soil microbial system.
In the calculation of , both carbon pool size and pool‑specific decomposition rate constants influence microbial activity. Pools with relatively small sizes, such as DPM and BIO, can still have a strong contribution because they are associated with high rate constants and therefore fast microbial turnover. At the same time, slower pools such as RPM and HUM contribute through their larger carbon stocks, which provide sustained substrate availability. As a result, reflects a balance between carbon quantity and carbon quality, rather than being driven by labile inputs alone.
Using only DPM and BIO, or using NPP as a proxy, would therefore miss important information about soil carbon turnover structure and legacy effects embedded in the full SOC pool distribution. By running RothC, we incorporate these effects explicitly, allowing soils with similar NPP or total carbon inputs to exhibit different microbial activity levels depending on their SOC composition. This added process representation explains why the RothC‑based approach provides value beyond NPP in non‑desert soils, even if the global sensitivity of kmax appears similar when averaged.
Lines 344-345 and L300-301: the authors state that in the Ehhalt and Bertagni models, the revised k’max values reflecting an increase of 25% and 46%, respectively, are indicative of the “added response of the model(s) to microbial activity” – but is this categorically true? Both of these models include explicit parameterizations of soil porosity, and it’s not clear if the revised value of k’max was estimated using just the microbial modifier or a simultaneous revision of the porosity inputs as well. If the latter, attributing the change in k’max estimates completely to the microbial rate modifier is potentially inaccurate.
In the revised Ehhalt and Bertagni model formulations, the updated values of k'max were estimated using the microbial activity modifier only. Soil porosity was not included in the estimation of the new k'max values. Therefore, the reported increases in k'max of 25% and 46% reflect solely the additional sensitivity introduced through the microbial activity modifier and are not the result of a simultaneous adjustment of soil porosity inputs. We agree that attributing changes in k'max to microbial activity would be inaccurate if other structural parameters were modified at the same time; however, this is not the case here. We have clarified this point in the revised text to avoid ambiguity (lines 233-237 in the revised version).
In Table 3, and elsewhere throughout the manuscript: Can the authors discuss if the ML-derived porosity estimates are in fact more reliable than texture-based estimates, outside of the model prediction comparisons? In all 5 cases, ML-based estimates seem to be higher than the texture-based estimates, but these results are never really validated using actual measurements of porosity. How realistic is to have loamy sand soils from Harvard forest to have a porosity of 0.55? Naively assuming a particle density of ~2.65 g/cm3 for these soils, porosity can be calculated from the measured bulk density as 0.36. This is lower than both texture-based and ML-derived estimates, and I’m finding it hard to decide which estimate is more reliable.
We thank the reviewer for this important comment and agree that soil porosity is a key but uncertain model input. In this study, neither the texture‑based nor the ML‑derived porosity estimates are independently validated against site‑specific porosity measurements, and therefore neither should be interpreted as definitive. In the Harvard Forest example, both the texture‑based estimate (~0.43) and the ML‑derived estimate (~0.55) deviate from the porosity inferred from bulk‑density measurements, illustrating uncertainties in both approaches.
We also note that the bulk‑density‑based estimate assumes a particle density of 2.65 g cm⁻³, whereas the actual particle density at the site may differ due to variations in soil organic matter content and mineral composition. This introduces additional uncertainty when using bulk density as a reference for “true” porosity. Without direct porosity measurements, it is therefore difficult to robustly determine which estimate is more accurate.
The primary motivation for including ML‑derived porosity was not to claim higher absolute accuracy, but to suggest an improvement over the commonly used approach of estimating soil porosity based only on soil texture (e.g. clay content). Texture‑based methods are highly simplified and cannot fully represent realistic porosity values, particularly across diverse soil types. This limitation is important because soil H₂ uptake is a diffusion‑dominated process, and porosity has a strong influence on model results. Estimating porosity using soil texture alone, as done in the current standard approach, neglects other relevant soil properties and may lead to unrealistic values.
We therefore emphasise that the porosity values used in this study should be interpreted as model inputs rather than ground truth, and that our conclusions focus on relative model behaviour and sensitivity rather than exact porosity values at individual sites. We have clarified this limitation in the revised manuscript and acknowledge that future work should prioritise evaluation against measured porosity and soil gas diffusivity data where available (lines 679-689 and 858-862 in the revised version).
Regarding ML-based porosity estimation: it looks like the authors essentially used the SoilGrids database to estimate porosity, but the way this is presented in the manuscript is somewhat confusing/misleading: as far as I can tell, this paper did not derive porosity using ML methods, but rather calculated it from ML-derived soil properties available in SoilGrids. To improve clarity, I suggest reframing “ML-derived porosity” to something like “porosity estimated using ML-predicted saturation water content”.
We agree that referring to this as “ML‑derived porosity” can be misleading, and we have revised the manuscript to improve clarity. Following the reviewer’s suggestion, we now use terminology such as “Porosity estimated using ML‑predicted soil properties” or “porosity estimated from ML‑predicted saturation water content”. This more accurately reflects the methodology and avoids confusion about the source of the porosity estimates.
We thank the reviewer for highlighting this point, which has helped us to improve the clarity and transparency of the manuscript (lines 479-480 in the revised version).
Related to the above, it’s also stated that the authors used water content at saturation as a proxy for porosity. Could the authors discuss potential problems with this method?
We thank the reviewer for this important comment and agree that using water content at saturation as a proxy for soil porosity has limitations that need to be clearly acknowledged. This approach assumes that the soil can reach full saturation and that all pore space is filled with water, which may not always be the case due to soil structure, entrapped air and drainage characteristics.
We also recognise uncertainties associated with estimating saturated water content from pedotransfer functions or ML‑predicted soil properties, particularly because these estimates may not fully capture site‑specific soil structure or macroporosity. As a result, porosity values derived in this way can deviate from porosity inferred from bulk density measurements or direct field observations.
Despite these limitations, we consider this ML‑based approach an improvement over the current standard method, which estimates soil porosity solely based on soil texture (e.g. clay content or texture class). Texture‑only approaches are highly simplified and neglect other important soil properties, and therefore often fail to represent realistic spatial variability in porosity. By incorporating ML‑predicted soil properties, the proposed approach integrates additional information beyond texture alone and provides a more flexible representation of soil porosity.
This distinction is particularly important because soil H₂ uptake is a diffusion‑dominated process, and porosity strongly influences model results. When compared with available field data, we found that the ML‑based estimates do not necessarily provide exact porosity values but offer a more responsive framework than texture‑based estimates. We therefore present this approach as a methodological improvement rather than a fully validated porosity product.
We emphasise that porosity values used here should be interpreted as model inputs rather than ground truth, and that our conclusions focus on relative model behaviour and sensitivity rather than exact porosity values at individual sites. We have clarified this limitation in the revised manuscript and acknowledge that future work should prioritise evaluation against measured porosity and soil gas diffusivity data where available.
(Line 573-576).L666-667: Another place where the apparent claim is that the “ML-derived” values are inherently better to texture-based approximations, but crucially, this claim is not validated.
Same answer as above question.
800-801: This conclusion is never really examined in the Discussion as far as I can tell. Why exactly would the Ehhalt model remain unaffected with the addition of a coupled-microbial rate modifier?
Thank you for raising this point. We agree that this conclusion requires clearer explanation in the Discussion, and we appreciate the opportunity to clarify why the Ehhalt model shows a limited response to the coupled microbial rate modifier.
In the original Ehhalt formulation, microbial activity is implicitly represented by assuming A=1, and the model does not include an explicit parameter for maximum potential H₂ uptake. As a result, the model lacks a mechanism to regulate the global mean H₂ oxidation activity through biological capacity.
When mCCMAC or mCMAC is introduced, a new adjustment factor ,k'max,E, is defined in order to preserve the original global mean uptake:
k'max,E = 1/ ave(mCCMAC)
This step is crucial. By construction, it normalises microbial activity around a fixed global mean and allows only a spatial redistribution of activity. This behaviour is specific to the Ehhalt model, because it originally lacks an explicit kmax term and assumes A=1. In contrast, the other models already include a maximum uptake parameter (kmax,B= 0.038, kmax,SD= 0.01226, such that the effects of adding mCCMAC or mCMAC
are scaled relative to an existing biological capacity rather than being normalised against unity.
As a consequence, in the Ehhalt model, most of the microbial adjustment that the formulation can express is absorbed by the adjustment factor (A=1) into the single global scaling of k'max,E, leaving limited scope for additional sensitivity when introducing the coupled formulation (lines 784-791 in the revised version).
Minor: Note that Coleman and Jenkinson (1996) is not cited in the References.
Thank you for pointing this out. We appreciate the careful reading. The reference to Coleman and Jenkinson (1996) has now been added to the reference list in the revised manuscript.
Citation: https://doi.org/10.5194/egusphere-2026-1312-AC1
-
AC1: 'Reply on RC1', Saeed Karbin, 18 May 2026
reply
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 614 | 221 | 41 | 876 | 41 | 67 |
- HTML: 614
- PDF: 221
- XML: 41
- Total: 876
- BibTeX: 41
- EndNote: 67
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Karbin et al. presents a revision of the four major H2 uptake models to improve model performance. To improve accuracy, they modify the models to include a (1) microbial activity rate modification, and (2) ML-derived soil porosity. While the results are presented clearly, some of the interpretations can be more thoroughly discussed.
First, there is no discussion of the broader implications of the revised model formulations. Since the total sink strength is held constant by recalibrating k'max, the model refinements are actually change is the regional (and seasonal ?) distribution of uptake. The key question then is whether the revised models produce a more realistic spatial pattern of H₂ uptake.
specific comments:
L239-240: The revised value of k’max,SD essentially serves to redistribute the total uptake spatially, so that the global mean remains unchanged. But is the new pattern of uptake more realistic than the previous? Which is difficult to test since the actual datasets available for validation are from a geographically limited set of sites. Could the authors comment on the validity of the revised k’max,SD estimates?
L268-269: If modifying Kmax by mCMAC leads to similar sensitivities as using NPP as a proxy for SOC-modulation, what’s the value gained by running RothC in non-desert soils? Also, since NPP is largely a proxy for labile C input from vegetation, I wonder if using DPM and BIO alone in estimating mCMAC would give identical results to the full mCMAC. Not suggesting that the authors do this, but just wondering whether the HUM and RPM fractions contribute anything significant at all.
Lines 344-345 and L300-301: the authors state that in the Ehhalt and Bertagni models, the revised k’max values reflecting an increase of 25% and 46%, respectively, are indicative of the “added response of the model(s) to microbial activity” – but is this categorically true? Both of these models include explicit parameterizations of soil porosity, and it’s not clear if the revised value of k’max was estimated using just the microbial modifier or a simultaneous revision of the porosity inputs as well. If the latter, attributing the change in k’max estimates completely to the microbial rate modifier is potentially inaccurate.
In Table 3, and elsewhere throughout the manuscript: Can the authors discuss if the ML-derived porosity estimates are in fact more reliable than texture-based estimates, outside of the model prediction comparisons? In all 5 cases, ML-based estimates seem to be higher than the texture-based estimates, but these results are never really validated using actual measurements of porosity. How realistic is to have loamy sand soils from Harvard forest to have a porosity of 0.55? Naively assuming a particle density of ~2.65 g/cm3 for these soils, porosity can be calculated from the measured bulk density as 0.36. This is lower than both texture-based and ML-derived estimates, and I’m finding it hard to decide which estimate is more reliable.
Regarding ML-based porosity estimation: it looks like the authors essentially used the SoilGrids database to estimate porosity, but the way this is presented in the manuscript is somewhat confusing/misleading: as far as I can tell, this paper did not derive porosity using ML methods, but rather calculated it from ML-derived soil properties available in SoilGrids. To improve clarity, I suggest reframing “ML-derived porosity” to something like “porosity estimated using ML-predicted saturation water content”.
Related to the above, it’s also stated that the authors used water content at saturation as a proxy for porosity. Could the authors discuss potential problems with this method?
L666-667: Another place where the apparent claim is that the “ML-derived” values are inherently better to texture-based approximations, but crucially, this claim is not validated.
800-801: This conclusion is never really examined in the Discussion as far as I can tell. Why exactly would the Ehhalt model remain unaffected with the addition of a coupled-microbial rate modifier?
Minor: Note that Coleman and Jenkinson (1996) is not cited in the References.