Soil science-informed neural networks for soil organic carbon density modelling under scarce bulk density data
Abstract. Soil organic carbon (SOC) density is a key variable for quantifying soil carbon stocks, yet its modelling is challenged by sparse and inconsistent measurements of bulk density and coarse fragments relative to SOC content. Conventional digital soil mapping approaches typically model SOC density as a single target variable, thereby underutilising abundant SOC content data and overlooking physical relationships among soil properties. This study evaluates a soil science-informed neural network for SOC density prediction that explicitly constrains the SOC–BD relationship, and compares it with univariate and multivariate neural network architectures. Across sparsely sampled target variables, including SOC density, bulk density, and coarse fragments, the soil science-informed model achieves comparable or slightly improved prediction accuracy relative to multivariate and univariate models. Although it yields lower accuracy for SOC content, the soil science-informed model better preserves physically plausible SOC–BD joint distributions and generates smoother, more temporally stable SOC density trajectories. Overall, the results demonstrate that incorporating soil physical constraints into machine learning models adds value beyond univariate accuracy, improving robustness, plausibility, and temporal coherence of SOC density predictions under sparse data conditions. Moreover, the latent parameters inferred by the soil science-informed model improve model interpretability and offer additional soil science relevant insights beyond predictive accuracy.
I have read the manuscript describing soil informed neural networks to predict SOC density under sparse auxiliary data by leveraging multivariate learning and a soil-relation–informed ML.
The topic is timely for the DSM/ML community, and the paper’s central idea, using constraints to improve plausibility and robustness under missing BD/CF and is potentially valuable.
However, there are several issues in the mathematical formulation, unit consistency, evaluation design, and soil-science framing that needs attention.
I outline some comments below.
-L.21 The Introduction implies SOC density as a DSM-driven stakeholder target. In practice: carbon accounting uses SOC stocks per area (depth-integrated), not only DSM.
- Depth handling (0–20 cm) and LUCAS 2018 exclusion, The study excluded 0–10 and 20–30 cm to focus on 0–20 cm. need to explain: whether 0–10 and 10–20 exist and why not used,
- Unit consistency and dimensional correctness in SOC density (Eq. 1) with SOC content in g kg⁻¹, BD in g cm⁻³, and output in kg m⁻³.
While the formula can be numerically correct if the implicit conversions cancel , it is not dimensionally transparent and is easy to misapply. I suggest Rewrite Eq. (1) with explicit conversion constants, or (b) define the variables as mass fraction and kg m⁻³ explicitly.
Also motivation says stakeholders care about “SOC density” more than content. In practice, stakeholders commonly want SOC as well as SOC stock per area (e.g., Mg ha⁻¹ for a depth interval). Please temper the claim and clarify the end-use context (accounting, monitoring, agronomy, reporting).
- SOC–BD mechanistic constraint (Eq. 2) and SOM=1.724⋅SOC content
this is incorrect, because SOC content is in g kg⁻¹ (not a fraction), so SOM becomes order 10–100+ (dimensionless), which makes (1−SOM) negative.
The Federer reference is not the origin of the equation. This equation of mixing is due to Adams WA. 1973. The effect of organic matter on the bulk and true densities of some uncultivated podzolic soils. J Soil Sci, 24 (1973), pp. 10-17
- Coarse fragments. In Lucas, CF is measured in mass basis, how did you convert to volume basis as in Eq 2
- L.85 The study promotes soil-informed ML, but at the same time uses 362 covariates from 15 groups, many of which are highly correlated, multi-scale, and partially redundant. That creates a tension between the stated philosophy and the modelling design.
- Cross-validation design likely suffers from leakage (repeated sites + spatial autocorrelation). It stated five-fold CV via random partitioning. But the dataset contains repeated measurements at the same sites across years. Random folds will almost certainly place the same site in both train and test folds (even if different years), inflating performance and plausibility diagnostics.
Additionally, DSM with dense covariates typically demands spatially blocked CV (or at least spatial buffering) to avoid optimistic estimates.
Use grouped CV by site ID so all time points for a site stay in one fold. Ideally, combine with spatial blocking (e.g., spatial k-fold) to reflect mapping/generalisation performance.
- Targets are transformed to reduce skewness, and constrain to [0,1], achieved through “log transformation and scaling using a standard scaler.” A standard scaler does not constrain to [0,1]; it standardises to mean 0, variance 1. Correct the description. Coul be Min–Max scaling?
- SOC density can only be “truly” validated where BD and CF exist, and here they exist only in 2018. The study claims robustness under sparse BD availability. But have not rigorously validated the “sparse BD reconstruction” claim. It just means better internal consistency + smoother time series. But not neccessarily correct reconstruction when BD is truly absent.
- Temporal consistency filter appears logically inconsistent (likely a typo or mis-specified threshold). It wrote assume SOC changes < 0.5 g kg⁻¹ yr⁻¹, but use a “conservative threshold of 50 g kg⁻¹ yr⁻¹ for the maximum absolute difference across measurements.” This is confusing . Justify with citations and show sensitivity (how many series removed under alternative thresholds).
- Some reported units and plausibility statements are incorrect. Example: “extreme changes exceeding 60 g cm⁻³” for SOC density trajectories. SOC density is in kg m⁻³ (or equivalently g L⁻¹), not g cm⁻³.
- Small sample sizes make stratified metrics unreliable (Table 3). In Table 3, Wetland has N = 2 but reports R² = 0.90. This is not meaningful.
-MSE and R² are fine, but heavy tails and log transforms can distort interpretation. Add residual analysis (bias by SOC quantiles, BD quantiles)
For the joint SOC–BD space: consider distance metrics or coverage metrics i.e, how much predicted mass lies outside observed support.
- Uncertainty quantification is missing, since “gains are modest”, uncertainty reduction may be the key value proposition. Provide at least one uncertainty (ensembles, MC dropout, deep ensembles) for SOC density maps.
L.109 eq 2 is not mechanistic, still empirical