Preprints
https://doi.org/10.5194/egusphere-2026-229
https://doi.org/10.5194/egusphere-2026-229
28 Jan 2026
 | 28 Jan 2026
Status: this preprint is open for discussion and under review for SOIL (SOIL).

Soil science-informed neural networks for soil organic carbon density modelling under scarce bulk density data

Xuemeng Tian, Bernhard Ahrens, Leo Rossdeutscher, Lazaro Alonso, and Leandro Parente

Abstract. Soil organic carbon (SOC) density is a key variable for quantifying soil carbon stocks, yet its modelling is challenged by sparse and inconsistent measurements of bulk density and coarse fragments relative to SOC content. Conventional digital soil mapping approaches typically model SOC density as a single target variable, thereby underutilising abundant SOC content data and overlooking physical relationships among soil properties. This study evaluates a soil science-informed neural network for SOC density prediction that explicitly constrains the SOC–BD relationship, and compares it with univariate and multivariate neural network architectures. Across sparsely sampled target variables, including SOC density, bulk density, and coarse fragments, the soil science-informed model achieves comparable or slightly improved prediction accuracy relative to multivariate and univariate models. Although it yields lower accuracy for SOC content, the soil science-informed model better preserves physically plausible SOC–BD joint distributions and generates smoother, more temporally stable SOC density trajectories. Overall, the results demonstrate that incorporating soil physical constraints into machine learning models adds value beyond univariate accuracy, improving robustness, plausibility, and temporal coherence of SOC density predictions under sparse data conditions. Moreover, the latent parameters inferred by the soil science-informed model improve model interpretability and offer additional soil science relevant insights beyond predictive accuracy.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Xuemeng Tian, Bernhard Ahrens, Leo Rossdeutscher, Lazaro Alonso, and Leandro Parente

Status: open (until 11 Mar 2026)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Xuemeng Tian, Bernhard Ahrens, Leo Rossdeutscher, Lazaro Alonso, and Leandro Parente

Model code and software

EasyDensity Xuemeng Tian et al. https://github.com/AI4SoilHealth/EasyDensity.jl

Xuemeng Tian, Bernhard Ahrens, Leo Rossdeutscher, Lazaro Alonso, and Leandro Parente

Viewed

Total article views: 67 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
45 17 5 67 3 2
  • HTML: 45
  • PDF: 17
  • XML: 5
  • Total: 67
  • BibTeX: 3
  • EndNote: 2
Views and downloads (calculated since 28 Jan 2026)
Cumulative views and downloads (calculated since 28 Jan 2026)

Viewed (geographical distribution)

Total article views: 67 (including HTML, PDF, and XML) Thereof 67 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 30 Jan 2026
Download
Short summary
We studied how to better estimate and map how much carbon is stored in soils when key measurements are scarce. We built a machine learning model that follows known physical links between soil carbon and soil density, and compared it with pure machine learning methods. Although accuracy was similar, the model informed by soil science gave more realistic and stable results over time and clearer insights into soil behavior, improving interpretability for decision making.
Share