Preprints
https://doi.org/10.5194/egusphere-2025-4828
https://doi.org/10.5194/egusphere-2025-4828
14 Oct 2025
 | 14 Oct 2025
Status: this preprint is open for discussion and under review for SOIL (SOIL).

Estimating soil carbon sequestration potential with mid-IR spectroscopy and explainable machine learning

Yang Hu and Raphael A. Viscarra Rossel

Abstract. Soil carbon sequestration refers to the process of capturing atmospheric carbon through plant photosynthesis and storing it in soil as organic carbon. The primary mechanism for carbon sequestration is via organic carbon molecules adsorbing onto mineral surfaces of the soil's fine fraction (clay + silt 20 μm), forming mineral-associated organic carbon (MAOC). Soil has a finite capacity to stabilise and sequester organic carbon, known as carbon saturation capacity, which depends on the proportion of reactive minerals in the soil. The difference between the current MAOC content and the carbon saturation capacity is referred to as the organic carbon saturation deficit (Cdef) or sequestration potential. Fourier-transformed (FTIR) mid-infrared (mid-IR) spectroscopy can simultaneously measure soil properties relevant to carbon stabilisation, organic carbon functional groups, clay and iron-oxide mineralogy and particle size. Therefore, we hypothesise that mid-IR spectroscopy can effectively and accurately estimate Cdef. Thus, we aim to (i) develop spectroscopic models to estimate the MAOC and Cdef of 482 Australian topsoil samples, (ii) model MAOC and Cdef using mid-IR spectra and an interpretable machine learning, and (ii) interpret the MAOC and Cdef models using the explainable artificial intelligence (AI) algorithm SHapley Additive exPlanations (SHAP). Using frontier line analysis, we fitted a function to the upper envelope of the MAOC vs clay + silt relationship to derive Cdef. We recorded mid-IR spectra of the samples and used the regression trees method CUBIST to model MAOC content and Cdef. We interpreted these models by examining the regression trees and using SHAP. The models were unbiased and estimated MAOC content with R2 of 0.86 and RMSE of 2.77 (g/kg soil), and Cdef with R2 of 0.89 and RMSE of 3.72 (g/kg soil). Model interpretation revealed Cdef estimates relied on negative interactions with absorptions from organic matter functional groups and positive interactions with absorptions from clay minerals. Our results show that mid-IR spectra can effectively estimate MAOC and soil Cdef, offering a rapid and cost-effective method for assessing and monitoring this critical soil function.

Competing interests: At least one of the (co-)authors is a member of the editorial board of SOIL.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Yang Hu and Raphael A. Viscarra Rossel

Status: open (until 25 Nov 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Yang Hu and Raphael A. Viscarra Rossel
Yang Hu and Raphael A. Viscarra Rossel

Viewed

Total article views: 34 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
26 6 2 34 0 0
  • HTML: 26
  • PDF: 6
  • XML: 2
  • Total: 34
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 14 Oct 2025)
Cumulative views and downloads (calculated since 14 Oct 2025)

Viewed (geographical distribution)

Total article views: 34 (including HTML, PDF, and XML) Thereof 34 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 16 Oct 2025
Download
Short summary
We analysed 482 Australian topsoils to estimate mineral-associated organic carbon (MAOC) and the carbon storage deficit (Cdef). Using mid-infrared spectra with explainable machine learning, we predicted MAOC (R2=0.86) and Cdef (R2=0.89). Model interpretation revealed signals from organic matter and clay minerals were most significant in predicting MAOC and Cdef. Our work provides an accurate, cost-effective means to assess and better understand the drivers of soil carbon sequestration potential.
Share