18 Oct 2022
18 Oct 2022

Shapley values reveal the drivers of soil organic carbon stocks prediction

Alexandre M. J.-C. Wadoux1, Nicolas P. A. Saby2, and Manuel P. Martin2 Alexandre M. J.-C. Wadoux et al.
  • 1Sydney Institute of Agriculture & School of Life and Environmental Sciences, The University of Sydney, Sydney, Australia
  • 2INRAE, Unité de Services Infosol, Orléans, France

Abstract. Insights into the controlling factors of soil organic carbon (SOC) stocks variation is necessary both for our scientific understanding of the terrestrial carbon balance and to support policies that intend to promote carbon storage in soils to mitigate climate change. In recent years, complex statistical and algorithmic tools from the field of machine learning became popular for modelling and mapping SOC stocks over large areas. In this paper, we report on the development of a statistical method for interpreting complex models, which we implemented for the study of SOC stocks variation. We fitted a random forest machine learning model with 2206 measurements of SOC stocks for the 0–50 cm depth interval from mainland France and using a set of environmental covariates as explanatory variables. We introduce Shapley values, a method from coalitional game theory, and use them to understand how environmental factors influence SOC stocks prediction: what is the functional form of the association in the model between SOC stocks and environmental covariates, and how the covariate importance varies locally from one location to another and between carbon-landscape zones. Results were validated both in light of the existing and well-described soil processes mediating soil carbon storage and with regards to previous studies in the same area. We found that vegetation and topography were overall the most important drivers of SOC stock variation in mainland France but that the set of most important covariates varied greatly among locations and carbon-landscape zones. In two spatial locations with equivalent SOC stocks, there was nearly an opposite pattern in the individual covariates contribution that yielded the prediction: in one case climate variables contributed positively whereas in the second case climate variables contributed negatively, and that this effect was mitigated by landuse. This shows that SOC stock variation is complex and should be interpreted at multiple levels. We demonstrate that Shapley values are a methodological development that yielded useful insights into the importance of factors controlling SOC stocks variation in space. This may provide valuable information to understand whether complex empirical models are predicting a property of interest for the right reasons and to formulate hypotheses on the mechanisms driving the carbon sequestration potential of a soil.

Alexandre M. J.-C. Wadoux et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2022-1034', Anonymous Referee #1, 11 Nov 2022
    • AC1: 'Reply on RC1', Alexandre Wadoux, 14 Nov 2022
  • RC2: 'Comment on egusphere-2022-1034', Anonymous Referee #2, 13 Nov 2022
    • AC2: 'Reply on RC2', Alexandre Wadoux, 14 Nov 2022

Alexandre M. J.-C. Wadoux et al.

Alexandre M. J.-C. Wadoux et al.


Total article views: 409 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
295 100 14 409 28 3 3
  • HTML: 295
  • PDF: 100
  • XML: 14
  • Total: 409
  • Supplement: 28
  • BibTeX: 3
  • EndNote: 3
Views and downloads (calculated since 18 Oct 2022)
Cumulative views and downloads (calculated since 18 Oct 2022)

Viewed (geographical distribution)

Total article views: 402 (including HTML, PDF, and XML) Thereof 402 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 06 Dec 2022
Short summary
Introduce Shapley values for machine learning model interpretation. Reveal the local and global controlling factors of SOC stocks. Enable spatial analysis of the important variables. Vegetation and topography determine much of the SOC stocks variation in mainland France. SOC stock variation is complex and should be interpreted at multiple levels.