Preprints
https://doi.org/10.5194/egusphere-2022-1034
https://doi.org/10.5194/egusphere-2022-1034
18 Oct 2022
 | 18 Oct 2022

Shapley values reveal the drivers of soil organic carbon stocks prediction

Alexandre M. J.-C. Wadoux, Nicolas P. A. Saby, and Manuel P. Martin

Abstract. Insights into the controlling factors of soil organic carbon (SOC) stocks variation is necessary both for our scientific understanding of the terrestrial carbon balance and to support policies that intend to promote carbon storage in soils to mitigate climate change. In recent years, complex statistical and algorithmic tools from the field of machine learning became popular for modelling and mapping SOC stocks over large areas. In this paper, we report on the development of a statistical method for interpreting complex models, which we implemented for the study of SOC stocks variation. We fitted a random forest machine learning model with 2206 measurements of SOC stocks for the 0–50 cm depth interval from mainland France and using a set of environmental covariates as explanatory variables. We introduce Shapley values, a method from coalitional game theory, and use them to understand how environmental factors influence SOC stocks prediction: what is the functional form of the association in the model between SOC stocks and environmental covariates, and how the covariate importance varies locally from one location to another and between carbon-landscape zones. Results were validated both in light of the existing and well-described soil processes mediating soil carbon storage and with regards to previous studies in the same area. We found that vegetation and topography were overall the most important drivers of SOC stock variation in mainland France but that the set of most important covariates varied greatly among locations and carbon-landscape zones. In two spatial locations with equivalent SOC stocks, there was nearly an opposite pattern in the individual covariates contribution that yielded the prediction: in one case climate variables contributed positively whereas in the second case climate variables contributed negatively, and that this effect was mitigated by landuse. This shows that SOC stock variation is complex and should be interpreted at multiple levels. We demonstrate that Shapley values are a methodological development that yielded useful insights into the importance of factors controlling SOC stocks variation in space. This may provide valuable information to understand whether complex empirical models are predicting a property of interest for the right reasons and to formulate hypotheses on the mechanisms driving the carbon sequestration potential of a soil.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Journal article(s) based on this preprint

11 Jan 2023
Shapley values reveal the drivers of soil organic carbon stock prediction
Alexandre M. J.-C. Wadoux, Nicolas P. A. Saby, and Manuel P. Martin
SOIL, 9, 21–38, https://doi.org/10.5194/soil-9-21-2023,https://doi.org/10.5194/soil-9-21-2023, 2023
Short summary
Alexandre M. J.-C. Wadoux, Nicolas P. A. Saby, and Manuel P. Martin

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2022-1034', Anonymous Referee #1, 11 Nov 2022
    • AC1: 'Reply on RC1', Alexandre Wadoux, 14 Nov 2022
  • RC2: 'Comment on egusphere-2022-1034', Anonymous Referee #2, 13 Nov 2022
    • AC2: 'Reply on RC2', Alexandre Wadoux, 14 Nov 2022

Interactive discussion

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2022-1034', Anonymous Referee #1, 11 Nov 2022
    • AC1: 'Reply on RC1', Alexandre Wadoux, 14 Nov 2022
  • RC2: 'Comment on egusphere-2022-1034', Anonymous Referee #2, 13 Nov 2022
    • AC2: 'Reply on RC2', Alexandre Wadoux, 14 Nov 2022

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
ED: Revision (02 Dec 2022) by Peter Finke
AR by Alexandre Wadoux on behalf of the Authors (05 Dec 2022)  Author's response   Author's tracked changes   Manuscript 
ED: Publish as is (06 Dec 2022) by Peter Finke
ED: Publish as is (13 Dec 2022) by Engracia Madejón Rodríguez (Executive editor)
AR by Alexandre Wadoux on behalf of the Authors (14 Dec 2022)

Journal article(s) based on this preprint

11 Jan 2023
Shapley values reveal the drivers of soil organic carbon stock prediction
Alexandre M. J.-C. Wadoux, Nicolas P. A. Saby, and Manuel P. Martin
SOIL, 9, 21–38, https://doi.org/10.5194/soil-9-21-2023,https://doi.org/10.5194/soil-9-21-2023, 2023
Short summary
Alexandre M. J.-C. Wadoux, Nicolas P. A. Saby, and Manuel P. Martin
Alexandre M. J.-C. Wadoux, Nicolas P. A. Saby, and Manuel P. Martin

Viewed

Total article views: 453 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
327 112 14 453 31 3 3
  • HTML: 327
  • PDF: 112
  • XML: 14
  • Total: 453
  • Supplement: 31
  • BibTeX: 3
  • EndNote: 3
Views and downloads (calculated since 18 Oct 2022)
Cumulative views and downloads (calculated since 18 Oct 2022)

Viewed (geographical distribution)

Total article views: 449 (including HTML, PDF, and XML) Thereof 449 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 31 Aug 2024
Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Short summary
Introduce Shapley values for machine learning model interpretation. Reveal the local and global controlling factors of SOC stocks. Enable spatial analysis of the important variables. Vegetation and topography determine much of the SOC stocks variation in mainland France. SOC stock variation is complex and should be interpreted at multiple levels.