Preprints
https://doi.org/10.5194/egusphere-2024-3703
https://doi.org/10.5194/egusphere-2024-3703
07 Jan 2025
 | 07 Jan 2025
Status: this preprint is open for discussion and under review for SOIL (SOIL).

Using Monte Carlo conformal prediction to evaluate the uncertainty of deep learning soil spectral models

Yin-Chung Huang, José Padarian, Budiman Minasny, and Alex B. McBratney

Abstract. Uncertainty quantification is a crucial step for the practical application of soil spectral models, particularly in supporting real-world decision making and risk assessment. While machine learning has made remarkable strides in predicting various physiochemical properties of soils using spectroscopy, predictions devoid of quantified uncertainty offer limited utility in guiding critical decisions. However, uncertainty quantification remains underutilised in the reporting of soil spectral models, with existing methods facing significant limitations. These approaches are either computationally demanding, fail to achieve the desired coverage of observed data, or struggle to handle out-of-domain uncertainty effectively. This study introduces the innovative use of Monte Carlo conformal prediction (MC-CP) as a novel approach to quantify uncertainty in the prediction of clay content from mid-infrared spectroscopy. We compared MC-CP with two established methods: (1) Monte Carlo dropout and (2) conformal prediction. Monte Carlo dropout generates prediction intervals for each sample and is effective at addressing larger uncertainties associated with out-of-domain data. However, it falls short in achieving the desired coverage – its 90 % prediction intervals only covered the observed values in 74 % of cases, well below the expected 90 % coverage. Conformal prediction, on the other hand, guarantees ideal coverage of true values but generates unnecessarily wide prediction intervals, making it overly conservative for many practical applications. In contrast, MC-CP successfully combines the strengths of both methods. It achieved a prediction interval coverage probability of 91 %, closely matching the expected 90 % coverage, and far surpassing the performance of Monte Carlo dropout. Additionally, the mean prediction interval width for MC-CP was 9.05 %, narrower than conformal prediction’s 11.11 %, while still effectively addressing the higher uncertainty in out-of-domain samples. By generating accurate prediction intervals alongside point predictions, MC-CP demonstrated its ability to deliver practical and reliable uncertainty quantification. This breakthrough enhances the real-world applicability of soil spectral models and represents a significant advancement in the field of soil science. The success of MC-CP paves the way for its integration into large-scale machine-learning models, such as soil inference systems, further revolutionising decision-making and risk assessment in soil science.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Yin-Chung Huang, José Padarian, Budiman Minasny, and Alex B. McBratney

Status: open (until 18 Feb 2025)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Yin-Chung Huang, José Padarian, Budiman Minasny, and Alex B. McBratney
Yin-Chung Huang, José Padarian, Budiman Minasny, and Alex B. McBratney

Viewed

Total article views: 77 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
50 26 1 77 0 0
  • HTML: 50
  • PDF: 26
  • XML: 1
  • Total: 77
  • BibTeX: 0
  • EndNote: 0
Views and downloads (calculated since 07 Jan 2025)
Cumulative views and downloads (calculated since 07 Jan 2025)

Viewed (geographical distribution)

Total article views: 55 (including HTML, PDF, and XML) Thereof 55 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 09 Jan 2025
Download
Short summary
Uncertainty quantification plays a crucial role in reporting machine learning models in soil spectroscopy. This study introduces Monte Carlo conformal prediction (MC-CP), a novel method for uncertainty quantification in deep learning soil spectral models. MC-CP outperformed two established methods, providing the most reliable results. Its efficiency and robustness make it a practical choice for implementing soil spectral models in decision-making.