Preprints
https://doi.org/10.5194/egusphere-2025-166
https://doi.org/10.5194/egusphere-2025-166
17 Mar 2025
 | 17 Mar 2025

Quantifying spatial uncertainty to improve soil predictions in data-sparse regions

Kerstin Rau, Katharina Eggensperger, Frank Schneider, Michael Blaschek, Philipp Hennig, and Thomas Scholten

Abstract. Artificial Neural Networks (ANNs) are valuable tools for predicting soil properties using large datasets. However, a common challenge in soil sciences is the uneven distribution of soil samples, which often results from past sampling projects that heavily sample certain areas while leaving similar yet geographically distant regions under-sampled. One potential solution to this problem is to transfer an already trained model to other similar regions. Robust spatial uncertainty quantification is crucial for this purpose, yet often overlooked in current research. We address this issue by using a Bayesian deep learning technique, Laplace Approximations, to quantify spatial uncertainty. This produces a probability measure encoding where the model’s prediction is deemed reliable, and where a lack of data should lead to a high uncertainty. We train such an ANN on a soil landscape dataset from a specific region in southern Germany and then transfer the trained model to another unseen but to some extend similar region, without any further model training. The model effectively generalized alluvial patterns, demonstrating its ability to recognize repetitive features of river systems. However, the model showed a tendency to favor overrepresented soil units, underscoring the importance of balancing training datasets to reduce overconfidence in dominant classes. Quantifying uncertainty in this way allows stakeholders to better identify regions and settings in need of further data collection, enhancing decision-making and prioritizing efforts in data collection. Our approach is computationally lightweight and can be added post-hoc to existing deep learning solutions for soil prediction, thus offering a practical tool to improve soil property predictions in under-sampled areas, as well as optimizing future sampling strategies, ensuring resources are allocated efficiently for maximum data coverage and accuracy.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Share
Kerstin Rau, Katharina Eggensperger, Frank Schneider, Michael Blaschek, Philipp Hennig, and Thomas Scholten

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2025-166', Anonymous Referee #1, 02 Apr 2025
    • AC1: 'Reply on RC1', Kerstin Rau, 30 May 2025
  • RC2: 'Comment on egusphere-2025-166', Anonymous Referee #2, 12 May 2025
    • AC2: 'Reply on RC2', Kerstin Rau, 30 May 2025
Kerstin Rau, Katharina Eggensperger, Frank Schneider, Michael Blaschek, Philipp Hennig, and Thomas Scholten
Kerstin Rau, Katharina Eggensperger, Frank Schneider, Michael Blaschek, Philipp Hennig, and Thomas Scholten

Viewed

Total article views: 375 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
251 106 18 375 15 20
  • HTML: 251
  • PDF: 106
  • XML: 18
  • Total: 375
  • BibTeX: 15
  • EndNote: 20
Views and downloads (calculated since 17 Mar 2025)
Cumulative views and downloads (calculated since 17 Mar 2025)

Viewed (geographical distribution)

Total article views: 384 (including HTML, PDF, and XML) Thereof 384 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 13 Jul 2025
Download
Short summary
Uneven data collection can make it hard to predict soil properties accurately in new areas. We developed a method to show where predictions are reliable and where more data is needed. By training a model in one region and applying it to another, we found that our approach effectively recognized river patterns but was biased toward overrepresented soil types. This tool can guide smarter data collection, helping improve predictions and make better use of resources for soil management.
Share