Uncertainty quantification of deep learning model for mineral prospectivity mapping

Wang, Ziye; Zuo, Renguang

doi:10.5194/egusphere-2026-1293

Preprints

https://doi.org/10.5194/egusphere-2026-1293

Preprints

08 May 2026

| 08 May 2026

Status: this preprint is open for discussion and under review for Geoscientific Model Development (GMD).

Uncertainty quantification of deep learning model for mineral prospectivity mapping

Ziye Wang and Renguang Zuo

Abstract. Deep learning techniques have significantly advanced mineral prospectivity mapping (MPM) by facilitating automated feature extraction and capturing nonlinear relationships among multi-source geological datasets. However, most deep learning models in MPM neglect the intrinsic uncertainties arising from incomplete geological knowledge, limited sampling, and model variability, leading to overconfident and potentially unreliable predictions. To address this limitation, this study proposes a comprehensive uncertainty quantification framework that jointly evaluates data, model, and prediction uncertainties in deep learning-based MPM. Data uncertainty, originating from sparse of geochemical/geophysical sampling and subjective interpretations of geological features, is characterized through stochastic simulation of evidential layers. Model uncertainty, arising from variability in network architecture and parameters estimation, is captured through a Bayesian convolutional neural network (CNN) employing Monte Carlo Dropout. The proposed framework is demonstrated through a real-world case study of gold prospectivity mapping in western Henan Province, China. These uncertainties are quantified using statistical measures including mean, variance, and entropy. The results indicate that areas exhibiting high prospectivity and low uncertainty represent robust and reliable exploration targets, whereas those with high uncertainty highlight regions requiring improved metallogenic interpretation or model refinement. Furthermore, uncertainty contribution analysis reveals that data uncertainty contributes more to total prediction uncertainty than model uncertainty, suggesting that enhancing the quality and representativeness of evidence layers is more effective for reducing uncertainty than merely optimizing network architecture or parameters. Overall, by modeling and visualizing both data and model uncertainties, the proposed framework transforms deep learning-based MPM from deterministic prediction to probabilistic decision-making, thereby enabling more reliable and trustworthy mineral exploration.

Received: 09 Mar 2026 – Discussion started: 08 May 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Ziye Wang and Renguang Zuo

Status: open (extended)

Post a comment Subscribe to comment alert

RC1:
'Referee comment on egusphere-2026-1293', Nathan Bowman, 04 Jun 2026 reply
General comments:
Uncertainty quantification is a commonly studied aspect of mineral prospectivity mapping, and this manuscript presents a potentially valuable framework for individually quantifying different types of uncertainty. This is an approach that could be adopted by geologists in both academia and industry to guide and improve their own MPM studies. However, several key methodological steps regarding the treatment of geochemical and geophysical data are either insufficiently described or ambiguously presented, making it difficult to assess the validity of the uncertainty analysis and the robustness of the conclusions. Clarifying these aspects is essential, as they directly underpin the central claims of the study. Furthermore, the discussion section does not adequately consider relevant prior work, nor does it clearly position this study in relation to existing methods. Although the code used to produce the results is included as supplementary material, the data provided are insufficient to resolve the issues described above or to fully reproduce the authors’ workflow. Many sections of the manuscript are repetitive and contain redundant information. The figures are of variable quality, and the figure captions lack sufficient description. With major revisions to improve the structure of the manuscript, clarify the methodology, and strengthen the discussion of data analysis and interpretation, this manuscript has the potential to make a meaningful contribution to the field.
Specific comments:
While individual sections of the manuscript are well written, several sections could be combined and simplified due to repetitive content. In particular, Section 2.1 (lines 129-139) repeats information from earlier sections (lines 75-90), and much of the methodology section provides general background rather than describing the specific methods the authors used. The actual workflow is described later (Sections 4 and 5, lines 301-404). Sections 4 and 5 are very well written, very descriptive, and should be the information provided in the methodology. The authors are encouraged to streamline the methodology by reducing background material and consolidating relevant content into a single, coherent section.
Many figure captions are vague and insufficiently descriptive. This often leads to confusion about what is being shown in the figures and how they relate to the workflow and results. Figure captions should be expanded to clearly explain what is being shown and their relevance to what is being discussed in the text. Figure 3 should be redrafted: It is not clear in either the text, the figure captions, or the illustration itself what is being shown or how it relates to the implementation of a Bayesian CNN. It is also unclear what is shown in Figure 16: do the mean vs uncertainty scatter plots represent pixel-wise values from the mean and variance maps? These points should be clarified in both the text and figure captions.
The inclusion of data and code in the appendices is appreciated and supports reproducibility. However, the provided datasets (two .tif files) are insufficient. In particular, the file evidence.tif is not georeferenced, and it is unclear what it represents.
Terminology relating to the types of uncertainty is inconsistent throughout the manuscript. Four types of uncertainty are initially defined: conceptual uncertainty, data uncertainty, variable uncertainty, and integration uncertainty (after Zuo et al., 2021; lines 55-60). The stated goal of the study is to quantify data uncertainty and “model” uncertainty (line 71). The authors then subsequently define data uncertainty as “limited sampling, observational errors, and the selection of negative samples, which is primarily expressed during the construction of evidential features, including ore-controlling geological structures, geochemical anomalies, and geophysical anomalies” (lines 72 to 74), which combines aspects of both data uncertainty and variable uncertainty. The authors then define model uncertainty as “the variability in deep learning architecture and parameters used to integrate these evidential features (lines 74-75)”, which corresponds more closely to the earlier definition of integration uncertainty (line 59). This inconsistency persists throughout the manuscript and creates confusion, as it is unclear which types of uncertainty the different methods are intended to address. For example, on line 332, “parameter uncertainty” is used to describe what was previously defined as variable uncertainty. The authors are encouraged to adopt a clear terminology framework and apply it consistently throughout the manuscript.
It is not clearly justified why MPS-DS is applied to geochemical and geophysical data. The authors state that MPS-DS is used to interpolate these datasets (explicitly stated for geochemical data in line 345 and implied for geophysical data in line 354), stating that generating multiple realizations allows them to quantify both data and model uncertainty (lines 364-365). However, other more commonly used methods of interpolation such as kriging also allow uncertainty to be quantified. Therefore, the authors should clarify why their approach is advantageous relative to established interpolation methods, particularly in relation to uncertainty quantification.
There is no reference in the text about origin or source of the aeromagnetic data, and the treatment of this dataset raises several concerns. The authors suggest that uncertainty from arises interpolation due low sampling density and can lead to over-smoothed results (lines 83-86, 194-196, 583-585). However, aeromagnetic data typically consist of dense measurements collected along flight lines, with regular spacing of flight lines across the study area. Yet figure 8b shows equidistant data points with extensive coverage over the entire region, which does not appear consistent with how aeromagnetic data are usually collected, nor does it reflect the “limited sampling density” issue the authors wish to address. Additionally, it is unclear how the vertical first-order derivative (1VD) was applied to the aeromagnetic data (lines 295-296). It should be clarified whether:
the 1VD transformation was applied to raw survey data prior to simulation, or

the data were first gridded, then 1VD transformed, and subsequently used in simulation.

Given the regular spacing shown in Figure 8b, the latter seems more likely. In either case, the processing workflow should be clearly described in both the main text and the figure captions.
Similar concerns apply to the geochemical data. The authors state that “geochemical data provide direct evidence of elemental anomalies linked to gold mineralization” (lines 281-282), yet the data are stream sediment samples (line 288). These samples are not in situ and therefore do not provide direct evidence of the magmatic-hydrothermal style of gold mineralisation from the study area (line 273). Rather the data reflects a combination of processes, including erosion, transport, and mixing of multiple source lithologies. As such these samples likely reflect the aggregate geochemical composition of multiple outcrops sourced from multiple locations, not just mineralised outcrop. This distinction is not addressed in the manuscript.
Furthermore, the stream sediment samples appear to be located on a regular grid (Figure 8a), which is atypical for stream sediment sampling. This suggests that the data shown in figure 8a is either a mix of outcrop, regolith, and stream sediment samples collected on a grid, or that they are stream sediment samples that have been interpolated prior to simulation. Again, the processing workflow should be clearly explained in both the text and figure captions.
Considering the issues discussed above (vague and undescriptive figure captions regarding geophysical and geochemical data, ambiguous descriptions of how the data was collected and treated, and the presentation of geochemical samples and geophysical data on apparently regular grids which is atypically for how the data is collected), the current presentation of the data gives the impression to the reader that these datasets were interpolated prior to MPS-DS. If both geochemical and geophysical datasets were interpolated prior to MPS-DS, this undermines the claim that MPS-DS “alleviates smoothing effects at unsampled locations” (lines 195-196), as it would instead be sampling from already smoothed data. If this is not the case, the authors should explicitly clarify what is represented in Figure 8 and provide a clear description in the text of how the geochemical and geophysical datasets were collected and processed prior to simulation to avoid confusion.
While the quantification of uncertainty in MPM (and geological models in general) is an active area of research, the discussion section contains only one reference, which is to support the use of ROC-AUC (line 532). There is no mention or discussion of similar studies, or how this study either compliments or improves upon the methods and findings of previous research. Furthermore, there are two claims made in the discussion and conclusion that are not supported by the data and results presented:
The authors claim that uncertainty quantification “reveals the key geological, geochemical, and geophysical factors that most strongly control mineralization, offering a deep understanding of the underlying ore-forming processes” (lines 551-552). However, only geological factors are briefly discussed (that the faults and Xiong’er rock group is more prospective than the Yanshanian intrusions, lines 439-443, 463). Geochemical and geophysical factors are not mentioned. The current discussion does not highlight how the methods reveal a "deep understanding" of ore forming processes

The authors state that “data uncertainty in MPM arises from limited sampling density of geochemical and geophysical data, as well as subjective interpretation in geological modelling…” (lines 583-584). As discussed above, the geochemical and geophysical datasets as presented in the text and figures appear extensive and do not show clear evidence of limited sampling density. As written, the manuscript does not adequately support or address this conclusion.

Line 78: The statement that weights-of-evidence uses “binary weights (0 or 1)” is inaccurate. The weightings used in WoE are typically logged likelihood ratios. The authors are referred to Bonham-Carter (1994) for a discussion of the method.
Line 129-130: The statement that mineralization is controlled by “geological, geochemical, and geophysical features” is conceptually misleading. Mineralization is controlled by geological processes; geochemical and geophysical data provide evidence for these processes. The authors are referred to Hronsky & Kreuzer (2019) and McCuaig et al. (2010) for a discussion on the distinction of mineral system processes and the use of evidential maps in mineral targeting.
Lines 204-205: The use of randomly selected background samples as negative data is an appropriate and widely used technique in multiple disciples, not just geoscience. Additional references could be included to strengthen this section, for example Montsion et al. (2024), as well as literature from spatial ecology on pseudo-absence data (e.g. Wisz & Guisan, 2009; Barbet-Massin et al., 2012). There are other methods to generate negative training data for MPM, such as using drill hole assay data depleted in a commodity of interest, or the location of mineral systems that are geologically different from the system of interest that may also be mentioned (see Lindsay et al. 2022).
Line 358: The claim that Figure 9 “clearly demonstrates” that DS reproduces the spatial structure of geochemical and geophysical data is not adequately supported. A cumulative frequency distribution alone does not capture spatial structure, and the same cumulative frequency distribution could be reproduced by random data. More appropriate methods would include capture efficiency curves, Ripley’s K function, or nearest-neighbour G function.
Lines 432-435 state that, for data uncertainty quantification, “a deterministic CNN model is trained on simulated evidence layers to produce 100 prospectivity predictions.” I assume that these correspond to the 100 geological evidential layers produced through Monte Carlo sampling of maximum controlling distances, and the 100 geochemical and geophysical maps produced by MPS-DS. If this is the case, the authors state on lines 323-324, and again on lines 364-365, that multiple realizations of the evidential layers allow both the influence of data uncertainty and model uncertainty to be quantified. If the methods used to create these evidential maps incorporate both data uncertainty and model uncertainty, it is unclear in the text how they are subsequently used to quantify only data uncertainty.
Technical corrections:
Lines 12-14: the authors state the study proposes a framework to jointly evaluate data, model, and prediction uncertainty. However on line 71 the authors state the goal is to quantify on model and data uncertainty. Either the introduction or the abstract needs to be updated to reflect the goals and aims of the study.

Line 131: The term “formations” is ambiguous and should be clarified.

Line 144: Replace “perspective” with “framework.”

Line 178: Replace “prominence” with “prominent”

Line 180: Remove “Theoretically”

Line 192: The type of uncertainty referenced in this sentence is unclear and should be explicitly stated.

Line 306: Replace “quantitatively characterizing” with “quantifying”.

Line 378: Replace “are list in Table 2” with “are listed in Table 2”

Table 2 is a repeat of Table 1. Likely an error in copying Table 2 to the manuscript.

Line 408-409: “Case studies” is not an appropriate term; consider using the terms “baseline model” and “experimental models.”

Line 490: Replace “jointly consideration” with “joint consideration”.

Line 564: Remove “ultimately.”

Reply
Citation: https://doi.org/10.5194/egusphere-2026-1293-RC1
CEC1:
'Comment on egusphere-2026-1293 - No compliance with the policy of the journal', Juan Antonio Añel, 21 Jun 2026 reply

Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
The repository that you provide listed in the Code and Data Availability section of your manuscript contains neither the code of the model used nor the geochemical exploration dataset. As these seem to be integral parts of your work, and necessary to replicate it, you must publish them in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
Later, if the Topical Editor decides to continue with the review or publication process of your manuscript and you are requested to upload a new version of it, then The 'Code and Data Availability’ section of your manuscript must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel

Geosci. Model Dev. Executive Editor

Reply

Citation: https://doi.org/10.5194/egusphere-2026-1293-CEC1
- AC1:
  'Reply on CEC1', Ziye Wang, 22 Jun 2026 reply
  
  Dear Executive Editor,
  Thank you for your comment. In the previous version, we provided the code and evidence layers used for implementing the uncertainty quantification framework for mineral prospectivity mapping, including data uncertainty, model uncertainty, and prediction uncertainty. However, we acknowledge that the repository did not contain all materials necessary for full reproducibility.
  In the revised version, we have uploaded all data and code related to the simulation and quantification of uncertainty, including:
  
  (1) Geological, geochemical, and geophysical evidence layers used in mineral prospectivity mapping
  
  (2) Original evidence layers and simulated evidence layers
  
  (2) Code and data for continuous and discrete data uncertainty modeling
  
  (3) Codes and data for data uncertainty, model uncertainty, and prediction uncertainty quantification
  
  (4) Detailed documentation describing the complete workflow, required inputs, outputs, software environment, and execution procedures.
  The repository link and permanent identifier (DOI) are provided below:
  
  https://doi.org/10.5281/zenodo.20793560
  We believe that the revised repository now contains all materials required to reproduce the uncertainty quantification framework and the results presented in the manuscript.
  
  Reply
  
  Citation: https://doi.org/10.5194/egusphere-2026-1293-AC1
  - CEC2: 'Reply on AC1', Juan Antonio Añel, 22 Jun 2026 reply
    
    Dear authors,
    Many thanks for your reply. We can consider now the current version of your manuscript in compliance with the policy of the journal.
    Juan A. Añel
    
    Geosci. Model Dev. Executive Editor
    
    Reply
    
    Citation: https://doi.org/10.5194/egusphere-2026-1293-CEC2
RC2: 'Comment on egusphere-2026-1293', Anonymous Referee #2, 10 Jul 2026 reply

In is paper, the authors present a method to propagate data and model uncertainty in a Mineral Prospectivity Mapping workflow. The manuscript is rather well structured and pleasant to read. However, some strong reformulation in the introduction, additional justifications and discussion point are missing. It would make the manuscript stronger in terms of addressed scope and impact. More details are given below.
Introduction
Lines 41-42: spread the citations across the specific mentioned techniques?
Lines 62-63: in the context of MPM, or in a more general context (e.g. modelling). These two categories are more general, than the four introduced earlier line 55. I would thus reverse the order of presenting these ways of categorizing uncertainties, from a more general approach (2 categories) to the more specific MPM context (4 categories).
Lines 71-115: this part of the introduction is hard to follow especially if the reader is not familiar with deep learning model for MPM. It would be helpful if the authors could recall the main components considered in MPM (e.g. structural features, degree of mineralization, etc.), the common architecture of such workflows (step 1-inferring layers, step 2- extracting features /, step 3- combining information to produce a map) and which methods are usually involved at each step. Then explain how uncertainty is currently ignored or considered at each step and what methods are available for it. Then clearly summarize the knowledge-gap or current limitations and define what you specifically address in this paper. Given the relatively important number of self-citations, it is important to clearly put in perspective the contribution and limitations of previous work by the authors to underline the novelty of this approach.
2.1.1 Line 143: can you explain why multifractal singularity theory offers a rigorous mathematical foundation to simulate geological evidential layer uncertainty?
2.1.2 What training images are available for geochemical and geophysical evidential layers? In the example given in Figure 2, the TI seems to be a realization from a multi-Gaussian Random Field. What is the advantage of using the Direct Sampling algorithm in such a case? Would Sequential Gaussian Simulations not be sufficient?
2.2 Using only a distance criterion to randomly sample negative samples seems a bit limited. Isn’t there a risk of creating a bias? Would it be possible to consider other factors? Which ones? A map showing the locations of negative samples (or a sampling density map) would be meaningful to this study to compare later with the results.

Figure 7: it is not trivial to see the differences between the realizations from one row to the other. A fifth row displaying the standard deviation over the generated ensemble or weighted evidence layer might highlight the variability. A colour bar with the value range is missing for the evidence layers.

4.2 The Training Images need to be presented and clearly justified (what support the interpretation of such property field structures).
Figure 8: though the differences are more visible than in Figure 7, a fifth row displaying the standard deviation of the simulated evidence layers would be great.

5.1: how were the key hyper-parameters determined? How do you justify the values employed?

Section 6
Line 405: ‘Results’ instead of ‘Result’
Lines 452-453 “Overall, areas exhibiting high mean and low entropy values represent the most reliable exploration targets”: Could you provide such an exploration map e.g. by multiplying the mean map with (1-entropy map)?
A single figure grouping figures 12, 13 and 14 might facilitate the comparison of the results.
6.3 While 100 evidence layers realizations are used to quantify data uncertainty, how many models are generated using the Bayesian CNN for model uncertainty quantification? 100? from a single (original) set of evidence layers?
6.4 Where do the 10,000 realizations come from? 100 stochastic sets of evidence layers x 100 Bayesian CNN models?
7.1, the uncertainty decomposition should also be compared and discussed with figures 12 and 13. What are the similarities or differences? Why is it valuable to perform this decomposition rather than relying on data uncertainty and model uncertainty as generated in Figures 12 and 13?
7.2 For this specific study case, prospective areas seem to be in the vicinity of existing mines. Is it related to the location of the negative samples? What is learned specifically for this area from the uncertainty and prospectivity maps from previous MPM studies on the same area?

Reply

Citation: https://doi.org/10.5194/egusphere-2026-1293-RC2

Ziye Wang and Renguang Zuo

Viewed

Total article views: 599 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
443	139	17	599	26	26

HTML: 443
PDF: 139
XML: 17
Total: 599
BibTeX: 26
EndNote: 26

Views and downloads (calculated since 08 May 2026)

Month	HTML	PDF	XML	Total
May 2026	232	66	8	306
Jun 2026	29	23	3	55
Jul 2026	182	50	6	238

Cumulative views and downloads (calculated since 08 May 2026)

Month	HTML	PDF	XML	Total
May 2026	232	66	8	306
Jun 2026	29	23	3	55
Jul 2026	182	50	6	238

Viewed (geographical distribution)

Total article views: 538 (including HTML, PDF, and XML) Thereof 538 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 28 Jul 2026

Short summary

This study proposes a comprehensive uncertainty quantification framework that jointly evaluates data, model, and prediction uncertainties in deep learning-based mineral prospectivity mapping. By modelling and visualizing both data and model uncertainties, the framework transforms deep learning-based mineral prospectivity mapping from deterministic prediction to probabilistic decision-making, thereby enabling more reliable and trustworthy mineral exploration.


Total:	0
HTML:	0
PDF:	0
XML:	0