the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
REHEATFUNQ 1.4.0: A model for regional aggregate heat flow distributions and anomaly quantification
Abstract. Surface heat flow is a geophysical variable that is affected by a complex combination of various heat generation and transport processes. The processes act on different lengths scales, from tens of meters to hundreds of kilometers. In general, it is not possible to resolve all processes for a lack of data or modeling resources, and hence the heat flow data within a region is subject to residual fluctuations.
We introduce the REgional HEATFlow Uncertainty and aNomaly Quantification (REHEATFUNQ) model, version 1.4.0. At its core, REHEATFUNQ uses a stochastic model for heat flow within a region, considering the aggregate heat flow to be generated by a gamma distributed random variable. Based on this assumption, REHEATFUNQ uses Bayesian inference to (i) quantify the regional aggregate heat flow distribution (RAHFD), and (ii) estimate the strength of given heat flow anomaly, for instance as generated by a tectonically active fault. The inference uses a prior conjugate to the gamma distribution for the RAHFDs, and we compute parameters for a uninformed prior from the global heat flow data base by Lucazeau (2019). Through the Bayesian inference, our model is the first of its kind to consistently account for the variability of regional heat flow in the inference of spatial signals in heat flow data. Interpretation of these spatial signals and in particular their interpretation in terms of fault characteristics (particularly fault strength) is a longstanding debate within the geophysical community.
We describe the components of REHEATFUNQ and perform a series of goodnessoffit tests and synthetic resilience analyses of the model. While our analysis reveals to some degree a misfit of our idealized empirical model with realworld heat flow, it simultaneously confirms the robustness of REHEATFUNQ to these model simplifications.
We conclude with an application of REHEATFUNQ to the San Andreas fault in California. Our analysis finds heat flow data in the Mojave section to be sufficient for an analysis, and concludes that stochastic variability can allow for a surprisingly large faultgenerated heat flow anomaly to be compatible with the data. This indicates that heat flow alone may not be a suitable quantity to address fault strength of the San Andreas fault.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(2078 KB)

Supplement
(136 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(2078 KB)  Metadata XML

Supplement
(136 KB)  BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed

CEC1: 'Comment on egusphere2023222', Astrid Kerkweg, 02 Jun 2023
Dear authors,
unfortunately I am not able to find or access the code of REHEATFUNQ (Ziebarth 2023).
Neither the provided DOI ( https://doi.org/10.5880/GFZ.2.6.2023.002 ) nor searching for the Authors name in the GFZ server works.
Please correct this as soon as possible.
Best regards,
Astrid Kerkweg (GMD executive Editor)
Citation: https://doi.org/10.5194/egusphere2023222CEC1 
AC1: 'Comment on egusphere2023222', Malte Ziebarth, 03 Jun 2023
Dear Dr. Kerkweg,
thank you kindly for your comment. The DOI has now been activated and it now
leads to the software landing page on GFZ Data Services (previously only accessible
through the review link in the Code and data availability section).Best regards
Malte ZiebarthCitation: https://doi.org/10.5194/egusphere2023222AC1 
RC1: 'Comment on egusphere2023222', Anonymous Referee #1, 12 Jun 2023
The paper nicely illustrates the possible sources of variability for the heat flow and the importance to consider these variabilities in order to get a better estimate of the heat flow. The manuscript could be improved by better illustrating the workflow for characterizing the variability and by better highlighting their consequences.
General Remarks:
 The paper addresses many interesting aspects that influence the results of the heat flow determination and highlight the importance to consider the processes not in a deterministic manner but in a probabilistic to get an estimate of the variability of the heat flow. However, at times it is difficult to extract the key messages and consequences. It would be beneficial to move some of the more detailed technical explanations to the appendix and better highlight the key findings and consequences of the individual sections.
 For several equations not all variables are introduced (e.g., Eq. 5,7,11). Please check all equations and introduce all used variables.
 Throughout the manuscript, the distributions are often illustrated in terms of their mean and standard deviation values (e.g. p. 15 Figure 6). Note that a standard deviation is only a valid representation of a normal distribution. So, for nonnormal distributions such as the gamma distribution other representations need to be chosen.
 The figures are not displayed in the order they are mentioned and are partly placed far away from the corresponding text. So, it would be important to revise this.
 A main contribution of this paper is the characterization of the variability of the regional heat flow. For this Bayesian inference has been used for the heat flow anomaly strength. However, it is in the current form difficult to understand the individual steps performed for the entire analysis (so not only the anomaly) and to find out which parameters have been used since they are distributed over the different text sections and are partly missing. Therefore, it would help to have a designated section describing the workflow and presenting the workflow also in a schematic figure. Furthermore, it would be helpful to present the used parameters in form of a table.
 Additionally, it might be useful to point out possible sources for the variability observed for the heat flow data points close to each other. So whether these are uncertainties caused by the measurements themselves or if they are physical processes yielding these differences.
 In the Supplement, a couple of additional Figures are presented. It would be good to reference these figures (so far only S1 is referenced) also in the main manuscript and provide additional explanations in the Supplement to better understand how these figures fit into the context of the manuscript.
Further Remarks:
 p. 5 l. 104105: What is the influence of d_min on the variability? This is mentioned later on but a brief explanation here and a reference to the detailed explanation would be helpful.
 p. 7 l. 136: It is mentioned that REHEATFUNQ uses a blackbox approach and prior to that also the importance of the physics is highlighted. So, it would be good to provide a small justification for using a blackbox approach.
 p. 7 l. 139140: "If a region is characterized by uniform heat flow and sufficient data has been collected, a statistical analysis will yield a precise result". I would avoid the word "precise" this can be easily mistaken as a metric of how well the uncertainty quantification worked. A better formulation might be "yield a result with a small variability". Furthermore, it would be good to clarify this statement. The uniform heat flow is only visible when in addition no significant measurement errors are observable.
 p. 8 l. 154: Please clarify in numbers what "sufficiently deep" means.
 p. 8 l. 159160: Please clarify shortly how the separation is performed.
 p. 8 eq. 3: It would be good to provide a reference to the later section, where it is described how q_a is determined.
 p. 8 eq. 4, line 175: Throughout the paper, the assumption is made that the magnitude of q_a is small. It would be good to clarify if the software would work also for large magnitudes of q_a and how this nonlinear function could be incorporated.
 p. 8 l. 183: The number for the Figure is missing.
 p. 8 l. 185186: It is explained that clustered points have high discrepancies and will yield less accurate deterministic distributions. This statement should be further clarified. If neighboring points show very different flow values would that not also be an indication of potential measurement errors?
 p. 9 Figure 4: Panel a: Please explain in detail how the heat flow was computed. According to the text a probability distribution has been used. So, is this one possible realization?
 p. 11 eq. 9: It would be good to provide an explanation of why these equations are used.
 p. 13 l. 278: Which global optimization method has been used?
 p. 13 l. 286: Please provide further details about the SHGO method.
 p. 13 l. 289: Please provide further details about how this manual investigation has been performed.
 p. 13 l. 292: Add a remark on how does this final cost relates to the desired cost?
 p. 14 l. 299303: Sensitivity is derived from a sensitivity analysis and has a specific mathematical meaning. Therefore, it would be important to not use the world in context of the uncertainty quantification and talk instead of uncertainties and variability.
 p. 14 l. 304306: So the uncertainty quantification has been performed with both influential and noninfluential parameters? Has it been considered to perform a sensitivity analysis prior to the uncertainty quantification to determine the noninfluential parameters and keep them fixed during the uncertainty quantification? This aspect is important because noninfluential parameters can significantly degrade the robustness of the uncertainty quantification. Was the robustness of the analysis investigated?
 p. 23 l. 475: Why was a 5 % rejection rate chosen?
 p. 23 l. 480: What is the consequence of the regional aggregate heat flow not following a gamma distribution? This is later explained but it would be important to provide here already a brief explanation and a reference to the later section 4.2.3 for the detailed explanation.
 p. 26 l. 534: The process has been repeated 100 times. Has a convergence test been performed for this?
 Section 4.3.1: Are general rules available for choosing the prior?
 p. 30 l. 609: How many samples have been used for the MonteCarlo simulations? Where convergence test performed?
 p. 32 l. 640: See the previous comment about the convergence test.
 p. 33 l. 669: “To conclude, the test yield posterior support for our gamma model choice.” Has this analysis also been done for the other distributions that showed in Figure 13 similar performance as the gamma distribution? That would be important to know in order to judge whether also other prior distributions yield similar posterior results.
 p. 34 l. 702: Can the geophysical modeling of the heat generation and transport that can help in detrending the data, be done in REHEATFUNQ or does this need to be done externally?
 p. 35 l. 724725: What impacts would an uncertain heat transport have on the determination of the heat flow over REHEATFUNQ?
 p. 39 Figure 20: In panels d,f, and g the posterior pdf does not match with the data. Further explanations of what can be done in this case are required.
 Supplementary Material: Figures S2 to S4 are missing the axis labels.
Citation: https://doi.org/10.5194/egusphere2023222RC1 
AC2: 'Reply on RC1', Malte Ziebarth, 20 Dec 2023
Dear Reviewer #1,
thank you kindly for your extensive and valuable feedback. We have now addressed all points you mentioned, some of which required extensive work on the code base. Hence we would like to offer our apologies for the late reply.
We would like to respond to your comments in two steps. First, we would like to answer to a small subset of the comments in this interactive discussion; this is the first section of this document. The remaining majority of the comments led to straightforward improvements of our manuscript; we list them in the second part of this manuscript.
Discussion

The paper addresses many interesting aspects that influence the results of the heat flow determination and highlight the importance to consider the processes not in a deterministic manner but in a probabilistic to get an estimate of the variability of the heat flow. However, at times it is difficult to extract the key messages and consequences. It would be beneficial to move some of the more detailed technical explanations to the appendix and better highlight the key findings and consequences of the individual sections.

Are there any specific parts which “more detailed technical explanations” refer to? We have improved the highlighting of the key findings in several places as the result of multiple others of your comments (and those of RC2).


Additionally, it might be useful to point out possible sources for the variability observed for the heat flow data points close to each other. So whether these are uncertainties caused by the measurements themselves or if they are physical processes yielding these differences.

Four possible sources (variability of radiogenic heat production, displacement in faulting, topographic effects, and measurement errors) are mentioned in lines 149 to 159.


p. 7 l. 139140: "If a region is characterized by uniform heat flow and sufficient data has been collected, a statistical analysis will yield a precise result". I would avoid the word "precise" this can be easily mistaken as a metric of how well the uncertainty quantification worked. A better formulation might be "yield a result with a small variability". Furthermore, it would be good to clarify this statement. The uniform heat flow is only visible when in addition no significant measurement errors are observable.

We would like to kindly thank you for your suggestion but we have decided to remain with “precise”. We find that in the context of observational error, “accuracy” and “precision” are quite wellknown terms, and “precision” quantifies exactly what we want to express (and which you have also noted): small variance of the results with repeated measurements.
Nevertheless, we have clarified the meaning of “uniform”. We use uniform in the sense of “identically distribution”, that is, we imply that “uniform heat flow” is drawn from the same distribution throughout a region.
The issue of “significant measurement errors” of your last sentence is hence resolved. Within our underlying physical model (Equation 3), this distribution includes any potential measurement error. Hence, we do not need to measure constant heat flow throughout the region for the region to be “characterized by uniform heat flow” in the sense we have now clarified.


p. 8 l. 185186: It is explained that clustered points have high discrepancies and will yield less accurate deterministic distributions. This statement should be further clarified. If neighboring points show very different flow values would that not also be an indication of potential measurement errors?

The answer to the question is yes (strong differences between close heat flow measurements could be both due to strong material contrasts in terms of heat production, or due to measurement errors q_f), and REHEATFUNQ aims to capture this variability by means of the latent parameter described in sections “3.2.4 Handling spatial data clusters” and “3.4 Handling spatial data clusters”. The paragraph l. 183187 was written deliberately focused on the effect that spatial measurement clustering has on the regional aggregate heat flow distribution if at least a part of it is “random” due to the sampling of a spatially variable heat flow field (Fig. 4).

We have now added an early explanation that REHEATFUNQ aims to mitigate the clustering effect while still capturing the variability that is contained in a cluster.


p. 13 l. 292: Add a remark on how does this final cost relates to the desired cost?

We neither assume nor define a desired cost. Our only criterion is that the cost should be minimal, hence implying that the “fit” of the prior to the distributions observed in the global data is maximal. At least locally, our final cost is minimal. To the degree that our sampling employed in the SHGO method is dense enough, the cost is also globally minimal.


p. 14 l. 299303: Sensitivity is derived from a sensitivity analysis and has a specific mathematical meaning. Therefore, it would be important to not use the world in context of the uncertainty quantification and talk instead of uncertainties and variability.

We would like to politely state our disagreement. First, uncertainty and sensitivity are two interwoven aspects of the same phenomenon: uncertain estimates. Second, in cited lines, we use sensitivity in the sense of “strength of the dependence onto input variation” which corresponds to its semantics in both sensitivity analysis and common speech. More specifically

l.299300, “and the parameters’ sensitivities for different α and β.” refers to the gamma distribution parameters’ sensitivity to the (shape of the) distribution itself, and how this sensitivity changes as the distribution parameters change.

l. 302303 specifically explains the interwoven nature of uncertainty and sensitivity: “However, the sensitivity of the overall distribution relative to the distribution parameters—and consequently the uncertainty of the distribution estimates—changes with the distribution parameters.”



p. 14 l. 304306: So the uncertainty quantification has been performed with both influential and noninfluential parameters? Has it been considered to perform a sensitivity analysis prior to the uncertainty quantification to determine the noninfluential parameters and keep them fixed during the uncertainty quantification? This aspect is important because noninfluential parameters can significantly degrade the robustness of the uncertainty quantification. Was the robustness of the analysis investigated?

We have not performed uncertainty quantification. Lines 299306 refer to general properties of the gamma distribution’s parameters α and β if one were to compute point estimates and then quantify their uncertainty. Since REHEATFUNQ is a Bayesian approach, no point estimates of α and β are employed—instead, α and β are always integrated over all possible values—and uncertainty quantification therefore does not apply.


p. 23 l. 475: Why was a 5 % rejection rate chosen?

The 5 % rejection rate was initially chosen adhoc due to the availability of critical statistics for the gamma and other distributions (e.g. 1 %, 2.5 %, 5 %, 10 % given by Stephens, 1986). If a lower rejection rate would be chosen, this rejection rate could not be resolved from the small number of disks that cover the global data set at small R (due to the minimum sample size criterion). For instance, one would need to have at least around 100 valid disks to be able to quantify a 1 % rejection rate, whereas around 20 disks suffice to quantify a 5 % rejection rate. In case of the NGHF data set we used, the graphs labeled “Inverse average number of disks in RGRDC” in Fig. 11 indicate by radius the number of valid disks that can be fit onto Earth surface (say, approximately R >= 120 km would be sufficient for 1 % rejection rate). Hence, reducing the rejection rate prevents the analysis of smaller disks—or, in other words, decreases the spatial resolution. The 5 % rejection rate we chose is close to the minimum that can be quantified at 80 km. If a higher rejection rate would be chosen, the goodnessoffit tests are generally more powerful at the expense of a higher TypeI error. This is something we would like to prevent. All in all, the choice of the rejection rate has considerable leeway—yet 5 % seems to work well for the purpose of and the data set used in this study.


Section 4.3.1: Are general rules available for choosing the prior?

There are the following general rules: 1) If the sample size is small, and the sample distribution is “typical” (i.e. similar to what we have found in the NGHF data set, see Fig. 14), the informed prior with optimized parameters is beneficial 2) if the sample’s variance is low (e.g. after correcting for numerous known effects that cause spatial variation), the uninformed prior may be beneficial 3) If the sample size is large, both choices should be relatively similar.

Added a paragraph describing these general rules.


p. 33 l. 669: “To conclude, the test yield posterior support for our gamma model choice.” Has this analysis also been done for the other distributions that showed in Figure 13 similar performance as the gamma distribution? That would be important to know in order to judge whether also other prior distributions yield similar posterior results.

We have not performed this analysis for other distributions. Considering the amount of algebraic and numerical developments that went into stabilizing the REHEATFUNQ code to its current stage (that is, eliminating numerical difficulties in various parts of the parameter space), repeating this process to a comparable for other distributions would likely add another manuscript worth of workload onto the present manuscript. We hence consider this work out of scope for the present manuscript. That said, we have no reason to believe that, given a careful method development, a REHEATFUNQ model based on another similarfitting PDF, such as the loglogistic or normal distribution, would perform significantly worse or better than our development based on the gamma distribution. The differences between a good fit of these three distributions among each other are decidedly less than the difference between the gamma distribution and the mixture distributions used in section 4.3.2. The phrase “To conclude, the tests yield posterior support for our gamma model choice” is meant to support (or validate) our specific model without indicating preference among the models tested in Fig. 13.


p. 34 l. 702: Can the geophysical modeling of the heat generation and transport that can help in detrending the data, be done in REHEATFUNQ or does this need to be done externally?

No, this needs to be done externally (as mentioned also section “3.5 Heat Conduction”).


p. 39 Figure 20: In panels d,f, and g the posterior pdf does not match with the data. Further explanations of what can be done in this case are required.

These differences are due to the distance selection criterion. In the Carrizo, Creeping, and North Coast segment regions, there are visible effects of clustering (e.g. at geothermal areas or clustered around the San Andreas fault in the Carrizo segment area. Here, the minimum distance criterion works as intended: it enforces a more uniform sampling of the area, which does not overly weight small regions with clustered measurements, while keeping all that information within the latent dimension (exlusive sampling of the clusters). These effects are already described in detail in section 5.1 with references to the maps Fig. 19. The remedy in this case, to bring the histograms closer to the posterior predictive, is to provide more uniform heat flow measurements.

We have added to the caption of Figure 20 a reference to the explanation in section 5.1, and we have mentioned that this section discusses the mismatch between data and posterior.

Changes
General Remarks:

For several equations not all variables are introduced (e.g., Eq. 5,7,11). Please check all equations and introduce all used variables.

We have checked the equations and added introduction of variables (some of them redundantly to improve local readability)


The figures are not displayed in the order they are mentioned and are partly placed far away from the corresponding text. So, it would be important to revise this.

We have referenced Figure 1 earlier and moved Figure 2. The cross reference to later Fig. 18 of the Southern California example is difficult to resolve; we have left it as is.


A main contribution of this paper is the characterization of the variability of the regional heat flow. For this Bayesian inference has been used for the heat flow anomaly strength. However, it is in the current form difficult to understand the individual steps performed for the entire analysis (so not only the anomaly) and to find out which parameters have been used since they are distributed over the different text sections and are partly missing. Therefore, it would help to have a designated section describing the workflow and presenting the workflow also in a schematic figure. Furthermore, it would be helpful to present the used parameters in form of a table.

Added a section “Workflow Cheat Sheet” that lists the steps of a typical use of the REHEATFUNQ model directly after the introduction.


In the Supplement, a couple of additional Figures are presented. It would be good to reference these figures (so far only S1 is referenced) also in the main manuscript and provide additional explanations in the Supplement to better understand how these figures fit into the context of the manuscript.

Added explicit Fig S2 to S4 references in section 4.3.2 where they were previously referenced implicitly.

Further Remarks:

p. 5 l. 104105: What is the influence of d_min on the variability? This is mentioned later on but a brief explanation here and a reference to the detailed explanation would be helpful.

Added a note that d_min aims to prevent biases due to spatial clustering of measurements, and a reference to the later following explanation


p. 7 l. 136: It is mentioned that REHEATFUNQ uses a blackbox approach and prior to that also the importance of the physics is highlighted. So, it would be good to provide a small justification for using a blackbox approach.

We have detailed the connection between the paragraph of line 136 and the one before. We now emphasize that REHEATFUNQ aims to capture those principally known physical processes which yet cannot be modeled due to insufficient data. That is, REHEATFUNQ captures the known physics which is acting from unknown sources.


p. 8 l. 154: Please clarify in numbers what "sufficiently deep" means.

Clarified as a range depending on the thermal gradient and the topographic variability


p. 8 l. 159160: Please clarify shortly how the separation is performed.

The separation of q_a from the remaining heat flow signal is based on the superposition of q_a and q_u in equations (3) and (4), and on treating q_u as a (gammadistributed) random variable. “Separation” in this case means inferring the additive component q_a based on the model for q_u and on the knowledge about the pattern of q_a. Added a better explanation of this approach in l. 175 and rewrote the paragraph starting in l. 159.


p. 8 eq. 3: It would be good to provide a reference to the later section, where it is described how q_a is determined.

The anomalous or modeled heat flow q_a is an input to the REHETAFUNQ model. Therefore, section 3.5, the later section which simplifies the computation of q_a for one particular heat source, is not general enough to fulfill the role of “describ[ing] how q_a is determined”. We have rewritten the paragraphs preceding eq. (3) to better convey the message that a user of REHEATFUNQ’s anomaly quantification would already have a q_a in mind—which is determined by the problem that the researcher investigates. Nevertheless, we have referred to section 3.5 as an example of how q_a is expressed in terms of P_H, and how the required coefficients c_i can be computed (added more detail for that as well).


p. 8 eq. 4, line 175: Throughout the paper, the assumption is made that the magnitude of q_a is small. It would be good to clarify if the software would work also for large magnitudes of q_a and how this nonlinear function could be incorporated.

We clarified that the software does not work if the heat source that generates the anomaly q_a itself drives nonlinear convection. However, we note that this is not the main use case of REHEATFUNQ: If the anomaly heat source is that strong, the resulting anomaly will likely have a good signaltonoise ratio against the background heat flow, and one will likely be able to discern the signal without REHEATFUNQ. (We also noted one technical way to apply REHEATFUNQ if the nonlinearity does not lead to a nonlinear signal in space, and can be handled by a global transformation of variables. We cannot assess how likely such a case might be, though, so our preferred answer is that REHEATFUNQ does not work in these cases).


p. 8 l. 183: The number for the Figure is missing.

Added


p. 9 Figure 4: Panel a: Please explain in detail how the heat flow was computed. According to the text a probability distribution has been used. So, is this one possible realization?

The heat flow was computed by optimizing underground heat sources (not shown) such that the resulting surface heat flow field, sampled uniformly in space, has an aggregate heat flow distribution that is close to a gamma distribution (measured by the AndersonDarling statistic). In a sense, this is one realization of a random process, but not as simple as one realization of a probability distribution. The wording in the caption has been clarified to concisely explain the generation of the heat flow field. A detailed explanation of the twostep process is given in a new section of the Appendix.


p. 11 eq. 9: It would be good to provide an explanation of why these equations are used.

Gave a short explanation of Bayesian updating and its benefits by reducing the number of dimensions of numerical quadrature.


p. 13 l. 278: Which global optimization method has been used?

This paragraph was clarified to mention that these are general considerations independent of the specific method. The paragraph was split at the point where algorithm, SHGO, is mentioned (l. 285) to clarify.


p. 13 l. 286: Please provide further details about the SHGO method.

Added a short explanation of how the SHGO method works. Furthermore, gave more detail to the advantage of the SHGO method (yielding exactly one local optimization starting point per local minimum) if the parameter space has been sufficiently sampled.


p. 23 l. 480: What is the consequence of the regional aggregate heat flow not following a gamma distribution? This is later explained but it would be important to provide here already a brief explanation and a reference to the later section 4.2.3 for the detailed explanation.

Added a brief explanation of the potential problem of using an imperfect model, and refer to sections 4.2.2, 4.2.3, 4.3.2


p. 26 l. 534: The process has been repeated 100 times. Has a convergence test been performed for this?

Added Figures S9, S10, and S11 in the supplement, showing the convergence of the plots in Fig. 13 of the manuscript.


p. 30 l. 609: How many samples have been used for the MonteCarlo simulations? Where convergence test performed?

Added Figures S7 and S8 in the supplement that shows the convergence analysis for Fig. 15 in the manuscript.


p. 32 l. 640: See the previous comment about the convergence test.

Same simulations as previous point.


p. 35 l. 724725: What impacts would an uncertain heat transport have on the determination of the heat flow over REHEATFUNQ?

Uncertain heat transport leads to further uncertainty and we expect a further diffusion of the prior if the heat flow anomaly is variable within a certain range. Since the heat transport enters the REHEATFUNQ analysis only through the coefficients c_i, uncertain heat transport can be introduced to our model by varying these coefficients according to the uncertainty of the heat flow model, and provide weights for each point in the model range. In this way, one would condense all uncertainty of the heat transport into marginal distributions of the c_i.

We have implemented an option to provide uncertain heat transport via a weighted list of sets of functions f_i to evaluate the coefficients at the data locations (the structure is {(w_i, f_i(x,y)) : i=1,…,n}). This is effectively a sample of the marginal “distribution” of the heat flow distribution. Internally, this set of weighted functions is combined, randomly sampled, with the set of data selections from the minimum distance criterion to yield a combined latent parameter dimension. This dimension samples the combined posterior of the distance criterion and the effect of uncertain heat transport.


Supplementary Material: Figures S2 to S4 are missing the axis labels.

Added missing labels

Citation: https://doi.org/10.5194/egusphere2023222AC2 

RC2: 'Comment on egusphere2023222', Anonymous Referee #2, 24 Oct 2023
This contribution presents a model tackling the problem that surface heat flow measurements do not dissociate the background crustal heat flow from the heat flow induced by anomalies such as fault friction, taken as the feature of focus in this model. A method is devised to separate from heat flow measurements, the background heat flow and the strength of a fault frictional heat anomaly. This Bayesian inference method seems adequate and is interesting to pursue. I like the random global Rdisk coverings to reduce local variability but it could be specified in simpler words why it is used. The heat flow is assumed to follow a gamma distribution and the message is hard to distillate because it is emphasised the gamma distribution does not fit the data, but that it is the best that can be used. The model was tested on 4 different regions around the San Andreas fault to revisit previous results about fault strength.
The paper is dense and very technical. With my limited understanding of the technical content, I had a hard time following the authors and I believe the paper is not easily accessible to every geoscientist reader. In that sense, interpretations should be supplemented by a more physical interpretation so that all geoscientists are able to understand and use the model and/or cite the results. Those interpretations could be used as summary of sections that are quite dense to help with readability.
I was confused by the introduction of the random global Rdisk coverings this early as part the Data section 2.2 when more of its details is explained in the model section 3.2.4 and is also a mini section 3.4. I would like to see only one section about it for readability and I would rather see it in the model section 3.2.4.
I would like the authors to find a more positive and logical spin in the validation section about the choice of gamma distribution. Furthermore it should be clearly highlighted in 4.2.3 why the gamma distribution is the best distribution and if it is not, why it was chosen regardless. I do not understand why Weibull is not a good distribution for example. It should be also highlighted if no other distribution is better, what could be an alternate solution that would produce a lower rejection rate and why is the solution not considered in the model (maybe it cannot be modelled?).
Text correction:
Section 2 Data: can the term “data” be specified?
line 117: leads us to chose a minimum distance > choose
Line 183: illustrated in Figure ? highlights
Figure 4 refers to CDF which has not been introduced yet
Line 241: a prior > a priori
Line 327: we can compute evaluate posteriors > ?
Can section 3.5 about heat conduction be part of 3.3 on the anomaly detection? I feel it should not be an independent section
Rejection rate needs to be explained more in details
Section 4.3 synthetic data: can the term “data” be specified?
Line 793: the results are less clear > I have trouble seeing what makes a result clear or not…Citation: https://doi.org/10.5194/egusphere2023222RC2 
AC3: 'Reply on RC2', Malte Ziebarth, 22 Dec 2023
Dear Reviewer #2,
thank you kindly for your helpful and valuable feedback that led us to improve our manuscript.
We would like to respond to your comments in two steps. First, we would like to answer to your highlevel comments in this interactive discussion; this is the first section of this document. The remaining text corrections led to straightforward improvements of our manuscript; we list them in the second part of this manuscript.Discussion
This contribution presents a model tackling the problem that surface heat flow measurements do not dissociate the background crustal heat flow from the heat flow induced by anomalies such as fault friction, taken as the feature of focus in this model. A method is devised to separate from heat flow measurements, the background heat flow and the strength of a fault frictional heat anomaly. This Bayesian inference method seems adequate and is interesting to pursue. I like the random global Rdisk coverings to reduce local variability but it could be specified in simpler words why it is used. The heat flow is assumed to follow a gamma distribution and the message is hard to distillate because it is emphasised the gamma distribution does not fit the data, but that it is the best that can be used. The model was tested on 4 different regions around the San Andreas fault to revisit previous results about fault strength.
 The paper is dense and very technical. With my limited understanding of the technical content, I had a hard time following the authors and I believe the paper is not easily accessible to every geoscientist reader. In that sense, interpretations should be supplemented by a more physical interpretation so that all geoscientists are able to understand and use the model and/or cite the results. Those interpretations could be used as summary of sections that are quite dense to help with readability.
 Thank you kindly for this feedback. Indeed, our paper tries to manage a difficult balance. On one hand, it is a reference of all technical details of the development of the REHEATFUNQ model, as well as a synthesis of all supporting tests and analyses that we have performed. On the other hand, it aims to make REHEATFUNQ accessible for a wide number of geoscientists—where as you have pointed out correctly our draft has a high potential to obscure the practically relevant details through the large amount of technical description. We have made a number of changes to the organization of the manuscript to make it more accessible to a wide geoscientific audience:
 We have added a new onepage section “2. Workflow cheat sheet” directly after the introduction. This page gives a short summary of the steps involved in a typical usage scenario of REHEATFUNQ, lists Python classes involved in those steps, and links to the documentation where code examples are listed.
 Added the “PointsofInterest Measurement” toy model to the appendix to better illustrate the potential impact of spatial data clustering, and to illustrate the mitigation using the dminsampling. The model is built on simple assumptions on the nature of human heat flow data acquisition.
 Thank you kindly for this feedback. Indeed, our paper tries to manage a difficult balance. On one hand, it is a reference of all technical details of the development of the REHEATFUNQ model, as well as a synthesis of all supporting tests and analyses that we have performed. On the other hand, it aims to make REHEATFUNQ accessible for a wide number of geoscientists—where as you have pointed out correctly our draft has a high potential to obscure the practically relevant details through the large amount of technical description. We have made a number of changes to the organization of the manuscript to make it more accessible to a wide geoscientific audience:
 I was confused by the introduction of the random global Rdisk coverings this early as part the Data section 2.2 when more of its details is explained in the model section 3.2.4 and is also a mini section 3.4. I would like to see only one section about it for readability and I would rather see it in the model section 3.2.4.
 The original reason why the RGRDCs are mentioned at multiple places in the manuscript is that different variants of the same idea are used at different places (when comparing different distributions and when fitting the conjugate prior, we use coverings of the real world NGHF data. For the synthetic validation tests, we use random computergenerated data that mimic the RGRDCs of the NGHF). We have now consolidated the different sections into the appendix since the algorithms are mostly a technical detail—the underlying idea can be conveyed using just Figure 1.
 I would like the authors to find a more positive and logical spin in the validation section about the choice of gamma distribution. Furthermore it should be clearly highlighted in 4.2.3 why the gamma distribution is the best distribution and if it is not, why it was chosen regardless. I do not understand why Weibull is not a good distribution for example. It should be also highlighted if no other distribution is better, what could be an alternate solution that would produce a lower rejection rate and why is the solution not considered in the model (maybe it cannot be modelled?).
 As we have noted in the section, the results of the comparison is not conclusive (arguably with the exception of rejecting the Fréchet distribution). No distribution is unanimously selected in the majority of regional aggregate heat flow distributions, and furthermore the positive evidence (Fig. 13a) is not significant in the vast majority of cases that a distribution is selected by the BIC. In particular, this concerns also the Weibull distribution: as we have noted, it has the highest BIC selection rate but the vast majority of selections have a ∆BIC<2, which is not significant (e.g. Kass & Raftery, 1995, p. 777) and may well be due to random fluctuations. Concluding, with the data we have at hand there is no significant preference between many of the distributions investigated, including the gamma distribution. Therefore, the choice of distribution is a modeling decision.
 A further note on the Weibull distribution: the possible leftskewness of the Weibull distribution may help the Weibull distribution to acquire a significant amount of BIC selections on small sample sizes by pure chance. We have tested the BIC selection rates for some random data of sample size N=10 drawn from a gamma distribution and found that the Weibull distribution was selected by the BIC criterion over the gamma distribution in about one third of the cases.
 From a modeling point of view, the gamma distribution is the only one that combines all of the following criteria: (1) it is defined on a positive support, (2) it has a conjugate prior (important for enabling the costly computations), (3) it is rightskewed, like the global heat flow distribution, for all parameter combinations
 We have highlighted a possible alternate solution, leading to a lower selection rate, in section 4.2.2 (now 5.1.2) and Fig. 12 (now Fig. 11). The high rejection rate can be explained by the bimodal mixture distribution of two gamma distributions. Vice versa, one may be able to achieve a lower rejection rate by (1) deriving a REHEATFUNQ model for a gamma mixture distribution or (2) by considering spatial dependence of distribution parameters. These two avenues are possible improvements of our model in future works. For now, we have shown in section 4.3.2 (now 5.2.2) that the REHEATFUNQ method can handle bimodal regional aggregate heat flow distributions, and it does so by yielding results of higher uncertainty.
 We have improved the conclusion of section 4.2.3 based on these arguments.
 As we have noted in the section, the results of the comparison is not conclusive (arguably with the exception of rejecting the Fréchet distribution). No distribution is unanimously selected in the majority of regional aggregate heat flow distributions, and furthermore the positive evidence (Fig. 13a) is not significant in the vast majority of cases that a distribution is selected by the BIC. In particular, this concerns also the Weibull distribution: as we have noted, it has the highest BIC selection rate but the vast majority of selections have a ∆BIC<2, which is not significant (e.g. Kass & Raftery, 1995, p. 777) and may well be due to random fluctuations. Concluding, with the data we have at hand there is no significant preference between many of the distributions investigated, including the gamma distribution. Therefore, the choice of distribution is a modeling decision.
Changes
Text correction:
 Section 2 Data: can the term “data” be specified?
 We have specified that this section is about heat flow data, and we note furthermore that it describes the database we have used (Lucazeau, 2019) and the kind of filtering we applied to this database.
 line 117: leads us to chose a minimum distance > choose
 Corrected.
 Line 183: illustrated in Figure ? Highlights
 Added the Figure number.
 Figure 4 refers to CDF which has not been introduced yet
 Added the abbreviation to the caption.
 Line 241: a prior > a priori
 Corrected.
 Line 327: we can compute evaluate posteriors > ?
 Corrected to "evaluate".
 Can section 3.5 about heat conduction be part of 3.3 on the anomaly detection? I feel it should not be an independent section
 Done.
 Rejection rate needs to be explained more in details
 Added more explanation in lines 475+ (of the old manuscript)
 Section 4.3 synthetic data: can the term “data” be specified?
 Added a short description to the first sentence of that section: “computegenerated samples {(x,y,q)i} of surface heat flow.”
 Line 793: the results are less clear > I have trouble seeing what makes a result clear or not…
 Specified the sentence: the “results” refer to the existence or nonexistence of a finite heat flow anomaly.
Citation: https://doi.org/10.5194/egusphere2023222AC3  The paper is dense and very technical. With my limited understanding of the technical content, I had a hard time following the authors and I believe the paper is not easily accessible to every geoscientist reader. In that sense, interpretations should be supplemented by a more physical interpretation so that all geoscientists are able to understand and use the model and/or cite the results. Those interpretations could be used as summary of sections that are quite dense to help with readability.

AC3: 'Reply on RC2', Malte Ziebarth, 22 Dec 2023
Interactive discussion
Status: closed

CEC1: 'Comment on egusphere2023222', Astrid Kerkweg, 02 Jun 2023
Dear authors,
unfortunately I am not able to find or access the code of REHEATFUNQ (Ziebarth 2023).
Neither the provided DOI ( https://doi.org/10.5880/GFZ.2.6.2023.002 ) nor searching for the Authors name in the GFZ server works.
Please correct this as soon as possible.
Best regards,
Astrid Kerkweg (GMD executive Editor)
Citation: https://doi.org/10.5194/egusphere2023222CEC1 
AC1: 'Comment on egusphere2023222', Malte Ziebarth, 03 Jun 2023
Dear Dr. Kerkweg,
thank you kindly for your comment. The DOI has now been activated and it now
leads to the software landing page on GFZ Data Services (previously only accessible
through the review link in the Code and data availability section).Best regards
Malte ZiebarthCitation: https://doi.org/10.5194/egusphere2023222AC1 
RC1: 'Comment on egusphere2023222', Anonymous Referee #1, 12 Jun 2023
The paper nicely illustrates the possible sources of variability for the heat flow and the importance to consider these variabilities in order to get a better estimate of the heat flow. The manuscript could be improved by better illustrating the workflow for characterizing the variability and by better highlighting their consequences.
General Remarks:
 The paper addresses many interesting aspects that influence the results of the heat flow determination and highlight the importance to consider the processes not in a deterministic manner but in a probabilistic to get an estimate of the variability of the heat flow. However, at times it is difficult to extract the key messages and consequences. It would be beneficial to move some of the more detailed technical explanations to the appendix and better highlight the key findings and consequences of the individual sections.
 For several equations not all variables are introduced (e.g., Eq. 5,7,11). Please check all equations and introduce all used variables.
 Throughout the manuscript, the distributions are often illustrated in terms of their mean and standard deviation values (e.g. p. 15 Figure 6). Note that a standard deviation is only a valid representation of a normal distribution. So, for nonnormal distributions such as the gamma distribution other representations need to be chosen.
 The figures are not displayed in the order they are mentioned and are partly placed far away from the corresponding text. So, it would be important to revise this.
 A main contribution of this paper is the characterization of the variability of the regional heat flow. For this Bayesian inference has been used for the heat flow anomaly strength. However, it is in the current form difficult to understand the individual steps performed for the entire analysis (so not only the anomaly) and to find out which parameters have been used since they are distributed over the different text sections and are partly missing. Therefore, it would help to have a designated section describing the workflow and presenting the workflow also in a schematic figure. Furthermore, it would be helpful to present the used parameters in form of a table.
 Additionally, it might be useful to point out possible sources for the variability observed for the heat flow data points close to each other. So whether these are uncertainties caused by the measurements themselves or if they are physical processes yielding these differences.
 In the Supplement, a couple of additional Figures are presented. It would be good to reference these figures (so far only S1 is referenced) also in the main manuscript and provide additional explanations in the Supplement to better understand how these figures fit into the context of the manuscript.
Further Remarks:
 p. 5 l. 104105: What is the influence of d_min on the variability? This is mentioned later on but a brief explanation here and a reference to the detailed explanation would be helpful.
 p. 7 l. 136: It is mentioned that REHEATFUNQ uses a blackbox approach and prior to that also the importance of the physics is highlighted. So, it would be good to provide a small justification for using a blackbox approach.
 p. 7 l. 139140: "If a region is characterized by uniform heat flow and sufficient data has been collected, a statistical analysis will yield a precise result". I would avoid the word "precise" this can be easily mistaken as a metric of how well the uncertainty quantification worked. A better formulation might be "yield a result with a small variability". Furthermore, it would be good to clarify this statement. The uniform heat flow is only visible when in addition no significant measurement errors are observable.
 p. 8 l. 154: Please clarify in numbers what "sufficiently deep" means.
 p. 8 l. 159160: Please clarify shortly how the separation is performed.
 p. 8 eq. 3: It would be good to provide a reference to the later section, where it is described how q_a is determined.
 p. 8 eq. 4, line 175: Throughout the paper, the assumption is made that the magnitude of q_a is small. It would be good to clarify if the software would work also for large magnitudes of q_a and how this nonlinear function could be incorporated.
 p. 8 l. 183: The number for the Figure is missing.
 p. 8 l. 185186: It is explained that clustered points have high discrepancies and will yield less accurate deterministic distributions. This statement should be further clarified. If neighboring points show very different flow values would that not also be an indication of potential measurement errors?
 p. 9 Figure 4: Panel a: Please explain in detail how the heat flow was computed. According to the text a probability distribution has been used. So, is this one possible realization?
 p. 11 eq. 9: It would be good to provide an explanation of why these equations are used.
 p. 13 l. 278: Which global optimization method has been used?
 p. 13 l. 286: Please provide further details about the SHGO method.
 p. 13 l. 289: Please provide further details about how this manual investigation has been performed.
 p. 13 l. 292: Add a remark on how does this final cost relates to the desired cost?
 p. 14 l. 299303: Sensitivity is derived from a sensitivity analysis and has a specific mathematical meaning. Therefore, it would be important to not use the world in context of the uncertainty quantification and talk instead of uncertainties and variability.
 p. 14 l. 304306: So the uncertainty quantification has been performed with both influential and noninfluential parameters? Has it been considered to perform a sensitivity analysis prior to the uncertainty quantification to determine the noninfluential parameters and keep them fixed during the uncertainty quantification? This aspect is important because noninfluential parameters can significantly degrade the robustness of the uncertainty quantification. Was the robustness of the analysis investigated?
 p. 23 l. 475: Why was a 5 % rejection rate chosen?
 p. 23 l. 480: What is the consequence of the regional aggregate heat flow not following a gamma distribution? This is later explained but it would be important to provide here already a brief explanation and a reference to the later section 4.2.3 for the detailed explanation.
 p. 26 l. 534: The process has been repeated 100 times. Has a convergence test been performed for this?
 Section 4.3.1: Are general rules available for choosing the prior?
 p. 30 l. 609: How many samples have been used for the MonteCarlo simulations? Where convergence test performed?
 p. 32 l. 640: See the previous comment about the convergence test.
 p. 33 l. 669: “To conclude, the test yield posterior support for our gamma model choice.” Has this analysis also been done for the other distributions that showed in Figure 13 similar performance as the gamma distribution? That would be important to know in order to judge whether also other prior distributions yield similar posterior results.
 p. 34 l. 702: Can the geophysical modeling of the heat generation and transport that can help in detrending the data, be done in REHEATFUNQ or does this need to be done externally?
 p. 35 l. 724725: What impacts would an uncertain heat transport have on the determination of the heat flow over REHEATFUNQ?
 p. 39 Figure 20: In panels d,f, and g the posterior pdf does not match with the data. Further explanations of what can be done in this case are required.
 Supplementary Material: Figures S2 to S4 are missing the axis labels.
Citation: https://doi.org/10.5194/egusphere2023222RC1 
AC2: 'Reply on RC1', Malte Ziebarth, 20 Dec 2023
Dear Reviewer #1,
thank you kindly for your extensive and valuable feedback. We have now addressed all points you mentioned, some of which required extensive work on the code base. Hence we would like to offer our apologies for the late reply.
We would like to respond to your comments in two steps. First, we would like to answer to a small subset of the comments in this interactive discussion; this is the first section of this document. The remaining majority of the comments led to straightforward improvements of our manuscript; we list them in the second part of this manuscript.
Discussion

The paper addresses many interesting aspects that influence the results of the heat flow determination and highlight the importance to consider the processes not in a deterministic manner but in a probabilistic to get an estimate of the variability of the heat flow. However, at times it is difficult to extract the key messages and consequences. It would be beneficial to move some of the more detailed technical explanations to the appendix and better highlight the key findings and consequences of the individual sections.

Are there any specific parts which “more detailed technical explanations” refer to? We have improved the highlighting of the key findings in several places as the result of multiple others of your comments (and those of RC2).


Additionally, it might be useful to point out possible sources for the variability observed for the heat flow data points close to each other. So whether these are uncertainties caused by the measurements themselves or if they are physical processes yielding these differences.

Four possible sources (variability of radiogenic heat production, displacement in faulting, topographic effects, and measurement errors) are mentioned in lines 149 to 159.


p. 7 l. 139140: "If a region is characterized by uniform heat flow and sufficient data has been collected, a statistical analysis will yield a precise result". I would avoid the word "precise" this can be easily mistaken as a metric of how well the uncertainty quantification worked. A better formulation might be "yield a result with a small variability". Furthermore, it would be good to clarify this statement. The uniform heat flow is only visible when in addition no significant measurement errors are observable.

We would like to kindly thank you for your suggestion but we have decided to remain with “precise”. We find that in the context of observational error, “accuracy” and “precision” are quite wellknown terms, and “precision” quantifies exactly what we want to express (and which you have also noted): small variance of the results with repeated measurements.
Nevertheless, we have clarified the meaning of “uniform”. We use uniform in the sense of “identically distribution”, that is, we imply that “uniform heat flow” is drawn from the same distribution throughout a region.
The issue of “significant measurement errors” of your last sentence is hence resolved. Within our underlying physical model (Equation 3), this distribution includes any potential measurement error. Hence, we do not need to measure constant heat flow throughout the region for the region to be “characterized by uniform heat flow” in the sense we have now clarified.


p. 8 l. 185186: It is explained that clustered points have high discrepancies and will yield less accurate deterministic distributions. This statement should be further clarified. If neighboring points show very different flow values would that not also be an indication of potential measurement errors?

The answer to the question is yes (strong differences between close heat flow measurements could be both due to strong material contrasts in terms of heat production, or due to measurement errors q_f), and REHEATFUNQ aims to capture this variability by means of the latent parameter described in sections “3.2.4 Handling spatial data clusters” and “3.4 Handling spatial data clusters”. The paragraph l. 183187 was written deliberately focused on the effect that spatial measurement clustering has on the regional aggregate heat flow distribution if at least a part of it is “random” due to the sampling of a spatially variable heat flow field (Fig. 4).

We have now added an early explanation that REHEATFUNQ aims to mitigate the clustering effect while still capturing the variability that is contained in a cluster.


p. 13 l. 292: Add a remark on how does this final cost relates to the desired cost?

We neither assume nor define a desired cost. Our only criterion is that the cost should be minimal, hence implying that the “fit” of the prior to the distributions observed in the global data is maximal. At least locally, our final cost is minimal. To the degree that our sampling employed in the SHGO method is dense enough, the cost is also globally minimal.


p. 14 l. 299303: Sensitivity is derived from a sensitivity analysis and has a specific mathematical meaning. Therefore, it would be important to not use the world in context of the uncertainty quantification and talk instead of uncertainties and variability.

We would like to politely state our disagreement. First, uncertainty and sensitivity are two interwoven aspects of the same phenomenon: uncertain estimates. Second, in cited lines, we use sensitivity in the sense of “strength of the dependence onto input variation” which corresponds to its semantics in both sensitivity analysis and common speech. More specifically

l.299300, “and the parameters’ sensitivities for different α and β.” refers to the gamma distribution parameters’ sensitivity to the (shape of the) distribution itself, and how this sensitivity changes as the distribution parameters change.

l. 302303 specifically explains the interwoven nature of uncertainty and sensitivity: “However, the sensitivity of the overall distribution relative to the distribution parameters—and consequently the uncertainty of the distribution estimates—changes with the distribution parameters.”



p. 14 l. 304306: So the uncertainty quantification has been performed with both influential and noninfluential parameters? Has it been considered to perform a sensitivity analysis prior to the uncertainty quantification to determine the noninfluential parameters and keep them fixed during the uncertainty quantification? This aspect is important because noninfluential parameters can significantly degrade the robustness of the uncertainty quantification. Was the robustness of the analysis investigated?

We have not performed uncertainty quantification. Lines 299306 refer to general properties of the gamma distribution’s parameters α and β if one were to compute point estimates and then quantify their uncertainty. Since REHEATFUNQ is a Bayesian approach, no point estimates of α and β are employed—instead, α and β are always integrated over all possible values—and uncertainty quantification therefore does not apply.


p. 23 l. 475: Why was a 5 % rejection rate chosen?

The 5 % rejection rate was initially chosen adhoc due to the availability of critical statistics for the gamma and other distributions (e.g. 1 %, 2.5 %, 5 %, 10 % given by Stephens, 1986). If a lower rejection rate would be chosen, this rejection rate could not be resolved from the small number of disks that cover the global data set at small R (due to the minimum sample size criterion). For instance, one would need to have at least around 100 valid disks to be able to quantify a 1 % rejection rate, whereas around 20 disks suffice to quantify a 5 % rejection rate. In case of the NGHF data set we used, the graphs labeled “Inverse average number of disks in RGRDC” in Fig. 11 indicate by radius the number of valid disks that can be fit onto Earth surface (say, approximately R >= 120 km would be sufficient for 1 % rejection rate). Hence, reducing the rejection rate prevents the analysis of smaller disks—or, in other words, decreases the spatial resolution. The 5 % rejection rate we chose is close to the minimum that can be quantified at 80 km. If a higher rejection rate would be chosen, the goodnessoffit tests are generally more powerful at the expense of a higher TypeI error. This is something we would like to prevent. All in all, the choice of the rejection rate has considerable leeway—yet 5 % seems to work well for the purpose of and the data set used in this study.


Section 4.3.1: Are general rules available for choosing the prior?

There are the following general rules: 1) If the sample size is small, and the sample distribution is “typical” (i.e. similar to what we have found in the NGHF data set, see Fig. 14), the informed prior with optimized parameters is beneficial 2) if the sample’s variance is low (e.g. after correcting for numerous known effects that cause spatial variation), the uninformed prior may be beneficial 3) If the sample size is large, both choices should be relatively similar.

Added a paragraph describing these general rules.


p. 33 l. 669: “To conclude, the test yield posterior support for our gamma model choice.” Has this analysis also been done for the other distributions that showed in Figure 13 similar performance as the gamma distribution? That would be important to know in order to judge whether also other prior distributions yield similar posterior results.

We have not performed this analysis for other distributions. Considering the amount of algebraic and numerical developments that went into stabilizing the REHEATFUNQ code to its current stage (that is, eliminating numerical difficulties in various parts of the parameter space), repeating this process to a comparable for other distributions would likely add another manuscript worth of workload onto the present manuscript. We hence consider this work out of scope for the present manuscript. That said, we have no reason to believe that, given a careful method development, a REHEATFUNQ model based on another similarfitting PDF, such as the loglogistic or normal distribution, would perform significantly worse or better than our development based on the gamma distribution. The differences between a good fit of these three distributions among each other are decidedly less than the difference between the gamma distribution and the mixture distributions used in section 4.3.2. The phrase “To conclude, the tests yield posterior support for our gamma model choice” is meant to support (or validate) our specific model without indicating preference among the models tested in Fig. 13.


p. 34 l. 702: Can the geophysical modeling of the heat generation and transport that can help in detrending the data, be done in REHEATFUNQ or does this need to be done externally?

No, this needs to be done externally (as mentioned also section “3.5 Heat Conduction”).


p. 39 Figure 20: In panels d,f, and g the posterior pdf does not match with the data. Further explanations of what can be done in this case are required.

These differences are due to the distance selection criterion. In the Carrizo, Creeping, and North Coast segment regions, there are visible effects of clustering (e.g. at geothermal areas or clustered around the San Andreas fault in the Carrizo segment area. Here, the minimum distance criterion works as intended: it enforces a more uniform sampling of the area, which does not overly weight small regions with clustered measurements, while keeping all that information within the latent dimension (exlusive sampling of the clusters). These effects are already described in detail in section 5.1 with references to the maps Fig. 19. The remedy in this case, to bring the histograms closer to the posterior predictive, is to provide more uniform heat flow measurements.

We have added to the caption of Figure 20 a reference to the explanation in section 5.1, and we have mentioned that this section discusses the mismatch between data and posterior.

Changes
General Remarks:

For several equations not all variables are introduced (e.g., Eq. 5,7,11). Please check all equations and introduce all used variables.

We have checked the equations and added introduction of variables (some of them redundantly to improve local readability)


The figures are not displayed in the order they are mentioned and are partly placed far away from the corresponding text. So, it would be important to revise this.

We have referenced Figure 1 earlier and moved Figure 2. The cross reference to later Fig. 18 of the Southern California example is difficult to resolve; we have left it as is.


A main contribution of this paper is the characterization of the variability of the regional heat flow. For this Bayesian inference has been used for the heat flow anomaly strength. However, it is in the current form difficult to understand the individual steps performed for the entire analysis (so not only the anomaly) and to find out which parameters have been used since they are distributed over the different text sections and are partly missing. Therefore, it would help to have a designated section describing the workflow and presenting the workflow also in a schematic figure. Furthermore, it would be helpful to present the used parameters in form of a table.

Added a section “Workflow Cheat Sheet” that lists the steps of a typical use of the REHEATFUNQ model directly after the introduction.


In the Supplement, a couple of additional Figures are presented. It would be good to reference these figures (so far only S1 is referenced) also in the main manuscript and provide additional explanations in the Supplement to better understand how these figures fit into the context of the manuscript.

Added explicit Fig S2 to S4 references in section 4.3.2 where they were previously referenced implicitly.

Further Remarks:

p. 5 l. 104105: What is the influence of d_min on the variability? This is mentioned later on but a brief explanation here and a reference to the detailed explanation would be helpful.

Added a note that d_min aims to prevent biases due to spatial clustering of measurements, and a reference to the later following explanation


p. 7 l. 136: It is mentioned that REHEATFUNQ uses a blackbox approach and prior to that also the importance of the physics is highlighted. So, it would be good to provide a small justification for using a blackbox approach.

We have detailed the connection between the paragraph of line 136 and the one before. We now emphasize that REHEATFUNQ aims to capture those principally known physical processes which yet cannot be modeled due to insufficient data. That is, REHEATFUNQ captures the known physics which is acting from unknown sources.


p. 8 l. 154: Please clarify in numbers what "sufficiently deep" means.

Clarified as a range depending on the thermal gradient and the topographic variability


p. 8 l. 159160: Please clarify shortly how the separation is performed.

The separation of q_a from the remaining heat flow signal is based on the superposition of q_a and q_u in equations (3) and (4), and on treating q_u as a (gammadistributed) random variable. “Separation” in this case means inferring the additive component q_a based on the model for q_u and on the knowledge about the pattern of q_a. Added a better explanation of this approach in l. 175 and rewrote the paragraph starting in l. 159.


p. 8 eq. 3: It would be good to provide a reference to the later section, where it is described how q_a is determined.

The anomalous or modeled heat flow q_a is an input to the REHETAFUNQ model. Therefore, section 3.5, the later section which simplifies the computation of q_a for one particular heat source, is not general enough to fulfill the role of “describ[ing] how q_a is determined”. We have rewritten the paragraphs preceding eq. (3) to better convey the message that a user of REHEATFUNQ’s anomaly quantification would already have a q_a in mind—which is determined by the problem that the researcher investigates. Nevertheless, we have referred to section 3.5 as an example of how q_a is expressed in terms of P_H, and how the required coefficients c_i can be computed (added more detail for that as well).


p. 8 eq. 4, line 175: Throughout the paper, the assumption is made that the magnitude of q_a is small. It would be good to clarify if the software would work also for large magnitudes of q_a and how this nonlinear function could be incorporated.

We clarified that the software does not work if the heat source that generates the anomaly q_a itself drives nonlinear convection. However, we note that this is not the main use case of REHEATFUNQ: If the anomaly heat source is that strong, the resulting anomaly will likely have a good signaltonoise ratio against the background heat flow, and one will likely be able to discern the signal without REHEATFUNQ. (We also noted one technical way to apply REHEATFUNQ if the nonlinearity does not lead to a nonlinear signal in space, and can be handled by a global transformation of variables. We cannot assess how likely such a case might be, though, so our preferred answer is that REHEATFUNQ does not work in these cases).


p. 8 l. 183: The number for the Figure is missing.

Added


p. 9 Figure 4: Panel a: Please explain in detail how the heat flow was computed. According to the text a probability distribution has been used. So, is this one possible realization?

The heat flow was computed by optimizing underground heat sources (not shown) such that the resulting surface heat flow field, sampled uniformly in space, has an aggregate heat flow distribution that is close to a gamma distribution (measured by the AndersonDarling statistic). In a sense, this is one realization of a random process, but not as simple as one realization of a probability distribution. The wording in the caption has been clarified to concisely explain the generation of the heat flow field. A detailed explanation of the twostep process is given in a new section of the Appendix.


p. 11 eq. 9: It would be good to provide an explanation of why these equations are used.

Gave a short explanation of Bayesian updating and its benefits by reducing the number of dimensions of numerical quadrature.


p. 13 l. 278: Which global optimization method has been used?

This paragraph was clarified to mention that these are general considerations independent of the specific method. The paragraph was split at the point where algorithm, SHGO, is mentioned (l. 285) to clarify.


p. 13 l. 286: Please provide further details about the SHGO method.

Added a short explanation of how the SHGO method works. Furthermore, gave more detail to the advantage of the SHGO method (yielding exactly one local optimization starting point per local minimum) if the parameter space has been sufficiently sampled.


p. 23 l. 480: What is the consequence of the regional aggregate heat flow not following a gamma distribution? This is later explained but it would be important to provide here already a brief explanation and a reference to the later section 4.2.3 for the detailed explanation.

Added a brief explanation of the potential problem of using an imperfect model, and refer to sections 4.2.2, 4.2.3, 4.3.2


p. 26 l. 534: The process has been repeated 100 times. Has a convergence test been performed for this?

Added Figures S9, S10, and S11 in the supplement, showing the convergence of the plots in Fig. 13 of the manuscript.


p. 30 l. 609: How many samples have been used for the MonteCarlo simulations? Where convergence test performed?

Added Figures S7 and S8 in the supplement that shows the convergence analysis for Fig. 15 in the manuscript.


p. 32 l. 640: See the previous comment about the convergence test.

Same simulations as previous point.


p. 35 l. 724725: What impacts would an uncertain heat transport have on the determination of the heat flow over REHEATFUNQ?

Uncertain heat transport leads to further uncertainty and we expect a further diffusion of the prior if the heat flow anomaly is variable within a certain range. Since the heat transport enters the REHEATFUNQ analysis only through the coefficients c_i, uncertain heat transport can be introduced to our model by varying these coefficients according to the uncertainty of the heat flow model, and provide weights for each point in the model range. In this way, one would condense all uncertainty of the heat transport into marginal distributions of the c_i.

We have implemented an option to provide uncertain heat transport via a weighted list of sets of functions f_i to evaluate the coefficients at the data locations (the structure is {(w_i, f_i(x,y)) : i=1,…,n}). This is effectively a sample of the marginal “distribution” of the heat flow distribution. Internally, this set of weighted functions is combined, randomly sampled, with the set of data selections from the minimum distance criterion to yield a combined latent parameter dimension. This dimension samples the combined posterior of the distance criterion and the effect of uncertain heat transport.


Supplementary Material: Figures S2 to S4 are missing the axis labels.

Added missing labels

Citation: https://doi.org/10.5194/egusphere2023222AC2 

RC2: 'Comment on egusphere2023222', Anonymous Referee #2, 24 Oct 2023
This contribution presents a model tackling the problem that surface heat flow measurements do not dissociate the background crustal heat flow from the heat flow induced by anomalies such as fault friction, taken as the feature of focus in this model. A method is devised to separate from heat flow measurements, the background heat flow and the strength of a fault frictional heat anomaly. This Bayesian inference method seems adequate and is interesting to pursue. I like the random global Rdisk coverings to reduce local variability but it could be specified in simpler words why it is used. The heat flow is assumed to follow a gamma distribution and the message is hard to distillate because it is emphasised the gamma distribution does not fit the data, but that it is the best that can be used. The model was tested on 4 different regions around the San Andreas fault to revisit previous results about fault strength.
The paper is dense and very technical. With my limited understanding of the technical content, I had a hard time following the authors and I believe the paper is not easily accessible to every geoscientist reader. In that sense, interpretations should be supplemented by a more physical interpretation so that all geoscientists are able to understand and use the model and/or cite the results. Those interpretations could be used as summary of sections that are quite dense to help with readability.
I was confused by the introduction of the random global Rdisk coverings this early as part the Data section 2.2 when more of its details is explained in the model section 3.2.4 and is also a mini section 3.4. I would like to see only one section about it for readability and I would rather see it in the model section 3.2.4.
I would like the authors to find a more positive and logical spin in the validation section about the choice of gamma distribution. Furthermore it should be clearly highlighted in 4.2.3 why the gamma distribution is the best distribution and if it is not, why it was chosen regardless. I do not understand why Weibull is not a good distribution for example. It should be also highlighted if no other distribution is better, what could be an alternate solution that would produce a lower rejection rate and why is the solution not considered in the model (maybe it cannot be modelled?).
Text correction:
Section 2 Data: can the term “data” be specified?
line 117: leads us to chose a minimum distance > choose
Line 183: illustrated in Figure ? highlights
Figure 4 refers to CDF which has not been introduced yet
Line 241: a prior > a priori
Line 327: we can compute evaluate posteriors > ?
Can section 3.5 about heat conduction be part of 3.3 on the anomaly detection? I feel it should not be an independent section
Rejection rate needs to be explained more in details
Section 4.3 synthetic data: can the term “data” be specified?
Line 793: the results are less clear > I have trouble seeing what makes a result clear or not…Citation: https://doi.org/10.5194/egusphere2023222RC2 
AC3: 'Reply on RC2', Malte Ziebarth, 22 Dec 2023
Dear Reviewer #2,
thank you kindly for your helpful and valuable feedback that led us to improve our manuscript.
We would like to respond to your comments in two steps. First, we would like to answer to your highlevel comments in this interactive discussion; this is the first section of this document. The remaining text corrections led to straightforward improvements of our manuscript; we list them in the second part of this manuscript.Discussion
This contribution presents a model tackling the problem that surface heat flow measurements do not dissociate the background crustal heat flow from the heat flow induced by anomalies such as fault friction, taken as the feature of focus in this model. A method is devised to separate from heat flow measurements, the background heat flow and the strength of a fault frictional heat anomaly. This Bayesian inference method seems adequate and is interesting to pursue. I like the random global Rdisk coverings to reduce local variability but it could be specified in simpler words why it is used. The heat flow is assumed to follow a gamma distribution and the message is hard to distillate because it is emphasised the gamma distribution does not fit the data, but that it is the best that can be used. The model was tested on 4 different regions around the San Andreas fault to revisit previous results about fault strength.
 The paper is dense and very technical. With my limited understanding of the technical content, I had a hard time following the authors and I believe the paper is not easily accessible to every geoscientist reader. In that sense, interpretations should be supplemented by a more physical interpretation so that all geoscientists are able to understand and use the model and/or cite the results. Those interpretations could be used as summary of sections that are quite dense to help with readability.
 Thank you kindly for this feedback. Indeed, our paper tries to manage a difficult balance. On one hand, it is a reference of all technical details of the development of the REHEATFUNQ model, as well as a synthesis of all supporting tests and analyses that we have performed. On the other hand, it aims to make REHEATFUNQ accessible for a wide number of geoscientists—where as you have pointed out correctly our draft has a high potential to obscure the practically relevant details through the large amount of technical description. We have made a number of changes to the organization of the manuscript to make it more accessible to a wide geoscientific audience:
 We have added a new onepage section “2. Workflow cheat sheet” directly after the introduction. This page gives a short summary of the steps involved in a typical usage scenario of REHEATFUNQ, lists Python classes involved in those steps, and links to the documentation where code examples are listed.
 Added the “PointsofInterest Measurement” toy model to the appendix to better illustrate the potential impact of spatial data clustering, and to illustrate the mitigation using the dminsampling. The model is built on simple assumptions on the nature of human heat flow data acquisition.
 Thank you kindly for this feedback. Indeed, our paper tries to manage a difficult balance. On one hand, it is a reference of all technical details of the development of the REHEATFUNQ model, as well as a synthesis of all supporting tests and analyses that we have performed. On the other hand, it aims to make REHEATFUNQ accessible for a wide number of geoscientists—where as you have pointed out correctly our draft has a high potential to obscure the practically relevant details through the large amount of technical description. We have made a number of changes to the organization of the manuscript to make it more accessible to a wide geoscientific audience:
 I was confused by the introduction of the random global Rdisk coverings this early as part the Data section 2.2 when more of its details is explained in the model section 3.2.4 and is also a mini section 3.4. I would like to see only one section about it for readability and I would rather see it in the model section 3.2.4.
 The original reason why the RGRDCs are mentioned at multiple places in the manuscript is that different variants of the same idea are used at different places (when comparing different distributions and when fitting the conjugate prior, we use coverings of the real world NGHF data. For the synthetic validation tests, we use random computergenerated data that mimic the RGRDCs of the NGHF). We have now consolidated the different sections into the appendix since the algorithms are mostly a technical detail—the underlying idea can be conveyed using just Figure 1.
 I would like the authors to find a more positive and logical spin in the validation section about the choice of gamma distribution. Furthermore it should be clearly highlighted in 4.2.3 why the gamma distribution is the best distribution and if it is not, why it was chosen regardless. I do not understand why Weibull is not a good distribution for example. It should be also highlighted if no other distribution is better, what could be an alternate solution that would produce a lower rejection rate and why is the solution not considered in the model (maybe it cannot be modelled?).
 As we have noted in the section, the results of the comparison is not conclusive (arguably with the exception of rejecting the Fréchet distribution). No distribution is unanimously selected in the majority of regional aggregate heat flow distributions, and furthermore the positive evidence (Fig. 13a) is not significant in the vast majority of cases that a distribution is selected by the BIC. In particular, this concerns also the Weibull distribution: as we have noted, it has the highest BIC selection rate but the vast majority of selections have a ∆BIC<2, which is not significant (e.g. Kass & Raftery, 1995, p. 777) and may well be due to random fluctuations. Concluding, with the data we have at hand there is no significant preference between many of the distributions investigated, including the gamma distribution. Therefore, the choice of distribution is a modeling decision.
 A further note on the Weibull distribution: the possible leftskewness of the Weibull distribution may help the Weibull distribution to acquire a significant amount of BIC selections on small sample sizes by pure chance. We have tested the BIC selection rates for some random data of sample size N=10 drawn from a gamma distribution and found that the Weibull distribution was selected by the BIC criterion over the gamma distribution in about one third of the cases.
 From a modeling point of view, the gamma distribution is the only one that combines all of the following criteria: (1) it is defined on a positive support, (2) it has a conjugate prior (important for enabling the costly computations), (3) it is rightskewed, like the global heat flow distribution, for all parameter combinations
 We have highlighted a possible alternate solution, leading to a lower selection rate, in section 4.2.2 (now 5.1.2) and Fig. 12 (now Fig. 11). The high rejection rate can be explained by the bimodal mixture distribution of two gamma distributions. Vice versa, one may be able to achieve a lower rejection rate by (1) deriving a REHEATFUNQ model for a gamma mixture distribution or (2) by considering spatial dependence of distribution parameters. These two avenues are possible improvements of our model in future works. For now, we have shown in section 4.3.2 (now 5.2.2) that the REHEATFUNQ method can handle bimodal regional aggregate heat flow distributions, and it does so by yielding results of higher uncertainty.
 We have improved the conclusion of section 4.2.3 based on these arguments.
 As we have noted in the section, the results of the comparison is not conclusive (arguably with the exception of rejecting the Fréchet distribution). No distribution is unanimously selected in the majority of regional aggregate heat flow distributions, and furthermore the positive evidence (Fig. 13a) is not significant in the vast majority of cases that a distribution is selected by the BIC. In particular, this concerns also the Weibull distribution: as we have noted, it has the highest BIC selection rate but the vast majority of selections have a ∆BIC<2, which is not significant (e.g. Kass & Raftery, 1995, p. 777) and may well be due to random fluctuations. Concluding, with the data we have at hand there is no significant preference between many of the distributions investigated, including the gamma distribution. Therefore, the choice of distribution is a modeling decision.
Changes
Text correction:
 Section 2 Data: can the term “data” be specified?
 We have specified that this section is about heat flow data, and we note furthermore that it describes the database we have used (Lucazeau, 2019) and the kind of filtering we applied to this database.
 line 117: leads us to chose a minimum distance > choose
 Corrected.
 Line 183: illustrated in Figure ? Highlights
 Added the Figure number.
 Figure 4 refers to CDF which has not been introduced yet
 Added the abbreviation to the caption.
 Line 241: a prior > a priori
 Corrected.
 Line 327: we can compute evaluate posteriors > ?
 Corrected to "evaluate".
 Can section 3.5 about heat conduction be part of 3.3 on the anomaly detection? I feel it should not be an independent section
 Done.
 Rejection rate needs to be explained more in details
 Added more explanation in lines 475+ (of the old manuscript)
 Section 4.3 synthetic data: can the term “data” be specified?
 Added a short description to the first sentence of that section: “computegenerated samples {(x,y,q)i} of surface heat flow.”
 Line 793: the results are less clear > I have trouble seeing what makes a result clear or not…
 Specified the sentence: the “results” refer to the existence or nonexistence of a finite heat flow anomaly.
Citation: https://doi.org/10.5194/egusphere2023222AC3  The paper is dense and very technical. With my limited understanding of the technical content, I had a hard time following the authors and I believe the paper is not easily accessible to every geoscientist reader. In that sense, interpretations should be supplemented by a more physical interpretation so that all geoscientists are able to understand and use the model and/or cite the results. Those interpretations could be used as summary of sections that are quite dense to help with readability.

AC3: 'Reply on RC2', Malte Ziebarth, 22 Dec 2023
Peer review completion
Journal article(s) based on this preprint
Viewed
HTML  XML  Total  Supplement  BibTeX  EndNote  

302  155  29  486  54  21  18 
 HTML: 302
 PDF: 155
 XML: 29
 Total: 486
 Supplement: 54
 BibTeX: 21
 EndNote: 18
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
Cited
Malte Jörn Ziebarth
Sebastian von Specht
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(2078 KB)  Metadata XML

Supplement
(136 KB)  BibTeX
 EndNote
 Final revised paper