Calculation of Uncertainty in the (U-Th)/He System

Martin, Peter E.; Metcalf, James R.; Flowers, Rebecca M.

doi:https://doi.org/10.5194/egusphere-2022-299

Preprints

https://doi.org/10.5194/egusphere-2022-299

Preprints

12 May 2022

| 12 May 2022

Calculation of Uncertainty in the (U-Th)/He System

Peter E. Martin, James R. Metcalf, and Rebecca M. Flowers

Abstract. Currently there is no standardized approach for reporting date uncertainty in the (U-Th)/He system, partly due to the fact that the methods and formulae for calculating single-grain uncertainty have never been published. This creates challenges for interpreting the expected distribution of dates within individual samples and of dates generated by differing labs. Here we publish two procedures to derive (U-Th)/He single-grain date uncertainty (linear and Monte Carlo uncertainty propagation), based on input ⁴He, radionuclide, and isotope-specific F_T values and uncertainties. We also describe a newly released software package, HeCalc, that performs date calculation and uncertainty propagation for (U-Th)/He data. Using this software, we find that date uncertainty decreases with increasing age for constant relative input uncertainties. Skew in date probability distributions (i.e., asymmetrical uncertainty) yielded by the Monte Carlo method varies with age and increases with increasing relative input uncertainty. Propagating uncertainties in ⁴He and radionuclides using a compilation of real (U-Th)/He data (N = 1978 apatites, and 1753 zircons) reveals that the uncertainty budget in this dataset is dominated by uncertainty stemming from the radionuclides, yielding median relative uncertainty values of 2.9 % for apatites and 1.7 % for zircons. When uncertainties in F_T of 2 % or 5 % are assumed and additionally propagated, these values increase to 3.3 % and 5.0 % for apatite, and 2.4 % and 4.7 % for zircon. The potentially strong influence of F_T on the uncertainty budget indicates the need to better quantify and routinely propagate F_T uncertainty into (U-Th)/He dates. Skew is generally positive and can be significant, with ~14 % of apatite dates and ~5 % of zircon dates characterized by skew of 10 % or greater. This outcome indicates the value of applying Monte Carlo uncertainty propagation to identify samples with substantially skewed uncertainties that should be considered during data interpretation. The formulae published here and the associated HeCalc software can aid in more consistent (U-Th)/He uncertainty reporting and enable a more rigorous understanding of when and why multiple aliquots from a sample differ beyond what is expected from analytical and F_T uncertainties.

Received: 05 May 2022 – Discussion started: 12 May 2022

Download & links

Preprint (PDF, 2064 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (2064 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

07 Feb 2023

Calculation of uncertainty in the (U–Th) ∕ He system

Peter E. Martin, James R. Metcalf, and Rebecca M. Flowers

Geochronology, 5, 91–107, https://doi.org/10.5194/gchron-5-91-2023,https://doi.org/10.5194/gchron-5-91-2023, 2023

Short summary

Peter E. Martin et al.

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-299', Pieter Vermeesch, 16 May 2022
This paper discusses the error propagation of (U-Th)/He data. Its introduction claims that “the formal analytical uncertainty in (U-Th)/He dates has never been thoroughly assessed”. I have three comments about this statement:

I am sure that the error propagation of the (U-Th)/He method has been worked out before, and probably several times. In fact, I have done so myself, and even implemented it in a publicly accessible computer program: https://ucl.ac.uk/~ucfbpve/heliocalc/.

One probable reason why nobody published the error propagation formulas for the (U-Th)/He method is the overdispersion that characterises most (U-Th)/He datasets: the scatter of several aliquots from the same samples usually exceeds the precision of the data, by a lot. So, in a sense, the analytical uncertainties are irrelevant. The interplay between analytical uncertainty and overdispersion is discussed by Vermeesch (2010, doi:10.1016/j.chemgeo.2010.01.002), who also covers some aspects of the error propagation problem.

A second reason why error propagation hasn’t been discussed much is that it is next to impossible to quantify the analytical uncertainty of the alpha ejection correction, which is one of the main sources of uncertainty in (U-Th)/He dating. Geochronologists slap a nominal uncertainty on this correction, which largely defeats the purpose of rigorous error propagation for the other variables. Unfortunately, the paper under consideration does not address this issue.

Despite these three caveats, I do not object to publishing the error propagation formulas in GChron. However, before this can happen the manuscript needs serious revision. The paper is far too long and can be shortened by at least 50%. I will make some specific suggestions for this later in this review.

The paper uses both standard error propagation and Monte Carlo (MC) simulation. I have two comments about this:

According to the authors, the main advantage of the MC method is its ability to handle skewed error distributions. However, it would be easy to adjust the conventional error propagation to handle the observed skewness. This can be achieved by reformulating the error propagation formula in terms of the log of the variables (e.g., Section 5 of https://doi.org/10.5194/gchron-2-119-2020). Thus, the log of the age could be calculated as a function of the U, Th and He concentrations. An even better solution would be to use log-ratios. See Vermeesch (2010) for details. I am not sure how easy it would be to reformulate the paper and HeCalc code in terms of log(ratio) variables. If the authors find it too difficult, then I guess that the MC approach would be fine as an alternative.

The actual main advantage of MC error propagation is not mentioned, namely its ability to handle non-Gaussian error distributions. This is particularly pertinent with regards to the alpha-ejection correction (i.e. the uncertainty of the alpha-retention factor Ft). Meesters and Dunai (2002, https://doi.org/10.1016/S0009-2541(01)00423-5) and Hourigan (2005, https://doi.org/10.1016/j.gca.2005.01.024) have shown that compositional zoning can strongly affect the fraction of ejected alpha particles. Matters are further complicated in the presence of broken grains, when the alpha ejection correction may result in overcorrection (Brown et al., 2013, https://doi.org/10.1016/j.gca.2013.05.041). Things are even more difficult for slowly cooled samples, in which alpha-ejection occurs synchronous with diffusive loss of helium. The dispersion caused by all these complexities is difficult to ascertain, but is likely non-Gaussian. The MC approach could be used to explore these effects. I’m not saying that the authors should do this in their paper (because I don’t want to make it even longer), but they should at least mention the possibility. Perhaps HeCalc could offer an interface to explore these effects?

Detailed comments:

Equations 4-10 are unnecessary. They are simply repeating Meesters and Dunai (2005, https://doi.org/10.1029/2004GC000834), and Raphson and Newton (~1711). Incidentally, I do not really see the point of using the Meesters and Dunai (2005) solution as a starting point for a Raphson-Newton algorithm anyway. Their direct solution is accurate to better than 0.1% for ages up to 500Ma, which covers all terrestrial applications of the (U-Th)/He method.

Equations 11 and 12 could be written more succinctly in matrix form.

Equations 14-18 all share the same denominator (which equals df/dt), which could be stored in a variable. All these equations could be put together into a single Jacobian matrix, or moved into the appendix.

Section 3.3.2 describes a method to choose the optimal number of MC iterations to derive a desired level of precision on the mean value. It just presents the well known “the square root of n” phenomenon, which I think is too trivial a result to occupy so much space (Figure 2 is certainly not necessary). It is also important to note that the square root of n rule only applies to the standard error of the mean. The standard error of the standard deviation (s) is given by s/sqrt(2n-2). I am mentioning this here because the uncertainty of the standard deviation is more relevant than that of the mean, which is never used in the remainder of the paper.

I installed HeCalc on my computer and am happy to confirm that it works. I have not extensively tested it though. I think that the presentation of HeCalc should take greater prominence in the paper. Of course, this will automatically happen if some of the remaining bulk is removed.

HeCalc requires that the user provide the uncertainties of the alpha-retention factors 238Ft, 235Ft, 232Ft and 147Ft. However, the paper does not explain how these uncertainties should be obtained. A nominal 5% uncertainty is used in later examples, without proper justification.

HeCalc also requires that the user specify the error correlations between the different parameters. However, it does not discuss how to estimate those correlations. Does the CU TRaiL database specify them?

Minor comment: the paper (and HeCalc) use the awkward convention to report MC uncertainties as “68% confidence intervals”. I understand where this comes from: a 1-sigma interval around the mean of a normal distribution covers 68% of that distribution. However, uncertainties are usually reported either as standard errors or as 95% confidence intervals. If the authors want to compare their analytical results with the MC simulations, then a 95% confidence interval would be more elegant.

Section 5 can be nearly completely removed. The most interesting part of this section is the finding that parent concentrations are a greater contributor to the uncertainty budget than the helium concentrations. This finding could be reported much more succinctly.

According to lines 421-423: “when combining uncertainties with equal magnitude, the resulting uncertainty will be only ~1.4 times larger than the input, rather than twice as large as might be expected.” Here the authors underestimate the reader. I am certain that the vast majority of geochronologists are familiar with the quadratic addition of uncertainties. Consequently this sentence, as well as the preceding paragraph and Figure 4, can be safely removed.

The paper attributes the reduction of analytical uncertainty with increasing date to the “roll over” of the exponential decay function. This may be correct but is largely irrelevant to real world applications. The observed reduction only expresses itself at >1 Ga, while the vast majority of published (U-Th)/He dates are <200 Ma. At young ages, the helium age equation is linear to a good approximation (https://doi.org/10.1016/j.chemgeo.2008.01.027). Note, however, that the fixed uncertainty of the helium measurements shown in Figure 3 is not realistic: older samples will tend to contain more helium, which can be measured more precisely. This will also cause a reduction of analytical uncertainty, even for Cenozoic samples.

In section 5.2, the authors introduce a new definition for skewness. This is a very bad idea. There already exists enough confusion in the geological community about basic statistical concepts. It would be unwise to add to the confusion by redefining widely accepted terms such as skewness. At this point I would like to reiterate the fact that the approximately lognormal uncertainty distribution of the dates could easily be captured analytically by recasting the equations as a function of the log of the age. Simply referring to the percent uncertainty of the age would capture the uncertainty and the skewness with a single number.

Section 5.4 applies the algorithms to a database of ~3,600 (U-Th)/He dates. It is a shame that this database is not released along with the paper. It must be a treasure trove of useful information! Unfortunately, I don’t think that Section 5.4 is particularly interesting. It definitely doesn’t deserve seven manuscript pages, four pages and three figures (not counting sub-panels). However, Figure 11 does illustrate my comment at the start of this review effectively: the nominal uncertainty of the alpha-ejection correction dwarfs the other uncertainties, thereby defeating the purpose of the careful error propagation.

Lines 615-616: “a challenge to interpreting data with asymmetrical uncertainties is that no widely used inverse thermal history modeling software for (U-Th)/He data permits the input of asymmetrical uncertainty” I’m not sure how HeFTy handles the analytical uncertainty of (U-Th)/He data, but if I seem to recall that QTQt essentially inflates the uncertainties until they account for the overdispersion of the data. This means that the uncertainties are, effectively, ignored. HeFTy probably does something similar, because otherwise its formalised hypothesis tests would fail. Ideally, thermal history inversions should aim to predict the uncorrected (U-Th)/He dates, ignoring the alpha ejection correction. As mentioned before, this is because alpha ejection occurs concurrently with thermal diffusion. So it is not a constant but a variable that depends on the thermal history (Meesters and Dunai, 2002).

Equations a1-a10 all have the same denominator. Storing this denominator in a variable would avoid a lot of duplicate text. You could then even put all these equations into a single concise Jacobian.

I apologise if this review comes across as overly critical. I think that this paper (and the HeCalc program) could serve a useful purpose. My opinions is that it would be greatly improved by trimming it down to the important parts. Perhaps the paper could be recast as one of GChron’s popular “Technical notes”? This would provide a nice way to present HeCalc to the world, whilst reviewing the error propagation problem.
Citation: https://doi.org/10.5194/egusphere-2022-299-RC1
- AC1:
  'Reply on RC1', Peter Martin, 27 May 2022
              We thank Prof. Vermeesch for taking the time to write a prompt, thoughtful, and helpful review. While it will become clear that we largely disagree with the conclusions he draws about the content of the paper, he has helped us identify several areas where the manuscript can be improved and clarified. Below, we provide a high-level response that addresses many of the reviewer’s critiques. We then respond to the individual points raised, copying the text of the original review in bold.
              Much of this review centers on shortening the text and streamlining the equations provided to yield the most compact possible description of HeCalc and uncertainty in the (U-Th)/He system. In short, we wish to write this paper for the benefit of all geoscientists, not just those with a strong grounding in math and statistics. While the reviewer likely believes that our approach to this goal largely underestimates readers, we disagree. Our experience in the CU TRaIL (Thermochronology Research and Instrumentation Lab) with hundreds of researchers, students, and other clients has demonstrated that the levels of comfort with statistics and math within the (U-Th)/He user community are highly variable. Our objective is to make the manuscript as comprehensible to as wide a readership as possible in an attempt to subvert the well-known “Curse of Expertise” wherein knowledge experts underestimate the difficulty of a subject for novice and intermediate practitioners (e.g., Hinds, 1999; https://doi.org/10.1037/1076-898X.5.2.205).
              The fundamental purpose of this paper is not to simply publish a method for (U-Th)/He uncertainty propagation, but rather to fully explicate this process for the full range of practitioners of this method, from lab managers and PIs to external users of (U-Th)/He data and early graduate students. Shortening the manuscript in the manner suggested by the reviewer would remove a significant amount of text dedicated to not just writing out the formulae for (U-Th)/He uncertainty propagation, but meant also to assist readers in developing an intuition for this system. We now recognize that this primary goal of accessibility is not well communicated in the text. We will correct this and thank Prof. Vermeesch for highlighting this shortcoming.
  This paper discusses the error propagation of (U-Th)/He data. Its introduction claims that “the formal analytical uncertainty in (U-Th)/He dates has never been thoroughly assessed”. I have three comments about this statement:
  The full statement in the paper is “methods for propagating uncertainty components into single-grain (U-Th)/He dates have never been described in the literature, and the formal analytical uncertainty in (U-Th)/He dates has never been thoroughly assessed.” We largely stand by this statement, emphasizing that we are not claiming that it has never been done, but rather that it has not been fully explained in the literature, as we intend to do in this manuscript. Again, we will revise the manuscript to more clearly communicate the latter goal.
  I am sure that the error propagation of the (U-Th)/He method has been worked out before, and probably several times. In fact, I have done so myself, and even implemented it in a publicly accessible computer program: https://ucl.ac.uk/~ucfbpve/heliocalc/.
  
  It is true that we failed to mention that a tool for performing this uncertainty propagation is presently available. However, the methods by which this program functions are not (to our knowledge) available outside of inspection of the source code. As this source code is largely uncommented and is written in javascript (in our experience, an uncommon language for geoscience computing compared with Fortran, Matlab, Python, R, or Julia), we believe it would be difficult for a non-expert to understand how or why the program produces the uncertainties that it does. Again, it is the inaccessibility of the knowledge behind uncertainty propagation that we also wish to address, so that it is no longer a “black box” to users.
  One probable reason why nobody published the error propagation formulas for the (U-Th)/He method is the overdispersion that characterises most (U-Th)/He datasets: the scatter of several aliquots from the same samples usually exceeds the precision of the data, by a lot. So, in a sense, the analytical uncertainties are irrelevant. The interplay between analytical uncertainty and overdispersion is discussed by Vermeesch (2010, doi:10.1016/j.chemgeo.2010.01.002), who also covers some aspects of the error propagation problem.
  
  We strongly disagree with this line of reasoning. This argument may have been appropriate to make 15 years ago, but the field of (U-Th)/He dating has advanced radically since then. Overdispersion is a central motivation for this contribution. How can we understand overdispersion if we don’t first carefully propagate the uncertainty in (U-Th)/He dates that can be well-quantified? By first constraining and then subtracting the scatter in data attributable purely to analytical error, one may begin to approach the problem of the underlying physical causes of overdispersion with greater confidence. This phenomenon is one of the core motivations for the work, as stated in the introduction, background, and conclusion sections. We will edit to make this point even more clearly in the text.
  A second reason why error propagation hasn’t been discussed much is that it is next to impossible to quantify the analytical uncertainty of the alpha ejection correction, which is one of the main sources of uncertainty in (U-Th)/He dating. Geochronologists slap a nominal uncertainty on this correction, which largely defeats the purpose of rigorous error propagation for the other variables. Unfortunately, the paper under consideration does not address this issue.
  
  This claim that it is “next to impossible to quantify the analytical uncertainty of the alpha ejection correction” is simply untrue. Recent (Cooperdock et al., 2019; https://doi.org/10.5194/gchron-1-17-2019) and ongoing (Zeigler et al., 2021; DOI: 10.1002/essoar.10507962.1) work is quantifying the geometric uncertainties associated with alpha-ejection corrections, as is summarized briefly in our paper. It also is possible to quantify the uncertainty introduced by zonation (e.g., Hourigan et al., 2005 DOI: 10.1016/j.gca.2005.01.024; Johnstone et al., 2013 DOI: 10.1016/j.gca.2013.01.004). These considerations are fully discussed in the recent review paper by Flowers et al. (2022; https://doi.org/10.1130/B36266.1), in which numerous labs agree for the need to more rigorously quantify uncertainties in alpha-ejection corrections and propagate them into the reported uncertainty in (U-Th)/He dates, rather than “slapping a nominal uncertainty on the correction”. In this manuscript we both provide a method by which to do Ft uncertainty propagation and also fully explain it.
  While it is true that in this paper we apply some nominal uncertainties to the F_T values in Section 5, these uncertainties are meant to illustrate exactly the point the reviewer is making here: that Ft uncertainty is likely a major contributor to intra-sample dispersion, and that quantification of these uncertainties should be a high priority for the (U-Th)/He community.
  Despite these three caveats, I do not object to publishing the error propagation formulas in GChron. However, before this can happen the manuscript needs serious revision. The paper is far too long and can be shortened by at least 50%. I will make some specific suggestions for this later in this review.
  The paper uses both standard error propagation and Monte Carlo (MC) simulation. I have two comments about this:
  According to the authors, the main advantage of the MC method is its ability to handle skewed error distributions. However, it would be easy to adjust the conventional error propagation to handle the observed skewness. This can be achieved by reformulating the error propagation formula in terms of the log of the variables (e.g., Section 5 of https://doi.org/10.5194/gchron-2-119-2020). Thus, the log of the age could be calculated as a function of the U, Th and He concentrations. An even better solution would be to use log-ratios. See Vermeesch (2010) for details. I am not sure how easy it would be to reformulate the paper and HeCalc code in terms of log(ratio) variables. If the authors find it too difficult, then I guess that the MC approach would be fine as an alternative.
  
  We will look into this possibility; an explicit calculation of skewed uncertainties would certainly be a useful addition. On the other hand, we do not believe that the MC approach has any major downsides given HeCalc’s computational efficiency. An added benefit of MC is in its mathematical simplicity, furthering the overall purpose of this paper, which is to provide an easily comprehensible description of the means of deriving uncertainty in (U-Th)/He dates.
  The actual main advantage of MC error propagation is not mentioned, namely its ability to handle non-Gaussian error distributions. This is particularly pertinent with regards to the alpha-ejection correction (i.e. the uncertainty of the alpha-retention factor Ft). Meesters and Dunai (2002, https://doi.org/10.1016/S0009-2541(01)00423-5) and Hourigan (2005, https://doi.org/10.1016/j.gca.2005.01.024) have shown that compositional zoning can strongly affect the fraction of ejected alpha particles. Matters are further complicated in the presence of broken grains, when the alpha ejection correction may result in overcorrection (Brown et al., 2013, https://doi.org/10.1016/j.gca.2013.05.041). Things are even more difficult for slowly cooled samples, in which alpha-ejection occurs synchronous with diffusive loss of helium. The dispersion caused by all these complexities is difficult to ascertain, but is likely non-Gaussian. The MC approach could be used to explore these effects. I’m not saying that the authors should do this in their paper (because I don’t want to make it even longer), but they should at least mention the possibility. Perhaps HeCalc could offer an interface to explore these effects?
  
  This is a good point, and one we had not fully considered. It would certainly be possible to offer a means of inputting additional non-gaussian uncertainty components of arbitrary shape into HeCalc. We will evaluate the feasibility of including it in HeCalc for the final version of this paper, and, if it is not feasible, try to include it in future HeCalc versions.
  Detailed comments:
  Equations 4-10 are unnecessary. They are simply repeating Meesters and Dunai (2005, https://doi.org/10.1029/2004GC000834), and Raphson and Newton (~1711). Incidentally, I do not really see the point of using the Meesters and Dunai (2005) solution as a starting point for a Raphson-Newton algorithm anyway. Their direct solution is accurate to better than 0.1% for ages up to 500Ma, which covers all terrestrial applications of the (U-Th)/He method.
  Equation 4 is an alteration to the Meeters and Dunai (2005) method that is not included in the original paper, so we think it should probably be kept. The other equations are indeed a reproduction of other works. We included these for complete traceability of the math in the paper, but will defer to the editor on decisions about whether to replicate equations that are available elsewhere.
  Many (U-Th)/He dates exceed 500 Ma, including dates produced regularly by CU TRaIL. "Deep time thermochronology” is becoming increasingly popular. Moreover, is there any reason to knowingly accept errors in date computation? The Newton-Raphson method is efficient enough that for ages <500 Ma virtually no loss of computational efficiency is observed. Computational efficiency only suffers for older dates where the Meesters and Dunai method provides poorer estimates, but in these cases the Meesters and Dunai-generated age has greater error anyhow. We see no reason why labs would not report dates using iteration; this approach is favored by labs in the recent GSAB paper on reporting (U-Th)/He data (Flowers et al., 2022; https://doi.org/10.1130/B36266.1).
  Equations 11 and 12 could be written more succinctly in matrix form.
  Equations 14-18 all share the same denominator (which equals df/dt), which could be stored in a variable. All these equations could be put together into a single Jacobian matrix, or moved into the appendix.
  Here again, we favor clarity over brevity and feel that the way these equations are written are more widely accessible than their admittedly more compact matrix counterparts. The main reason for including this series of equations is to provide the means by which a lab that does not wish to incorporate HeCalc into their workflow may perform rigorous uncertainty propagation in a spreadsheet program. While it is true that Excel can handle matrix math, this process requires multiple rows which will generally not be compatible with most labs’ workflows.
  Section 3.3.2 describes a method to choose the optimal number of MC iterations to derive a desired level of precision on the mean value. It just presents the well known “the square root of n” phenomenon, which I think is too trivial a result to occupy so much space (Figure 2 is certainly not necessary). It is also important to note that the square root of n rule only applies to the standard error of the mean. The standard error of the standard deviation (s) is given by s/sqrt(2n-2). I am mentioning this here because the uncertainty of the standard deviation is more relevant than that of the mean, which is never used in the remainder of the paper.
  This is an entirely valid criticism. We will plan to convert the algorithm to select the number of Monte Carlo iterations to a calculation of standard error of the standard deviation and trim this section significantly.
  I installed HeCalc on my computer and am happy to confirm that it works. I have not extensively tested it though. I think that the presentation of HeCalc should take greater prominence in the paper. Of course, this will automatically happen if some of the remaining bulk is removed.
  It is good to hear that an outside user has successfully downloaded and run HeCalc. The usability of new software is always challenging to assess as we are sure the reviewer is aware. As far as “taking further prominence”—is there a component of the description of HeCalc that is missing?
  HeCalc requires that the user provide the uncertainties of the alpha-retention factors 238Ft, 235Ft, 232Ft and 147Ft. However, the paper does not explain how these uncertainties should be obtained. A nominal 5% uncertainty is used in later examples, without proper justification.
  As mentioned above, the means to derive Ft uncertainties is an active area of research and would be beyond the scope of this paper; the 5% Ft uncertainty (and 2%, which was also included) was meant to be illustrative. These values are, however, not arbitrary but are based on recent work (cited in the manuscript) and are a likely reasonable magnitude for the purposes of this study. On the other hand, it is entirely possible to input zero uncertainty for each and obtain the same result as if the uncertainties were ignored by the program.
  HeCalc also requires that the user specify the error correlations between the different parameters. However, it does not discuss how to estimate those correlations. Does the CU TRaiL database specify them?
  Unlike Ft uncertainties, error correlations are not required inputs; the program defaults to assuming uncorrelated uncertainty if these columns are not present. CU TRaIL does not currently include correlations in uncertainty for any regularly calculated uncertainty (i.e., He, U, Th, or Sm)—our methods should in theory result in fully independent error. As discussed in the background section, uncertainties in Ft are likely highly correlated, though we will leave the quantification of this to future researchers. We apply both fully correlated and uncorrelated uncertainties in Section 5 to circumvent this currently underconstrained problem.
  Minor comment: the paper (and HeCalc) use the awkward convention to report MC uncertainties as “68% confidence intervals”. I understand where this comes from: a 1-sigma interval around the mean of a normal distribution covers 68% of that distribution. However, uncertainties are usually reported either as standard errors or as 95% confidence intervals. If the authors want to compare their analytical results with the MC simulations, then a 95% confidence interval would be more elegant.
  This is another good point. Ideally, the program would provide both 1- and 2-sigma uncertainties and their equivalents for non-gaussian distributions calculated by MC to allow apples-to-apples comparison of whatever uncertainty statistic an end user prefers. We will implement both a 68% and 95% confidence interval output for the code.
  Section 5 can be nearly completely removed. The most interesting part of this section is the finding that parent concentrations are a greater contributor to the uncertainty budget than the helium concentrations. This finding could be reported much more succinctly.
  This is where our most important disagreement with this review becomes apparent. Section 5 is almost entirely devoted to a thorough exploration of the mathematical systems described in earlier sections in largely non-mathematical terms, with the intent of elucidating the system for all users. Although this section likely feels boring and repetitive to those with significant experience with statistics, for many, this portion of the paper may be crucial to developing a real and deep understanding of the uncertainty components in a (U-Th)/He date and how they combine to provide a date uncertainty. We feel that the subsection 5.4 in particular is essential. For users with more hands-on experience with real data than statistical/mathematical theory, seeing how the concepts described early in the paper apply to real data may be critical.
  According to lines 421-423: “when combining uncertainties with equal magnitude, the resulting uncertainty will be only ~1.4 times larger than the input, rather than twice as large as might be expected.” Here the authors underestimate the reader. I am certain that the vast majority of geochronologists are familiar with the quadratic addition of uncertainties. Consequently this sentence, as well as the preceding paragraph and Figure 4, can be safely removed.
  Though we agree all researchers would be capable of calculating uncertainty by adding components in quadrature and observing the non-linear effects of this approach, in several conversations we have found that users of (U-Th)/He data at a variety of levels did not intuitively and instinctively expect this relationship. Again, our primary goal is increased accessibility to a broad group of (U-Th)/He data producers, users, and students, not the handful of statistical experts who can easily skip over a couple dozen words.
  The paper attributes the reduction of analytical uncertainty with increasing date to the “roll over” of the exponential decay function. This may be correct but is largely irrelevant to real world applications. The observed reduction only expresses itself at >1 Ga, while the vast majority of published (U-Th)/He dates are <200 Ma. At young ages, the helium age equation is linear to a good approximation (https://doi.org/10.1016/j.chemgeo.2008.01.027). Note, however, that the fixed uncertainty of the helium measurements shown in Figure 3 is not realistic: older samples will tend to contain more helium, which can be measured more precisely. This will also cause a reduction of analytical uncertainty, even for Cenozoic samples.
  As discussed earlier, ancient (U-Th)/He dates are becoming common as deep time thermochronology gains prominence; dates where this uncertainty reduction becomes important do exist, and are increasingly routine.
  Figure 3 demonstrates the relationship between input and date uncertainty, but is not meant to reflect a series of real data (i.e., we are not suggesting that uncertainty in helium measurement will be the same across all dates). Because the shape of this reduction depends only on Th/U ratio, the main use for this figure would be to read off a % reduction in uncertainty at a given date for a given Th/U ratio.
  Additionally, the decrease in He uncertainties with sample age is neither continuous nor universal. Many analysts will specifically target grains with a range of U and Th contents to explore the effects of radiation damage, meaning grains with nominally the same date can have very different He contents. Even with increasing He contents, uncertainties do not continuously decrease, but instead are a product of uncertainty resulting from blank and standard measurements, creating a minimum uncertainty bound. It is not the case that a 1000 Ma grain, for example, necessarily has a lower He uncertainty than a 100 Ma grain.
  In section 5.2, the authors introduce a new definition for skewness. This is a very bad idea. There already exists enough confusion in the geological community about basic statistical concepts. It would be unwise to add to the confusion by redefining widely accepted terms such as skewness. At this point I would like to reiterate the fact that the approximately lognormal uncertainty distribution of the dates could easily be captured analytically by recasting the equations as a function of the log of the age. Simply referring to the percent uncertainty of the age would capture the uncertainty and the skewness with a single number.
  This seems like another instance where style preference comes to the fore. Skewness is a widely accepted term within the statistics community, and it is likely that many geochemists are obliquely familiar. We believe it unlikely, however, that most geochemists intuitively grasp the inherent meaning behind the numerical calculation of skew (e.g., what does a skew of +0.8 mean for a given distribution?). The redefinition we include is meant only to assist in reader and user comprehension of the skew of the distributions. If the reviewer thinks it would be appropriate to provide the formal skewness in addition, this would be straightforward to implement, but may unnecessarily clutter the output, especially with the inclusion of 95% confidence intervals.
  Section 5.4 applies the algorithms to a database of ~3,600 (U-Th)/He dates. It is a shame that this database is not released along with the paper. It must be a treasure trove of useful information! Unfortunately, I don’t think that Section 5.4 is particularly interesting. It definitely doesn’t deserve seven manuscript pages, four pages and three figures (not counting sub-panels). However, Figure 11 does illustrate my comment at the start of this review effectively: the nominal uncertainty of the alpha-ejection correction dwarfs the other uncertainties, thereby defeating the purpose of the careful error propagation.
  Much of the data in this compilation comes from samples CU TRaIL was contracted to run. We do not “own” it and it therefore is not ours to release outside of these anonymized and derived figures. Much of the data produced by CU TRaIL for internal research projects is published or is in the process of being published, and is easily discoverable in the literature. Our lab has played a lead role in developing community agreement on (U-Th)/He data reporting protocols and in making the information needed to understand and interpret (U-Th)/He data accessible to the community (Flowers et al., 2022a,b). However, we cannot release unpublished data without their scientific context and with no means for proper attribution, especially when those data are not ours to release in the first place.
  As stated earlier, we feel that this section is in fact highly important as a means of making the results of this study more broadly comprehensible. This final section provides examples and (hopefully) builds intuition for non-experts as to how this system will actually function in practice. Presented with the bare facts in Sections 2-4, we believe many (U-Th)/He practitioners would have the immediate thought “Ok, so how does this apply to my data?” Our goal with Section 5 is to provide an answer to that question. We will, however, revisit with an eye to tightening.
  Figure 11 is important precisely to make the point that the current efforts to more rigorously quantify Ft uncertainties are well-worth the time. And at the point that such uncertainties are well-quantified, then they should be propagated into the uncertainties reported on (U-Th)/He dates.
  Lines 615-616: “a challenge to interpreting data with asymmetrical uncertainties is that no widely used inverse thermal history modeling software for (U-Th)/He data permits the input of asymmetrical uncertainty” I’m not sure how HeFTy handles the analytical uncertainty of (U-Th)/He data, but if I seem to recall that QTQt essentially inflates the uncertainties until they account for the overdispersion of the data. This means that the uncertainties are, effectively, ignored. HeFTy probably does something similar, because otherwise its formalised hypothesis tests would fail. Ideally, thermal history inversions should aim to predict the uncorrected (U-Th)/He dates, ignoring the alpha ejection correction. As mentioned before, this is because alpha ejection occurs concurrently with thermal diffusion. So it is not a constant but a variable that depends on the thermal history (Meesters and Dunai, 2002).
  For HeFTy, we are certain that the program takes only the uncertainty provided by the user, which is assumed to be gaussian in the formulation of HeFTy’s goodness-of-fit parameter. This issue is explored in Vermeesch and Tian (2014; 10.1016/j.earscirev.2014.09.010) and the subsequent comments and replies, where the authors note that improved uncertainty results in fewer “good” and “acceptable” paths, which is really just an artifact of the random Monte Carlo path generation method combined with tighter goodness-of-fit requirements of improved uncertainty. We are less familiar with QTQt, but it must apply some version of uncertainty to the data. Unless this uncertainty is permitted to be asymmetrical, the overall point stands: date probability distributions have the potential to be non-gaussian, and that potential should be included in thermal modeling.
  Equations a1-a10 all have the same denominator. Storing this denominator in a variable would avoid a lot of duplicate text. You could then even put all these equations into a single concise Jacobian.
  We have the same response to this as with earlier suggestions that would streamline the equations in this paper. Adding variables would just increase the complexity of interpreting the equations. In contrast, the repeated text does not harm the paper in any way (it doesn’t even add lines to the manuscript). It is true that these could be recast in matrix math, but for input to an excel workbook, readers would need to revert that change. On the other hand, any practitioner who wishes to convert these equations to matrix form before using them is likely to be more than capable of doing so.
  I apologise if this review comes across as overly critical. I think that this paper (and the HeCalc program) could serve a useful purpose. My opinions is that it would be greatly improved by trimming it down to the important parts. Perhaps the paper could be recast as one of GChron’s popular “Technical notes”? This would provide a nice way to present HeCalc to the world, whilst reviewing the error propagation problem.
  
  Citation: https://doi.org/10.5194/egusphere-2022-299-AC1
RC2:
'Comment on egusphere-2022-299', Ryan Ickert, 11 Aug 2022

Review of “Calculation of Uncertainty in the (U-Th)/He System” for Geochronology, by Martin et al.

This review is by Ryan Ickert (Purdue University)

This manuscript presents an algorithm for determining U-Th-Sm/He dates in accessory minerals and then derives uncertainty propagation equations for them, combining uncertainties in measurements of nuclide amounts and geometric correction factors. They also introduce a Monte Carlo method and compare the two. These algorithms are implemented in a python package hosted on zenodo.

A consistent set of uncertainty propagation equations for the community is a welcome addition to the literature, although it is unlikely that it will strongly affect science derived from the measurements due to pervasive overdispersion.

I’ve had the opportunity to read the review by Pieter Vermeesch prior to writing this, and that saves me a lot of time writing my review because I came to basically the same conclusions as him. I don’t think it’s necessary for me to state specifically on which points I agree, but one that affects the manuscript substantially is the suggestion to cut down the length. I agree that section five can be completely removed. I don’t think it’s necessary for the authors to include log functions to the analytical solutions, but I agree that it would be a superior technique.

What I think would be very useful is a comparison of before/after of typical analyses from the authors lab. A few representative datasets would be just fine. I think this would be of great value in showing readers whether this approach changes assigned uncertainties in any substantial way. Maybe it doesn’t, but that’s not necessarily a bad thing.

I think it’s a missed opportunity somewhat to not assess the accuracy of assigned uncertainties in the U-Th-Sm, and the He analyses. I appreciate that the authors have carved the manuscript such that this is “out of scope” (which is their prerogative), but the brief comment that tracers are often added by pipetting (without discussing the implications of adding a spike isotope using a technique with such a high variability) suggests to me that the radionuclide measurements may be underestimated. Not to mention that the treatment of under/over spiking and blank subtraction (to name a couple), in my experience, are dealt with in a highly variable way by different people in the U-Th-Sm/He community. I guess that leaves space for the next paper!

I have a few shorter comments below:

L9: What quantities are ⁴He and “radionuclide” here. Are they amounts, concentrations….? Please briefly define F_T in the abstract for a non-specialist, particularly because it’s referred to as particularly important later in the abstract.

L11: Is this relative or absolute uncertainty?

L15: Again, is this concentration?

L15: What is the confidence level for these estimates? 95%? 2sigma?

L34: I assume that by “kinetic” the authors mean diffusion kinetics.

L76: “ppm” is, in general, ambiguous and best practice is that it should be avoided. I appreciate that there is an implicit convention in some geochemical subfields that it refers only to µg/g, but it is not always the case and there is no disadvantage to being explicit and using the SI-consistent µg/g.

L80: “sector” should say “magnetic sector”

L81: The technique is called “isotope dilution” not “isotope spike”.

L82: I’m not sure what “ratioed mass spectrometric measurements” are?

L103-104: It’s not clear to me that this is true. For example, if a mixed U-Th-Sm tracer is used and the mass of spike solution is relatively small (which is usually the case), the uncertainty in the amount of spike added (which propagates directly onto the amount of U, Th, and Sm calculated) will be relatively large and since the tracer is added as a mixture, that component will be highly correlated. It takes special care to get weighing errors to less than 1%, and with small amounts of spike they can easily be in the 5-10% range. When pipetting without weighing (which is what is implied here) the problem can be much worse. Pipetting consistency can vary, for sure, but for 25 µl the relative standard deviation on masses dispensed can be 3-5%, which translates directly to a 3-5% uncertainty on the measured quantities. This seems like it should be large enough to matter?

L117: 2-9% in what? The Ft correction or the final date? And is this a bias (e.g., the technical definition of “error”) or additional uncertainty on the date?

L131: Here and elsewhere the word “variance” is used. It’s unclear as to whether this is referring to a moment of the gaussian distribution or as a casual synonym for uncertainty or data scatter.

L366: A Th/U (by mass) ratio of 1.25 is *not* typical of zircon, it is unusually high. Looking at the georoc database of zircon compositions, after doing some data culling, the median value is about 0.6 (mean = 0.8). That sounds much more realistic to me than 1.25.

L420: This statement should be justified or referenced, or else removed. It sounds a little bit like it is underestimating the math and stats skills of typical geochronologists.

Citation: https://doi.org/10.5194/egusphere-2022-299-RC2
- AC2: 'Reply on RC2', Peter Martin, 29 Aug 2022
  
  We would like to thank Dr. Ickert for his review; we respond to his feedback in-line below, using bold to easily differentiate our response.
  This manuscript presents an algorithm for determining U-Th-Sm/He dates in accessory minerals and then derives uncertainty propagation equations for them, combining uncertainties in measurements of nuclide amounts and geometric correction factors. They also introduce a Monte Carlo method and compare the two. These algorithms are implemented in a python package hosted on zenodo.
  A consistent set of uncertainty propagation equations for the community is a welcome addition to the literature, although it is unlikely that it will strongly affect science derived from the measurements due to pervasive overdispersion.
  We generally agree with this assessment regarding current and archival data. However, by accurately constraining uncertainty (or, more specifically, the probability distribution) of an analysis, we can also accurately constrain the amount and distribution of overdispersion present in a given dataset to better understand the causes. This work also has the potential to future-proof the field as the technique is applied to increasingly challenging phases and time periods (both deep time and unusually young samples) where analytical uncertainties may become more significant.
  I’ve had the opportunity to read the review by Pieter Vermeesch prior to writing this, and that saves me a lot of time writing my review because I came to basically the same conclusions as him. I don’t think it’s necessary for me to state specifically on which points I agree, but one that affects the manuscript substantially is the suggestion to cut down the length. I agree that section five can be completely removed. I don’t think it’s necessary for the authors to include log functions to the analytical solutions, but I agree that it would be a superior technique.
  Section 5 – the entire discussion of the manuscript – is a fundamental part of this paper, just as the discussion is critical for any manuscript. This section shows the influence of input uncertainties on output uncertainties, is potentially critical for readers who do not have a strong foundation in the mathematical concepts presented earlier in the paper (based on experience with hundreds of (U-Th)/He data users at all levels), and shows the implications for real data. A clear example applying mathematical concepts is often the bridge to understanding for anyone struggling to follow a purely conceptual discussion. It was our understanding that Geochronology is a journal intended for exactly this type of contribution. However, we do appreciate the suggestion that Section 5 can be tightened up, and we will revise the manuscript to keep Section 5 appropriately terse and present some of the less impactful discussion points (e.g. MC vs linear uncertainty propagation) to the appendices.
  What I think would be very useful is a comparison of before/after of typical analyses from the authors lab. A few representative datasets would be just fine. I think this would be of great value in showing readers whether this approach changes assigned uncertainties in any substantial way. Maybe it doesn’t, but that’s not necessarily a bad thing.
  We are not sure exactly what this suggestion entails? Since there is currently no generally accepted method for uncertainty propagation in (U-Th)/He dating, there is not really anything to compare against. While we could provide a comparison between the uncertainties assigned to the data in CU TRaIL prior to the development of this method, this comparison would be highly specific to our lab and would likely not be predictive of what other labs would observe if they made such a comparison.
  I think it’s a missed opportunity somewhat to not assess the accuracy of assigned uncertainties in the U-Th-Sm, and the He analyses. I appreciate that the authors have carved the manuscript such that this is “out of scope” (which is their prerogative), but the brief comment that tracers are often added by pipetting (without discussing the implications of adding a spike isotope using a technique with such a high variability) suggests to me that the radionuclide measurements may be underestimated. Not to mention that the treatment of under/over spiking and blank subtraction (to name a couple), in my experience, are dealt with in a highly variable way by different people in the U-Th-Sm/He community. I guess that leaves space for the next paper!
  We generally agree that including uncertainty analyses for the component data (He, U-Th-Sm, and Ft) would be nice to have in the literature; the challenge, as mentioned in this comment, is that the techniques for these measurements are not consistent between labs. Such a discussion would therefore necessarily not be globally applicable and we do consider it beyond the scope of this paper. And including this discussion here would likely double or triple the length of the paper, which, as acknowledged by both reviewers, is already long.
  I have a few shorter comments below:
  L9: What quantities are ⁴He and “radionuclide” here. Are they amounts, concentrations….? Please briefly define F_T in the abstract for a non-specialist, particularly because it’s referred to as particularly important later in the abstract.
  
  For the abstract, this is more of a general statement since a date can be calculated using either concentrations or amounts.
  
  We agree that we should define Ft in the abstract, and will make this change during revisions.
  
  L11: Is this relative or absolute uncertainty?
  These are relative input uncertainties; we will clarify this in revision.
  L15: Again, is this concentration?
  
  These calculations were done with amount instead of concentration, but this isn’t a critical distinction given that the conversion to concentration has a common denominator for He and U-Th-Sm, and is therefore factored out in date calculation.
  L15: What is the confidence level for these estimates? 95%? 2sigma?
  The uncertainties in the abstract are quoted at 1-sigma; we will clarify this.
  L34: I assume that by “kinetic” the authors mean diffusion kinetics.
  Yes, we are referring to diffusion kinetics, and will clarify.
  L76: “ppm” is, in general, ambiguous and best practice is that it should be avoided. I appreciate that there is an implicit convention in some geochemical subfields that it refers only to µg/g, but it is not always the case and there is no disadvantage to being explicit and using the SI-consistent µg/g.
  This is a good point. We are referring to ppm by mass, so can make this change throughout the manuscript.
  L80: “sector” should say “magnetic sector”
  Yes, this is a good clarification.
  L81: The technique is called “isotope dilution” not “isotope spike”.
  We will correct this.
  L82: I’m not sure what “ratioed mass spectrometric measurements” are?
  
  We are referring to ratio measurements (as opposed to absolute counts or voltages), and will clarify this.
  
  L103-104: It’s not clear to me that this is true. For example, if a mixed U-Th-Sm tracer is used and the mass of spike solution is relatively small (which is usually the case), the uncertainty in the amount of spike added (which propagates directly onto the amount of U, Th, and Sm calculated) will be relatively large and since the tracer is added as a mixture, that component will be highly correlated. It takes special care to get weighing errors to less than 1%, and with small amounts of spike they can easily be in the 5-10% range. When pipetting without weighing (which is what is implied here) the problem can be much worse. Pipetting consistency can vary, for sure, but for 25 µl the relative standard deviation on masses dispensed can be 3-5%, which translates directly to a 3-5% uncertainty on the measured quantities. This seems like it should be large enough to matter?
  
  We think this is likely true for CU TRaIL based on our procedures and observed reproducibility, but correlated uncertainty associated with spiking procedures could be significant depending on the exact procedure used. We will add language to this effect and note that this may need to be assessed on a per lab basis.
  
  L117: 2-9% in what? The Ft correction or the final date? And is this a bias (e.g., the technical definition of “error”) or additional uncertainty on the date?
  We will clarify this point—we are referring to uncertainty in the Ft correction, which will result in increased date uncertainty.
  L131: Here and elsewhere the word “variance” is used. It’s unclear as to whether this is referring to a moment of the gaussian distribution or as a casual synonym for uncertainty or data scatter.
  Variance is explicitly not a parameter related to the gaussian distribution, but rather the formal term for “data scatter” (specifically, the expected squared deviation of a random variable from the mean) within any distribution. We used it here and elsewhere with the intention of being agnostic about the probability distribution reflected by an observed dataset while also using the formal language for scatter in a dataset. We will clarify this in the manuscript.
  L366: A Th/U (by mass) ratio of 1.25 is *not* typical of zircon, it is unusually high. Looking at the georoc database of zircon compositions, after doing some data culling, the median value is about 0.6 (mean = 0.8). That sounds much more realistic to me than 1.25.
  Thanks for this. This is a typo we discovered after submission; 1.25 is meant to be in reference to a typical apatite and 0.6 is our FCT number, which we applied to the zircon in this work. We will correct this during revisions.
  L420: This statement should be justified or referenced, or else removed. It sounds a little bit like it is underestimating the math and stats skills of typical geochronologists.
  Thanks for this reaction. We will modify this statement to “Figure 4 illustrates examples of combining uncertainties with differing magnitudes in quadrature.” Hopefully this clarifies that, rather than an implicit low appraisal of most geochronologists’ math/stats skills, this paragraph is meant as a simple exercise to build understanding for less advanced readers (e.g. undergraduates or early graduate students), and to prompt the memories for more advanced readers.
  
  Citation: https://doi.org/10.5194/egusphere-2022-299-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-299', Pieter Vermeesch, 16 May 2022
This paper discusses the error propagation of (U-Th)/He data. Its introduction claims that “the formal analytical uncertainty in (U-Th)/He dates has never been thoroughly assessed”. I have three comments about this statement:

I am sure that the error propagation of the (U-Th)/He method has been worked out before, and probably several times. In fact, I have done so myself, and even implemented it in a publicly accessible computer program: https://ucl.ac.uk/~ucfbpve/heliocalc/.

One probable reason why nobody published the error propagation formulas for the (U-Th)/He method is the overdispersion that characterises most (U-Th)/He datasets: the scatter of several aliquots from the same samples usually exceeds the precision of the data, by a lot. So, in a sense, the analytical uncertainties are irrelevant. The interplay between analytical uncertainty and overdispersion is discussed by Vermeesch (2010, doi:10.1016/j.chemgeo.2010.01.002), who also covers some aspects of the error propagation problem.

A second reason why error propagation hasn’t been discussed much is that it is next to impossible to quantify the analytical uncertainty of the alpha ejection correction, which is one of the main sources of uncertainty in (U-Th)/He dating. Geochronologists slap a nominal uncertainty on this correction, which largely defeats the purpose of rigorous error propagation for the other variables. Unfortunately, the paper under consideration does not address this issue.

Despite these three caveats, I do not object to publishing the error propagation formulas in GChron. However, before this can happen the manuscript needs serious revision. The paper is far too long and can be shortened by at least 50%. I will make some specific suggestions for this later in this review.

The paper uses both standard error propagation and Monte Carlo (MC) simulation. I have two comments about this:

According to the authors, the main advantage of the MC method is its ability to handle skewed error distributions. However, it would be easy to adjust the conventional error propagation to handle the observed skewness. This can be achieved by reformulating the error propagation formula in terms of the log of the variables (e.g., Section 5 of https://doi.org/10.5194/gchron-2-119-2020). Thus, the log of the age could be calculated as a function of the U, Th and He concentrations. An even better solution would be to use log-ratios. See Vermeesch (2010) for details. I am not sure how easy it would be to reformulate the paper and HeCalc code in terms of log(ratio) variables. If the authors find it too difficult, then I guess that the MC approach would be fine as an alternative.

The actual main advantage of MC error propagation is not mentioned, namely its ability to handle non-Gaussian error distributions. This is particularly pertinent with regards to the alpha-ejection correction (i.e. the uncertainty of the alpha-retention factor Ft). Meesters and Dunai (2002, https://doi.org/10.1016/S0009-2541(01)00423-5) and Hourigan (2005, https://doi.org/10.1016/j.gca.2005.01.024) have shown that compositional zoning can strongly affect the fraction of ejected alpha particles. Matters are further complicated in the presence of broken grains, when the alpha ejection correction may result in overcorrection (Brown et al., 2013, https://doi.org/10.1016/j.gca.2013.05.041). Things are even more difficult for slowly cooled samples, in which alpha-ejection occurs synchronous with diffusive loss of helium. The dispersion caused by all these complexities is difficult to ascertain, but is likely non-Gaussian. The MC approach could be used to explore these effects. I’m not saying that the authors should do this in their paper (because I don’t want to make it even longer), but they should at least mention the possibility. Perhaps HeCalc could offer an interface to explore these effects?

Detailed comments:

Equations 4-10 are unnecessary. They are simply repeating Meesters and Dunai (2005, https://doi.org/10.1029/2004GC000834), and Raphson and Newton (~1711). Incidentally, I do not really see the point of using the Meesters and Dunai (2005) solution as a starting point for a Raphson-Newton algorithm anyway. Their direct solution is accurate to better than 0.1% for ages up to 500Ma, which covers all terrestrial applications of the (U-Th)/He method.

Equations 11 and 12 could be written more succinctly in matrix form.

Equations 14-18 all share the same denominator (which equals df/dt), which could be stored in a variable. All these equations could be put together into a single Jacobian matrix, or moved into the appendix.

Section 3.3.2 describes a method to choose the optimal number of MC iterations to derive a desired level of precision on the mean value. It just presents the well known “the square root of n” phenomenon, which I think is too trivial a result to occupy so much space (Figure 2 is certainly not necessary). It is also important to note that the square root of n rule only applies to the standard error of the mean. The standard error of the standard deviation (s) is given by s/sqrt(2n-2). I am mentioning this here because the uncertainty of the standard deviation is more relevant than that of the mean, which is never used in the remainder of the paper.

I installed HeCalc on my computer and am happy to confirm that it works. I have not extensively tested it though. I think that the presentation of HeCalc should take greater prominence in the paper. Of course, this will automatically happen if some of the remaining bulk is removed.

HeCalc requires that the user provide the uncertainties of the alpha-retention factors 238Ft, 235Ft, 232Ft and 147Ft. However, the paper does not explain how these uncertainties should be obtained. A nominal 5% uncertainty is used in later examples, without proper justification.

HeCalc also requires that the user specify the error correlations between the different parameters. However, it does not discuss how to estimate those correlations. Does the CU TRaiL database specify them?

Minor comment: the paper (and HeCalc) use the awkward convention to report MC uncertainties as “68% confidence intervals”. I understand where this comes from: a 1-sigma interval around the mean of a normal distribution covers 68% of that distribution. However, uncertainties are usually reported either as standard errors or as 95% confidence intervals. If the authors want to compare their analytical results with the MC simulations, then a 95% confidence interval would be more elegant.

Section 5 can be nearly completely removed. The most interesting part of this section is the finding that parent concentrations are a greater contributor to the uncertainty budget than the helium concentrations. This finding could be reported much more succinctly.

According to lines 421-423: “when combining uncertainties with equal magnitude, the resulting uncertainty will be only ~1.4 times larger than the input, rather than twice as large as might be expected.” Here the authors underestimate the reader. I am certain that the vast majority of geochronologists are familiar with the quadratic addition of uncertainties. Consequently this sentence, as well as the preceding paragraph and Figure 4, can be safely removed.

The paper attributes the reduction of analytical uncertainty with increasing date to the “roll over” of the exponential decay function. This may be correct but is largely irrelevant to real world applications. The observed reduction only expresses itself at >1 Ga, while the vast majority of published (U-Th)/He dates are <200 Ma. At young ages, the helium age equation is linear to a good approximation (https://doi.org/10.1016/j.chemgeo.2008.01.027). Note, however, that the fixed uncertainty of the helium measurements shown in Figure 3 is not realistic: older samples will tend to contain more helium, which can be measured more precisely. This will also cause a reduction of analytical uncertainty, even for Cenozoic samples.

In section 5.2, the authors introduce a new definition for skewness. This is a very bad idea. There already exists enough confusion in the geological community about basic statistical concepts. It would be unwise to add to the confusion by redefining widely accepted terms such as skewness. At this point I would like to reiterate the fact that the approximately lognormal uncertainty distribution of the dates could easily be captured analytically by recasting the equations as a function of the log of the age. Simply referring to the percent uncertainty of the age would capture the uncertainty and the skewness with a single number.

Section 5.4 applies the algorithms to a database of ~3,600 (U-Th)/He dates. It is a shame that this database is not released along with the paper. It must be a treasure trove of useful information! Unfortunately, I don’t think that Section 5.4 is particularly interesting. It definitely doesn’t deserve seven manuscript pages, four pages and three figures (not counting sub-panels). However, Figure 11 does illustrate my comment at the start of this review effectively: the nominal uncertainty of the alpha-ejection correction dwarfs the other uncertainties, thereby defeating the purpose of the careful error propagation.

Lines 615-616: “a challenge to interpreting data with asymmetrical uncertainties is that no widely used inverse thermal history modeling software for (U-Th)/He data permits the input of asymmetrical uncertainty” I’m not sure how HeFTy handles the analytical uncertainty of (U-Th)/He data, but if I seem to recall that QTQt essentially inflates the uncertainties until they account for the overdispersion of the data. This means that the uncertainties are, effectively, ignored. HeFTy probably does something similar, because otherwise its formalised hypothesis tests would fail. Ideally, thermal history inversions should aim to predict the uncorrected (U-Th)/He dates, ignoring the alpha ejection correction. As mentioned before, this is because alpha ejection occurs concurrently with thermal diffusion. So it is not a constant but a variable that depends on the thermal history (Meesters and Dunai, 2002).

Equations a1-a10 all have the same denominator. Storing this denominator in a variable would avoid a lot of duplicate text. You could then even put all these equations into a single concise Jacobian.

I apologise if this review comes across as overly critical. I think that this paper (and the HeCalc program) could serve a useful purpose. My opinions is that it would be greatly improved by trimming it down to the important parts. Perhaps the paper could be recast as one of GChron’s popular “Technical notes”? This would provide a nice way to present HeCalc to the world, whilst reviewing the error propagation problem.
Citation: https://doi.org/10.5194/egusphere-2022-299-RC1
- AC1:
  'Reply on RC1', Peter Martin, 27 May 2022
              We thank Prof. Vermeesch for taking the time to write a prompt, thoughtful, and helpful review. While it will become clear that we largely disagree with the conclusions he draws about the content of the paper, he has helped us identify several areas where the manuscript can be improved and clarified. Below, we provide a high-level response that addresses many of the reviewer’s critiques. We then respond to the individual points raised, copying the text of the original review in bold.
              Much of this review centers on shortening the text and streamlining the equations provided to yield the most compact possible description of HeCalc and uncertainty in the (U-Th)/He system. In short, we wish to write this paper for the benefit of all geoscientists, not just those with a strong grounding in math and statistics. While the reviewer likely believes that our approach to this goal largely underestimates readers, we disagree. Our experience in the CU TRaIL (Thermochronology Research and Instrumentation Lab) with hundreds of researchers, students, and other clients has demonstrated that the levels of comfort with statistics and math within the (U-Th)/He user community are highly variable. Our objective is to make the manuscript as comprehensible to as wide a readership as possible in an attempt to subvert the well-known “Curse of Expertise” wherein knowledge experts underestimate the difficulty of a subject for novice and intermediate practitioners (e.g., Hinds, 1999; https://doi.org/10.1037/1076-898X.5.2.205).
              The fundamental purpose of this paper is not to simply publish a method for (U-Th)/He uncertainty propagation, but rather to fully explicate this process for the full range of practitioners of this method, from lab managers and PIs to external users of (U-Th)/He data and early graduate students. Shortening the manuscript in the manner suggested by the reviewer would remove a significant amount of text dedicated to not just writing out the formulae for (U-Th)/He uncertainty propagation, but meant also to assist readers in developing an intuition for this system. We now recognize that this primary goal of accessibility is not well communicated in the text. We will correct this and thank Prof. Vermeesch for highlighting this shortcoming.
  This paper discusses the error propagation of (U-Th)/He data. Its introduction claims that “the formal analytical uncertainty in (U-Th)/He dates has never been thoroughly assessed”. I have three comments about this statement:
  The full statement in the paper is “methods for propagating uncertainty components into single-grain (U-Th)/He dates have never been described in the literature, and the formal analytical uncertainty in (U-Th)/He dates has never been thoroughly assessed.” We largely stand by this statement, emphasizing that we are not claiming that it has never been done, but rather that it has not been fully explained in the literature, as we intend to do in this manuscript. Again, we will revise the manuscript to more clearly communicate the latter goal.
  I am sure that the error propagation of the (U-Th)/He method has been worked out before, and probably several times. In fact, I have done so myself, and even implemented it in a publicly accessible computer program: https://ucl.ac.uk/~ucfbpve/heliocalc/.
  
  It is true that we failed to mention that a tool for performing this uncertainty propagation is presently available. However, the methods by which this program functions are not (to our knowledge) available outside of inspection of the source code. As this source code is largely uncommented and is written in javascript (in our experience, an uncommon language for geoscience computing compared with Fortran, Matlab, Python, R, or Julia), we believe it would be difficult for a non-expert to understand how or why the program produces the uncertainties that it does. Again, it is the inaccessibility of the knowledge behind uncertainty propagation that we also wish to address, so that it is no longer a “black box” to users.
  One probable reason why nobody published the error propagation formulas for the (U-Th)/He method is the overdispersion that characterises most (U-Th)/He datasets: the scatter of several aliquots from the same samples usually exceeds the precision of the data, by a lot. So, in a sense, the analytical uncertainties are irrelevant. The interplay between analytical uncertainty and overdispersion is discussed by Vermeesch (2010, doi:10.1016/j.chemgeo.2010.01.002), who also covers some aspects of the error propagation problem.
  
  We strongly disagree with this line of reasoning. This argument may have been appropriate to make 15 years ago, but the field of (U-Th)/He dating has advanced radically since then. Overdispersion is a central motivation for this contribution. How can we understand overdispersion if we don’t first carefully propagate the uncertainty in (U-Th)/He dates that can be well-quantified? By first constraining and then subtracting the scatter in data attributable purely to analytical error, one may begin to approach the problem of the underlying physical causes of overdispersion with greater confidence. This phenomenon is one of the core motivations for the work, as stated in the introduction, background, and conclusion sections. We will edit to make this point even more clearly in the text.
  A second reason why error propagation hasn’t been discussed much is that it is next to impossible to quantify the analytical uncertainty of the alpha ejection correction, which is one of the main sources of uncertainty in (U-Th)/He dating. Geochronologists slap a nominal uncertainty on this correction, which largely defeats the purpose of rigorous error propagation for the other variables. Unfortunately, the paper under consideration does not address this issue.
  
  This claim that it is “next to impossible to quantify the analytical uncertainty of the alpha ejection correction” is simply untrue. Recent (Cooperdock et al., 2019; https://doi.org/10.5194/gchron-1-17-2019) and ongoing (Zeigler et al., 2021; DOI: 10.1002/essoar.10507962.1) work is quantifying the geometric uncertainties associated with alpha-ejection corrections, as is summarized briefly in our paper. It also is possible to quantify the uncertainty introduced by zonation (e.g., Hourigan et al., 2005 DOI: 10.1016/j.gca.2005.01.024; Johnstone et al., 2013 DOI: 10.1016/j.gca.2013.01.004). These considerations are fully discussed in the recent review paper by Flowers et al. (2022; https://doi.org/10.1130/B36266.1), in which numerous labs agree for the need to more rigorously quantify uncertainties in alpha-ejection corrections and propagate them into the reported uncertainty in (U-Th)/He dates, rather than “slapping a nominal uncertainty on the correction”. In this manuscript we both provide a method by which to do Ft uncertainty propagation and also fully explain it.
  While it is true that in this paper we apply some nominal uncertainties to the F_T values in Section 5, these uncertainties are meant to illustrate exactly the point the reviewer is making here: that Ft uncertainty is likely a major contributor to intra-sample dispersion, and that quantification of these uncertainties should be a high priority for the (U-Th)/He community.
  Despite these three caveats, I do not object to publishing the error propagation formulas in GChron. However, before this can happen the manuscript needs serious revision. The paper is far too long and can be shortened by at least 50%. I will make some specific suggestions for this later in this review.
  The paper uses both standard error propagation and Monte Carlo (MC) simulation. I have two comments about this:
  According to the authors, the main advantage of the MC method is its ability to handle skewed error distributions. However, it would be easy to adjust the conventional error propagation to handle the observed skewness. This can be achieved by reformulating the error propagation formula in terms of the log of the variables (e.g., Section 5 of https://doi.org/10.5194/gchron-2-119-2020). Thus, the log of the age could be calculated as a function of the U, Th and He concentrations. An even better solution would be to use log-ratios. See Vermeesch (2010) for details. I am not sure how easy it would be to reformulate the paper and HeCalc code in terms of log(ratio) variables. If the authors find it too difficult, then I guess that the MC approach would be fine as an alternative.
  
  We will look into this possibility; an explicit calculation of skewed uncertainties would certainly be a useful addition. On the other hand, we do not believe that the MC approach has any major downsides given HeCalc’s computational efficiency. An added benefit of MC is in its mathematical simplicity, furthering the overall purpose of this paper, which is to provide an easily comprehensible description of the means of deriving uncertainty in (U-Th)/He dates.
  The actual main advantage of MC error propagation is not mentioned, namely its ability to handle non-Gaussian error distributions. This is particularly pertinent with regards to the alpha-ejection correction (i.e. the uncertainty of the alpha-retention factor Ft). Meesters and Dunai (2002, https://doi.org/10.1016/S0009-2541(01)00423-5) and Hourigan (2005, https://doi.org/10.1016/j.gca.2005.01.024) have shown that compositional zoning can strongly affect the fraction of ejected alpha particles. Matters are further complicated in the presence of broken grains, when the alpha ejection correction may result in overcorrection (Brown et al., 2013, https://doi.org/10.1016/j.gca.2013.05.041). Things are even more difficult for slowly cooled samples, in which alpha-ejection occurs synchronous with diffusive loss of helium. The dispersion caused by all these complexities is difficult to ascertain, but is likely non-Gaussian. The MC approach could be used to explore these effects. I’m not saying that the authors should do this in their paper (because I don’t want to make it even longer), but they should at least mention the possibility. Perhaps HeCalc could offer an interface to explore these effects?
  
  This is a good point, and one we had not fully considered. It would certainly be possible to offer a means of inputting additional non-gaussian uncertainty components of arbitrary shape into HeCalc. We will evaluate the feasibility of including it in HeCalc for the final version of this paper, and, if it is not feasible, try to include it in future HeCalc versions.
  Detailed comments:
  Equations 4-10 are unnecessary. They are simply repeating Meesters and Dunai (2005, https://doi.org/10.1029/2004GC000834), and Raphson and Newton (~1711). Incidentally, I do not really see the point of using the Meesters and Dunai (2005) solution as a starting point for a Raphson-Newton algorithm anyway. Their direct solution is accurate to better than 0.1% for ages up to 500Ma, which covers all terrestrial applications of the (U-Th)/He method.
  Equation 4 is an alteration to the Meeters and Dunai (2005) method that is not included in the original paper, so we think it should probably be kept. The other equations are indeed a reproduction of other works. We included these for complete traceability of the math in the paper, but will defer to the editor on decisions about whether to replicate equations that are available elsewhere.
  Many (U-Th)/He dates exceed 500 Ma, including dates produced regularly by CU TRaIL. "Deep time thermochronology” is becoming increasingly popular. Moreover, is there any reason to knowingly accept errors in date computation? The Newton-Raphson method is efficient enough that for ages <500 Ma virtually no loss of computational efficiency is observed. Computational efficiency only suffers for older dates where the Meesters and Dunai method provides poorer estimates, but in these cases the Meesters and Dunai-generated age has greater error anyhow. We see no reason why labs would not report dates using iteration; this approach is favored by labs in the recent GSAB paper on reporting (U-Th)/He data (Flowers et al., 2022; https://doi.org/10.1130/B36266.1).
  Equations 11 and 12 could be written more succinctly in matrix form.
  Equations 14-18 all share the same denominator (which equals df/dt), which could be stored in a variable. All these equations could be put together into a single Jacobian matrix, or moved into the appendix.
  Here again, we favor clarity over brevity and feel that the way these equations are written are more widely accessible than their admittedly more compact matrix counterparts. The main reason for including this series of equations is to provide the means by which a lab that does not wish to incorporate HeCalc into their workflow may perform rigorous uncertainty propagation in a spreadsheet program. While it is true that Excel can handle matrix math, this process requires multiple rows which will generally not be compatible with most labs’ workflows.
  Section 3.3.2 describes a method to choose the optimal number of MC iterations to derive a desired level of precision on the mean value. It just presents the well known “the square root of n” phenomenon, which I think is too trivial a result to occupy so much space (Figure 2 is certainly not necessary). It is also important to note that the square root of n rule only applies to the standard error of the mean. The standard error of the standard deviation (s) is given by s/sqrt(2n-2). I am mentioning this here because the uncertainty of the standard deviation is more relevant than that of the mean, which is never used in the remainder of the paper.
  This is an entirely valid criticism. We will plan to convert the algorithm to select the number of Monte Carlo iterations to a calculation of standard error of the standard deviation and trim this section significantly.
  I installed HeCalc on my computer and am happy to confirm that it works. I have not extensively tested it though. I think that the presentation of HeCalc should take greater prominence in the paper. Of course, this will automatically happen if some of the remaining bulk is removed.
  It is good to hear that an outside user has successfully downloaded and run HeCalc. The usability of new software is always challenging to assess as we are sure the reviewer is aware. As far as “taking further prominence”—is there a component of the description of HeCalc that is missing?
  HeCalc requires that the user provide the uncertainties of the alpha-retention factors 238Ft, 235Ft, 232Ft and 147Ft. However, the paper does not explain how these uncertainties should be obtained. A nominal 5% uncertainty is used in later examples, without proper justification.
  As mentioned above, the means to derive Ft uncertainties is an active area of research and would be beyond the scope of this paper; the 5% Ft uncertainty (and 2%, which was also included) was meant to be illustrative. These values are, however, not arbitrary but are based on recent work (cited in the manuscript) and are a likely reasonable magnitude for the purposes of this study. On the other hand, it is entirely possible to input zero uncertainty for each and obtain the same result as if the uncertainties were ignored by the program.
  HeCalc also requires that the user specify the error correlations between the different parameters. However, it does not discuss how to estimate those correlations. Does the CU TRaiL database specify them?
  Unlike Ft uncertainties, error correlations are not required inputs; the program defaults to assuming uncorrelated uncertainty if these columns are not present. CU TRaIL does not currently include correlations in uncertainty for any regularly calculated uncertainty (i.e., He, U, Th, or Sm)—our methods should in theory result in fully independent error. As discussed in the background section, uncertainties in Ft are likely highly correlated, though we will leave the quantification of this to future researchers. We apply both fully correlated and uncorrelated uncertainties in Section 5 to circumvent this currently underconstrained problem.
  Minor comment: the paper (and HeCalc) use the awkward convention to report MC uncertainties as “68% confidence intervals”. I understand where this comes from: a 1-sigma interval around the mean of a normal distribution covers 68% of that distribution. However, uncertainties are usually reported either as standard errors or as 95% confidence intervals. If the authors want to compare their analytical results with the MC simulations, then a 95% confidence interval would be more elegant.
  This is another good point. Ideally, the program would provide both 1- and 2-sigma uncertainties and their equivalents for non-gaussian distributions calculated by MC to allow apples-to-apples comparison of whatever uncertainty statistic an end user prefers. We will implement both a 68% and 95% confidence interval output for the code.
  Section 5 can be nearly completely removed. The most interesting part of this section is the finding that parent concentrations are a greater contributor to the uncertainty budget than the helium concentrations. This finding could be reported much more succinctly.
  This is where our most important disagreement with this review becomes apparent. Section 5 is almost entirely devoted to a thorough exploration of the mathematical systems described in earlier sections in largely non-mathematical terms, with the intent of elucidating the system for all users. Although this section likely feels boring and repetitive to those with significant experience with statistics, for many, this portion of the paper may be crucial to developing a real and deep understanding of the uncertainty components in a (U-Th)/He date and how they combine to provide a date uncertainty. We feel that the subsection 5.4 in particular is essential. For users with more hands-on experience with real data than statistical/mathematical theory, seeing how the concepts described early in the paper apply to real data may be critical.
  According to lines 421-423: “when combining uncertainties with equal magnitude, the resulting uncertainty will be only ~1.4 times larger than the input, rather than twice as large as might be expected.” Here the authors underestimate the reader. I am certain that the vast majority of geochronologists are familiar with the quadratic addition of uncertainties. Consequently this sentence, as well as the preceding paragraph and Figure 4, can be safely removed.
  Though we agree all researchers would be capable of calculating uncertainty by adding components in quadrature and observing the non-linear effects of this approach, in several conversations we have found that users of (U-Th)/He data at a variety of levels did not intuitively and instinctively expect this relationship. Again, our primary goal is increased accessibility to a broad group of (U-Th)/He data producers, users, and students, not the handful of statistical experts who can easily skip over a couple dozen words.
  The paper attributes the reduction of analytical uncertainty with increasing date to the “roll over” of the exponential decay function. This may be correct but is largely irrelevant to real world applications. The observed reduction only expresses itself at >1 Ga, while the vast majority of published (U-Th)/He dates are <200 Ma. At young ages, the helium age equation is linear to a good approximation (https://doi.org/10.1016/j.chemgeo.2008.01.027). Note, however, that the fixed uncertainty of the helium measurements shown in Figure 3 is not realistic: older samples will tend to contain more helium, which can be measured more precisely. This will also cause a reduction of analytical uncertainty, even for Cenozoic samples.
  As discussed earlier, ancient (U-Th)/He dates are becoming common as deep time thermochronology gains prominence; dates where this uncertainty reduction becomes important do exist, and are increasingly routine.
  Figure 3 demonstrates the relationship between input and date uncertainty, but is not meant to reflect a series of real data (i.e., we are not suggesting that uncertainty in helium measurement will be the same across all dates). Because the shape of this reduction depends only on Th/U ratio, the main use for this figure would be to read off a % reduction in uncertainty at a given date for a given Th/U ratio.
  Additionally, the decrease in He uncertainties with sample age is neither continuous nor universal. Many analysts will specifically target grains with a range of U and Th contents to explore the effects of radiation damage, meaning grains with nominally the same date can have very different He contents. Even with increasing He contents, uncertainties do not continuously decrease, but instead are a product of uncertainty resulting from blank and standard measurements, creating a minimum uncertainty bound. It is not the case that a 1000 Ma grain, for example, necessarily has a lower He uncertainty than a 100 Ma grain.
  In section 5.2, the authors introduce a new definition for skewness. This is a very bad idea. There already exists enough confusion in the geological community about basic statistical concepts. It would be unwise to add to the confusion by redefining widely accepted terms such as skewness. At this point I would like to reiterate the fact that the approximately lognormal uncertainty distribution of the dates could easily be captured analytically by recasting the equations as a function of the log of the age. Simply referring to the percent uncertainty of the age would capture the uncertainty and the skewness with a single number.
  This seems like another instance where style preference comes to the fore. Skewness is a widely accepted term within the statistics community, and it is likely that many geochemists are obliquely familiar. We believe it unlikely, however, that most geochemists intuitively grasp the inherent meaning behind the numerical calculation of skew (e.g., what does a skew of +0.8 mean for a given distribution?). The redefinition we include is meant only to assist in reader and user comprehension of the skew of the distributions. If the reviewer thinks it would be appropriate to provide the formal skewness in addition, this would be straightforward to implement, but may unnecessarily clutter the output, especially with the inclusion of 95% confidence intervals.
  Section 5.4 applies the algorithms to a database of ~3,600 (U-Th)/He dates. It is a shame that this database is not released along with the paper. It must be a treasure trove of useful information! Unfortunately, I don’t think that Section 5.4 is particularly interesting. It definitely doesn’t deserve seven manuscript pages, four pages and three figures (not counting sub-panels). However, Figure 11 does illustrate my comment at the start of this review effectively: the nominal uncertainty of the alpha-ejection correction dwarfs the other uncertainties, thereby defeating the purpose of the careful error propagation.
  Much of the data in this compilation comes from samples CU TRaIL was contracted to run. We do not “own” it and it therefore is not ours to release outside of these anonymized and derived figures. Much of the data produced by CU TRaIL for internal research projects is published or is in the process of being published, and is easily discoverable in the literature. Our lab has played a lead role in developing community agreement on (U-Th)/He data reporting protocols and in making the information needed to understand and interpret (U-Th)/He data accessible to the community (Flowers et al., 2022a,b). However, we cannot release unpublished data without their scientific context and with no means for proper attribution, especially when those data are not ours to release in the first place.
  As stated earlier, we feel that this section is in fact highly important as a means of making the results of this study more broadly comprehensible. This final section provides examples and (hopefully) builds intuition for non-experts as to how this system will actually function in practice. Presented with the bare facts in Sections 2-4, we believe many (U-Th)/He practitioners would have the immediate thought “Ok, so how does this apply to my data?” Our goal with Section 5 is to provide an answer to that question. We will, however, revisit with an eye to tightening.
  Figure 11 is important precisely to make the point that the current efforts to more rigorously quantify Ft uncertainties are well-worth the time. And at the point that such uncertainties are well-quantified, then they should be propagated into the uncertainties reported on (U-Th)/He dates.
  Lines 615-616: “a challenge to interpreting data with asymmetrical uncertainties is that no widely used inverse thermal history modeling software for (U-Th)/He data permits the input of asymmetrical uncertainty” I’m not sure how HeFTy handles the analytical uncertainty of (U-Th)/He data, but if I seem to recall that QTQt essentially inflates the uncertainties until they account for the overdispersion of the data. This means that the uncertainties are, effectively, ignored. HeFTy probably does something similar, because otherwise its formalised hypothesis tests would fail. Ideally, thermal history inversions should aim to predict the uncorrected (U-Th)/He dates, ignoring the alpha ejection correction. As mentioned before, this is because alpha ejection occurs concurrently with thermal diffusion. So it is not a constant but a variable that depends on the thermal history (Meesters and Dunai, 2002).
  For HeFTy, we are certain that the program takes only the uncertainty provided by the user, which is assumed to be gaussian in the formulation of HeFTy’s goodness-of-fit parameter. This issue is explored in Vermeesch and Tian (2014; 10.1016/j.earscirev.2014.09.010) and the subsequent comments and replies, where the authors note that improved uncertainty results in fewer “good” and “acceptable” paths, which is really just an artifact of the random Monte Carlo path generation method combined with tighter goodness-of-fit requirements of improved uncertainty. We are less familiar with QTQt, but it must apply some version of uncertainty to the data. Unless this uncertainty is permitted to be asymmetrical, the overall point stands: date probability distributions have the potential to be non-gaussian, and that potential should be included in thermal modeling.
  Equations a1-a10 all have the same denominator. Storing this denominator in a variable would avoid a lot of duplicate text. You could then even put all these equations into a single concise Jacobian.
  We have the same response to this as with earlier suggestions that would streamline the equations in this paper. Adding variables would just increase the complexity of interpreting the equations. In contrast, the repeated text does not harm the paper in any way (it doesn’t even add lines to the manuscript). It is true that these could be recast in matrix math, but for input to an excel workbook, readers would need to revert that change. On the other hand, any practitioner who wishes to convert these equations to matrix form before using them is likely to be more than capable of doing so.
  I apologise if this review comes across as overly critical. I think that this paper (and the HeCalc program) could serve a useful purpose. My opinions is that it would be greatly improved by trimming it down to the important parts. Perhaps the paper could be recast as one of GChron’s popular “Technical notes”? This would provide a nice way to present HeCalc to the world, whilst reviewing the error propagation problem.
  
  Citation: https://doi.org/10.5194/egusphere-2022-299-AC1
RC2:
'Comment on egusphere-2022-299', Ryan Ickert, 11 Aug 2022

Review of “Calculation of Uncertainty in the (U-Th)/He System” for Geochronology, by Martin et al.

This review is by Ryan Ickert (Purdue University)

This manuscript presents an algorithm for determining U-Th-Sm/He dates in accessory minerals and then derives uncertainty propagation equations for them, combining uncertainties in measurements of nuclide amounts and geometric correction factors. They also introduce a Monte Carlo method and compare the two. These algorithms are implemented in a python package hosted on zenodo.

A consistent set of uncertainty propagation equations for the community is a welcome addition to the literature, although it is unlikely that it will strongly affect science derived from the measurements due to pervasive overdispersion.

I’ve had the opportunity to read the review by Pieter Vermeesch prior to writing this, and that saves me a lot of time writing my review because I came to basically the same conclusions as him. I don’t think it’s necessary for me to state specifically on which points I agree, but one that affects the manuscript substantially is the suggestion to cut down the length. I agree that section five can be completely removed. I don’t think it’s necessary for the authors to include log functions to the analytical solutions, but I agree that it would be a superior technique.

What I think would be very useful is a comparison of before/after of typical analyses from the authors lab. A few representative datasets would be just fine. I think this would be of great value in showing readers whether this approach changes assigned uncertainties in any substantial way. Maybe it doesn’t, but that’s not necessarily a bad thing.

I think it’s a missed opportunity somewhat to not assess the accuracy of assigned uncertainties in the U-Th-Sm, and the He analyses. I appreciate that the authors have carved the manuscript such that this is “out of scope” (which is their prerogative), but the brief comment that tracers are often added by pipetting (without discussing the implications of adding a spike isotope using a technique with such a high variability) suggests to me that the radionuclide measurements may be underestimated. Not to mention that the treatment of under/over spiking and blank subtraction (to name a couple), in my experience, are dealt with in a highly variable way by different people in the U-Th-Sm/He community. I guess that leaves space for the next paper!

I have a few shorter comments below:

L9: What quantities are ⁴He and “radionuclide” here. Are they amounts, concentrations….? Please briefly define F_T in the abstract for a non-specialist, particularly because it’s referred to as particularly important later in the abstract.

L11: Is this relative or absolute uncertainty?

L15: Again, is this concentration?

L15: What is the confidence level for these estimates? 95%? 2sigma?

L34: I assume that by “kinetic” the authors mean diffusion kinetics.

L76: “ppm” is, in general, ambiguous and best practice is that it should be avoided. I appreciate that there is an implicit convention in some geochemical subfields that it refers only to µg/g, but it is not always the case and there is no disadvantage to being explicit and using the SI-consistent µg/g.

L80: “sector” should say “magnetic sector”

L81: The technique is called “isotope dilution” not “isotope spike”.

L82: I’m not sure what “ratioed mass spectrometric measurements” are?

L103-104: It’s not clear to me that this is true. For example, if a mixed U-Th-Sm tracer is used and the mass of spike solution is relatively small (which is usually the case), the uncertainty in the amount of spike added (which propagates directly onto the amount of U, Th, and Sm calculated) will be relatively large and since the tracer is added as a mixture, that component will be highly correlated. It takes special care to get weighing errors to less than 1%, and with small amounts of spike they can easily be in the 5-10% range. When pipetting without weighing (which is what is implied here) the problem can be much worse. Pipetting consistency can vary, for sure, but for 25 µl the relative standard deviation on masses dispensed can be 3-5%, which translates directly to a 3-5% uncertainty on the measured quantities. This seems like it should be large enough to matter?

L117: 2-9% in what? The Ft correction or the final date? And is this a bias (e.g., the technical definition of “error”) or additional uncertainty on the date?

L131: Here and elsewhere the word “variance” is used. It’s unclear as to whether this is referring to a moment of the gaussian distribution or as a casual synonym for uncertainty or data scatter.

L366: A Th/U (by mass) ratio of 1.25 is *not* typical of zircon, it is unusually high. Looking at the georoc database of zircon compositions, after doing some data culling, the median value is about 0.6 (mean = 0.8). That sounds much more realistic to me than 1.25.

L420: This statement should be justified or referenced, or else removed. It sounds a little bit like it is underestimating the math and stats skills of typical geochronologists.

Citation: https://doi.org/10.5194/egusphere-2022-299-RC2
- AC2: 'Reply on RC2', Peter Martin, 29 Aug 2022
  
  We would like to thank Dr. Ickert for his review; we respond to his feedback in-line below, using bold to easily differentiate our response.
  This manuscript presents an algorithm for determining U-Th-Sm/He dates in accessory minerals and then derives uncertainty propagation equations for them, combining uncertainties in measurements of nuclide amounts and geometric correction factors. They also introduce a Monte Carlo method and compare the two. These algorithms are implemented in a python package hosted on zenodo.
  A consistent set of uncertainty propagation equations for the community is a welcome addition to the literature, although it is unlikely that it will strongly affect science derived from the measurements due to pervasive overdispersion.
  We generally agree with this assessment regarding current and archival data. However, by accurately constraining uncertainty (or, more specifically, the probability distribution) of an analysis, we can also accurately constrain the amount and distribution of overdispersion present in a given dataset to better understand the causes. This work also has the potential to future-proof the field as the technique is applied to increasingly challenging phases and time periods (both deep time and unusually young samples) where analytical uncertainties may become more significant.
  I’ve had the opportunity to read the review by Pieter Vermeesch prior to writing this, and that saves me a lot of time writing my review because I came to basically the same conclusions as him. I don’t think it’s necessary for me to state specifically on which points I agree, but one that affects the manuscript substantially is the suggestion to cut down the length. I agree that section five can be completely removed. I don’t think it’s necessary for the authors to include log functions to the analytical solutions, but I agree that it would be a superior technique.
  Section 5 – the entire discussion of the manuscript – is a fundamental part of this paper, just as the discussion is critical for any manuscript. This section shows the influence of input uncertainties on output uncertainties, is potentially critical for readers who do not have a strong foundation in the mathematical concepts presented earlier in the paper (based on experience with hundreds of (U-Th)/He data users at all levels), and shows the implications for real data. A clear example applying mathematical concepts is often the bridge to understanding for anyone struggling to follow a purely conceptual discussion. It was our understanding that Geochronology is a journal intended for exactly this type of contribution. However, we do appreciate the suggestion that Section 5 can be tightened up, and we will revise the manuscript to keep Section 5 appropriately terse and present some of the less impactful discussion points (e.g. MC vs linear uncertainty propagation) to the appendices.
  What I think would be very useful is a comparison of before/after of typical analyses from the authors lab. A few representative datasets would be just fine. I think this would be of great value in showing readers whether this approach changes assigned uncertainties in any substantial way. Maybe it doesn’t, but that’s not necessarily a bad thing.
  We are not sure exactly what this suggestion entails? Since there is currently no generally accepted method for uncertainty propagation in (U-Th)/He dating, there is not really anything to compare against. While we could provide a comparison between the uncertainties assigned to the data in CU TRaIL prior to the development of this method, this comparison would be highly specific to our lab and would likely not be predictive of what other labs would observe if they made such a comparison.
  I think it’s a missed opportunity somewhat to not assess the accuracy of assigned uncertainties in the U-Th-Sm, and the He analyses. I appreciate that the authors have carved the manuscript such that this is “out of scope” (which is their prerogative), but the brief comment that tracers are often added by pipetting (without discussing the implications of adding a spike isotope using a technique with such a high variability) suggests to me that the radionuclide measurements may be underestimated. Not to mention that the treatment of under/over spiking and blank subtraction (to name a couple), in my experience, are dealt with in a highly variable way by different people in the U-Th-Sm/He community. I guess that leaves space for the next paper!
  We generally agree that including uncertainty analyses for the component data (He, U-Th-Sm, and Ft) would be nice to have in the literature; the challenge, as mentioned in this comment, is that the techniques for these measurements are not consistent between labs. Such a discussion would therefore necessarily not be globally applicable and we do consider it beyond the scope of this paper. And including this discussion here would likely double or triple the length of the paper, which, as acknowledged by both reviewers, is already long.
  I have a few shorter comments below:
  L9: What quantities are ⁴He and “radionuclide” here. Are they amounts, concentrations….? Please briefly define F_T in the abstract for a non-specialist, particularly because it’s referred to as particularly important later in the abstract.
  
  For the abstract, this is more of a general statement since a date can be calculated using either concentrations or amounts.
  
  We agree that we should define Ft in the abstract, and will make this change during revisions.
  
  L11: Is this relative or absolute uncertainty?
  These are relative input uncertainties; we will clarify this in revision.
  L15: Again, is this concentration?
  
  These calculations were done with amount instead of concentration, but this isn’t a critical distinction given that the conversion to concentration has a common denominator for He and U-Th-Sm, and is therefore factored out in date calculation.
  L15: What is the confidence level for these estimates? 95%? 2sigma?
  The uncertainties in the abstract are quoted at 1-sigma; we will clarify this.
  L34: I assume that by “kinetic” the authors mean diffusion kinetics.
  Yes, we are referring to diffusion kinetics, and will clarify.
  L76: “ppm” is, in general, ambiguous and best practice is that it should be avoided. I appreciate that there is an implicit convention in some geochemical subfields that it refers only to µg/g, but it is not always the case and there is no disadvantage to being explicit and using the SI-consistent µg/g.
  This is a good point. We are referring to ppm by mass, so can make this change throughout the manuscript.
  L80: “sector” should say “magnetic sector”
  Yes, this is a good clarification.
  L81: The technique is called “isotope dilution” not “isotope spike”.
  We will correct this.
  L82: I’m not sure what “ratioed mass spectrometric measurements” are?
  
  We are referring to ratio measurements (as opposed to absolute counts or voltages), and will clarify this.
  
  L103-104: It’s not clear to me that this is true. For example, if a mixed U-Th-Sm tracer is used and the mass of spike solution is relatively small (which is usually the case), the uncertainty in the amount of spike added (which propagates directly onto the amount of U, Th, and Sm calculated) will be relatively large and since the tracer is added as a mixture, that component will be highly correlated. It takes special care to get weighing errors to less than 1%, and with small amounts of spike they can easily be in the 5-10% range. When pipetting without weighing (which is what is implied here) the problem can be much worse. Pipetting consistency can vary, for sure, but for 25 µl the relative standard deviation on masses dispensed can be 3-5%, which translates directly to a 3-5% uncertainty on the measured quantities. This seems like it should be large enough to matter?
  
  We think this is likely true for CU TRaIL based on our procedures and observed reproducibility, but correlated uncertainty associated with spiking procedures could be significant depending on the exact procedure used. We will add language to this effect and note that this may need to be assessed on a per lab basis.
  
  L117: 2-9% in what? The Ft correction or the final date? And is this a bias (e.g., the technical definition of “error”) or additional uncertainty on the date?
  We will clarify this point—we are referring to uncertainty in the Ft correction, which will result in increased date uncertainty.
  L131: Here and elsewhere the word “variance” is used. It’s unclear as to whether this is referring to a moment of the gaussian distribution or as a casual synonym for uncertainty or data scatter.
  Variance is explicitly not a parameter related to the gaussian distribution, but rather the formal term for “data scatter” (specifically, the expected squared deviation of a random variable from the mean) within any distribution. We used it here and elsewhere with the intention of being agnostic about the probability distribution reflected by an observed dataset while also using the formal language for scatter in a dataset. We will clarify this in the manuscript.
  L366: A Th/U (by mass) ratio of 1.25 is *not* typical of zircon, it is unusually high. Looking at the georoc database of zircon compositions, after doing some data culling, the median value is about 0.6 (mean = 0.8). That sounds much more realistic to me than 1.25.
  Thanks for this. This is a typo we discovered after submission; 1.25 is meant to be in reference to a typical apatite and 0.6 is our FCT number, which we applied to the zircon in this work. We will correct this during revisions.
  L420: This statement should be justified or referenced, or else removed. It sounds a little bit like it is underestimating the math and stats skills of typical geochronologists.
  Thanks for this reaction. We will modify this statement to “Figure 4 illustrates examples of combining uncertainties with differing magnitudes in quadrature.” Hopefully this clarifies that, rather than an implicit low appraisal of most geochronologists’ math/stats skills, this paragraph is meant as a simple exercise to build understanding for less advanced readers (e.g. undergraduates or early graduate students), and to prompt the memories for more advanced readers.
  
  Citation: https://doi.org/10.5194/egusphere-2022-299-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (04 Sep 2022) by Brenhin Keller

AR by Peter Martin on behalf of the Authors (28 Nov 2022) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (28 Nov 2022) by Brenhin Keller

RR by Pieter Vermeesch (09 Dec 2022)

Suggestions for revision or reasons for rejection

In response to my earlier review, the authors of “Calculation of Uncertainty in the (U-Th)/He System” have shortened their manuscript by about 25%, and have moved significant chunks of text into the appendices. These changes definitely constitute an improvement. However, in my opinion further changes are needed, but I am happy to leave the extent of these changes to the discretion of the Associate Editor.

Major comments:

1. I really don’t think it is necessary to explain well-known concepts such as the Newton-Raphson method (Section 3.1) and Monte Carlo error propagation (Section 3.3) in such minute detail. Thus, I think that Equations 4-6 and Figure 1 can be safely removed from the paper.
2. Equations 7, 8, 19 and a11 are wrong. The covariances should not be squared.
3. I was disappointed that the revision retains the ad-hoc definition of skewness (or “skew”). Not only is this definition confusing, but it is also unnecessary. As I mentioned in my first review, the skewed error distributions likely fit a simple lognormal distribution. If this is the case, then the skewness would disappear after taking logs, and the analytical uncertainty could be adequately captured with a single number, namely the standard error of the log. The absolute standard error of a logarithmic quantity corresponds to the relative standard error of its linear equivalent. Had the authors carried out their linear error propagation in log space, then this would have avoided the need for Monte Carlo simulations. Since two thirds of the paper is dedicated to explaining the difference between the linear and Monte Carlo uncertainty estimates, this means that two thirds of the paper is essentially unnecessary. This includes Figure 3, 4, 5, C2 and the entire appendices D and E. Taking logs would also avoid the issue with negative dates that is mentioned in Appendix B.

Other comments:

1. Section 1 claims that the uncertainty of the alpha-retention factor Ft is “increasingly well-constrained”. This is only the case for the measurement of the grain dimensions. Unfortunately, there has been little or no progress in quantifying the larger effect of compositional zoning (Hourigan et al., 2005). This effect is, essentially, unknowable.
2. Section 3.3. typo “6odelling”
3. Section 3 cites Efron and Tibshirani (1986) as a reference for Monte Carlo error propagation. However, this paper discusses bootstrapping, which is a different procedure than the one described in the manuscript.
4. Table 1. Typo: “convariance”
5. Section 4.2 defines a 95% Monte Carlo confidence interval as the difference between the 2.275 and 97.725 percentiles. However, 97.725 - 2.275 = 95.45 ≠ 95!
6. The denominators of Equations a1-a8 are all the same. It would save lots of ink if they were replaced by a separate parameter. I suggest ‘d’ for ‘denominator’.
7. Figure C1: the three columns of this plot are, essentially, the same (with the first and third being exactly the same). This reflects the fact that U, Th, He and Ft are simple multiplicative factors in the age equation. So the sensitivity of the age is, essentially, the same for them. I don’t think that this figure is very informative. I suggest that it be removed.
8. Figure C3 is misleading. As explained in its caption, it exaggerates the non-linearity of the (U-Th)/He age equation by decreasing the uranium decay constant. In other words: the non-linearity is a straw man problem. It only affects >1Ga samples, which are never found on Earth anyway. Consequently, I think that Appendix C can be removed as well.

Hide

ED: Publish subject to minor revisions (further review by editor) (10 Dec 2022) by Brenhin Keller

AR by Peter Martin on behalf of the Authors (19 Dec 2022) Author's response Author's tracked changes Manuscript

ED: Publish as is (22 Dec 2022) by Brenhin Keller

ED: Publish as is (03 Jan 2023) by Greg Balco (Editor)

I agree with the associate editor that this is suitable for publication. I have a few final comments:

-- This could be published either as a 'research article' or a 'technical note.' There is no size limit on a 'technical note', although this is confusing on the Gchron website because 'technical notes' and 'short communications' are largely described together. Personally, I would lean toward 'technical note' because for the most part this paper reviews data and mathematical methods that are already known, but the authors should use their best judgement. Please note that 'technical notes' at GChron have exactly the same review and bibliographic treatment as 'research articles' -- they are not considered a less rigorous category. If the authors wish to publish this as a 'technical note,' please contact the Copernicus editorial staff with a copy to myself and the AE.

-- Please carefully check all the equations in the paper during typesetting and production.

-- One theme of the reviews and responses for this paper is the tension between providing a mathematically concise description of the error propagation scheme and providing a description that can easily be understood by students or others with a less extensive mathematical background. In this paper the authors have argued in favor of the latter, which is totally reasonable in light of the intended purpose of the paper. However, a potential problem with this approach, that appears to be recognized but not explicitly discussed in the correspondence, is that the authors of this paper are, at least implicitly, proposing that the approach described in the paper should be adopted as a standard in the field. This leads to the risk that a more mathematically concise error propagation scheme (e.g., calculation of uncertainties in log rather than by Monte Carlo), even though it is also correct, might not be accepted in future publications because it would be viewed as noncompliant with the more simplistic standard proposed here. However, (i) the authors have been very good about keeping prescriptive remarks and recommendations out of the paper (this is good), and (ii) the purpose of this paper is, as articulated by the authors, to explain the method to students. Thus, I don't think any action is needed to address this.

Thanks very much for submitting your work to Geochronology,

-- greg

Hide

AR by Peter Martin on behalf of the Authors (10 Jan 2023) Author's response Manuscript

Journal article(s) based on this preprint

07 Feb 2023

Calculation of uncertainty in the (U–Th) ∕ He system

Peter E. Martin, James R. Metcalf, and Rebecca M. Flowers

Geochronology, 5, 91–107, https://doi.org/10.5194/gchron-5-91-2023,https://doi.org/10.5194/gchron-5-91-2023, 2023

Short summary

Peter E. Martin et al.

Model code and software

HeCalc P. E. Martin https://doi.org/10.5281/zenodo.5672830

Peter E. Martin et al.

Viewed

Total article views: 534 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
362	159	13	534	3	5

HTML: 362
PDF: 159
XML: 13
Total: 534
BibTeX: 3
EndNote: 5

Views and downloads (calculated since 12 May 2022)

Month	HTML	PDF	XML	Total
May 2022	159	43	6	208
Jun 2022	59	22	0	81
Jul 2022	22	16	0	38
Aug 2022	63	32	6	101
Sep 2022	12	15	0	27
Oct 2022	20	7	1	28
Nov 2022	14	7	0	21
Dec 2022	6	7	0	13
Jan 2023	5	7	0	12
Feb 2023	2	3	0	5
Mar 2023	0
Apr 2023	0
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0

Cumulative views and downloads (calculated since 12 May 2022)

Month	HTML	PDF	XML	Total
May 2022	159	43	6	208
Jun 2022	59	22	0	81
Jul 2022	22	16	0	38
Aug 2022	63	32	6	101
Sep 2022	12	15	0	27
Oct 2022	20	7	1	28
Nov 2022	14	7	0	21
Dec 2022	6	7	0	13
Jan 2023	5	7	0	12
Feb 2023	2	3	0	5
Mar 2023	0
Apr 2023	0
May 2023	0
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0

Viewed (geographical distribution)

Total article views: 494 (including HTML, PDF, and XML) Thereof 494 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 07 Oct 2023

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (2064 KB)
Metadata XML

Short summary

There is currently no standardized method of performing uncertainty propagation in the (U-Th)/He system, causing difficulties in data interpretation. We present two methods of uncertainty propagation and describe a free, open-source software (HeCalc) to apply these methods. A compilation of real data using only analytical uncertainty, and 2 % and 5 % uncertainties in F_T yields respective median relative date uncertainties of 2.9 %, 3.3 % and 5.0 % for apatites and 1.7 %, 3.3 % and 5.0 % for zircons.


Total:	0
HTML:	0
PDF:	0
XML:	0