The quantification of downhole fractionation for laser ablation mass spectrometry

Lloyd, Jarred Cain; Spandler, Carl; Gilbert, Sarah E.; Hasterok, Derrick

doi:https://doi.org/10.5194/egusphere-2024-2908

Preprints

https://doi.org/10.5194/egusphere-2024-2908

Preprints

10 Oct 2024

| 10 Oct 2024

The quantification of downhole fractionation for laser ablation mass spectrometry

Jarred Cain Lloyd, Carl Spandler, Sarah E. Gilbert, and Derrick Hasterok

Abstract. Downhole fractionation (DHF), a known phenomenon in static spot laser ablation, remains one of the most significant sources of uncertainty for laser-based geochronology. A given DHF pattern is unique to a set of conditions, including material, inter-element analyte pair, laser conditions, and spot volume/diameter. Current modelling methods (simple or complex linear models, spline-based modelling) for DHF do not readily lend themselves to uncertainty propagation, nor do they allow for quantitative inter-session comparison, let alone inter-laboratory or inter-material comparison.

In this study, we investigate the application of orthogonal polynomial decomposition for quantitative modelling of LA-ICP-MS DHF patterns with application to an exemplar U–Pb dataset across a range of materials and analytical sessions. We outline the algorithm used to compute the models and provide a brief interpretation of the resulting data. We demonstrate that it is possible to quantitatively compare the DHF patterns of multiple materials across multiple sessions accurately, and use uniform manifold approximation and projection (UMAP) to help visualise this large dataset.

We demonstrate that the algorithm presented advances our capability to accurately model LA-ICP-MS DHF and may enable reliable decoupling of the DHF correction for non-matrix matched materials, improved uncertainty propagation, and inter-laboratory comparison. The generalised nature of the algorithm means it is applicable not only to geochronology but also more broadly within the geosciences where predictable linear relationships exist.

Received: 28 Sep 2024 – Discussion started: 10 Oct 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 6718 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (6718 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

05 Aug 2025

The quantification of down-hole fractionation for laser ablation mass spectrometry

Jarred C. Lloyd, Carl Spandler, Sarah E. Gilbert, and Derrick Hasterok

Geochronology, 7, 265–287, https://doi.org/10.5194/gchron-7-265-2025,https://doi.org/10.5194/gchron-7-265-2025, 2025

Short summary

Jarred Cain Lloyd, Carl Spandler, Sarah E. Gilbert, and Derrick Hasterok

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2908', Anonymous Referee #1, 15 Nov 2024

While I am not in a position to critically evaluate the mathematical models used to quantify Differential Heating Factor (DHF) in this extensive dataset, the results presented here are compelling. Having previously analyzed a similar dataset and examined DHF patterns in U-Pb dating across various minerals, I am often struck by the surprising variability between them. The physical mechanisms underlying DHF remain somewhat ambiguous, which makes this study’s methodology and findings—particularly in the context of “Big Data”—valuable to the community engaged in LA-ICP-MS U-Pb dating methods.
Drawing from my experience, I believe certain aspects warrant additional discussion. Ideally, expanding this impressive dataset with data from alternative instrumentation (both laser ablation and ICP-MS systems) would mitigate some current limitations of the study, though it would likely raise additional questions as well.
Several potential limitations and influential factors in DHF quantification, as presented in this work, could benefit from a more detailed discussion. These include aspects such as signal duration, differences in instrumentation and operator handling, sample heterogeneity, inclusions, focal position accuracy, and variation in detector cross-calibration and dead time.
Lines 82-85: “The independence and physical meaning of the lambda coefficients allows them to be used to quantitatively compare independent fits (e.g., single analyses, materials, analytical sessions, differing laboratories) so long as other parameters (e.g., fluence, spot diameter/volume, laser wavelength) are considered.”

How are these other parameters accounted for within the analysis framework? What impact would neglecting them have on the method’s accuracy and reliability?
Is the centering of time-dependent ratios influenced by signal duration, or is it only lambda 0 that is sensitive to such changes, potentially due to ICP tuning dependencies? Furthermore, how does total analysis time (with data presented at 30 and 40 seconds) impact the DHF pattern once data are centered for further calculation of lambda components (lambda 1, 2, 3, and 4) and subsequent UMAP visualization? If lambda components exhibit different characteristics during different parts of the signal—such as a linear trend dominating the initial 10 seconds and a higher-order trend thereafter—then the lambda coefficients would differ for signal durations of 20, 30, or 40 seconds. Consequently, the same analysis might plot differently in UMAP depending on signal duration. This warrants further discussion.
The statement, “and for Wilberforce, the steeper linear DHF component and larger uncertainty are due to inclusion of several points from some analyses that are highly leveraging the fit, even with automated outlier removal being applied,” suggests that inclusions and heterogeneity (particularly variable initial Pb content, as in Apatite) within reference materials can influence the DHF pattern beyond what outlier removal can address. How is this handled in your analysis? A more in-depth discussion would be beneficial.
This study is based on data from a single laboratory using a single laser ablation system and two similar ICP-MS instruments likely operated or trained by a single Lab Manager. The robustness of DHF quantification could be enhanced by incorporating data from different laser ablation systems, which vary in wavelength, pulse width, energy density, and ablation cell design, as well as different ICP-MS instruments. Please discuss how this limitation might affect the generalizability of the quantification.
Additionally, ICP-MS instruments employ various detection modes that require cross-calibration. How is it ensured that ablation signals—where intensity generally decreases during single-hole ablation—are unaffected by potential cross-calibration errors that could impact ratio measurements?
Minor comments:
Line 33: “As DHF is a volume- dependant spot-ablation phenomena…” I would rather state “ As DHF is a crater geometry dependant ….”
Line 216 there is an E missing in “(𝜆3) range from -1.08E-5 to +2.15-5,”

Citation: https://doi.org/10.5194/egusphere-2024-2908-RC1
- AC1: 'Reply on RC1', Jarred Lloyd, 24 Mar 2025
  
  We thank the reviewer for their time and comments on the manuscript.
  The reviewer has raised they would like to see more discussion regarding the potential limitations and influential factors in down-hole fractionation (DHF). While we agree the with the overall concept they are presenting in their commentary, we feel that this is largely beyond the current the scope of this contribution, which is to provide an improved quantitative method for the assessment of DHF fraction during LA-ICP-MS analysis, rather than conduct the assessment of these factors. The assessment of these factors using the proposed method is a worthwhile endeavour, but would be best addressed in a one or more separate studies. The factors influencing DHF are well documented in literature, although their underlying mechanisms are somewhat ambiguous still as the reviewer states.
  
  We agree it would be worthwhile adding a brief section in the discussion or introduction that outlines the potential factors more clearly and how they are controlled for during this study and what impact they would have if neglected, but the mathematical foundation of the modelling is independent of these factors. As with all tools, user beware.
  Regarding commentary about single laboratory/single ablation system, again, given the method being proposed here is the mathematical foundation for an improved model fitting algorithm, we believe addition of extra datasets from other laboratories and equipment is not necessary for this particular manuscript. It is exactly what would be needed for a large-scale assessment of DHF by the community and would be best addressed in a separate study. From the data provided in this study it is already apparent that the usual factors influencing DHF (spot geometry, material, etc) will change the resulting model fit. Laser wavelength is known to impact DHF as shown in numerous studies and so would be expected to change the resulting fit, but again not the mathematical foundation of the fitting. We propose to add older published data from the lead author that used a 213 nm laser paired with an Agilent 7500 as a supplementary figure to demonstrate this, and would add the brief section on DHF influences as outlined above.
  With regard to the commentary asking how these factors were controlled for during this study. We decided to demonstrate the fitting algorithm using common reference materials for well-established methods, i.e. U-Pb geochronology of zircon, monazite, apatite etc using standard conditions for each material as accepted by the community for a given analytical setup. The references for the standard analytical methods are provided in the methods section of the paper, but for example all zircon data presented in this study was ablated using a 193 nm wavelength laser using a fluence of nominally 2.0 J cm^-2 and a repetition rate of 5 Hz. When computing orthogonal decompositions of aggregated data, the data is first grouped by fluence, spot diameter, repetition rate, and material. This ensures that only like-for-like data are being fit, and not artificially skewing results due to external factors that are known the impact DHF (e.g. fluence) not being accounted for. Again, our paper presents a methodology for quantitatively fitting the signal of LA-ICP-MS analyses to model DHF: it is independent of the specific instrument factors used during analysis.
  In regard to cross-calibration due to detection modes, I am interpreting this as the change from analogue to pulse detection model. Our ICP-MS setup is set to use a pulse cut-off of ~2.5 million counts per second, and during initial tuning before an analytical session the count rates are tuned to optimise sensitivity while ensuring that counts remain in one detection mode. If an analysis did have a portion in analogue mode that switched to pulse mode due to the signal intensity decrease over time, and that calibration factor was in error, it would make the fit erroneous and require portions of the signal to be subdivided for fitting, however no such change is observed in this data set (as seen in the supplementary signal intensity plots). It would be seen as a step change in the intensity and ratio of a signal if present.
  Regarding the centring of time-dependant data. The reviewer has understood correctly that only lambda 0 is sensitive to ICP tuning dependencies. Centring the data does not change the shape of the data, just the central tendency and this allows for comparison of data that have not yet been corrected against a known ratio (e.g. NIST glass, GJ). Total signal time is likely to have negligible impact on lambda 1, but may have a greater impact on lambdas 2 and higher due to the more complicated profiles these represent (sinuosity for example). This potentially would have some impact on where they plot on UMAP, but they should still plot close to analyses from the same material. Some of the variation seen for a single material (e.g. GJ1) may be explained by this effect. This can be elaborated on in the manuscript and looked at in more detail. The time component of a modelled signal is corrected to laser-fire and signal-stabilisation (after initial intensity ramping) to mitigate some of the differences that would occur from using a delayed subset of a signal (e.g. 10 seconds after the signal stabilised).
  For the minor comments:
  
  We agree that geometry is a better term as it accounts for shape, diameter, and depth.
  We will correct the missing E in the scientific notation, and will change to "value*10^X" notation if that is the Geochronology formatting recommendation.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2908-AC1
RC2:
'Comment on egusphere-2024-2908', Noah M McLean, 18 Feb 2025

This manuscript, which details a method to quantify the shape of laser ablation downhole fractionation profiles and compares them across minerals and instrument parameters from a single lab, is a solid contribution. Its quantitative, technical scope, and its methodological slant are well suited to Geochronology. In my opinion, the article needs somewhere between minor and major revisions, mostly to do with the clarity of the writing and quantitative explanations and figures. The orthogonal polynomial regression approach is sound, and the dataset is interesting, although applications for the technique have not been thoroughly explored.
The orthogonal polynomial regression approach used here is appropriate, and its use to quantify the shape of the DHF is an interesting contribution. I’ve never heard of UMAP, but it seems to be used appropriately. A quick survey of the GitHub repository shows that the code is relatively easy to follow, and major functions have informative docstrings. However, there are some issues with the regression framing (see comments around equations 7-9).
What’s missing from this publication is some explanation and clarity – on occasions throughout the text, I’m not sure what is being explained or plotted. I’ve enumerated those below. Another missing point, brought up by the previous review, is that the specific results discussed in this paper (most of the figures and discussion) are applicable to the authors’ analytical setup and choices. For instance, the shape parameters for DHF will depend not just on the mineral and the spot size but also on the choice of fluence. How were the fluences in Appendix A chosen? From lab to lab, these will vary based on sample cell setup, gas flows. It’s clear from reading the manuscript that the authors know this, but it’s not clear upfront to the reader.
It is also unclear how the authors propose to use the results of their orthogonal polynomial regression to correct for down-hole fractionation in the Applications section. On line 293, the authors suggest that “We envisage that this algorithm could be implemented in data reduction software to self-correct the DHF pattern of well-behaved materials…” I don’t know what this means, but it sounds like a very different approach than the two common DHF correction schemes, the ‘intercept’ method (used, e.g., by AZ LaserChron), or the ‘downhole’ method from Paton et al. implemented in Iolite. Can the authors describe such a self-correction and how it might be different from what’s used currently? This point is obtuse, and if the authors wish to write a forthcoming paper on a new algorithm, I’d suggest leaving it out here rather than the current vague mention.
Finally, I don’t see any documentation of how this algorithm was tested. Did you create synthetic data with a known data covariance matrix and known regression parameters, fit the data, and recover the input parameters? How does the scatter between regression parameters fit to many synthetic datasets compare with the uncertainties in the fit parameters estimated by the model?
Line-by-line comments follow.
Figure 2b: The trace of lambda_4 here is extraneous or in error. Based on the description in this portion of the manuscript, I would expect to see a fourth-order polynomial. If the coefficient is so small that the trace won’t plot on the y-axis of this figure, then it is not a good illustration. If it has been set to zero by an AICc cutoff explained much later in the text, then it’s quite confusing here. Is there some mathematical significance to the dot between the lambda_j and the variable x in the legend?
Figure 2a and caption: The ‘gmean-centered ratio’ y-axis label needs a detailed explanation for those who don’t regularly center their data with a geometric mean and are likely confused by this axis label. The first sentence of the caption doesn’t help much – please sacrifice some brevity for the sake of clarity. The x-axis label and color scale for 2a are confusing – do you need all those dates on the color scale? It’s the same length as the x-axis, which is confusing – I thought for a long time that the color scale corresponded to the ‘time since laser start’ and couldn’t figure out what was being plotted. Maybe make the color scale vertical and place it to the right of the axes with fewer dates?
Line 81: there is some confusion here about whether lambda_0 represents an arithmetic or geometric mean of the data. I like that this manuscript uses geometric means extensively. Why do the regression on ratios instead of log-ratios, though? The authors seem aware (e.g. in the appendix) about the challenges of dealing with compositional data. Those challenges extend not just to taking means, but to regression problems (same idea, more parameters). See countless Aitchison publications for details.
Line 115: I don't understand this paragraph as written. Specifically, what numbers are discrepant? Are lambda_1 and higher different for centered and non-centered fits of the same (e.g., 30-second) laser ablation analyses? This seems like a rounding or numerical precision problem. If the labda_1 and higher coefficients are different among separate laser ablation analyses (e.g., the first reference material analysis and the second), then this implies that the shape of the DHF is changing during the session, or that the ref mat is heterogeneous in the parameters that impart the DHF behavior.
Figure 4: Maybe thin this data out by randomly selecting 10% or 25% of the data to plot? That would prevent data overlap and occlusion problems and avoid aliasing effects.
Line 132: The example here is counts per second, but the figure shows a ratio on the y-axis. Is there a time when you'd fit an intensity rather than a ratio, and when would that fit be useful?
Line 136: What is N?
Equations 7-9: There is some lack of clarity here, along with some errors or typos.
In equation 7, if you want to leave that Sigma term in, you’ll want to take the hat off of the y. Usually, hat means the predicted values, which are given by multiplying the design matrix by the best-fit parameters (here, Lambda), usually also given a hat, which means there’s no error term in the equation. The error term here (Sigma in equation 7) would need to be a vector, not a matrix, and it is not given by equation 9, which describes the uncertainties in the best-fit parameters in Lambda, not the measurements in y.
In the generalized least squares framing, that Omega matrix in equation 8 is the covariance matrix for the measurements in y (or the residuals in a differently formulated equation 7). This matrix goes unmentioned in the main text, until line 425 in the Appendix. Where do the analytical uncertainties in equation 8 come from? I’ve looked at the code and the appendix and I’m still unsure. What terms go into the uncertainties? For ratios, are the numerator and denominator intensity measurements accounted for? Are detector effects like dead time included? What sort of assumptions are made here and how might they affect the results?
As far as I can tell from the appendix, these uncertainties are estimated from the variability of the measured ratios about the… geometric mean? But this wouldn’t make much sense for a time series where you expect a trend (and you want to measure its shape) – some of the variability would come from the analytical uncertainty and some from the trend itself. I can’t piece this together.
Looking at the code and the appendix, Omega is a diagonal matrix. My understanding is that this technically makes your algorithm a weighted least squares, rather than a generalized least squares, regression problem. Also, I think most folks would put a hat on Lambda here.
In equation 9, the Sigma contains the variances and covariances for the fit parameters (here, the lambdas). This step only works for generating uncertainties, confidence bands, etc, if the Omega is an inverse covariance matrix for the data in y. Note that in your GitHub code, lines 286-300 look like they're weighting by integer ones and maybe a few other approaches? I can't follow the code exactly and it's not commented. If you use ones as the diagonal of Omega, then equation 9 only gives you a Sigma that contains variances and covariances if the uncertainties are all independent (e.g., no covariance terms from detector effects like dead time) and identical to one in whatever the units of y are (unclear here and elsewhere).
Are the lambda uncertainties plotted throughout this paper calculated using input uncertainties to fit_orthogonal() or are some or all calculated with the default unit weights?
Figure 5, line 212: It is unclear to me what is going on here. I think I get (c), which describes the results of 5478 different orthogonal polynomial regression analyses (of what exactly?) versus time? Maybe 206/238? Or maybe an unspecified intensity in cps, per line 132?
What then are the 188(?) points in b? Maybe the (cps? ratio?) data from a single reference material (e.g. GJ1) have been lined up to a common time datum (the laser turning on?), and all the (centered?) data have been fit by the same orthogonal polynomials? If multiple datasets are combined, do they agree? One of those chi-squared values you mention in the model selection portion of the paper should tell you, but those are not reported here. The way that you’ve set up the regression in equations 7-9, the uncertainties in Sigma will get smaller with more data, no matter how scattered the data are about any one trend. Try it and see! This is the same effect as the weighted mean (a special case of weighted least squares) getting more precise when adding more analyses, even when those analyses don’t agree and the reduced chi squared grows large. Just like a dataset can have a weighted mean with a large reduced chi squared and a tiny uncertainty, your regression could have a large scatter about the trend you’ve described with orthogonal polynomials but tiny uncertainties.
Continuing on the comments from Figure 5b, what does this analysis tell us? Spell out where a user would one use these lambdas, rather than the lambdas from (c)? What about (a)??
Line 215: Spell out scientific notation here and in the figure axis labels, but I'd say just leave them out of the text altogether unless you mean to explain the physical significance (starting with the units) for each number.
Line 227: Describe very clearly here what you mean by “without needing reference material calibration,” or rephrase this. Maybe something along the lines of "the lambdas quantify the shape of the DHF for each of the unknown analytes and reference materials in a way that is independent of a shape derived from an average of reference material analysis (ref Paton).

Citation: https://doi.org/10.5194/egusphere-2024-2908-RC2
- AC2: 'Reply on RC2', Jarred Lloyd, 24 Mar 2025
  
  We thank Noah for his time and constructive comments on the manuscript.
  Starting with the final overview comment about how was the algorithm tested. We did not create a synthetic dataset and recover the input parameters, in hindsight this would have made testing simple, and we are happy to implement this to further validate the method as it would also be useful for debugging purposes. This method was developed upon the established methodology of O'Neill (2016) and Anenburg & Williams (2022) who have used it for modelling REE patterns quantitatively. Our algorithm was tested against theirs using identical datasets to ensure consistency in results and uncertainties. Our implementation of this algorithm is much more generalised allowing it to be used for the purposes in this manuscript, but can readily be applied to any data where a (up to five order) polynomial fit is sensible.
  While the reviewer has raised a similar concern to reviewer 1 about the results being applicable to only our analytical setup, we again believe this is not a limitation of the current scope of the study but would be happy to add older published data from the lead author that used a 213 nm laser paired with an Agilent 7500 as a supplementary figure. The scope of the study is to provide a tool that can reliably and quantitatively model DHF patterns to enable studies that assess the differences between the factors that influence DHF and differences between analytical setups, laboratories, etc., rather than make a full assessment of these. The assessment of these factors, and results from different laboratories would best be done in a separate (or several) community-led studies involving many institutes.
  Fluences in appendix A were chosen based on the analytical methods cited in the methods section. The chosen fluences for a given material have been used consistently for several years in the host laboratory with numerous studies published using those conditions. The initial studies aimed to optimise signal sensitivity, count rate, signal complexity, and pit depth while minimising DHF for a given material. As the reviewer has rightly stated, this will vary between laboratories but this is where the proposed method can be of significant use in enabling quantitative comparison between setups so long as the appropriate metadata (laser and tuning conditions) are known for a given fit. We can add a sentence or two to the manuscript highlighting this point.
  
  For the reviewer's overall commentary on the clarity, we will use their input to refine the manuscript to improve this aspect. One of the co-authors who is more removed from the mathematical and technical knowledge on this manuscript will look over the manuscript again with the reviewer's feedback in mind to assist further with this.
  Regarding the discussion of applications of the method. We will aim to clarify this further in a revision. Our intent is to provide the method so it could be used as a drop-in replacement for standard polynomial fitting used in current data reduction software (Iolite, LADR) for DHF correction whilst then providing the quantitative parameters and enable DHF correction uncertainty propagation. For our purposes, it is used to demonstrate the methods validity for down hole modelling, and we will use it to quantitatively assess the appropriateness of using one material to perform down-hole correction on another.
  
  The proposed method, could however be used to quantitatively model any geological process which has a (mathematically) linear relationship between x and y (for example, REE abundance ionic radii, as has been done previously).
  
  With specific regard to line 293, while we have not investigated how such an implementation would work in practice, we believe it would be possible to remove the dependence of the "down-hole method" (Paton el al.) on a homogeneous material for DHF correction by fitting each unknown directly. There are numerous parameters of fit that could numerically assess the sensibility of a fit to an individual signal, and if the signal were particularly noisy (due to things like inclusions) and the parameter of fit suggested such it could fall back to the standard "down-hole model" of a reference material. We may remove this sentence at the reviewer's recommendation, given the unexplored nature of this implementation. Regardless, we think it worthwhile elaborating on how the proposed method differs from the existing down-hole modelling methods as they are only briefly discussed in the manuscript's introduction.
  Figure 2: We would happily make modifications to the caption and figure to address the reviewer's comments here. For figure 2b we could remove λ₄ as it is effectively zero, and add a supplementary figure that displays the shape of each component (λ 0 - 4) more clearly. The dot between λ_j and variable x is intended to signify that is it λ_j multiplied by the orthogonal function φ(j) of x. More concretely, λ₁* φ₁(x) is the linear orthogonal polynomial component of x. This needs to be revised on the figure as they are in the wrong spot, it should be λ_j⋅φ(x).
  Line 81: It currently represents an arithmetic mean due to fitting on centred (i.e. geometric mean subtracted) ratios. These by definition end up centred around 0 with both positive and negative ratios causing log transformations to not be possible. This is more a limitation of how the data is initially ingested and the order in which operations are done. This could be overcome with a code rework. We don't see this as an issue currently, as we are only attempting to show this method is possible, and can make improvements to the code in future (contributions are welcome) to further improve the accuracy of the method from a compositional data perspective.
  Line 115: We'll work to revise this paragraph for clarity, as it is rather complicated as currently written.
  
  An example of what this paragraph was attempting to highlight: for session aggregated fits lambda 1-4 values for a non-centred and centred fit are slightly off from each other.
  
  For example, lambdas 1-4 for 91500 on 2020-02-24 are 0.00065959, -3.5747e-5, 4.6642e-7, 4.1489e-8 when not centred, and 0.00066039, -3.5474e-5, 5.2599e-7, 5.198e-8 when centred. After investigating this further, it appears to be an artefact of the data not being calibrated against a known reference material and their being some drift in the session (which has not been corrected for). This would cause underlying data to have slightly greater spread along the y-axis and would then impact λ₁ (the slope), which then impacts λ₂, and so on with each higher term being partially dependant on the term before it. Using a synthetic dataset and introducing "drift" across it will help confirm this.
  
  Importantly, individual analysis fits have identical lambda values for centred and non-centred datasets, implying that the algorithm operating correctly and this is an input data issue unique to session aggregated data.
  
  This further highlights the need to centre data when comparing different analyses or aggregating data to remove the influences of machine drift and tuning.
  Figure 4: We'll take this feedback on-board. The figure is there to demonstrate that we can compare data from different sessions without needing to first calibrate them to a known-ratio.
  Line 132: We'll change this to "ratio" for clarity. Fitting signal intensity rather than ratio might provide us with greater insight on how specific elements are behaving during ablation (as a result of ionisation efficiencies, volatilities etc) and could be a potential application of the method. We have not explored this though.
  Line 136: N is the total number of points. I can change this to "for all i from 1 to N" to improve clarity.
  Eqs 7-9: The hat is removed from y in an already revised version as this is a typographic error.
  
  The reviewer is correct in Eq. 9 being the variance-covariance of the lambda parameters, not the measurements in y, and the sigma term being a vector rather than a matrix. These are typographic errors. Eq. 9 is used to calculate the lambda values.
  
  These equations and corresponding text will be changed to reflect this.
  
  For Eq 8, notably the Omega term. The input uncertainties come from prior data ingestion steps, but they are not fully quantified with parameters like dead-time uncertainty, dwell time uncertainty etc. They are currently calculated from the underlying gas-blank subtracted count per second data for the ratio numerator and denominator intensities via the function load_agilent. These uncertainties are used as a weighting for the fit.
  
  The reviewer is also correct that the exact current implementation is a weighted least squares due to the diagonal Omega matrix, but this is a specialised case of generalised least squares. The intention of having the algorithm set up like this currently is to allow for true covariance-variance input (accounting for dwell time and dead time) in future (Omega would then not be a diagonal matrix).
  
  We welcome future, external discussion with the reviewer on this topic to refine the code accordingly but this does not impact the methodological foundations of the proposed fitting method as it gives results consistent to the REE fitting algorithm of Anenburg & Williams (2022) on their REE data.
  
  The plotted lambda uncertainties in this paper are all calculated by the fitting algorithm using input uncertainties as weights for the fit, and not the default unit weights that are provided in the algorithm as these are only a fall back if the user does not provide uncertainties for weighting.
  Figure 5, line 212: We would happily work through these comments to improve clarity here.
  
  For reference, (a) are the fits for an aggregated dataset for a specified sample from all sessions (e.g. GJ, centred, all sessions), (b) are the fits for an aggregated dataset for a specified sample within a session (e.g. GJ, centred, single session). x is λ₁, y is λ₂, of an orthogonal polynomial fit to U/Pb (y) against time (x). The data have been lined up to a common time datum, laser on using the automatic times algorithm implemented in the code.
  
  The reduced chi squared values are quite small, indicating either over-fitting of the data or the error variance has been overestimated. This is likely related to the set up of input and output analytical uncertainties, and as the reviewer has stated will lead to smaller uncertainties with increasing number of data points.
  
  I've implemented a normalised root-mean-square-error (NRMSE, 0 being a "perfect fit" and 1 being a "nonsensical fit" of the data). For the worst case sample using the least appropriate order (lowest AIC weight), this value is around 0.25 and for an appropriate fit, the value is usually between 0.08-0.16 suggesting good-quality fits.
  
  NRMSE values from the sample aggregated data (i.e. all of 91500), session fits (91500 per session) and analysis fits (91500 per analysis) very consistent across these models. I would expect to see greater variability in the NRMSE across the different aggregations if the input data were increasing the overall scatter due to them not following the same linear (x to y) relationship as each other.
  
  These measures of fit aren't directly reported in the manuscript as they are directly available from the provided code and data, and it would create an excessively large table that is quite difficult to understand.
  
  There is room for improvement in the uncertainty quantification and use of these measures of fit, with particular consideration of some of the pitfalls in reduced chi squared for linear regression (Andrae et al. 2010, preprint, http://arxiv.org/abs/1012.3754). However, this is will not change the overall conclusions drawn from this work, nor does it prevent the use of the proposed method, as the underlying fitting algorithm is well established in literature and as the reviewer has confirmed, is mathematically sound. The code is open source, and we welcome further discussion and contributions from the reviewer (and any other interested party) to improve the uncertainty quantification of the lambda parameters in the code.
  We'll take on the feedback to highlight where a user should utilise analysis or aggregated data lambda values. For individual down-hole fitting, analysis fitting is what should be used (i.e. assessing how well modelled a specific analysis is), session based aggregation would be used to check within lab variability, and sample based aggregation would be used for between lab variability/long term assessment of reference materials or to choose a suitable reference material to use as a calibration for an unknown.
  Line 215: We will use this feedback to make improvements here. From my current understanding, the lambda values are unit less. I.e. λ₀ is derived from the mean cps/cps (units cancel) and all other terms are multipliers of functions, that themselves have units, e.g. λ_j*φ_j(x) where, for example, φ₂ = (x_i - γ₁)(x_i - γ₂) and has units of x² (seconds squared).
  Line 227: We will take this feedback on board as it also assists with the prior two comments. In essence λ₀ is the only value sensitive to ICP-tuning and drift, thus if data are first centred to account for these factors the values of λ₁ and higher, should only represent the shape parameters of an analytical signal and not change with respect to a centred and non-centred dataset of single analysis fits. For aggregated datasets, if the values of λ₁ and higher are not identical for centred and non-centred datasets this suggests an external influence is not yet accounted for (i.e. machine drift within a session or intersession tuning).
  
  Citation: https://doi.org/10.5194/egusphere-2024-2908-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2908', Anonymous Referee #1, 15 Nov 2024

While I am not in a position to critically evaluate the mathematical models used to quantify Differential Heating Factor (DHF) in this extensive dataset, the results presented here are compelling. Having previously analyzed a similar dataset and examined DHF patterns in U-Pb dating across various minerals, I am often struck by the surprising variability between them. The physical mechanisms underlying DHF remain somewhat ambiguous, which makes this study’s methodology and findings—particularly in the context of “Big Data”—valuable to the community engaged in LA-ICP-MS U-Pb dating methods.
Drawing from my experience, I believe certain aspects warrant additional discussion. Ideally, expanding this impressive dataset with data from alternative instrumentation (both laser ablation and ICP-MS systems) would mitigate some current limitations of the study, though it would likely raise additional questions as well.
Several potential limitations and influential factors in DHF quantification, as presented in this work, could benefit from a more detailed discussion. These include aspects such as signal duration, differences in instrumentation and operator handling, sample heterogeneity, inclusions, focal position accuracy, and variation in detector cross-calibration and dead time.
Lines 82-85: “The independence and physical meaning of the lambda coefficients allows them to be used to quantitatively compare independent fits (e.g., single analyses, materials, analytical sessions, differing laboratories) so long as other parameters (e.g., fluence, spot diameter/volume, laser wavelength) are considered.”

How are these other parameters accounted for within the analysis framework? What impact would neglecting them have on the method’s accuracy and reliability?
Is the centering of time-dependent ratios influenced by signal duration, or is it only lambda 0 that is sensitive to such changes, potentially due to ICP tuning dependencies? Furthermore, how does total analysis time (with data presented at 30 and 40 seconds) impact the DHF pattern once data are centered for further calculation of lambda components (lambda 1, 2, 3, and 4) and subsequent UMAP visualization? If lambda components exhibit different characteristics during different parts of the signal—such as a linear trend dominating the initial 10 seconds and a higher-order trend thereafter—then the lambda coefficients would differ for signal durations of 20, 30, or 40 seconds. Consequently, the same analysis might plot differently in UMAP depending on signal duration. This warrants further discussion.
The statement, “and for Wilberforce, the steeper linear DHF component and larger uncertainty are due to inclusion of several points from some analyses that are highly leveraging the fit, even with automated outlier removal being applied,” suggests that inclusions and heterogeneity (particularly variable initial Pb content, as in Apatite) within reference materials can influence the DHF pattern beyond what outlier removal can address. How is this handled in your analysis? A more in-depth discussion would be beneficial.
This study is based on data from a single laboratory using a single laser ablation system and two similar ICP-MS instruments likely operated or trained by a single Lab Manager. The robustness of DHF quantification could be enhanced by incorporating data from different laser ablation systems, which vary in wavelength, pulse width, energy density, and ablation cell design, as well as different ICP-MS instruments. Please discuss how this limitation might affect the generalizability of the quantification.
Additionally, ICP-MS instruments employ various detection modes that require cross-calibration. How is it ensured that ablation signals—where intensity generally decreases during single-hole ablation—are unaffected by potential cross-calibration errors that could impact ratio measurements?
Minor comments:
Line 33: “As DHF is a volume- dependant spot-ablation phenomena…” I would rather state “ As DHF is a crater geometry dependant ….”
Line 216 there is an E missing in “(𝜆3) range from -1.08E-5 to +2.15-5,”

Citation: https://doi.org/10.5194/egusphere-2024-2908-RC1
- AC1: 'Reply on RC1', Jarred Lloyd, 24 Mar 2025
  
  We thank the reviewer for their time and comments on the manuscript.
  The reviewer has raised they would like to see more discussion regarding the potential limitations and influential factors in down-hole fractionation (DHF). While we agree the with the overall concept they are presenting in their commentary, we feel that this is largely beyond the current the scope of this contribution, which is to provide an improved quantitative method for the assessment of DHF fraction during LA-ICP-MS analysis, rather than conduct the assessment of these factors. The assessment of these factors using the proposed method is a worthwhile endeavour, but would be best addressed in a one or more separate studies. The factors influencing DHF are well documented in literature, although their underlying mechanisms are somewhat ambiguous still as the reviewer states.
  
  We agree it would be worthwhile adding a brief section in the discussion or introduction that outlines the potential factors more clearly and how they are controlled for during this study and what impact they would have if neglected, but the mathematical foundation of the modelling is independent of these factors. As with all tools, user beware.
  Regarding commentary about single laboratory/single ablation system, again, given the method being proposed here is the mathematical foundation for an improved model fitting algorithm, we believe addition of extra datasets from other laboratories and equipment is not necessary for this particular manuscript. It is exactly what would be needed for a large-scale assessment of DHF by the community and would be best addressed in a separate study. From the data provided in this study it is already apparent that the usual factors influencing DHF (spot geometry, material, etc) will change the resulting model fit. Laser wavelength is known to impact DHF as shown in numerous studies and so would be expected to change the resulting fit, but again not the mathematical foundation of the fitting. We propose to add older published data from the lead author that used a 213 nm laser paired with an Agilent 7500 as a supplementary figure to demonstrate this, and would add the brief section on DHF influences as outlined above.
  With regard to the commentary asking how these factors were controlled for during this study. We decided to demonstrate the fitting algorithm using common reference materials for well-established methods, i.e. U-Pb geochronology of zircon, monazite, apatite etc using standard conditions for each material as accepted by the community for a given analytical setup. The references for the standard analytical methods are provided in the methods section of the paper, but for example all zircon data presented in this study was ablated using a 193 nm wavelength laser using a fluence of nominally 2.0 J cm^-2 and a repetition rate of 5 Hz. When computing orthogonal decompositions of aggregated data, the data is first grouped by fluence, spot diameter, repetition rate, and material. This ensures that only like-for-like data are being fit, and not artificially skewing results due to external factors that are known the impact DHF (e.g. fluence) not being accounted for. Again, our paper presents a methodology for quantitatively fitting the signal of LA-ICP-MS analyses to model DHF: it is independent of the specific instrument factors used during analysis.
  In regard to cross-calibration due to detection modes, I am interpreting this as the change from analogue to pulse detection model. Our ICP-MS setup is set to use a pulse cut-off of ~2.5 million counts per second, and during initial tuning before an analytical session the count rates are tuned to optimise sensitivity while ensuring that counts remain in one detection mode. If an analysis did have a portion in analogue mode that switched to pulse mode due to the signal intensity decrease over time, and that calibration factor was in error, it would make the fit erroneous and require portions of the signal to be subdivided for fitting, however no such change is observed in this data set (as seen in the supplementary signal intensity plots). It would be seen as a step change in the intensity and ratio of a signal if present.
  Regarding the centring of time-dependant data. The reviewer has understood correctly that only lambda 0 is sensitive to ICP tuning dependencies. Centring the data does not change the shape of the data, just the central tendency and this allows for comparison of data that have not yet been corrected against a known ratio (e.g. NIST glass, GJ). Total signal time is likely to have negligible impact on lambda 1, but may have a greater impact on lambdas 2 and higher due to the more complicated profiles these represent (sinuosity for example). This potentially would have some impact on where they plot on UMAP, but they should still plot close to analyses from the same material. Some of the variation seen for a single material (e.g. GJ1) may be explained by this effect. This can be elaborated on in the manuscript and looked at in more detail. The time component of a modelled signal is corrected to laser-fire and signal-stabilisation (after initial intensity ramping) to mitigate some of the differences that would occur from using a delayed subset of a signal (e.g. 10 seconds after the signal stabilised).
  For the minor comments:
  
  We agree that geometry is a better term as it accounts for shape, diameter, and depth.
  We will correct the missing E in the scientific notation, and will change to "value*10^X" notation if that is the Geochronology formatting recommendation.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2908-AC1
RC2:
'Comment on egusphere-2024-2908', Noah M McLean, 18 Feb 2025

This manuscript, which details a method to quantify the shape of laser ablation downhole fractionation profiles and compares them across minerals and instrument parameters from a single lab, is a solid contribution. Its quantitative, technical scope, and its methodological slant are well suited to Geochronology. In my opinion, the article needs somewhere between minor and major revisions, mostly to do with the clarity of the writing and quantitative explanations and figures. The orthogonal polynomial regression approach is sound, and the dataset is interesting, although applications for the technique have not been thoroughly explored.
The orthogonal polynomial regression approach used here is appropriate, and its use to quantify the shape of the DHF is an interesting contribution. I’ve never heard of UMAP, but it seems to be used appropriately. A quick survey of the GitHub repository shows that the code is relatively easy to follow, and major functions have informative docstrings. However, there are some issues with the regression framing (see comments around equations 7-9).
What’s missing from this publication is some explanation and clarity – on occasions throughout the text, I’m not sure what is being explained or plotted. I’ve enumerated those below. Another missing point, brought up by the previous review, is that the specific results discussed in this paper (most of the figures and discussion) are applicable to the authors’ analytical setup and choices. For instance, the shape parameters for DHF will depend not just on the mineral and the spot size but also on the choice of fluence. How were the fluences in Appendix A chosen? From lab to lab, these will vary based on sample cell setup, gas flows. It’s clear from reading the manuscript that the authors know this, but it’s not clear upfront to the reader.
It is also unclear how the authors propose to use the results of their orthogonal polynomial regression to correct for down-hole fractionation in the Applications section. On line 293, the authors suggest that “We envisage that this algorithm could be implemented in data reduction software to self-correct the DHF pattern of well-behaved materials…” I don’t know what this means, but it sounds like a very different approach than the two common DHF correction schemes, the ‘intercept’ method (used, e.g., by AZ LaserChron), or the ‘downhole’ method from Paton et al. implemented in Iolite. Can the authors describe such a self-correction and how it might be different from what’s used currently? This point is obtuse, and if the authors wish to write a forthcoming paper on a new algorithm, I’d suggest leaving it out here rather than the current vague mention.
Finally, I don’t see any documentation of how this algorithm was tested. Did you create synthetic data with a known data covariance matrix and known regression parameters, fit the data, and recover the input parameters? How does the scatter between regression parameters fit to many synthetic datasets compare with the uncertainties in the fit parameters estimated by the model?
Line-by-line comments follow.
Figure 2b: The trace of lambda_4 here is extraneous or in error. Based on the description in this portion of the manuscript, I would expect to see a fourth-order polynomial. If the coefficient is so small that the trace won’t plot on the y-axis of this figure, then it is not a good illustration. If it has been set to zero by an AICc cutoff explained much later in the text, then it’s quite confusing here. Is there some mathematical significance to the dot between the lambda_j and the variable x in the legend?
Figure 2a and caption: The ‘gmean-centered ratio’ y-axis label needs a detailed explanation for those who don’t regularly center their data with a geometric mean and are likely confused by this axis label. The first sentence of the caption doesn’t help much – please sacrifice some brevity for the sake of clarity. The x-axis label and color scale for 2a are confusing – do you need all those dates on the color scale? It’s the same length as the x-axis, which is confusing – I thought for a long time that the color scale corresponded to the ‘time since laser start’ and couldn’t figure out what was being plotted. Maybe make the color scale vertical and place it to the right of the axes with fewer dates?
Line 81: there is some confusion here about whether lambda_0 represents an arithmetic or geometric mean of the data. I like that this manuscript uses geometric means extensively. Why do the regression on ratios instead of log-ratios, though? The authors seem aware (e.g. in the appendix) about the challenges of dealing with compositional data. Those challenges extend not just to taking means, but to regression problems (same idea, more parameters). See countless Aitchison publications for details.
Line 115: I don't understand this paragraph as written. Specifically, what numbers are discrepant? Are lambda_1 and higher different for centered and non-centered fits of the same (e.g., 30-second) laser ablation analyses? This seems like a rounding or numerical precision problem. If the labda_1 and higher coefficients are different among separate laser ablation analyses (e.g., the first reference material analysis and the second), then this implies that the shape of the DHF is changing during the session, or that the ref mat is heterogeneous in the parameters that impart the DHF behavior.
Figure 4: Maybe thin this data out by randomly selecting 10% or 25% of the data to plot? That would prevent data overlap and occlusion problems and avoid aliasing effects.
Line 132: The example here is counts per second, but the figure shows a ratio on the y-axis. Is there a time when you'd fit an intensity rather than a ratio, and when would that fit be useful?
Line 136: What is N?
Equations 7-9: There is some lack of clarity here, along with some errors or typos.
In equation 7, if you want to leave that Sigma term in, you’ll want to take the hat off of the y. Usually, hat means the predicted values, which are given by multiplying the design matrix by the best-fit parameters (here, Lambda), usually also given a hat, which means there’s no error term in the equation. The error term here (Sigma in equation 7) would need to be a vector, not a matrix, and it is not given by equation 9, which describes the uncertainties in the best-fit parameters in Lambda, not the measurements in y.
In the generalized least squares framing, that Omega matrix in equation 8 is the covariance matrix for the measurements in y (or the residuals in a differently formulated equation 7). This matrix goes unmentioned in the main text, until line 425 in the Appendix. Where do the analytical uncertainties in equation 8 come from? I’ve looked at the code and the appendix and I’m still unsure. What terms go into the uncertainties? For ratios, are the numerator and denominator intensity measurements accounted for? Are detector effects like dead time included? What sort of assumptions are made here and how might they affect the results?
As far as I can tell from the appendix, these uncertainties are estimated from the variability of the measured ratios about the… geometric mean? But this wouldn’t make much sense for a time series where you expect a trend (and you want to measure its shape) – some of the variability would come from the analytical uncertainty and some from the trend itself. I can’t piece this together.
Looking at the code and the appendix, Omega is a diagonal matrix. My understanding is that this technically makes your algorithm a weighted least squares, rather than a generalized least squares, regression problem. Also, I think most folks would put a hat on Lambda here.
In equation 9, the Sigma contains the variances and covariances for the fit parameters (here, the lambdas). This step only works for generating uncertainties, confidence bands, etc, if the Omega is an inverse covariance matrix for the data in y. Note that in your GitHub code, lines 286-300 look like they're weighting by integer ones and maybe a few other approaches? I can't follow the code exactly and it's not commented. If you use ones as the diagonal of Omega, then equation 9 only gives you a Sigma that contains variances and covariances if the uncertainties are all independent (e.g., no covariance terms from detector effects like dead time) and identical to one in whatever the units of y are (unclear here and elsewhere).
Are the lambda uncertainties plotted throughout this paper calculated using input uncertainties to fit_orthogonal() or are some or all calculated with the default unit weights?
Figure 5, line 212: It is unclear to me what is going on here. I think I get (c), which describes the results of 5478 different orthogonal polynomial regression analyses (of what exactly?) versus time? Maybe 206/238? Or maybe an unspecified intensity in cps, per line 132?
What then are the 188(?) points in b? Maybe the (cps? ratio?) data from a single reference material (e.g. GJ1) have been lined up to a common time datum (the laser turning on?), and all the (centered?) data have been fit by the same orthogonal polynomials? If multiple datasets are combined, do they agree? One of those chi-squared values you mention in the model selection portion of the paper should tell you, but those are not reported here. The way that you’ve set up the regression in equations 7-9, the uncertainties in Sigma will get smaller with more data, no matter how scattered the data are about any one trend. Try it and see! This is the same effect as the weighted mean (a special case of weighted least squares) getting more precise when adding more analyses, even when those analyses don’t agree and the reduced chi squared grows large. Just like a dataset can have a weighted mean with a large reduced chi squared and a tiny uncertainty, your regression could have a large scatter about the trend you’ve described with orthogonal polynomials but tiny uncertainties.
Continuing on the comments from Figure 5b, what does this analysis tell us? Spell out where a user would one use these lambdas, rather than the lambdas from (c)? What about (a)??
Line 215: Spell out scientific notation here and in the figure axis labels, but I'd say just leave them out of the text altogether unless you mean to explain the physical significance (starting with the units) for each number.
Line 227: Describe very clearly here what you mean by “without needing reference material calibration,” or rephrase this. Maybe something along the lines of "the lambdas quantify the shape of the DHF for each of the unknown analytes and reference materials in a way that is independent of a shape derived from an average of reference material analysis (ref Paton).

Citation: https://doi.org/10.5194/egusphere-2024-2908-RC2
- AC2: 'Reply on RC2', Jarred Lloyd, 24 Mar 2025
  
  We thank Noah for his time and constructive comments on the manuscript.
  Starting with the final overview comment about how was the algorithm tested. We did not create a synthetic dataset and recover the input parameters, in hindsight this would have made testing simple, and we are happy to implement this to further validate the method as it would also be useful for debugging purposes. This method was developed upon the established methodology of O'Neill (2016) and Anenburg & Williams (2022) who have used it for modelling REE patterns quantitatively. Our algorithm was tested against theirs using identical datasets to ensure consistency in results and uncertainties. Our implementation of this algorithm is much more generalised allowing it to be used for the purposes in this manuscript, but can readily be applied to any data where a (up to five order) polynomial fit is sensible.
  While the reviewer has raised a similar concern to reviewer 1 about the results being applicable to only our analytical setup, we again believe this is not a limitation of the current scope of the study but would be happy to add older published data from the lead author that used a 213 nm laser paired with an Agilent 7500 as a supplementary figure. The scope of the study is to provide a tool that can reliably and quantitatively model DHF patterns to enable studies that assess the differences between the factors that influence DHF and differences between analytical setups, laboratories, etc., rather than make a full assessment of these. The assessment of these factors, and results from different laboratories would best be done in a separate (or several) community-led studies involving many institutes.
  Fluences in appendix A were chosen based on the analytical methods cited in the methods section. The chosen fluences for a given material have been used consistently for several years in the host laboratory with numerous studies published using those conditions. The initial studies aimed to optimise signal sensitivity, count rate, signal complexity, and pit depth while minimising DHF for a given material. As the reviewer has rightly stated, this will vary between laboratories but this is where the proposed method can be of significant use in enabling quantitative comparison between setups so long as the appropriate metadata (laser and tuning conditions) are known for a given fit. We can add a sentence or two to the manuscript highlighting this point.
  
  For the reviewer's overall commentary on the clarity, we will use their input to refine the manuscript to improve this aspect. One of the co-authors who is more removed from the mathematical and technical knowledge on this manuscript will look over the manuscript again with the reviewer's feedback in mind to assist further with this.
  Regarding the discussion of applications of the method. We will aim to clarify this further in a revision. Our intent is to provide the method so it could be used as a drop-in replacement for standard polynomial fitting used in current data reduction software (Iolite, LADR) for DHF correction whilst then providing the quantitative parameters and enable DHF correction uncertainty propagation. For our purposes, it is used to demonstrate the methods validity for down hole modelling, and we will use it to quantitatively assess the appropriateness of using one material to perform down-hole correction on another.
  
  The proposed method, could however be used to quantitatively model any geological process which has a (mathematically) linear relationship between x and y (for example, REE abundance ionic radii, as has been done previously).
  
  With specific regard to line 293, while we have not investigated how such an implementation would work in practice, we believe it would be possible to remove the dependence of the "down-hole method" (Paton el al.) on a homogeneous material for DHF correction by fitting each unknown directly. There are numerous parameters of fit that could numerically assess the sensibility of a fit to an individual signal, and if the signal were particularly noisy (due to things like inclusions) and the parameter of fit suggested such it could fall back to the standard "down-hole model" of a reference material. We may remove this sentence at the reviewer's recommendation, given the unexplored nature of this implementation. Regardless, we think it worthwhile elaborating on how the proposed method differs from the existing down-hole modelling methods as they are only briefly discussed in the manuscript's introduction.
  Figure 2: We would happily make modifications to the caption and figure to address the reviewer's comments here. For figure 2b we could remove λ₄ as it is effectively zero, and add a supplementary figure that displays the shape of each component (λ 0 - 4) more clearly. The dot between λ_j and variable x is intended to signify that is it λ_j multiplied by the orthogonal function φ(j) of x. More concretely, λ₁* φ₁(x) is the linear orthogonal polynomial component of x. This needs to be revised on the figure as they are in the wrong spot, it should be λ_j⋅φ(x).
  Line 81: It currently represents an arithmetic mean due to fitting on centred (i.e. geometric mean subtracted) ratios. These by definition end up centred around 0 with both positive and negative ratios causing log transformations to not be possible. This is more a limitation of how the data is initially ingested and the order in which operations are done. This could be overcome with a code rework. We don't see this as an issue currently, as we are only attempting to show this method is possible, and can make improvements to the code in future (contributions are welcome) to further improve the accuracy of the method from a compositional data perspective.
  Line 115: We'll work to revise this paragraph for clarity, as it is rather complicated as currently written.
  
  An example of what this paragraph was attempting to highlight: for session aggregated fits lambda 1-4 values for a non-centred and centred fit are slightly off from each other.
  
  For example, lambdas 1-4 for 91500 on 2020-02-24 are 0.00065959, -3.5747e-5, 4.6642e-7, 4.1489e-8 when not centred, and 0.00066039, -3.5474e-5, 5.2599e-7, 5.198e-8 when centred. After investigating this further, it appears to be an artefact of the data not being calibrated against a known reference material and their being some drift in the session (which has not been corrected for). This would cause underlying data to have slightly greater spread along the y-axis and would then impact λ₁ (the slope), which then impacts λ₂, and so on with each higher term being partially dependant on the term before it. Using a synthetic dataset and introducing "drift" across it will help confirm this.
  
  Importantly, individual analysis fits have identical lambda values for centred and non-centred datasets, implying that the algorithm operating correctly and this is an input data issue unique to session aggregated data.
  
  This further highlights the need to centre data when comparing different analyses or aggregating data to remove the influences of machine drift and tuning.
  Figure 4: We'll take this feedback on-board. The figure is there to demonstrate that we can compare data from different sessions without needing to first calibrate them to a known-ratio.
  Line 132: We'll change this to "ratio" for clarity. Fitting signal intensity rather than ratio might provide us with greater insight on how specific elements are behaving during ablation (as a result of ionisation efficiencies, volatilities etc) and could be a potential application of the method. We have not explored this though.
  Line 136: N is the total number of points. I can change this to "for all i from 1 to N" to improve clarity.
  Eqs 7-9: The hat is removed from y in an already revised version as this is a typographic error.
  
  The reviewer is correct in Eq. 9 being the variance-covariance of the lambda parameters, not the measurements in y, and the sigma term being a vector rather than a matrix. These are typographic errors. Eq. 9 is used to calculate the lambda values.
  
  These equations and corresponding text will be changed to reflect this.
  
  For Eq 8, notably the Omega term. The input uncertainties come from prior data ingestion steps, but they are not fully quantified with parameters like dead-time uncertainty, dwell time uncertainty etc. They are currently calculated from the underlying gas-blank subtracted count per second data for the ratio numerator and denominator intensities via the function load_agilent. These uncertainties are used as a weighting for the fit.
  
  The reviewer is also correct that the exact current implementation is a weighted least squares due to the diagonal Omega matrix, but this is a specialised case of generalised least squares. The intention of having the algorithm set up like this currently is to allow for true covariance-variance input (accounting for dwell time and dead time) in future (Omega would then not be a diagonal matrix).
  
  We welcome future, external discussion with the reviewer on this topic to refine the code accordingly but this does not impact the methodological foundations of the proposed fitting method as it gives results consistent to the REE fitting algorithm of Anenburg & Williams (2022) on their REE data.
  
  The plotted lambda uncertainties in this paper are all calculated by the fitting algorithm using input uncertainties as weights for the fit, and not the default unit weights that are provided in the algorithm as these are only a fall back if the user does not provide uncertainties for weighting.
  Figure 5, line 212: We would happily work through these comments to improve clarity here.
  
  For reference, (a) are the fits for an aggregated dataset for a specified sample from all sessions (e.g. GJ, centred, all sessions), (b) are the fits for an aggregated dataset for a specified sample within a session (e.g. GJ, centred, single session). x is λ₁, y is λ₂, of an orthogonal polynomial fit to U/Pb (y) against time (x). The data have been lined up to a common time datum, laser on using the automatic times algorithm implemented in the code.
  
  The reduced chi squared values are quite small, indicating either over-fitting of the data or the error variance has been overestimated. This is likely related to the set up of input and output analytical uncertainties, and as the reviewer has stated will lead to smaller uncertainties with increasing number of data points.
  
  I've implemented a normalised root-mean-square-error (NRMSE, 0 being a "perfect fit" and 1 being a "nonsensical fit" of the data). For the worst case sample using the least appropriate order (lowest AIC weight), this value is around 0.25 and for an appropriate fit, the value is usually between 0.08-0.16 suggesting good-quality fits.
  
  NRMSE values from the sample aggregated data (i.e. all of 91500), session fits (91500 per session) and analysis fits (91500 per analysis) very consistent across these models. I would expect to see greater variability in the NRMSE across the different aggregations if the input data were increasing the overall scatter due to them not following the same linear (x to y) relationship as each other.
  
  These measures of fit aren't directly reported in the manuscript as they are directly available from the provided code and data, and it would create an excessively large table that is quite difficult to understand.
  
  There is room for improvement in the uncertainty quantification and use of these measures of fit, with particular consideration of some of the pitfalls in reduced chi squared for linear regression (Andrae et al. 2010, preprint, http://arxiv.org/abs/1012.3754). However, this is will not change the overall conclusions drawn from this work, nor does it prevent the use of the proposed method, as the underlying fitting algorithm is well established in literature and as the reviewer has confirmed, is mathematically sound. The code is open source, and we welcome further discussion and contributions from the reviewer (and any other interested party) to improve the uncertainty quantification of the lambda parameters in the code.
  We'll take on the feedback to highlight where a user should utilise analysis or aggregated data lambda values. For individual down-hole fitting, analysis fitting is what should be used (i.e. assessing how well modelled a specific analysis is), session based aggregation would be used to check within lab variability, and sample based aggregation would be used for between lab variability/long term assessment of reference materials or to choose a suitable reference material to use as a calibration for an unknown.
  Line 215: We will use this feedback to make improvements here. From my current understanding, the lambda values are unit less. I.e. λ₀ is derived from the mean cps/cps (units cancel) and all other terms are multipliers of functions, that themselves have units, e.g. λ_j*φ_j(x) where, for example, φ₂ = (x_i - γ₁)(x_i - γ₂) and has units of x² (seconds squared).
  Line 227: We will take this feedback on board as it also assists with the prior two comments. In essence λ₀ is the only value sensitive to ICP-tuning and drift, thus if data are first centred to account for these factors the values of λ₁ and higher, should only represent the shape parameters of an analytical signal and not change with respect to a centred and non-centred dataset of single analysis fits. For aggregated datasets, if the values of λ₁ and higher are not identical for centred and non-centred datasets this suggests an external influence is not yet accounted for (i.e. machine drift within a session or intersession tuning).
  
  Citation: https://doi.org/10.5194/egusphere-2024-2908-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Publish subject to minor revisions (further review by editor) (01 Apr 2025) by Axel Schmitt

AR by Jarred Lloyd on behalf of the Authors (01 May 2025) Author's response Author's tracked changes Manuscript

ED: Publish subject to technical corrections (21 May 2025) by Axel Schmitt

ED: Publish subject to technical corrections (21 May 2025) by Klaus Mezger (Editor)

AR by Jarred Lloyd on behalf of the Authors (22 May 2025) Author's response Manuscript

Journal article(s) based on this preprint

05 Aug 2025

The quantification of down-hole fractionation for laser ablation mass spectrometry

Jarred C. Lloyd, Carl Spandler, Sarah E. Gilbert, and Derrick Hasterok

Geochronology, 7, 265–287, https://doi.org/10.5194/gchron-7-265-2025,https://doi.org/10.5194/gchron-7-265-2025, 2025

Short summary

Jarred Cain Lloyd, Carl Spandler, Sarah E. Gilbert, and Derrick Hasterok

Data sets

Raw and derived data: The quantification of downhole fractionation for laser ablation mass spectrometry Jarred Lloyd and Sarah Gilbert https://doi.org/10.25909/26778298

Supplementary analyte signal figures - The quantification of downhole fractionation for laser ablation mass spectrometry Jarred Lloyd https://doi.org/10.25909/26778592

Supplementary Figures - The quantification of downhole fractionation for laser ablation mass spectrometry Jarred Lloyd https://doi.org/10.25909/27041821

Model code and software

Julia scripts - The quantification of downhole fractionation for laser ablation mass spectrometry Jarred Lloyd https://doi.org/10.25909/26779255

Jarred Cain Lloyd, Carl Spandler, Sarah E. Gilbert, and Derrick Hasterok

Viewed

Total article views: 743 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
316	112	315	743	23	36

HTML: 316
PDF: 112
XML: 315
Total: 743
BibTeX: 23
EndNote: 36

Views and downloads (calculated since 10 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	98	26	7	131
Nov 2024	48	19	3	70
Dec 2024	17	10	45	72
Jan 2025	16	10	47	73
Feb 2025	25	6	41	72
Mar 2025	27	13	56	96
Apr 2025	22	9	45	76
May 2025	20	5	47	72
Jun 2025	21	6	24	51
Jul 2025	21	8	0	29
Aug 2025	1	0	1

Cumulative views and downloads (calculated since 10 Oct 2024)

Month	HTML	PDF	XML	Total
Oct 2024	98	26	7	131
Nov 2024	48	19	3	70
Dec 2024	17	10	45	72
Jan 2025	16	10	47	73
Feb 2025	25	6	41	72
Mar 2025	27	13	56	96
Apr 2025	22	9	45	76
May 2025	20	5	47	72
Jun 2025	21	6	24	51
Jul 2025	21	8	0	29
Aug 2025	1	0	1

Viewed (geographical distribution)

Total article views: 751 (including HTML, PDF, and XML) Thereof 751 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 05 Aug 2025

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (6718 KB)
Metadata XML

Short summary

Laser-based dating of rocks and minerals is invaluable in geoscience. This study presents a significant advancement in our ability to model and correct for a process called downhole fractionation (DHF) that can impact the accuracy and uncertainty of dates. We develop an algorithm that quantitatively models DHF patterns. The implications are far-reaching: improved accuracy, reduced uncertainty, and easier comparisons between different samples and laboratories.


Total:	0
HTML:	0
PDF:	0
XML:	0