the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Technical Note: Cluster Analysis of Inverse Thermochronology Models
Abstract. Thermochronological inverse modeling may produce none-unique solution, that can group different thermal histories. Objective identification of such groups, also referred as "path families", is challenging and greatly benefits from dimension-reducing exploratory data analysis tools. This article proposes a statistical algorithm to overcome these challenges. We show that Hausdorff and Frechet distances are viable dissimilarity measures for ordered point sets, such as time-temperature paths. Clustering the pairwise dissimilarities between modeled thermal histories reveals distinct groups of thermal histories for a given sample or set of samples. As demonstrated by clustering a natural example, automated path-clustering allows for an objective and reproducible interpretation and maybe particularly useful for samples with poor prior knowledge of the time-temperature history. To allow adoption of the method by the thermochronology community, the methods introduced in this article are freely available through the package software thermoclustr, written in the programming language R.
- Preprint
(7097 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2026-1279', Kalin McDannell, 21 Apr 2026
-
AC1: 'Reply on RC1', Tobias Stephan, 02 Jun 2026
Comment 1: Stephan et al. present an automated statistical framework for clustering time-temperature paths (so-called "path families") from thermochronological inverse modeling, implemented in the R package thermoclustr. While the technical execution is competent and the software may have practical utility, the paper rests on a premise that warrants closer scrutiny.
- Response: We thank the reviewer for recognizing the competent technical execution and practical utility of the software as this aligns with the goal of this technical note.
Broader questions regarding the underlying premises and the evaluation of cluster significance are beyond the scope of the present contribution. These aspects have already been addressed in earlier studies (see ll. 19-23 in our initial manuscript) and are discussed in greater detail in an upcoming article (as stated in ll. 275-275). Thanks to suggestions of Reviewer #2, we will provide more references to relevant publications that discuss the key concepts and methodological foundations in detail in a revised version of this manuscript.
Comment 2: The authors treat distinct path clusters as geologically meaningful thermal history scenarios, but the 'dispersed' ensemble of acceptable t-T paths (produced by the HeFTy software) exists precisely because the thermochronological data cannot distinguish between them—they are all valid solutions within data uncertainty. Clustering a t-T ensemble does not necessarily reveal meaningful geological signal; it simply partitions solutions that the data are fundamentally unable to distinguish due to poor resolution. The non-uniqueness of inverse thermal history models is a statement about the resolving power of the data, and subdividing the acceptable solution space into groups does not change that fundamental limitation. A secondary but related concern is that path geometry is often already heavily structured by user-defined t-T constraints or "exploration boxes", meaning that recovered clusters for many inversions (especially where data sensitivity is low or absent) largely reflect the topology of modeling choices rather than nuanced thermal history signal.
- Response: We agree with all the concerns raised by the reviewer. As stated above, this is beyond the scope of this technical note manuscript that provides the mathematical and statistical framework for the cluster analysis. We believe a full repetition of the issues is not warranted. We do highlight the key premises in the Introduction, and address the issues raised by our statistical analysis in the Discussion (“5.1 Limitations and outlook”).
Comment 3: The natural example presented in Section 4 inadvertently highlights some of these limitations. Three clusters are recovered from the HeFTy inversion of data for sample 112-73, a Paleoproterozoic granodiorite from the southwestern Northwest Territories overlain by Devonian strata of the Western Canada Sedimentary Basin (truncated model shown in Figure 6). All three clusters represent acceptable fits to the same thermally reset apatite (U-Th)/He and fission-track data, yet they show different burial and exhumation histories distinguished by heating rates, peak temperatures, and cooling onset. Rather than the clustering algorithm identifying the geologically meaningful solution, the authors sequentially reject Clusters 1 and 2 by appealing to regional stratigraphic constraints from Morrow et al. (1993) and t-T constraints from nearby Mississippi Valley-type mineralization, ultimately endorsing Cluster 3 as the most plausible thermal history. This is the interpretive workflow that geoscientists already apply when evaluating inverse t-T modeling results. The clustering step alone added little discriminating power and the plausible thermal history was identified by geological reasoning, not by the algorithm. Unfortunately, without a clearer discussion of when clustering yields genuinely interpretable results versus when it simply reflects data or modeling limitations, the method risks being applied uncritically in contexts where the data do not support the level of thermal history discrimination that clustering implies.
- Response: As stated in the manuscript (l. 6 and ll. 246-259), clustering is only useful when time-temperature paths are poorly constrained or even unconstrained. The statistical tests employed here (e.g. Hopkins test) demonstrate that, in some cases, meaningful structure may still exist with otherwise unconstrained time-temperature solutions, and that such structure can be extracted using cluster analysis.
We agree that, where sufficient geological constraints are available, neither cluster analysis nor thermochronological modelling may be necessary. The example presented here was intentionally selected to demonstrate the capability of the clustering approach. In this case, certain clusters can be favored because independent geological information exists that allows alternative clusters to be excluded; this was done purely for demonstrative purposes. In practical applications, however, cluster analysis is most valuable in situations where such prior information is absent, as it may help identify otherwise unrecognized structures within the data and thereby support geological interpretation. More examples and various geological scenarios that highlight when cluster analysis is most useful are provided in our submitted paper by Pinto et al. (submitted to GChron in May 10, 2026).
Citation: https://doi.org/10.5194/egusphere-2026-1279-AC1 - Response: We thank the reviewer for recognizing the competent technical execution and practical utility of the software as this aligns with the goal of this technical note.
-
AC1: 'Reply on RC1', Tobias Stephan, 02 Jun 2026
-
RC2: 'Comment on egusphere-2026-1279', Anonymous Referee #2, 28 Apr 2026
“Technical Note: Cluster Analysis of Inverse Thermochronology Models” by Tobias Stephan and co-authors is an important contribution that could be published with some minor revisions. It is also a clear and logical step to apply clustering algorithms in place of manual selection of paths. In particular, there are lots of grammatical errors that should be easy to fix. Additional references are required and a longer discussion about the potential issues would be helpful.
There are lots of typos in the document. For example, the first sentence of the abstract has errors. “Abstract. Thermochronological inverse modeling may produce none-unique solution, that can group different thermal histories.” First Thermochronological inverse modelling is not limited to thermal history modelling. There are now lots of tools used to estimate different parameters from thermochron data. These include tools (e.g., Pecube, GLIDE) designed to extract exhumation rates, landscape change parameters or other things from thermochron data. It should be non-unique not “none-unique”. It should be solutions not solution. The second clause doesn’t make sense. There are lots of errors like these throughout the manuscript that need to be fixed.
My second concern, is that the example in Figure 6 shows that there is essentially no difference between the paths where there is resolution. In the time interval from 0 to 98 Ma, which is the time between the mean AFT age and today, and in the temperature interval from 100 degrees to 0, which is the temperature sensitivity of the data, there is no difference between the clusters. In other words, the differences between the clusters is all where there is no resolution any way, and the clusters all overlap showing that no distinct clusters have been identified. The data constrain cooling from 100 Ma and 100C to the present and the only differences between the clusters is due to the requirement of the clustering algorithms to cluster the data. If the parameter space was explored more fully, potentially there would be no distinct clusters unless the number of clusters was defined. In this way, I think the algorithm gives the impression that there are distinct clusters when really there should not be. Please add a box showing where the models are well resolved.
The aim to pull apart a continuous and smoothly varying distribution of paths, into a predefined number of clusters is worrying because an apparent increase in model resolution is achieved. These issues should be carefully discussed. For this reason, the manual selection of path families should probably be preferred over automatic detection of clusters that may not be very useful. This is reinforced by the discussion in Section 4. If you know that the sample had to be at 230C at 350 MA, why not find a path family that goes through this constraint. Or just add a constraint box?
It is also important to cite other work that looks at clustering of paths. A very relevant paper to cite here would be Willett (Willett, S.D., 1997. Inverse modeling of annealing of fission tracks in apatite; 1, A controlled random search method. American Journal of Science, 297(10)) who use a model similar to HEFTY to find thermal histories consistent with AFT data. He then used clustering analysis to determine how well resolved the timing of reheating is. In Fox and Shuster (Fox, M. and Shuster, D.L., 2014. The influence of burial heating on the (U–Th)/He system in apatite: Grand Canyon case study. Earth and Planetary Science Letters, 397, pp.174-183.), the authors pulled apart a distribution of likely thermal histories to reveal how different parts of a thermal history are correlated. This was further developed in Fox and Carter (Fox, M. and Carter, A., 2020. Heated topics in thermochronology and paths towards resolution. Geosciences, 10(9), p.375.) where they did this for a result from QTQT to determine marginal conditional posterior probabilities.
Citation: https://doi.org/10.5194/egusphere-2026-1279-RC2 -
AC2: 'Reply on RC2', Tobias Stephan, 02 Jun 2026
Comment 1: There are lots of typos in the document. For example, the first sentence of the abstract has errors. “Abstract. Thermochronological inverse modeling may produce none-unique solution, that can group different thermal histories.” First Thermochronological inverse modelling is not limited to thermal history modelling. There are now lots of tools used to estimate different parameters from thermochron data. These include tools (e.g., Pecube, GLIDE) designed to extract exhumation rates, landscape change parameters or other things from thermochron data. It should be non-unique not “none-unique”. It should be solutions not solution. The second clause doesn’t make sense. There are lots of errors like these throughout the manuscript that need to be fixed.
- Response: We will try our best to fix these typos and grammatical errors.
Comment 2: My second concern, is that the example in Figure 6 shows that there is essentially no difference between the paths where there is resolution. In the time interval from 0 to 98 Ma, which is the time between the mean AFT age and today, and in the temperature interval from 100 degrees to 0, which is the temperature sensitivity of the data, there is no difference between the clusters. In other words, the differences between the clusters is all where there is no resolution any way, and the clusters all overlap showing that no distinct clusters have been identified. The data constrain cooling from 100 Ma and 100C to the present and the only differences between the clusters is due to the requirement of the clustering algorithms to cluster the data. If the parameter space was explored more fully, potentially there would be no distinct clusters unless the number of clusters was defined. In this way, I think the algorithm gives the impression that there are distinct clusters when really there should not be. Please add a box showing where the models are well resolved.- Response: The reviewer is right in their observation that there is no difference between clusters for the <98 Ma time window because this portion of the thermal history is well constrained by thermochronometric data. The goal of clustering is to explore the structure within the ‘unconstrained’ T-t areas as we state in Line 267-71: “It is particularly beneficial for datasets characterized by large analytical or statistical dispersion, where thermal histories are difficult to interpret using traditional visual inspection alone. For example, the approach may be useful when the apatite or zircon thermochronometer were fully or partially reset. The method is well suited for poorly constrained inverse models with broad constraint boxes, as well as detrital samples that inherently represent mixtures of multiple thermal histories and therefore require unmixing approaches.” We could make this section stronger and state specifically that clustering is not needed for well-constrained portions of thermal models. Examples for when clustering is most useful and when not are provided in Pinto et al., (submitted).
We disagree with the reviewer’s comment that “...the differences between the clusters is all where there is no resolution any way,…”. The pre-98 Ma time is constrained by the AFT data (Arne 1991). The single-grain ages spread widely from 65 to 155 Ma and the confined track length measurements (N=100) shows a bi-model distribution suggesting partial resetting. Hence, the T-t space prior to 100 Ma is constrained by the AFT data. The cluster analysis allows to identify the structure in this space. In a revised version, we will add the single-grain ages to Fig 1 to better show the data that was used in the model.
Comment 3: The aim to pull apart a continuous and smoothly varying distribution of paths, into a predefined number of clusters is worrying because an apparent increase in model resolution is achieved. These issues should be carefully discussed. For this reason, the manual selection of path families should probably be preferred over automatic detection of clusters that may not be very useful. This is reinforced by the discussion in Section 4. If you know that the sample had to be at 230C at 350 MA, why not find a path family that goes through this constraint. Or just add a constraint box?
- Response: We agree, and as stated in our original manuscript l. 202:
“It should be noted that the number of clusters must be larger than one and should be as small as possible to avoid overfitting and artificial grouping of the data.”
Before making any assumptions about the underlying structure of the data, we strongly recommend using the statistically estimated number of clusters as an initial starting point. Ultimately, however, it is the user’s responsibility to balance between information gain and loss by employing a larger or smaller number of clusters.
The example presented in this technical note was intentionally chosen to demonstrate the utility and potential of the clustering approach. For this reason, we refrain from adding additional constraint boxes, as doing so would undermine the purpose of the example. As explained in the manuscript and discussed above, clustering is of limited value when the data are already well constrained or over-constrained.
Comment 4: It is also important to cite other work that looks at clustering of paths. A very relevant paper to cite here would be Willett (Willett, S.D., 1997. Inverse modeling of annealing of fission tracks in apatite; 1, A controlled random search method. American Journal of Science, 297(10)) who use a model similar to HEFTY to find thermal histories consistent with AFT data. He then used clustering analysis to determine how well resolved the timing of reheating is. In Fox and Shuster (Fox, M. and Shuster, D.L., 2014. The influence of burial heating on the (U–Th)/He system in apatite: Grand Canyon case study. Earth and Planetary Science Letters, 397, pp.174-183.), the authors pulled apart a distribution of likely thermal histories to reveal how different parts of a thermal history are correlated. This was further developed in Fox and Carter (Fox, M. and Carter, A., 2020. Heated topics in thermochronology and paths towards resolution. Geosciences, 10(9), p.375.) where they did this for a result from QTQT to determine marginal conditional posterior probabilities.- Response: We are very thankful for this comment and will make sure that credit is given to earlier research by adding the suggested references.
Citation: https://doi.org/10.5194/egusphere-2026-1279-AC2
-
AC2: 'Reply on RC2', Tobias Stephan, 02 Jun 2026
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 529 | 260 | 40 | 829 | 38 | 44 |
- HTML: 529
- PDF: 260
- XML: 40
- Total: 829
- BibTeX: 38
- EndNote: 44
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Stephan et al. present an automated statistical framework for clustering time-temperature paths (so-called "path families") from thermochronological inverse modeling, implemented in the R package thermoclustr. While the technical execution is competent and the software may have practical utility, the paper rests on a premise that warrants closer scrutiny. The authors treat distinct path clusters as geologically meaningful thermal history scenarios, but the 'dispersed' ensemble of acceptable t-T paths (produced by the HeFTy software) exists precisely because the thermochronological data cannot distinguish between them—they are all valid solutions within data uncertainty. Clustering a t-T ensemble does not necessarily reveal meaningful geological signal; it simply partitions solutions that the data are fundamentally unable to distinguish due to poor resolution. The non-uniqueness of inverse thermal history models is a statement about the resolving power of the data, and subdividing the acceptable solution space into groups does not change that fundamental limitation. A secondary but related concern is that path geometry is often already heavily structured by user-defined t-T constraints or "exploration boxes", meaning that recovered clusters for many inversions (especially where data sensitivity is low or absent) largely reflect the topology of modeling choices rather than nuanced thermal history signal.
The natural example presented in Section 4 inadvertently highlights some of these limitations. Three clusters are recovered from the HeFTy inversion of data for sample 112-73, a Paleoproterozoic granodiorite from the southwestern Northwest Territories overlain by Devonian strata of the Western Canada Sedimentary Basin (truncated model shown in Figure 6). All three clusters represent acceptable fits to the same thermally reset apatite (U-Th)/He and fission-track data, yet they show different burial and exhumation histories distinguished by heating rates, peak temperatures, and cooling onset. Rather than the clustering algorithm identifying the geologically meaningful solution, the authors sequentially reject Clusters 1 and 2 by appealing to regional stratigraphic constraints from Morrow et al. (1993) and t-T constraints from nearby Mississippi Valley-type mineralization, ultimately endorsing Cluster 3 as the most plausible thermal history. This is the interpretive workflow that geoscientists already apply when evaluating inverse t-T modeling results. The clustering step alone added little discriminating power and the plausible thermal history was identified by geological reasoning, not by the algorithm. Unfortunately, without a clearer discussion of when clustering yields genuinely interpretable results versus when it simply reflects data or modeling limitations, the method risks being applied uncritically in contexts where the data do not support the level of thermal history discrimination that clustering implies.