the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies
Abstract. Bayesian modelling is often implemented in geochronology and its applications to geomorphology, archaeology, etc. The rationale behind such practices is the aim to improve robustness, precision and accuracy thanks to the use of prior knowledge regarding the studied sites, and in particular the order of samples constrained by stratigraphy. All chronological models tested in this study (OxCal, Chronomodel and BayLum) use the same mathematical model to handle stratigraphic constraints. However, this model has been shown to lead to estimation biases. First, this bias is illustrated with BayLum modelling on a high-resolution OSL dataset. Then, this paper compares statistical inferences obtained with the three above-mentioned modelling software on the Neolithic East mound of Çatalhöyük (Turkey). For this site, 49 radiocarbon ages were obtained with the aim to determine the start of occupations at this locality. Interestingly, age uncertainties are rather large, because of calibration curve plateaus. Therefore, the conditions for estimation biases are met. We discuss the behaviour of the different models and show that caution must be taken when modelling results are at odds with measurements. While OxCal, Chronomodel and BayLum are all affected by a spread in ages resulting from their common model of stratigraphic errors, Chronomodel suffers from a great loss of precision and OxCal, through the phase model, concentrates ages undesirably. We also conclude that the onset of occupations at Çatalhöyük was probably earlier than previously thought based on the OxCal model.
- Preprint
(2110 KB) - Metadata XML
-
Supplement
(220 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-890', Anonymous Referee #1, 20 May 2025
Dear Editor(s) and Authors,
I have read the manuscript ‘The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies’ in detail. It is a well structured and well-written manuscript, and the scope fits Geochronology. You address a standing topic and issue in age-depth modeling, and your contribution is relevant. While your conclusions are not entirely new, these are based on real and well selected datasets, and your manuscript is a welcome part for the scientific literature. I clearly support publication in Geochronology after revisions.
The authors use two case studies to point out challenges with Bayesian age-depth modeling – and rightfully demonstrate that the tested models are not without bias and artifacts. I particularly like – and agree with - the repeated call for applying common sense and looking at data with an experienced eye of a geochronologist even when applying models, e.g. as (quotes from your submission follow) ‘As frustrating as it may be, in our view none of the tested models can tell us anything better than the actual data themselves’, and as ‘when testing any chronological model, it is of utmost importance to compare the model outcome with the input data.’. I fully agree and find this an important lesson: look at data, know possible issues – and then think if a model may help and/or is of any help.
The final statement ‘Our study shows that this goal [make use of prior observations to refine the precision, accuracy and robustness] is difficult to reach and that using models to correct measurements appears to be dangerous.’. Well – that really depends on the case and individual data structure in my opinion, and such a general statement should be at least softened when based on two datsets only (why is no reference to the often used models BChron & Bacon made?), and few datasets which are indeed challenging.
With this I come to my main criticism of this manuscript: the arbitrary selection of models, seemingly influenced by previous work of the authors. When speaking of luminence modeling I ask you to refer to ADMin (https://www.sciencedirect.com/science/article/pii/S187110141730047X) – probably the model least affected by the spread effect (?), but at the same time slow/unsuitable for large (and these?) datasets. Generally, I disagree with the BChron and Bacon models not even being mentioned, as these are really often used.
Further comments.
References to Ramsey should in my opinion be to Bronk Ramsey
Line 84: ‘event model of Lanos and Philippe (2018)’ – could you please introduce this one – it is less known than the one by Bronk Ramsey which you introduce in detail
Line 115: please explain ‘Theta matrix’
160ff: Can BayLum model 14C ages!? - that would be different than luminescence modeling, because here 'only' the 14C age is used?
In Fig. 3 (and others) please include original ages.
Generally, I find your figures would benefit from clearer explanation in captions, and systematically placing units on axes - ideally all would be on the same age (ka or BC ,please dont mix here).
Abscissa of Fig. 3 : space before bracket missing
Figure 5 and its explanation: ordinate unclear. Why was this only done for BayLum?
Line 278: please explain the phase structure here
284: ‘between samples OxA-9893 and OxA-23251’ – please mark in Figure so that these can easily be found
286f: I disagree with your statement ‘These two bottom-most samples are PL-980252A, whose age lies outside the calibrated age of all samples above’ – the densities do overlap
Chapter 3.2.2., and Fig. 8 limited to the lower 17 samples – was the model run for all or these samples?
Line ~322: please highlight where the spread effect is pronounced why
Fig. 10: units on both axes missing – please also include original dating - either as distribution or mean ages.
352ff: given that Chronomodel and OxCal partly do not overlap the praising of larger uncertainty alone seems unjustified.
In chapter 4.1. I find a prominent feature missing: The duration of the sequence when using OxCal is much shorter than when using BayLum or Chronomodel. This is worrying in my opinion, and the OxCal results seem much more similar to original ages than the BayLum and Chronomodel results. Especially the outer model ends seem unrealtistic long in BayLum and Chronomodel. The spread effect of the whole sequences seems therefore best captured by OxCal.
In line 415 I suggest reference to
https://www.sciencedirect.com/science/article/pii/S0277379103003160
https://journals.sagepub.com/doi/full/10.1177/0959683616675939
It is really good to see the computer code in Supplements. Yet I am wondering why this is only the case for one of the two examples. Further, R code would benefit from better documentation, please do so that also non-R-familiar colleagues can follow what is done why.
Further, I would like you to provide results (data plotted in Figures) in Supplements.
I am aware of issues with suggesting literature in the review process, and I am asking the editors to have a critical look at these – yet I ask you to consider including the information contained within the suggested literature in your manuscript.
Kind Regards,
Citation: https://doi.org/10.5194/egusphere-2025-890-RC1 -
RC2: 'Comment on egusphere-2025-890', Anonymous Referee #2, 28 Jun 2025
Overview (please see attachment for more detailed comments)
This paper presents two worked case study examples that show three different modelling approaches can lead to strikingly different results when applied to the same data.
I think this is an interesting result, and important for readers who might be using these models to understand, but I do feel that the current exposition is really quite unclear – in terms of sufficiently describing what the different models do; their interpretation; and the potential reasons for any differences in the resulting inference.
They argue that the differences they see between models are due to the way that they handle stratigraphic ordering, but it is not entirely clear to me that it is solely this – as they also seem to argue that the models implement this stratigraphic information in very similar ways.
If that is the case, then I would expect similar results across the models. Instead it suggests to me that either they actually handle stratigraphy quite differently, or that there are more fundamental differences between the three models (which are not explained) that effectively lead to quite different modelling assumptions; or that (some of) the models have perhaps not converged correctly.
Specifically, they compare:
- BayLum when modelling OSL dates
- OxCal, BayLum and ChronoModel when used to analyse a selection of 14C dates from Catalhoyuk
As I said, I think the overall manuscript has highly useful and valuable content for the community and I would recommend publication, but IMO the overall narrative and level of clear explanation really does need to be improved if it is to be of substantial use to the relevant community.
I have provided some major comments in the attached giving my opinion on:
- Providing more detailed descriptions of the specific models (with explicit likelihoods, priors, ...) to better enable comparison and differences between them
- Relationship with Nicholls and Jones (2001 ) where I think there may be some misunderstanding
- Conclusion and Assessment of Model Appropriateness
I hope these comments are useful.
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
196 | 68 | 15 | 279 | 23 | 11 | 27 |
- HTML: 196
- PDF: 68
- XML: 15
- Total: 279
- Supplement: 23
- BibTeX: 11
- EndNote: 27
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1