The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies

Guérin, Guillaume; Guitton-Boussion, Pierre; Bouafia, Imène; Philippe, Anne

doi:10.5194/egusphere-2025-890

Preprints

https://doi.org/10.5194/egusphere-2025-890

Preprints

12 Mar 2025

| 12 Mar 2025

The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies

Guillaume Guérin, Pierre Guitton-Boussion, Imène Bouafia, and Anne Philippe

Abstract. Bayesian modelling is often implemented in geochronology and its applications to geomorphology, archaeology, etc. The rationale behind such practices is the aim to improve robustness, precision and accuracy thanks to the use of prior knowledge regarding the studied sites, and in particular the order of samples constrained by stratigraphy. All chronological models tested in this study (OxCal, Chronomodel and BayLum) use the same mathematical model to handle stratigraphic constraints. However, this model has been shown to lead to estimation biases. First, this bias is illustrated with BayLum modelling on a high-resolution OSL dataset. Then, this paper compares statistical inferences obtained with the three above-mentioned modelling software on the Neolithic East mound of Çatalhöyük (Turkey). For this site, 49 radiocarbon ages were obtained with the aim to determine the start of occupations at this locality. Interestingly, age uncertainties are rather large, because of calibration curve plateaus. Therefore, the conditions for estimation biases are met. We discuss the behaviour of the different models and show that caution must be taken when modelling results are at odds with measurements. While OxCal, Chronomodel and BayLum are all affected by a spread in ages resulting from their common model of stratigraphic errors, Chronomodel suffers from a great loss of precision and OxCal, through the phase model, concentrates ages undesirably. We also conclude that the onset of occupations at Çatalhöyük was probably earlier than previously thought based on the OxCal model.

Received: 25 Feb 2025 – Discussion started: 12 Mar 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2110 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (2110 KB)

Supplement (220 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

30 Mar 2026

The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies

Guillaume Guérin, Pierre Guitton-Boussion, Imène Bouafia, and Anne Philippe

Geochronology, 8, 191–207, https://doi.org/10.5194/gchron-8-191-2026,https://doi.org/10.5194/gchron-8-191-2026, 2026

Short summary

Guillaume Guérin, Pierre Guitton-Boussion, Imène Bouafia, and Anne Philippe

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-890', Anonymous Referee #1, 20 May 2025

Dear Editor(s) and Authors,
I have read the manuscript ‘The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies’ in detail. It is a well structured and well-written manuscript, and the scope fits Geochronology. You address a standing topic and issue in age-depth modeling, and your contribution is relevant. While your conclusions are not entirely new, these are based on real and well selected datasets, and your manuscript is a welcome part for the scientific literature. I clearly support publication in Geochronology after revisions.
The authors use two case studies to point out challenges with Bayesian age-depth modeling – and rightfully demonstrate that the tested models are not without bias and artifacts. I particularly like – and agree with - the repeated call for applying common sense and looking at data with an experienced eye of a geochronologist even when applying models, e.g. as (quotes from your submission follow) ‘As frustrating as it may be, in our view none of the tested models can tell us anything better than the actual data themselves’, and as ‘when testing any chronological model, it is of utmost importance to compare the model outcome with the input data.’. I fully agree and find this an important lesson: look at data, know possible issues – and then think if a model may help and/or is of any help.
The final statement ‘Our study shows that this goal [make use of prior observations to refine the precision, accuracy and robustness] is difficult to reach and that using models to correct measurements appears to be dangerous.’. Well – that really depends on the case and individual data structure in my opinion, and such a general statement should be at least softened when based on two datsets only (why is no reference to the often used models BChron & Bacon made?), and few datasets which are indeed challenging.
With this I come to my main criticism of this manuscript: the arbitrary selection of models, seemingly influenced by previous work of the authors. When speaking of luminence modeling I ask you to refer to ADMin (https://www.sciencedirect.com/science/article/pii/S187110141730047X) – probably the model least affected by the spread effect (?), but at the same time slow/unsuitable for large (and these?) datasets. Generally, I disagree with the BChron and Bacon models not even being mentioned, as these are really often used.

Further comments.
References to Ramsey should in my opinion be to Bronk Ramsey
Line 84: ‘event model of Lanos and Philippe (2018)’ – could you please introduce this one – it is less known than the one by Bronk Ramsey which you introduce in detail
Line 115: please explain ‘Theta matrix’
160ff: Can BayLum model 14C ages!? - that would be different than luminescence modeling, because here 'only' the 14C age is used?
In Fig. 3 (and others) please include original ages.
Generally, I find your figures would benefit from clearer explanation in captions, and systematically placing units on axes - ideally all would be on the same age (ka or BC ,please dont mix here).
Abscissa of Fig. 3 : space before bracket missing
Figure 5 and its explanation: ordinate unclear. Why was this only done for BayLum?
Line 278: please explain the phase structure here
284: ‘between samples OxA-9893 and OxA-23251’ – please mark in Figure so that these can easily be found
286f: I disagree with your statement ‘These two bottom-most samples are PL-980252A, whose age lies outside the calibrated age of all samples above’ – the densities do overlap
Chapter 3.2.2., and Fig. 8 limited to the lower 17 samples – was the model run for all or these samples?
Line ~322: please highlight where the spread effect is pronounced why
Fig. 10: units on both axes missing – please also include original dating - either as distribution or mean ages.
352ff: given that Chronomodel and OxCal partly do not overlap the praising of larger uncertainty alone seems unjustified.
In chapter 4.1. I find a prominent feature missing: The duration of the sequence when using OxCal is much shorter than when using BayLum or Chronomodel. This is worrying in my opinion, and the OxCal results seem much more similar to original ages than the BayLum and Chronomodel results. Especially the outer model ends seem unrealtistic long in BayLum and Chronomodel. The spread effect of the whole sequences seems therefore best captured by OxCal.
In line 415 I suggest reference to
https://www.sciencedirect.com/science/article/pii/S0277379103003160
https://journals.sagepub.com/doi/full/10.1177/0959683616675939

It is really good to see the computer code in Supplements. Yet I am wondering why this is only the case for one of the two examples. Further, R code would benefit from better documentation, please do so that also non-R-familiar colleagues can follow what is done why.
Further, I would like you to provide results (data plotted in Figures) in Supplements.
I am aware of issues with suggesting literature in the review process, and I am asking the editors to have a critical look at these – yet I ask you to consider including the information contained within the suggested literature in your manuscript.
Kind Regards,

Citation: https://doi.org/10.5194/egusphere-2025-890-RC1
- AC1: 'Reply on RC1', Guillaume Guérin, 19 Sep 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-890/egusphere-2025-890-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-890-AC1
RC2:
'Comment on egusphere-2025-890', Anonymous Referee #2, 28 Jun 2025
Overview (please see attachment for more detailed comments)
This paper presents two worked case study examples that show three different modelling approaches can lead to strikingly different results when applied to the same data.
I think this is an interesting result, and important for readers who might be using these models to understand, but I do feel that the current exposition is really quite unclear – in terms of sufficiently describing what the different models do; their interpretation; and the potential reasons for any differences in the resulting inference.
They argue that the differences they see between models are due to the way that they handle stratigraphic ordering, but it is not entirely clear to me that it is solely this – as they also seem to argue that the models implement this stratigraphic information in very similar ways.
If that is the case, then I would expect similar results across the models. Instead it suggests to me that either they actually handle stratigraphy quite differently, or that there are more fundamental differences between the three models (which are not explained) that effectively lead to quite different modelling assumptions; or that (some of) the models have perhaps not converged correctly.
Specifically, they compare:
BayLum when modelling OSL dates

OxCal, BayLum and ChronoModel when used to analyse a selection of 14C dates from Catalhoyuk

As I said, I think the overall manuscript has highly useful and valuable content for the community and I would recommend publication, but IMO the overall narrative and level of clear explanation really does need to be improved if it is to be of substantial use to the relevant community.
I have provided some major comments in the attached giving my opinion on:
Providing more detailed descriptions of the specific models (with explicit likelihoods, priors, ...) to better enable comparison and differences between them

Relationship with Nicholls and Jones (2001 ) where I think there may be some misunderstanding

Conclusion and Assessment of Model Appropriateness

I hope these comments are useful.
Citation: https://doi.org/10.5194/egusphere-2025-890-RC2
- AC2: 'Reply on RC2', Guillaume Guérin, 19 Sep 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-890/egusphere-2025-890-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-890-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-890', Anonymous Referee #1, 20 May 2025

Dear Editor(s) and Authors,
I have read the manuscript ‘The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies’ in detail. It is a well structured and well-written manuscript, and the scope fits Geochronology. You address a standing topic and issue in age-depth modeling, and your contribution is relevant. While your conclusions are not entirely new, these are based on real and well selected datasets, and your manuscript is a welcome part for the scientific literature. I clearly support publication in Geochronology after revisions.
The authors use two case studies to point out challenges with Bayesian age-depth modeling – and rightfully demonstrate that the tested models are not without bias and artifacts. I particularly like – and agree with - the repeated call for applying common sense and looking at data with an experienced eye of a geochronologist even when applying models, e.g. as (quotes from your submission follow) ‘As frustrating as it may be, in our view none of the tested models can tell us anything better than the actual data themselves’, and as ‘when testing any chronological model, it is of utmost importance to compare the model outcome with the input data.’. I fully agree and find this an important lesson: look at data, know possible issues – and then think if a model may help and/or is of any help.
The final statement ‘Our study shows that this goal [make use of prior observations to refine the precision, accuracy and robustness] is difficult to reach and that using models to correct measurements appears to be dangerous.’. Well – that really depends on the case and individual data structure in my opinion, and such a general statement should be at least softened when based on two datsets only (why is no reference to the often used models BChron & Bacon made?), and few datasets which are indeed challenging.
With this I come to my main criticism of this manuscript: the arbitrary selection of models, seemingly influenced by previous work of the authors. When speaking of luminence modeling I ask you to refer to ADMin (https://www.sciencedirect.com/science/article/pii/S187110141730047X) – probably the model least affected by the spread effect (?), but at the same time slow/unsuitable for large (and these?) datasets. Generally, I disagree with the BChron and Bacon models not even being mentioned, as these are really often used.

Further comments.
References to Ramsey should in my opinion be to Bronk Ramsey
Line 84: ‘event model of Lanos and Philippe (2018)’ – could you please introduce this one – it is less known than the one by Bronk Ramsey which you introduce in detail
Line 115: please explain ‘Theta matrix’
160ff: Can BayLum model 14C ages!? - that would be different than luminescence modeling, because here 'only' the 14C age is used?
In Fig. 3 (and others) please include original ages.
Generally, I find your figures would benefit from clearer explanation in captions, and systematically placing units on axes - ideally all would be on the same age (ka or BC ,please dont mix here).
Abscissa of Fig. 3 : space before bracket missing
Figure 5 and its explanation: ordinate unclear. Why was this only done for BayLum?
Line 278: please explain the phase structure here
284: ‘between samples OxA-9893 and OxA-23251’ – please mark in Figure so that these can easily be found
286f: I disagree with your statement ‘These two bottom-most samples are PL-980252A, whose age lies outside the calibrated age of all samples above’ – the densities do overlap
Chapter 3.2.2., and Fig. 8 limited to the lower 17 samples – was the model run for all or these samples?
Line ~322: please highlight where the spread effect is pronounced why
Fig. 10: units on both axes missing – please also include original dating - either as distribution or mean ages.
352ff: given that Chronomodel and OxCal partly do not overlap the praising of larger uncertainty alone seems unjustified.
In chapter 4.1. I find a prominent feature missing: The duration of the sequence when using OxCal is much shorter than when using BayLum or Chronomodel. This is worrying in my opinion, and the OxCal results seem much more similar to original ages than the BayLum and Chronomodel results. Especially the outer model ends seem unrealtistic long in BayLum and Chronomodel. The spread effect of the whole sequences seems therefore best captured by OxCal.
In line 415 I suggest reference to
https://www.sciencedirect.com/science/article/pii/S0277379103003160
https://journals.sagepub.com/doi/full/10.1177/0959683616675939

It is really good to see the computer code in Supplements. Yet I am wondering why this is only the case for one of the two examples. Further, R code would benefit from better documentation, please do so that also non-R-familiar colleagues can follow what is done why.
Further, I would like you to provide results (data plotted in Figures) in Supplements.
I am aware of issues with suggesting literature in the review process, and I am asking the editors to have a critical look at these – yet I ask you to consider including the information contained within the suggested literature in your manuscript.
Kind Regards,

Citation: https://doi.org/10.5194/egusphere-2025-890-RC1
- AC1: 'Reply on RC1', Guillaume Guérin, 19 Sep 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-890/egusphere-2025-890-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-890-AC1
RC2:
'Comment on egusphere-2025-890', Anonymous Referee #2, 28 Jun 2025
Overview (please see attachment for more detailed comments)
This paper presents two worked case study examples that show three different modelling approaches can lead to strikingly different results when applied to the same data.
I think this is an interesting result, and important for readers who might be using these models to understand, but I do feel that the current exposition is really quite unclear – in terms of sufficiently describing what the different models do; their interpretation; and the potential reasons for any differences in the resulting inference.
They argue that the differences they see between models are due to the way that they handle stratigraphic ordering, but it is not entirely clear to me that it is solely this – as they also seem to argue that the models implement this stratigraphic information in very similar ways.
If that is the case, then I would expect similar results across the models. Instead it suggests to me that either they actually handle stratigraphy quite differently, or that there are more fundamental differences between the three models (which are not explained) that effectively lead to quite different modelling assumptions; or that (some of) the models have perhaps not converged correctly.
Specifically, they compare:
BayLum when modelling OSL dates

OxCal, BayLum and ChronoModel when used to analyse a selection of 14C dates from Catalhoyuk

As I said, I think the overall manuscript has highly useful and valuable content for the community and I would recommend publication, but IMO the overall narrative and level of clear explanation really does need to be improved if it is to be of substantial use to the relevant community.
I have provided some major comments in the attached giving my opinion on:
Providing more detailed descriptions of the specific models (with explicit likelihoods, priors, ...) to better enable comparison and differences between them

Relationship with Nicholls and Jones (2001 ) where I think there may be some misunderstanding

Conclusion and Assessment of Model Appropriateness

I hope these comments are useful.
Citation: https://doi.org/10.5194/egusphere-2025-890-RC2
- AC2: 'Reply on RC2', Guillaume Guérin, 19 Sep 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-890/egusphere-2025-890-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-890-AC2

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (29 Sep 2025) by Michael Dietze

AR by Guillaume Guérin on behalf of the Authors (06 Nov 2025) Author's response Author's tracked changes Manuscript

ED: Publish subject to revisions (further review by editor and referees) (13 Nov 2025) by Michael Dietze

ED: Referee Nomination & Report Request started (08 Dec 2025) by Michael Dietze

RR by Anonymous Referee #1 (13 Jan 2026)

RR by Anonymous Referee #2 (21 Jan 2026)

ED: Publish subject to minor revisions (further review by editor) (23 Jan 2026) by Michael Dietze

AR by Guillaume Guérin on behalf of the Authors (30 Jan 2026) Author's response Author's tracked changes Manuscript

ED: Publish as is (09 Feb 2026) by Michael Dietze

ED: Publish as is (05 Mar 2026) by Georgina King (Editor)

AR by Guillaume Guérin on behalf of the Authors (05 Mar 2026) Manuscript

Journal article(s) based on this preprint

30 Mar 2026

The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies

Guillaume Guérin, Pierre Guitton-Boussion, Imène Bouafia, and Anne Philippe

Geochronology, 8, 191–207, https://doi.org/10.5194/gchron-8-191-2026,https://doi.org/10.5194/gchron-8-191-2026, 2026

Short summary

Guillaume Guérin, Pierre Guitton-Boussion, Imène Bouafia, and Anne Philippe

Supplement

https://doi.org/10.5194/egusphere-2025-890-supplement

Guillaume Guérin, Pierre Guitton-Boussion, Imène Bouafia, and Anne Philippe

Viewed

Total article views: 991 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
718	237	36	991	42	43	57

HTML: 718
PDF: 237
XML: 36
Total: 991
Supplement: 42
BibTeX: 43
EndNote: 57

Views and downloads (calculated since 12 Mar 2025)

Month	HTML	PDF	XML	Total
Mar 2025	60	23	6	89
Apr 2025	31	14	1	46
May 2025	38	14	4	56
Jun 2025	47	11	3	61
Jul 2025	27	11	2	40
Aug 2025	56	11	0	67
Sep 2025	288	13	7	308
Oct 2025	22	12	0	34
Nov 2025	26	30	3	59
Dec 2025	27	23	1	51
Jan 2026	30	17	2	49
Feb 2026	24	26	5	55
Mar 2026	42	32	2	76
Apr 2026	0

Cumulative views and downloads (calculated since 12 Mar 2025)

Month	HTML	PDF	XML	Total
Mar 2025	60	23	6	89
Apr 2025	31	14	1	46
May 2025	38	14	4	56
Jun 2025	47	11	3	61
Jul 2025	27	11	2	40
Aug 2025	56	11	0	67
Sep 2025	288	13	7	308
Oct 2025	22	12	0	34
Nov 2025	26	30	3	59
Dec 2025	27	23	1	51
Jan 2026	30	17	2	49
Feb 2026	24	26	5	55
Mar 2026	42	32	2	76
Apr 2026	0

Viewed (geographical distribution)

Total article views: 1,036 (including HTML, PDF, and XML) Thereof 1,036 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 16 Apr 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (2110 KB)
Metadata XML

Short summary

Bayesian modelling is often used to refine numerically dated chronological sequences, e.g., by making use of stratigraphic constraints. First, a high-resolution dataset based on luminescence dating is modelled with the dedicated R package BayLum. Then, three Bayesian modelling tools – namely BayLum, Chronomodel and OxCal – are compared using a high-resolution, radiocarbon dataset. Modelling artefacts are identified; the strengths and weaknesses of the models are discussed.

The conflict between sampling resolution and stratigraphic constraints from a Bayesian perspective: OSL and radiocarbon case studies

Journal article(s) based on this preprint

Interactive discussion

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection

Suggestions for revision or reasons for rejection

Journal article(s) based on this preprint

Supplement

Viewed

Viewed (geographical distribution)


Total:	0
HTML:	0
PDF:	0
XML:	0