the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Bayesian data selection to quantify the value of data for landslide runout calibration
Abstract. The reliability of physics-based landslide runout models depends on the effective calibration of its parameters, which are often conceptual and cannot be physically measured. Bayesian methods offer a robust framework to incorporate uncertainties in both model and observations into the calibration process. Therefore, they are increasingly used to calibrate physics-based landslide runout models. However, the practical application of Bayesian methods to real-world landslide events depends on the availability and quality of observational data, which determines the reliability of the calibration outcomes. Despite this, systematic investigation of the influence of observational data on the Bayesian calibration of landslide runout models has been limited.
We propose quantifying the impact of observational data on calibration outcomes by measuring the information gained during the calibration process using a decision-theoretic measure called Kullback-Leibler (KL) divergence. Building on this, we present a unified Bayesian data selection workflow to identify the most informative dataset for calibrating a given parameter. The workflow runs parallel calibration routines across available observation datasets. It then computes the information gained relative to the observations by calculating the KL divergence between prior and posterior distributions and selects the dataset that yields the highest KL divergence.
We demonstrate our workflow using an elementary landslide runout model, calibrating friction parameters with a diverse set of synthetic observations to evaluate the impact of data selection on parameter calibration. Specifically, we compare and quantify the information gained from calibration routines using observations with varying information content, i.e., velocity vs. position, and observations with different granularity, i.e., aggregated data vs. time series data. The insights from this study will optimize the use of available observations for calibration and guide the design of effective data acquisition strategies.
- Preprint
(1900 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-4531', Reyko Schachtschneider, 04 Nov 2025
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4531/egusphere-2025-4531-RC1-supplement.pdfCitation: https://doi.org/
10.5194/egusphere-2025-4531-RC1 -
AC1: 'Reply on RC1', V Mithlesh Kumar, 10 Nov 2025
Thank you for your comments. We are currently addressing all the points you raised. One of your main concerns was the missing labels in several figures, and we appreciate you bringing this to our attention.
The missing labels resulted from a rendering error in the preprint version rather than an issue with the submitted manuscript. The version we submitted on September 15, 2025, included all figure labels. However, on October 6, we noticed that the posted preprint had some labels missing. We contacted the editorial office, who confirmed that such rendering errors can occur and promptly restored the correct version the same day.
The online preprint has displayed all figures correctly with labels since October 6, and no changes were made to the manuscript content.
Thank you again for your careful review and attention to detail.Citation: https://doi.org/10.5194/egusphere-2025-4531-AC1 -
AC2: 'Reply on RC1', V Mithlesh Kumar, 18 Dec 2025
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-4531/egusphere-2025-4531-AC2-supplement.pdf
- AC4: 'Reply on RC1', V Mithlesh Kumar, 26 Jan 2026
-
AC1: 'Reply on RC1', V Mithlesh Kumar, 10 Nov 2025
-
RC2: 'Comment on egusphere-2025-4531', Aki Vehtari, 12 Jan 2026
I have read the other review and author's rebuttal for that.
- Please define "calibration" as it has different definitions. For example, see https://en.wikipedia.org/wiki/Calibration_(statistics) and https://doi.org/10.1214/23-BA1404
- I'm also concenred with use of KL-divergence. I don't think author's rebuttal on this is sufficient. As the authors have used only uniform prior, the measure used is equivalent to entropy. Entropy is sensible measure for sharpness of the distribution. Instead of talking about KL-divergence, I suggest the authors would talk about entropy and use entropy also in case of non-uniform priors. Using entropy would focus on maximizing the sharpness of the posterior, while using KL can lead also maximal shift of the posterior.
- I found the Figure 2 and 3 schematics confusing, as by first look it looks like information flows from priors to likelihoods, which doesn't make sense. I don't have good suggestion how to change them, but wanted to mention this if the authors would have other ideas.
- The author's mention which MCMC algorithm is used, so the authors could also mention which convergence diagnostics were used from ArviZ package
- Explicitly define the prior in case study, now it seems based on the plots that it's unifrom, but the priors were not explicitly defined in the text. This make huge difference for the use of KL, as it's then equivalent to entropy which measures just the sharpness. Without explicitly mentioning the prior in the text, it takes more time and effort from the reader to see what has been actually used.
- Otherwise the paper is mostly clear and easy to follow and I don't have further comments
Citation: https://doi.org/10.5194/egusphere-2025-4531-RC2 - AC3: 'Reply on RC2', V Mithlesh Kumar, 26 Jan 2026
Data sets
Bayesian data selection to quantify the value of data for landslide runout calibration V Mithlesh Kumar https://doi.org/10.5281/zenodo.17120721
Interactive computing environment
Bayesian data selection to quantify the value of data for landslide runout calibration V Mithlesh Kumar https://doi.org/10.5281/zenodo.17120721
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 225 | 118 | 32 | 375 | 21 | 19 |
- HTML: 225
- PDF: 118
- XML: 32
- Total: 375
- BibTeX: 21
- EndNote: 19
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1