Bayesian data selection to quantify the value of data for landslide runout calibration
Abstract. The reliability of physics-based landslide runout models depends on the effective calibration of its parameters, which are often conceptual and cannot be physically measured. Bayesian methods offer a robust framework to incorporate uncertainties in both model and observations into the calibration process. Therefore, they are increasingly used to calibrate physics-based landslide runout models. However, the practical application of Bayesian methods to real-world landslide events depends on the availability and quality of observational data, which determines the reliability of the calibration outcomes. Despite this, systematic investigation of the influence of observational data on the Bayesian calibration of landslide runout models has been limited.
We propose quantifying the impact of observational data on calibration outcomes by measuring the information gained during the calibration process using a decision-theoretic measure called Kullback-Leibler (KL) divergence. Building on this, we present a unified Bayesian data selection workflow to identify the most informative dataset for calibrating a given parameter. The workflow runs parallel calibration routines across available observation datasets. It then computes the information gained relative to the observations by calculating the KL divergence between prior and posterior distributions and selects the dataset that yields the highest KL divergence.
We demonstrate our workflow using an elementary landslide runout model, calibrating friction parameters with a diverse set of synthetic observations to evaluate the impact of data selection on parameter calibration. Specifically, we compare and quantify the information gained from calibration routines using observations with varying information content, i.e., velocity vs. position, and observations with different granularity, i.e., aggregated data vs. time series data. The insights from this study will optimize the use of available observations for calibration and guide the design of effective data acquisition strategies.