How hidden variables limit the performance of shallow landslide susceptibility models

Halter, Tobias; Stähli, Manfred; Bast, Alexander; Lehmann, Peter; Aaron, Jordan

doi:10.5194/egusphere-2026-2858

Preprints

https://doi.org/10.5194/egusphere-2026-2858

Preprints

10 Jun 2026

| 10 Jun 2026

Status: this preprint is open for discussion and under review for Earth Surface Dynamics (ESurf).

How hidden variables limit the performance of shallow landslide susceptibility models

Tobias Halter, Manfred Stähli, Alexander Bast, Peter Lehmann, and Jordan Aaron

Abstract. Susceptibility mapping is critical in assessing shallow landslide hazard and sediment transport potential. Advancements in modelling techniques and the availability of high-resolution spatial data have continuously improved the performance of landslide susceptibility maps. Nevertheless, discrepancies between predicted susceptibility and observed landslide occurrence remain. In addition to shortcomings in model design and the incompleteness of landslide inventories, the accuracy and transferability of susceptibility models are critically limited by hidden variables, such as site-specific variability in soil development, that control the triggering process but are rarely available in inventories. Here we developed an extensive case study framework, and apply it to two uniquely detailed inventories in order to quantify the role of hidden variables, as well the effects of incomplete landslide inventories. The first inventory is a comprehensive regional dataset containing over 24,000 mapped landslides across 5,939 km², and the second is a field-validated dataset of 734 landslides which includes detailed documentation of hidden variables. We trained two Random Forest machine learning models using a wide range of explanatory variables, including topography, land cover, soil properties, and climate. The first model was optimized for the first dataset, and achieved high predictive performance within its training domain (mean cross-validation of the area under the curve, AUC = 0.89). However, its accuracy decreased significantly (AUC = 0.74) when applied to the second dataset, highlighting limitations in transferability. The second model was optimized for the second dataset (AUC = 0.79). A comparison of the two models revealed that regional climatic and geologic data hindered transferability to remote regions because the relationship between available and hidden variables is not properly captured by the susceptibility model. We further analysed the predicted susceptibility values as a function of the site-specific information collected in the second database, to quantitatively explore the role of hidden variables. The analysis suggested that variables related to (i) subsurface heterogeneity and (ii) vegetation complexity govern landslide initiation, but are rarely accounted for in susceptibility models. Specifically, the models underestimated susceptibility in poorly developed soils and areas with uniform forest layering. This study underscores the necessity of a process-based understanding grounded in field observations to capture the full complexity of landslide failure mechanisms, relevant to landslide susceptibility modelling.

Received: 18 May 2026 – Discussion started: 10 Jun 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1824 KB)

Supplement (1065 KB)

Download & links

Tobias Halter, Manfred Stähli, Alexander Bast, Peter Lehmann, and Jordan Aaron

Status: open (until 22 Jul 2026)

Post a comment Subscribe to comment alert

RC1:
'Comment on egusphere-2026-2858', Anonymous Referee #1, 30 Jun 2026 reply
This study presents an analysis of how certain variables may limit the performance of shallow landslide susceptibility modelling in Switzerland. I find the topic relevant and promising; however, improvements are necessary, particularly in the introduction and methodology sections, which require a more detailed description. Additional comments and suggestions are outlined below:
Lines 44-48: The argument is weak. I do not agree with this statement. Landslides do not always occur in unexpected locations. In most cases, susceptible areas are already known, but other factors and competing interests often take precedence, allowing these areas to be occupied.

Lines 62-64: The argument is weak. The authors mention the availability of “large landslide inventories”, but this is not supported or properly referenced. In which regions or studies is this the case? In my view, this is not representative of most countries and should be revised and properly justified with references.

Introduction: Poorly developed with week arguments and general phrases, needs to be rewritten.

It would be helpful to provide more background information about the study area. The manuscript barely discusses Switzerland, yet it is important to explain why this country was selected, its historical context regarding landslides, and the current state of landslide inventory mapping and management. This additional context would help readers better understand the relevance and applicability of the study.
Lines 188-89: Please provide the version of the software used.

Section 3: Did the authors assess the correlation among the predictor variables? If so, please describe the methodology used and report the results. If not, I recommend including a correlation or multicollinearity analysis to ensure that highly correlated variables do not affect the interpretation of the results.

Table 1: Please provide the scale/resolution of all thematic variables used.

The manuscript uses predictor variables with spatial resolutions ranging from 2 m to 1000 m. This considerable difference in spatial resolution raises concerns about the consistency of the input data and its potential impact on the model's performance. The authors should clarify how these datasets were harmonized prior to the analysis (e.g., resampling method and target resolution) and discuss the possible implications of combining variables at such different spatial scales.
Table 2: There appears to be an inconsistency between the title and the landslide classification used in the manuscript. While the title refers to “shallow landslides,” the inventory includes different movement types (flows, slides, rotational and translational landslides). This is somewhat confusing, as it is not clear whether the study focuses exclusively on shallow landslides or on a broader set of mass movement processes. The authors should clarify the definition of “shallow landslide” used in this context and justify the inclusion of the different movement types in the inventory.

Section 5: More discussion with other studies developed in Switzerland are required and/or similar Alpine environments.

The validation strategy appears limited, as it relies solely on AUC to evaluate the different susceptibility scenarios. While AUC is a useful measure of overall model performance, it does not provide information about the spatial agreement between predicted susceptibility and observed landslides. I recommend including at least one spatially explicit evaluation metric to better assess model performance in a spatial context, especially given that different predictor sets and inventory configurations are compared.

Reply
Citation: https://doi.org/10.5194/egusphere-2026-2858-RC1
RC2:
'Comment on egusphere-2026-2858', Anonymous Referee #2, 09 Jul 2026 reply
Review of “How hidden variables limit the performance of shallow landslide susceptibility models” by Halter et al.
Halter et al. explore three components of data-driven statistical models for assessing landslide susceptibility: 1. the effects of incomplete inventories on model development; 2. the transferability of these models; 3. the effects of hidden variables on model outputs. The first two components have already been extensively covered in previous studies, as the authors recognize. The unique aspect of this study is the third component. The authors try to assess the influence of hidden variables generally not included in data-driven statistical models, due to the data being highly heterogeneous and difficult to collect, on susceptibility model outputs. These hidden variables include soil and vegetation characteristics. It is well known that hidden variables can exert a strong control on susceptibility model results. This paper attempts to provide a more detailed understanding of these hidden variables using a database developed from field observations at known landslide locations. I found the manuscript to be well-written and the topic to be of interest to the landslide community. However, I have some serious concerns about the limits of the data used in the analysis. Below, I provide some general comments on the manuscript, followed by more detailed comments.

General Comments:
I have some major concerns about the hidden variable aspect of this study. Most of them are tied to the fact that the authors only have information about the hidden variables at landslide locations, not non-landslide locations. Because this is a data shortage issue, it is not clear to me if it can be overcome to allow this manuscript to be published.
The analysis only analyzes the influence of hidden variables at landslide locations and not at non-landslide locations. I understand that this is a limitation of the dataset, but I don’t think this gives you enough information to understand the influence of the hidden variables on the susceptibility models. All it can tell you is if your model is underpredicting when a given hidden variable is present. It cannot tell you anything about where the model overpredicted when the same hidden variable is present, which is just as important as assessing underprediction. It also doesn’t directly tell you anything about the relative abundance of these hidden variables where you do not have data, which is most of your study area. It could be that the hidden variables with low susceptibility values at landslide locations (e.g., the rendzina soil type) are prolific at non-landslide locations. In which case, we may not want landslides with these hidden variables to be modeled with high susceptibility.

Building on the previous concern, there is no indication of the effect of hidden variables on susceptibility model performance. This again relates to the hidden variables only being evaluated at known landslide locations. We still don’t know if including detailed soil observations, or any other of the hidden variables, would have a notable influence on model performance. Put another way, the analysis doesn’t tell us anything about the effects of hidden variables on modelled landslide susceptibility. Despite what the manuscript currently says. It only provides information about the variance of susceptibility values of landslide locations. These are not the same thing! It is already well understood that data-driven statistical models omit important processes for assessing slope stability. I’m not convinced that this manuscript does anything more than provide more discussion of this fact without any real data to support it.

As the authors acknowledge in the introduction and background sections of the manuscript, the real novelty of this manuscript is the exploration of the hidden variables. However, about half of the results and discussion are devoted to well-investigated issues with data-driven statistical susceptibility models. I think the paper could cut back on these other sections to better focus on the more novel parts of this study. This only works if the authors can fix the issues I raised with the hidden variable component of the study.
There were several references to soil depth data, but I did not see this data anywhere in the manuscript or supplemental. Please include this data somewhere.

Line-by-lines:
66: “no study has systematically summarized and quantified the effect of these limitations on model performance”- I don’t think this is accurate. Lots of studies have quantified the limits of model performance. I think you need to be more specific in what exactly you are referring to.
73: I think you are limiting this discussion to data-driven statistical landslide susceptibility models rather than other types of models. Please specify.
250: Please define KOBO.
257: Please explain how you downscaled this data.
262: Please clarify what is meant by ‘paths’.
Table 1: I suggest being consistent with the model names. Choose Model 1 or Independent.
285: Please specify how you determined a ‘statistically sufficient sample size’.
363: You tuned the same hyperparameters; you did not use the same hyperparameter values. Right? Please clarify your meaning.
367: Please clarify your meaning here. Some of the SLD data is outside of the Canton Bern area. And you said that you removed some areas of the Canton that were in the SLD data.
383: What about the occurrence of false positives? These are important too.
384-398: It took me a while to understand how this statistical workflow worked. I'll try and outline what I found confusing with suggestions to help you rewrite this section:
I did not understand where the paragraph was going and why you were explaining these things. Provide a brief overview of what this paragraph explains before diving into it.

Explaining your workflow in general terms before diving into the details will also help orient the readers. Also, in this general description, explain the motivation for each comparison.

More details are needed on the Kruskal-Wallis test. You write it as if the readers should already know what this test is for and how it is used. Most will not.

I found myself getting confused on whether you were comparing the variables or the variable categories (i.e., subclasses). You use the eta-squared statistic for the variables and the boxplots of susceptibility values and the Dunn test for the categories.

What exactly is the eta-squared measuring because we only have susceptibility of known landslide locations? Not non-landslide locations.

391-392: It is unclear to me how to interpret these effect values. Also, why are these thresholds meaningful?
395: Did you mean ‘computed’ instead of ‘competed’?
410: Please show the AUC values of these perimeters or refer the reader to where they are. I could not find them.
448: The term ‘effect sizes’ is misleading because it is usually used as an indicator of how the model output changes based on the input parameters. The variables you are comparing were not in the model and we don't really know how they affect model performance. We only know their values and your model's susceptibility values at landslide locations. That does not provide enough information to get indications of the true effects of these variables on the susceptibility model.
451-452: Can you explain why the effect sizes were smaller?
464-465: Rendzina and the Brownearth have statistically comparable susceptibility values (group 'a').
468: Please explain why having at least 10 landslides matters.
470: I don’t see any data about soil depth anywhere. Please show it.
478: How do you know that that is the only reason?
484: Please reword as follows – predicted susceptibility at known landslide locations. This type of clarification needs to be implemented throughout the manuscript.
524: Because this paper relies on statistics, I wouldn't use this word unless you mean it in a statistical sense. You use it in a non-statistical sense throughout the manuscript.
554: I found this sentence to be difficult to follow. What is ‘they’ referring to?
565: Please clarify what is meant by ‘less well predicted’.

Reply
Citation: https://doi.org/10.5194/egusphere-2026-2858-RC2

Tobias Halter, Manfred Stähli, Alexander Bast, Peter Lehmann, and Jordan Aaron

Supplement

https://doi.org/10.5194/egusphere-2026-2858-supplement

Data sets

Swisstopo Data Swisstopo https://map.geo.admin.ch/

Soil Grids L. Poggio et al. https://soilgrids.org/

Normalized Difference Vegetation Index (NDVI) UniGeneva https://doi.org/10.26037/yareta:kpmscrogqbdhvjeuev2ydrzk7y

Vegetation height model national forest inventory (NFI) Christian Ginzler https://doi.org/10.16904/1000001.1

Tobias Halter, Manfred Stähli, Alexander Bast, Peter Lehmann, and Jordan Aaron

Viewed

Total article views: 181 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
122	45	14	181	22	8	9

HTML: 122
PDF: 45
XML: 14
Total: 181
Supplement: 22
BibTeX: 8
EndNote: 9

Views and downloads (calculated since 10 Jun 2026)

Month	HTML	PDF	XML	Total
Jun 2026	96	29	8	133
Jul 2026	26	16	6	48

Cumulative views and downloads (calculated since 10 Jun 2026)

Month	HTML	PDF	XML	Total
Jun 2026	96	29	8	133
Jul 2026	26	16	6	48

Viewed (geographical distribution)

Total article views: 168 (including HTML, PDF, and XML) Thereof 168 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 22 Jul 2026

Short summary

Despite substantial progress in predicting landslide susceptibility, landslides still occur at unexpected locations. In this article, we investigated the root causes of these mispredictions by analysing surface and subsurface field observations. Our findings highlight the inherent limitations of state-of-the-art machine learning models. The articles advocates for a shift toward process-based understanding and the integration and advancement of high-resolution in-situ and remote sensing data.


Total:	0
HTML:	0
PDF:	0
XML:	0