The dominant role of latent spatial structure in landslide susceptibility
Abstract. Landslide susceptibility mapping is essential for risk management and mitigation. Traditional multivariate models have achieved high nominal accuracy in predicting landslide occurrence, but the underlying methods rarely account for spatial dependence in the input data. We test how spatial Hierarchical Generalized Linear Models (HGLMs) and spatial autoregressive models might enhance both accuracy and reliability of landslide susceptibility. We estimate the frequency of landslides in a catchment, drawing on a catalogue of 10,837 landslides and several predictors (e.g., rainfall, elevation, slope) in the northern Colombian Andes. Our HGLMs integrate the effects of spatial dependency and heterogeneity through Markov Random Field (MRF) models, i.e. Intrinsic Conditional Autoregressive (ICAR), Besag-York-Mollié (BYM), and Leroux models. Our results show that landslide frequency is significantly influenced by spatially-dependent unobserved factors, likely representing contiguous geological formations or shared soil properties, which we therefore incorporate as latent variables. Even after accounting for known covariates, the HGLMs capture spatial dependencies that non-spatial models fail to address. Incorporating spatial structure in the data improves model performance, judging from model selection metrics such as the Deviance Information Criterion (DIC) or the Watanabe–Akaike Information Criterion (WAIC). By accounting for latent spatial effects, spatial HGLMs produce smoother and more reliable susceptibility maps. This approach overcomes a key limitation of traditional models: the underestimation of landslide frequency in high-density areas where unobserved, spatially-structured factors are most influential. Our findings highlight the importance of integrating spatial dependence and heterogeneity in landslide susceptibility models to achieve enhanced predictive performance and reliability.
General assessment
This manuscript addresses an important methodological question, namely the role of spatial dependence in landslide susceptibility modeling and the use of hierarchical spatial models. The comparison of non-spatial and spatial models is potentially relevant for NHESS readers, and the manuscript is clearly structured and generally well written. However, I cannot recommend publication in its current form. My concerns are conceptual as well as technical: (1) The study operates at the level of terrain units that is not appropriate for landslide susceptibility modeling, which undermines model interpretation and relevance of findings. (2) The study conflates statistical concepts, potentially over-interpreting and over-stating its findings. I therefore recommend rejection.
Major concerns
1. Terrain mapping units are inadequate for susceptibility modeling
The manuscript models landslide susceptibility using hydrographic catchments of approximately one hundrd to a few hundred km² as terrain mapping units. This support is inadequate for a process known to be governed by slope-scale conditions. Yet the predictors are aggregated to catchment means (mean slope, mean elevation, rainfall summaries), suppressing the variability that controls slope failure. This severe change-of-support problem likely dilutes predictor-response relationships by construction, resulting in inflated residual variation. In my view, this issue fundamentally undermines the manuscript’s central conclusion.
2. Over-interpretation of latent spatial effects
The manuscript repeatedly interprets latent spatial effects as evidence for contiguous lithology, soil properties, tectonic domains, or shared physical controls. However, the CAR formulation only encodes adjacency among catchments. The model demonstrates that neighboring polygons have similar residual structure; it does not identify geology or tectonic controls. No attempt is made to relate latent spatial structure to known geological or other structures in this vast mountain area, and no attempt is made to use such (even coarse) information to construct additional process-related covariates. The physical interpretation is therefore speculative and substantially overstated.
The use of three large hydrographic basins as higher-level units is particularly weakly justified. No evidence is presented that these basins correspond to meaningful geophysical groupings for slope stability, and there is no general reason to expect “tectonic domains” to align with hydrographic catchments.
3. Model interpretation and diagnostics are insufficient
The manuscript repeatedly makes strong claims (“reliable”, “accurate”, “better estimates”, “overconfident predictions”) without clearly distinguishing precision and bias of coefficient estimtors, predictive performance, and goodness of fit. Residual diagnostics also raise concerns. Pearson residuals appear extremely large (up to 20-30), suggesting major lack of fit or overdispersion (perhaps related to points 1 and 2 above), yet this is not discussed.
4. Predictor selection and process representation are insufficiently justified
The predictor set is extremely limited (mean slope, mean elevation, rainfall summary). Relevant predictors commonly used in susceptibility modeling (e.g., land cover/land use, lithology, anthropogenic disturbance such as roads, topographic wetness index, and of course local slope angle) are absent without justification. Predictor selection (correlation analysis and PCA) is insufficiently described, and the manuscript does not justify linear predictor-response assumptions despite well-known nonlinearities, particularly for slope angle.
Additional comments
Additional minor comments
L11 "model performance" - or rather "goodness of fit" in the usual statistical terminology; "performance" suggests estimation on independent test set
L22-23 "cross-sectional" and "panel data" are terms that are not commonly used in the environmental statistics literature; prefer more descriptive alternatives? e.g. longitudinal or simply temporal or time series?
L24 TMU is not a concept from spatial statistics
L39 "model residuals" - definition not so obvious in the context of classification problems
L41 "interactions" - concept not obvious in this context; unobserved confounders are not interactions - not statistically and not otherwise
L44-51 "statistical oversight" is very strong wording, considering that most susceptibility analyses do not interpret model coefficients in detail and do not perform statistical inference.