the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The dominant role of latent spatial structure in landslide susceptibility
Abstract. Landslide susceptibility mapping is essential for risk management and mitigation. Traditional multivariate models have achieved high nominal accuracy in predicting landslide occurrence, but the underlying methods rarely account for spatial dependence in the input data. We test how spatial Hierarchical Generalized Linear Models (HGLMs) and spatial autoregressive models might enhance both accuracy and reliability of landslide susceptibility. We estimate the frequency of landslides in a catchment, drawing on a catalogue of 10,837 landslides and several predictors (e.g., rainfall, elevation, slope) in the northern Colombian Andes. Our HGLMs integrate the effects of spatial dependency and heterogeneity through Markov Random Field (MRF) models, i.e. Intrinsic Conditional Autoregressive (ICAR), Besag-York-Mollié (BYM), and Leroux models. Our results show that landslide frequency is significantly influenced by spatially-dependent unobserved factors, likely representing contiguous geological formations or shared soil properties, which we therefore incorporate as latent variables. Even after accounting for known covariates, the HGLMs capture spatial dependencies that non-spatial models fail to address. Incorporating spatial structure in the data improves model performance, judging from model selection metrics such as the Deviance Information Criterion (DIC) or the Watanabe–Akaike Information Criterion (WAIC). By accounting for latent spatial effects, spatial HGLMs produce smoother and more reliable susceptibility maps. This approach overcomes a key limitation of traditional models: the underestimation of landslide frequency in high-density areas where unobserved, spatially-structured factors are most influential. Our findings highlight the importance of integrating spatial dependence and heterogeneity in landslide susceptibility models to achieve enhanced predictive performance and reliability.
- Preprint
(26893 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 12 Jul 2026)
-
RC1: 'Comment on egusphere-2026-1749', Anonymous Referee #1, 27 May 2026
reply
-
AC1: 'Reply on RC1', Edier Vicente Aristizábal Giraldo, 06 Jun 2026
reply
Major Concern 1: Terrain mapping units are inadequate for susceptibility modeling
We respectfully disagree with this assessment. While landslides are inherently slope-scale processes, this does not preclude their analysis at coarser spatial units. The choice of analysis scale is governed by the study objectives, data availability, and the desired spatial resolution of the output map. Regional and national-scale susceptibility assessments, which necessarily operate at resolutions far coarser than the individual slope, are well established in the literature (Guzzetti et al., 2005, Geomorphology, 72:272–299; Stanley & Kirschbaum, 2017, Natural Hazards, 87:145–164; Nowicki Jessee et al., 2018, JGR: Earth Surface, 123:1351–1369). At those scales, predictors such as geology, mean slope, and climatic variables effectively serve as regional proxies for local susceptibility conditions.
Critically, our dependent variable is not the binary presence/absence of a landslide on a slope, which would indeed require slope-unit or pixel resolution, but rather the count (and implicitly density) of landslides per catchment. This Poisson count formulation is specifically designed for aggregated data and is theoretically consistent with catchment-level predictors (Lombardo et al., 2020, Earth-Science Reviews, 209:103318). Mean catchment slope and elevation are appropriate summaries of terrain conditions that drive bulk susceptibility at this scale, a design choice analogous to using basin-wide lithology fractions or mean annual rainfall in regional hazard assessments. We will add a paragraph in the Methods section explicitly justifying the catchment unit in the context of the Poisson count model.
Major Concern 2: Over-interpretation of latent spatial effectsWe partially agree and will revise the Discussion accordingly. The CAR/ICAR/BYM formulations encode spatial adjacency structure in the residuals; they do not identify specific geological or tectonic domains. Our Discussion text listing "geological characteristics, tectonic settings, and anthropogenic effects" was intended to enumerate plausible sources of unobserved spatial structure, not to assert that the model identifies them. We will rephrase this to make the distinction explicit, in line with standard practice in spatial hierarchical modelling (Besag et al., 1991; Lawson, 2018, Bayesian Disease Mapping; Morris et al., 2019, Spatial and Spatio-temporal Epidemiology, 31:100301).
Regarding the three basins: the Atrato, Cauca, and Magdalena drainage systems correspond to distinct structural and lithological domains of the northern Colombian Andes, the Pacific-facing Western Cordillera (Atrato), the inter-Andean Cauca depression flanked by volcanic and metamorphic lithologies, and the Magdalena valley between the Central and Eastern Cordilleras, making them physically motivated grouping units for spatial heterogeneity at the upper level. We will substantiate this choice with supporting references in the revised manuscript.
Major Concern 3: Model interpretation and diagnostics are insufficientWe agree and will revise accordingly on both points.
Major Concern 4: Predictor selection and process representation are insufficiently justified
We acknowledge this concern and will expand the predictor selection section.
We tested a broader initial set including topographic wetness index (TWI), curvature, drainage density, and NDVI, but these showed Pearson correlations |r| > 0.75 with the retained variables in the correlation and PCA screening, and were excluded to avoid multicollinearity (Reichenbach et al., 2018, Earth-Science Reviews, 180:60–91). We will report the full correlation matrix as supplementary material.
Response to Additional Comments and Minor Comments
We thank the reviewer for these constructive observations, which we consider will substantially improve the clarity and rigour of the manuscript. We commit to addressing all of them in the revised version.
Citation: https://doi.org/10.5194/egusphere-2026-1749-AC1 -
CC1: 'Reply on RC1. i have a doubt in M3 model . code suggests it has positive Morans value but in paper it is negative.', Basit Ahad Raina, 09 Jun 2026
reply
i have a doubt in M3 model . code suggests it has positive Morans value but in paper it is negative.
Citation: https://doi.org/10.5194/egusphere-2026-1749-CC1
-
AC1: 'Reply on RC1', Edier Vicente Aristizábal Giraldo, 06 Jun 2026
reply
Data sets
database Edier Aristizabal https://github.com/edieraristizabal/PAPER_BHGLM
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 161 | 50 | 13 | 224 | 10 | 12 |
- HTML: 161
- PDF: 50
- XML: 13
- Total: 224
- BibTeX: 10
- EndNote: 12
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General assessment
This manuscript addresses an important methodological question, namely the role of spatial dependence in landslide susceptibility modeling and the use of hierarchical spatial models. The comparison of non-spatial and spatial models is potentially relevant for NHESS readers, and the manuscript is clearly structured and generally well written. However, I cannot recommend publication in its current form. My concerns are conceptual as well as technical: (1) The study operates at the level of terrain units that is not appropriate for landslide susceptibility modeling, which undermines model interpretation and relevance of findings. (2) The study conflates statistical concepts, potentially over-interpreting and over-stating its findings. I therefore recommend rejection.
Major concerns
1. Terrain mapping units are inadequate for susceptibility modeling
The manuscript models landslide susceptibility using hydrographic catchments of approximately one hundrd to a few hundred km² as terrain mapping units. This support is inadequate for a process known to be governed by slope-scale conditions. Yet the predictors are aggregated to catchment means (mean slope, mean elevation, rainfall summaries), suppressing the variability that controls slope failure. This severe change-of-support problem likely dilutes predictor-response relationships by construction, resulting in inflated residual variation. In my view, this issue fundamentally undermines the manuscript’s central conclusion.
2. Over-interpretation of latent spatial effects
The manuscript repeatedly interprets latent spatial effects as evidence for contiguous lithology, soil properties, tectonic domains, or shared physical controls. However, the CAR formulation only encodes adjacency among catchments. The model demonstrates that neighboring polygons have similar residual structure; it does not identify geology or tectonic controls. No attempt is made to relate latent spatial structure to known geological or other structures in this vast mountain area, and no attempt is made to use such (even coarse) information to construct additional process-related covariates. The physical interpretation is therefore speculative and substantially overstated.
The use of three large hydrographic basins as higher-level units is particularly weakly justified. No evidence is presented that these basins correspond to meaningful geophysical groupings for slope stability, and there is no general reason to expect “tectonic domains” to align with hydrographic catchments.
3. Model interpretation and diagnostics are insufficient
The manuscript repeatedly makes strong claims (“reliable”, “accurate”, “better estimates”, “overconfident predictions”) without clearly distinguishing precision and bias of coefficient estimtors, predictive performance, and goodness of fit. Residual diagnostics also raise concerns. Pearson residuals appear extremely large (up to 20-30), suggesting major lack of fit or overdispersion (perhaps related to points 1 and 2 above), yet this is not discussed.
4. Predictor selection and process representation are insufficiently justified
The predictor set is extremely limited (mean slope, mean elevation, rainfall summary). Relevant predictors commonly used in susceptibility modeling (e.g., land cover/land use, lithology, anthropogenic disturbance such as roads, topographic wetness index, and of course local slope angle) are absent without justification. Predictor selection (correlation analysis and PCA) is insufficiently described, and the manuscript does not justify linear predictor-response assumptions despite well-known nonlinearities, particularly for slope angle.
Additional comments
Additional minor comments
L11 "model performance" - or rather "goodness of fit" in the usual statistical terminology; "performance" suggests estimation on independent test set
L22-23 "cross-sectional" and "panel data" are terms that are not commonly used in the environmental statistics literature; prefer more descriptive alternatives? e.g. longitudinal or simply temporal or time series?
L24 TMU is not a concept from spatial statistics
L39 "model residuals" - definition not so obvious in the context of classification problems
L41 "interactions" - concept not obvious in this context; unobserved confounders are not interactions - not statistically and not otherwise
L44-51 "statistical oversight" is very strong wording, considering that most susceptibility analyses do not interpret model coefficients in detail and do not perform statistical inference.