Multi-scale spatial validation and probability calibration of pixel-based landslide susceptibility modeling in the northern Peruvian Andes

Quiroz, Wendy

doi:10.5194/egusphere-2026-1318

Preprints

https://doi.org/10.5194/egusphere-2026-1318

Preprints

17 Mar 2026

| 17 Mar 2026

Multi-scale spatial validation and probability calibration of pixel-based landslide susceptibility modeling in the northern Peruvian Andes

Wendy Quiroz

Abstract. Landslides are recurrent geohazards in Andean regions, causing significant impacts on infrastructure and local communities. In spatially structured terrains, model reliability hinges on the definition of pseudo-absence samples and the treatment of spatial dependence during validation. This study evaluates pixel-based rotational landslide susceptibility in the province of Huancabamba (Piura, northern Peru) using a Random Forest classifier and seven conditioning factors derived from a photogrammetric digital elevation model and lithological data at 10 m resolution.

The landslide inventory consists of 25 field-mapped rotational landslides compiled from geomorphological surveys and high-resolution photogrammetric products. Pseudo-absence samples were selected outside mapped polygons using a buffered exclusion zone to reduce label uncertainty, and a balanced sampling scheme (1:1) was adopted. To obtain spatially realistic performance estimates, model evaluation was conducted using spatial block cross-validation with block sizes ranging from 600 to 1500 m. This provides a clear view of how spatial partitioning affects discrimination and calibration, alongside the model's stability throughout the validation folds.

Results show that discrimination performance decreases systematically as spatial block size increases, indicating that conventional random validation may overestimate predictive capacity due to spatial autocorrelation. A block size of 900 m provided a compromise between spatial independence and fold stability. Permutation importance computed under spatially independent folds identified lithology and elevation as the dominant predictors of rotational landslide occurrence, followed by aspect and topographic wetness index. Calibration metrics (Brier score and Expected Calibration Error) indicated moderate but stable reliability of susceptibility scores across spatial configurations.

The resulting susceptibility map shows spatial patterns consistent with the geomorphological setting and the mapped inventory, with high susceptibility concentrated in steep slopes developed over weak lithological units. These findings indicate that integrating spatial validation, calibration, and constrained sampling improves the reliability of pixel-based modelling in this Andean setting.

Received: 10 Mar 2026 – Discussion started: 17 Mar 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Wendy Quiroz

Status: final response (author comments only)

RC1: 'Comment on egusphere-2026-1318', Anonymous Referee #1, 09 Apr 2026

The paper assesses landslide susceptibility based on 25 rotational landslides in a relatively small study area (~32 km²) in northern Peru using a Random Forest model, with a focus on pseudo-absence selection and spatial validation. By applying and interpreting spatial block cross-validation, the author infers that conventional validation likely overestimates model performance due to spatial autocorrelation. Based on permutation-based feature importance, lithology and elevation were identified as the main drivers of slope instability. The study concludes that model performance strongly depends on the chosen validation approach, with larger validation blocks leading to lower but more realistic performance estimates (LINE 622f) “highlighting the influence of spatial autocorrelation on apparent predictive capacity”. The author finally concludes that (Line 644f) “The modelling approach developed in this study is particularly suited to rotational landslides under similar geomorphological conditions. Extension to other mass movement types would require process-tailored sampling strategies and predictor selection.”

While the manuscript addresses relevant aspects of data-driven landslide susceptibility modelling, I am not fully convinced that it is ready for publication. The scope is relatively narrow, focusing on a small number of landslides (n = 25) within a limited spatial extent. In addition, several methodological decisions remain insufficiently justified or unclear. Most importantly, the study offers limited conceptual or methodological advancement, as Random Forest-based susceptibility modelling and spatial validation strategies are already established in the literature. Overall, I consider the manuscript to be at the borderline between rejection and major revision. I formally recommend major revisions, although addressing the key concerns outlined below may require substantial changes that could alter the scope and structure of the study.

Details:

1. Sampling strategy and spatial autocorrelation

The effective sample size appears to be artificially inflated (approximately 30,000 observations within ~32 km²). From the text I understand that multiple pixels from the same landslide were treated as independent presence observations. If so, this may introduce another dimension of spatial dependence and potentially undermines the central aim of this study of addressing spatial autocorrelation. Additionally, this approach implicitly assigns (undesired?) greater weight to larger landslides, as they contribute more pixels to the model (even though magnitude/size is not part of conventional landslide susceptibility modelling).

Alternative strategies could be considered. For example, mixed-effects modelling frameworks allow explicit treatment of hierarchical structures (e.g., pixels nested within landslides via a landslide ID as a random effect). More broadly, if spatial dependence is a core topic in this study, the analysis could go beyond spatial block validation alone. There are even approaches such as spatially explicit hyperparameter tuning that tackle the topic not only during validation.

Schratz, P., Muenchow, J., Iturritxa, E., Richter, J., and Brenning, A.: Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data, Ecological Modelling, 406, 109–120, https://doi.org/10.1016/j.ecolmodel.2019.06.002, 2019.

Schlögl, M., Spiekermann, R., and Steger, S.: Towards a holistic assessment of landslide susceptibility models: insights from the Central Eastern Alps, Environ Earth Sci, 84, 113, https://doi.org/10.1007/s12665-024-12041-y, 2025.

Moreover, several studies in landslide susceptibility modelling advocate using a single representative point per landslide, preferably within the initiation zone, to better capture causal landslide conditions. Based on Figure 1, it appears that both scarp and runout areas were mapped and therefore sampled. Including runout zones may dilute the relationship between predictors and landslide initiation, potentially explaining why slope or other predictor appears unimportant while lithology and elevation dominate. These implications should be more thoroughly considered.

2. Missing engagement with spatial cross-validation literature

The manuscript does not sufficiently engage with other literature on spatial cross-validation. Several contributions (e.g., Brenning; Schratz et al.; Schlögl et al.) as well as critical perspectives (e.g., Wadoux et al.) are not adequately discussed. Incorporating these studies would help contextualize the methodological choices and clarify the degree of novelty of this work. It would also allow for a more balanced discussion of when spatial cross-validation is appropriate and what its limitations are.

Wadoux, A. M. J.-C., Heuvelink, G. B. M., de Bruin, S., and Brus, D. J.: Spatial cross-validation is not the right way to evaluate map accuracy, Ecological Modelling, 457, 109692, https://doi.org/10.1016/j.ecolmodel.2021.109692, 2021.

Brenning, A.: Spatial prediction models for landslide hazards: review, comparison and evaluation, Natural Hazards and Earth System Science, 5, 853–862, 2005.

Schratz, P., Muenchow, J., Iturritxa, E., Richter, J., and Brenning, A.: Hyperparameter tuning and performance assessment of statistical and machine-learning algorithms using spatial data, Ecological Modelling, 406, 109–120, https://doi.org/10.1016/j.ecolmodel.2019.06.002, 2019.

Schlögl, M., Spiekermann, R., and Steger, S.: Towards a holistic assessment of landslide susceptibility models: insights from the Central Eastern Alps, Environ Earth Sci, 84, 113, https://doi.org/10.1007/s12665-024-12041-y, 2025.

3. Justification of spatial partitioning strategy

The rationale behind the selected spatial block sizes (600–1500 m) remains unclear and should be more explicitly justified. In particular, it is not evident how these scales relate to the spatial characteristics of the mapped landslides or the underlying processes. Additionally, it may be important to clarify whether individual landslides can span multiple spatial blocks. If so, there is a possibility that different parts of the same landslide are included in both the training and validation datasets. This would compromise the independence between training and test data. The manuscript should explicitly address this issue and, if relevant, describe how such cases were handled.

Alternative spatial partitioning strategies are neither tested nor discussed, such as:

• clustering-based approaches (e.g., k-means),

• geomorphologically meaningful units (e.g., catchments),

• lithological or landscape-based stratification.

Acknowledging or comparing such alternatives would strengthen the methodological robustness.

4. “Spatial leakage”

The term “spatial leakage” (Line 79) is introduced without a clear definition. While it is often used to describe unintended information transfer between training and validation datasets (see also comments before) due to spatial proximity, the manuscript should explicitly define how the term is used in this study.

5. Process-based considerations

The study area is small but contains relatively large, predominantly rotational landslides (mean size ~16.7 ha), which likely refer to deep-seated processes. In such cases, subsurface conditions (e.g., material properties, hydrology, weak layers) are often key controls but are not well captured by typical surface predictors used in this study. Furthermore, as also outlined above, the manuscript does not clearly distinguish between initiation and runout zones. Including runout areas in the sampling (especially for such large landslides) may introduce noise, as these areas can occur under very different topographic and lithological conditions than the source zones. This could reduce the model ability to identify “true” causal factors.

6. Resolution and representation of pre-failure conditions

The use of a 10 m DEM raises concerns for modelling relatively large landslides in such a way. Morphometric predictors derived from such data are likely influenced by post-failure topography rather than representing pre-failure conditions. This is particularly relevant in studies of large landslides, where the terrain has been substantially altered. As a result, the model may have limited predictive capability for identifying currently stable but susceptible areas, arguably the primary objective of susceptibility modelling. This limitation should be explicitly addressed and discussed in the manuscript.

Steger, S., Schmaltz, E., and Glade, T.: The (f)utility to account for pre-failure topography in data-driven landslide susceptibility modelling, Geomorphology, 354, 107041, https://doi.org/10.1016/j.geomorph.2020.107041, 2020.

Van Den Eeckhaut, M., Vanwalleghem, T., Poesen, J., Govers, G., Verstraeten, G., and Vandekerckhove, L.: Prediction of landslide susceptibility using rare events logistic regression: A case-study in the Flemish Ardennes (Belgium), Geomorphology, 76, 392–410, https://doi.org/10.1016/j.geomorph.2005.12.003, 2006.

Citation: https://doi.org/10.5194/egusphere-2026-1318-RC1
RC2: 'Comment on egusphere-2026-1318', Anonymous Referee #2, 19 Apr 2026

The manuscript should be rejected as its overall scientific quality is very low and does not meet the standard required for publication. Although the topic of spatial validation in landslide susceptibility modelling is potentially relevant, the study is fundamentally limited by extremely poor data quality and insufficient scientific depth. Most critically, the landslide inventory contains only 25 events within a very small study area (~32 km²), which is far from adequate to support a pixel-based machine learning model or any statistically meaningful validation. The modelling framework therefore lacks representativeness, robustness, and generalizability. The so-called “multi-scale spatial validation” is essentially a technical exercise with limited novelty, as similar block cross-validation approaches have already been widely discussed in the literature, and the manuscript fails to provide any substantive methodological advancement. In addition, the use of balanced pseudo-absence sampling further undermines the physical interpretability of the results, yet this issue is not rigorously addressed. The discussion section is largely descriptive and repetitive, lacking critical analysis, mechanism interpretation, or connection to broader hazard processes. No new insights into landslide processes, hazard mechanisms, or practical applications are provided. Figures and results mainly confirm well-known patterns (e.g., lithology and elevation dominance), offering little scientific contribution. Overall, the study is a routine application of an existing method on a very limited dataset, with weak innovation, insufficient data support, and poor scientific significance. Therefore, rejection is recommended.

Citation: https://doi.org/10.5194/egusphere-2026-1318-RC2

Wendy Quiroz

Viewed

Total article views: 161 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
114	37	10	161	11	14

HTML: 114
PDF: 37
XML: 10
Total: 161
BibTeX: 11
EndNote: 14

Views and downloads (calculated since 17 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	90	27	10	127
Apr 2026	24	10	0	34

Cumulative views and downloads (calculated since 17 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	90	27	10	127
Apr 2026	24	10	0	34

Viewed (geographical distribution)

Total article views: 150 (including HTML, PDF, and XML) Thereof 150 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 29 Apr 2026

Short summary

Landslides are a common hazard in mountainous regions and can seriously affect infrastructure and local communities. Identifying where they are more likely to occur is important for reducing risk and improving land-use planning. This study examines landslide susceptibility in the northern Peruvian Andes using terrain information from satellite data and geological mapping. The results highlight areas with higher landslide likelihood and support hazard assessment and territorial planning.


Total:	0
HTML:	0
PDF:	0
XML:	0