Semi-Supervised Segmentation for Mapping Urban Expansion and Hazard Exposure in Lima, Peru

Jaimes, Javier; Moya, Luis; Vilela, Marta; Santa María, María; Santa Cruz, Sandra; Gonzales, Carlos

doi:10.5194/egusphere-2025-5987

Preprints

https://doi.org/10.5194/egusphere-2025-5987

Preprints

22 Jan 2026

| 22 Jan 2026

Status: this preprint is open for discussion and under review for Natural Hazards and Earth System Sciences (NHESS).

Semi-Supervised Segmentation for Mapping Urban Expansion and Hazard Exposure in Lima, Peru

Javier Jaimes, Luis Moya, Marta Vilela, María Santa María, Sandra Santa Cruz, and Carlos Gonzales

Abstract. Urban expansion in rapidly growing cities increases exposure to natural hazards but remains difficult to monitor in regions with limited data. This challenge is amplified in places such as Metropolitan Lima, where global datasets of urban areas lack precision along complex and rapidly changing city boundaries. As a result, recent growth in informal and peripheral zones is not well defined. This study introduces a practical application of a semi-supervised mapping approach that combines satellite imagery with partially labeled information and targeted manual refinement to identify new built-up areas in Metropolitan Lima from 2016 to 2025. The method improves the detection of small and fragmented structures, including emerging informal settlements that global datasets frequently miss. Results show that Metropolitan Lima expanded by approximately 76 km² during the study period. A portion of this growth occurred in coastal zones exposed to tsunamis, in areas with medium to high landslide susceptibility, and on soil types where strong ground shaking is amplified during large earthquakes. These findings highlight the continued concentration of people and infrastructure in hazard-prone terrain.

Received: 02 Dec 2025 – Discussion started: 22 Jan 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Javier Jaimes, Luis Moya, Marta Vilela, María Santa María, Sandra Santa Cruz, and Carlos Gonzales

Status: open (until 09 May 2026)

Post a comment Subscribe to comment alert

RC1: 'Comment on egusphere-2025-5987', Anonymous Referee #1, 01 Mar 2026 reply

This study presents a semi-supervised segmentation framework based on satellite images and partially labeled data to improve the detection of small and informal settlements in Metropolitan Lima, which is often missed by global datasets. Results show that the city expanded by about 76 km² between 2016 and 2025, with a significant share of new development occurring in areas exposed to tsunami, landslide, and seismic hazards, highlighting growing risk in hazard-prone zones. Overall, this study is well-designed and comprehensive, and the findings are meaningful for risk-informed and resilient urban management. However, I still have several comments and suggestions for improving the current work.

1) Lines 17-18: It is suggested to make it clear that the correlation refers to Spearman’s correlation coefficient, and present the corresponding p-value that indicates its statistical significance.

2) Lines 41-42: What is the difference between the proposed semi-supervised segmentation framework and those in the literature?

3) Line 45: “SAR” stands for Synthetic Aperture Radar? It is better to use the full name for its first appearance in the manuscript.

4) For figures with maps, it is suggested to add “N” to the north arrow, and add labels and units like “longitude (°)” and “latitude (°)” to the axes.

5) Equation (1): Why does the power number of e include a coefficient of “5”?

6) Equations (2) and (6): The right square bracket is missing for the second term (the expected loss, E_u) on the right-hand side of the equation.

7) Line 118: Maybe a typo: pi_m should be pi_n?

8) Section 3.1: What is the spatial resolution of the images for deep learning modeling? Would this affect the model performance since the resolution of WSF dataset is 10 m?

9) Line 170 and Figure 6: It is typically expected that the model performance in validation is poorer than that in training, but this figure show that the loss values of the two stages almost overlap with each other. It is suggested to randomly split the dataset into training and validation to guarantee the model’s robustness. In addition, which set of model weights among the 200 epochs were chosen for further model comparison?

10) Line 193: It stated that “This result is expected since WSF effectively represents consolidated urban zones worldwide”. If that is the case, both the precision and recall evaluation metrics of WSF should be higher than those of the proposed framework.

11) Lines 197-199: What are the possible reasons why the performance difference is relatively large in these cases?

12) Figure 11(a): It should be noted that the uncertainty in the flood modeling process is not negligible. Thus, it is suggested to employ the probabilistic flood inundation maps instead of the deterministic maps for the further exposure analysis if possible. Please refer to the paper below.

Reference:

“Uncertainty analysis and quantification in flood insurance rate maps using Bayesian model averaging and hierarchical BMA” (https://doi.org/10.1061/JHYEFF.HEENG-58)

13) Figure 12: How are the clusters defined and what is “Ha” in the horizontal axis? It is also suggested to change the label of the vertical axis in Figures 12(b)-12(d) to the accumulated area for the corresponding hazard.

14) Lines 339-340: The statement that “the improved recall in peripheral and remote areas” may be true only for the urban area according to Table 2.

Reply

Citation: https://doi.org/10.5194/egusphere-2025-5987-RC1

Javier Jaimes, Luis Moya, Marta Vilela, María Santa María, Sandra Santa Cruz, and Carlos Gonzales

Viewed

Total article views: 753 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
210	524	19	753	17	23

HTML: 210
PDF: 524
XML: 19
Total: 753
BibTeX: 17
EndNote: 23

Views and downloads (calculated since 22 Jan 2026)

Month	HTML	PDF	XML	Total
Jan 2026	105	39	12	156
Feb 2026	38	35	3	76
Mar 2026	56	427	4	487
Apr 2026	11	23	0	34

Cumulative views and downloads (calculated since 22 Jan 2026)

Month	HTML	PDF	XML	Total
Jan 2026	105	39	12	156
Feb 2026	38	35	3	76
Mar 2026	56	427	4	487
Apr 2026	11	23	0	34

Viewed (geographical distribution)

Total article views: 656 (including HTML, PDF, and XML) Thereof 656 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 13 Apr 2026

Short summary

This study presented a semi-supervised method to map recent urban growth in Metropolitan Lima using satellite images and partial reference data. It reveals that the city expanded into several districts and in areas exposed to hazards such as landslides, earthquakes, and tsunamis. The approach helps detect early land occupations and supports safer land-use planning and better local decision-making.


Total:	0
HTML:	0
PDF:	0
XML:	0