Review article: Hydrologically Enhanced Machine Learning Framework for Urban Flood Inundation Mapping Using Multi-Sensor Remote Sensing Data: A Case Study of Mumbai, India

Pawar, Ankush S.; Phade, Gayatri M.

doi:10.5194/egusphere-2026-1275

Preprints

https://doi.org/10.5194/egusphere-2026-1275

Preprints

22 Apr 2026

| 22 Apr 2026

Status: this preprint is open for discussion and under review for Natural Hazards and Earth System Sciences (NHESS).

Review article: Hydrologically Enhanced Machine Learning Framework for Urban Flood Inundation Mapping Using Multi-Sensor Remote Sensing Data: A Case Study of Mumbai, India

Ankush S. Pawar and Gayatri M. Phade

Abstract. The complicated terrain, highly populated building surfaces and insufficient credible ground observations make urban flood mapping difficult in urbanizing megacities that rapidly develop in coastal areas. This study suggests that a hydrologically improved machine learning architecture can be utilized to perform automated urban flood inundation mapping by combining multi-sensor satellite data with a scalable decision support system (DSS). The Google Earth engine used Sentinel-1 SAR, Sentinel-2 optical imagery, SRTM digital elevation data, and CHIRPS precipitation data to create a comprehensive predictor stack.

To explicitly model flood propagation controls that most data-driven models tend to omit, two new hydrologic-topographic predictors were created:-the Relative Elevation Model (REM) and River Network Index (RNI), to model local terrain depressions and hydraulic connectivity. A consensus-based combination of SAR backscatter change, optical water indices, and topographic constraints produced flood labels with approximately 2.6x10⁵ pixels of floods in the Mumbai Metropolitan Region during the 2019 monsoon season. A representative training set was formed using balanced stratified sampling for use in the supervised classification. Random Forest, optimized XGBoost and ensemble models were created and tested in Python using official classification measures. The tuned XGBoost model had the best performance with an overall accuracy of 71.7 percent and an area under the receiver operating characteristic curve (AUC) of 0.803, which performed better than the Random Forest and ensemble configurations. The statistical significance of the improvement in model discrimination was at the 95 percent confidence level. The analysis of ablation revealed that the model discrimination of REM and RNI increased by approximately 5–6 percent in AUC, which proves their importance in urban flood detection. There is high spatial congruency between the predicted inundation pattern and known flood-prone regions along the major drainage patterns.

The proposed framework provides a reproducible, scalable, and hydrologically informed framework for urban flood inundation mapping and has high potential for operational flood monitoring and decision support in data limited tropical cities.

Received: 07 Mar 2026 – Discussion started: 22 Apr 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Ankush S. Pawar and Gayatri M. Phade

Status: open (until 24 Jun 2026)

Post a comment Subscribe to comment alert

CC1:
'Comment on egusphere-2026-1275', Anupal Baruah, 26 Apr 2026 reply

1) I am somewhat skeptical about the use of SAR data for urban flood mapping, given the well-known double-bounce scattering effect in built-up areas, which can lead to misclassification of flooded regions. This limitation is already acknowledged in Table 1. In light of this, it would be helpful if the authors could further justify their decision to proceed with SAR data, and clarify how they mitigate or account for these uncertainties in their analysis.
2) Additionally, the use of 30 m spatial resolution may be too coarse for accurately capturing urban flood dynamics, where fine-scale features such as roads, drainage networks, and building footprints play a critical role

Reply

Citation: https://doi.org/10.5194/egusphere-2026-1275-CC1
- AC1:
  'Reply on CC1', Gayatri M Phade, 26 Apr 2026 reply
  We sincerely thank the commenter for the insightful and constructive feedback on our manuscript. The points raised regarding the use of SAR data in urban environments and the implications of spatial resolution are highly relevant, and we appreciate the opportunity to clarify these aspects.
  
  Regarding the use of SAR data in urban flood mapping, we agree that double-bounce scattering in built-up areas can introduce uncertainties and may lead to misclassification of flooded regions. Despite this limitation, SAR data remain a widely adopted and valuable source for flood mapping due to their all-weather, day-and-night imaging capability, which is particularly critical during flood events characterized by cloud cover.
  
  In our study, we mitigate these limitations through a multi-source data integration framework. Specifically, SAR-derived features are not used in isolation but are combined with hydrologic–topographic indicators such as relative elevation and flow accumulation, as well as additional geospatial predictors. This integration allows the machine learning model to reduce reliance on any single data source and improves robustness against SAR-specific artefacts. Furthermore, the use of statistical descriptors and temporal variability features helps to distinguish true flood signals from urban backscatter effects. We will further clarify this mitigation strategy in the revised manuscript.
  
  With respect to the use of 30 m spatial resolution, we acknowledge that this resolution may not fully capture fine-scale urban features such as roads, drainage networks, and individual buildings. However, the choice of 30 m resolution was guided by the need to maintain consistency across multiple datasets (e.g., DEM, precipitation, and derived hydrologic variables) and to ensure computational feasibility for regional-scale analysis.
  
  Our objective is to provide a scalable and generalizable framework for urban flood susceptibility mapping rather than detailed street-level inundation modelling. The machine learning framework leverages terrain and hydrologic context, which remain meaningful at this spatial scale. Nevertheless, we agree that higher-resolution datasets could further enhance the accuracy of flood delineation in dense urban environments. This limitation and its implications will be more explicitly discussed in the revised manuscript, along with suggestions for future work using higher-resolution data.
  We thank the commenter again for these valuable suggestions, which will help improve the clarity and robustness of our study.
  
  Reply
  
  Citation: https://doi.org/10.5194/egusphere-2026-1275-AC1
RC1:
'Comment on egusphere-2026-1275', Anonymous Referee #1, 31 May 2026 reply

The manuscript addresses an important and timely topic in urban flood inundation mapping by integrating multi-sensor remote sensing data with machine learning and hydrologic-topographic predictors. The use of Sentinel-1 SAR, Sentinel-2 optical imagery, SRTM DEM, CHIRPS rainfall, and HydroRIVERS data provides a relevant basis for developing a scalable flood mapping framework. The inclusion of the Relative Elevation Model (REM) and River Network Index (RNI) is a promising attempt to improve the physical interpretability of machine learning-based flood detection. The reported performance of the tuned XGBoost model, with an accuracy of 71.7% and AUC of 0.803, suggests that the proposed framework has potential for operational urban flood monitoring. However, the manuscript requires substantial revision before it can be considered scientifically robust. First, the title describes the paper as a “Review article,” but the manuscript is clearly an original research article involving data processing, model development, model evaluation, and case study application. This should be corrected to avoid confusion.
The novelty of the study also needs to be clarified. The manuscript claims that REM and RNI are new hydrologic-topographic predictors, but it does not sufficiently explain how these indices differ from existing flood conditioning variables such as elevation, slope, distance to river, HAND, drainage proximity, topographic wetness index, or flow accumulation. The authors should clearly state whether REM and RNI are newly developed indices, modified versions of existing indices, or case-specific hydrologic features. A major concern is the formulation of the RNI. The text describes RNI as a measure of hydraulic proximity and connectivity to river networks, but the equation uses cumulative precipitation divided by the elevation difference from the minimum DEM. This formulation does not directly represent distance to river, drainage connectivity, or river network influence. The authors should revise the RNI equation so that it is mathematically consistent with its stated hydrological meaning.
The flood label generation process also requires stronger justification. The manuscript uses a consensus rule based on SAR backscatter ratio, backscatter difference, NDWI, and REM thresholds, but the selected threshold values are not adequately justified. The authors should explain why SAR ratio > 1.25, backscatter difference ≥ 3 dB, NDWI > 0.05, and REM < 5 m were selected. A threshold sensitivity analysis would strengthen the reliability of the generated flood labels. The validation strategy is another important limitation. Since the training and testing labels are generated from remote sensing-based consensus rules, the model may be learning the labeling assumptions rather than being validated against independent flood observations. The authors are encouraged to include independent validation data, such as official flood records, observed flood locations, high-resolution imagery, or historical flood-prone areas in Mumbai. If such data are unavailable, the manuscript should clearly state that the reported accuracy reflects agreement with consensus-generated labels rather than confirmed ground truth.
The statistical significance claims should also be improved. The manuscript states that the XGBoost model significantly outperforms other models based on the DeLong test, but the p-values, confidence intervals, and test statistics are not reported. Since the AUC difference between Random Forest and XGBoost is small, these values are necessary to support the claim of statistical significance. There is also an inconsistency in the reported ensemble performance. Table 3 reports the RF-XGB ensemble AUC as 0.794, while the ROC figure appears to show a different ensemble AUC value. The authors should carefully check and correct all performance values in the tables, figures, and discussion.
Finally, the manuscript requires substantial language editing. Several sentences are awkward or unclear, and the citation style is inconsistent between author-year and numbered formats. The figures, especially the workflow and spatial flood maps, should be improved for readability and publication quality. Figure 5 should include clearer legends, units, class definitions, and map elements. Overall, the study has potential, but the current version needs major revision. The authors should strengthen the novelty statement, correct the RNI formulation, justify the flood-label thresholds, improve validation, report full statistical testing results, resolve inconsistencies in model performance, and substantially revise the language and presentation.

Reply

Citation: https://doi.org/10.5194/egusphere-2026-1275-RC1
- AC2: 'Reply on RC1', Gayatri M Phade, 01 Jun 2026 reply
  
  We thank Referee #1 for the detailed and constructive review of our manuscript. We appreciate the positive assessment of the overall framework and the valuable suggestions for improvement.
  We acknowledge the concerns regarding manuscript classification, the novelty and formulation of the hydrologic–topographic predictors, threshold selection, validation strategy, statistical significance testing, and presentation quality. We are currently revising the manuscript and will address each comment in detail in a point-by-point response and revised manuscript.
  We particularly appreciate the referee's suggestions regarding clarification of the Relative Elevation Model (REM) and River Network Index (RNI), justification of flood-label thresholds, reporting of DeLong test statistics, and improvement of figures and language. These comments will substantially strengthen the manuscript.
  We thank the referee again for the constructive feedback and will carefully incorporate all recommendations in the revised version.
  
  Reply
  
  Citation: https://doi.org/10.5194/egusphere-2026-1275-AC2

Ankush S. Pawar and Gayatri M. Phade

Data sets

Data for Hydrologic–Topographic Enhanced Machine Learning for Urban Flood Inundation Mapping Ankush S. Pawar and Gayatri M. Phade https://doi.org/10.5281/zenodo.18486214

Ankush S. Pawar and Gayatri M. Phade

Viewed

Total article views: 348 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
266	63	19	348	19	19

HTML: 266
PDF: 63
XML: 19
Total: 348
BibTeX: 19
EndNote: 19

Views and downloads (calculated since 22 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	155	45	18	218
May 2026	111	18	1	130
Jun 2026	0

Cumulative views and downloads (calculated since 22 Apr 2026)

Month	HTML	PDF	XML	Total
Apr 2026	155	45	18	218
May 2026	111	18	1	130
Jun 2026	0

Viewed (geographical distribution)

Total article views: 348 (including HTML, PDF, and XML) Thereof 348 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 02 Jun 2026

Short summary

Urban floods frequently affect coastal cities like Mumbai, causing severe damage. This study develops a satellite-based method that combines rainfall, terrain, and radar data with machine learning to map flood areas more accurately. By including terrain features that influence how water spreads, the model improves flood detection reliability. The approach provides a scalable and practical tool for supporting flood monitoring and decision-making in rapidly growing cities.


Total:	0
HTML:	0
PDF:	0
XML:	0