Evaluating the effects of preprocessing, method selection, and hyperparameter tuning on SAR-based flood mapping and water depth estimation

Travert, Jean-Paul; Goeury, Cédric; Boyaval, Sébastien; Bacchi, Vito; Zaoui, Fabrice

doi:10.5194/egusphere-2025-3726

Preprints

https://doi.org/10.5194/egusphere-2025-3726

Preprints

10 Nov 2025

| 10 Nov 2025

Status: this preprint is open for discussion and under review for Natural Hazards and Earth System Sciences (NHESS).

Evaluating the effects of preprocessing, method selection, and hyperparameter tuning on SAR-based flood mapping and water depth estimation

Jean-Paul Travert, Cédric Goeury, Sébastien Boyaval, Vito Bacchi, and Fabrice Zaoui

Abstract. Flood mapping and water depth estimation from Synthetic Aperture Radar (SAR) imagery are crucial for calibrating and validating hydraulic models. This study uses SAR imagery to evaluate various preprocessing (especially speckle noise reduction), flood mapping, and water depth estimation methods. The impact of the choice of method at different steps and its hyperparameters is studied by considering an ensemble of preprocessed images, flood maps, and water depth fields.

The evaluation is conducted for two flood events on the Garonne River (France) in 2019 and 2021, using hydrodynamic simulations and in-situ observations as reference data. Results show that the speckle filtering method choice can significantly alter flood extent estimations with variations of several square kilometers. Additionally, the selection and tuning of flood mapping methods significantly affect performance. While supervised methods outperformed unsupervised ones, well-tuned unsupervised approaches (such as local thresholding or change detection) can achieve comparable results. The compounded uncertainty from preprocessing and flood mapping steps also introduces substantial variability in the water depth field estimates.

This study highlights the importance of considering the entire processing pipeline, encompassing preprocessing, flood mapping, and water depth estimation methods and their associated hyperparameters. Rather than relying on a single configuration, adopting an ensemble approach and accounting for methodological uncertainty should be privileged. For flood mapping, the method choice has the most influence. For water depth estimation, the most influential processing step was the flood map input resulting from the flood mapping step and the hyperparameters of the methods.

Received: 31 Jul 2025 – Discussion started: 10 Nov 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Jean-Paul Travert, Cédric Goeury, Sébastien Boyaval, Vito Bacchi, and Fabrice Zaoui

Status: open (until 29 Dec 2025)

Post a comment Subscribe to comment alert

RC1:
'Comment on egusphere-2025-3726', Wolfgang Wagner, 28 Nov 2025 reply
This is a very detailed and interesting study that compares different SAR based flood extent and water depth estimation techniques. The study is carried out for two flood events (December 2019 and February 2021) in the floodplains of the Garonne river in France. An impressive number of simulations were carried out using different SAR preprocessing approaches, models and model parameterizations and assessed using high quality hydraulic model outputs and observed watermarks. In my view this is much needed and highly valuable study investigating the strength and weaknesses of different algorithms in SAR data processing + flood mapping + water depth estimation. However, I have several major and minor comments.
MAJOR COMMENTS
My main comment is that the authors do not properly account for the effect that Sentinel-1 cannot map flooding in non-sensitive areas (forests, urban areas) and water-look-alike areas (streets, smooth fields, etc.). While they do mention these effects, it has no impact on their methodology of how the SAR derived flood and water depths maps are assessed. The point is that for Sentinel-1 the “ground truth” is not the actual flood area as simulated by the hydrological model or observed by the watermarks. Instead, for Sentinel-1 the “ground truth” is the real flood area minus the exclusion areas (e.g. all pixels where Sentinel-1 cannot detect flooding due to physical reasons). This has major implications for how the results are interpreted. At the moment the authors conclude that the CNN flood map outperforms the other results as the total flood area estimated by the CNN comes closest to the real flood extent. However, comparing the SAR backscatter images and the CNN maps shows that the CNN has a tendency to overestimate flood extent as seen by Sentinel-1. This problem then becomes apparent when estimating water depth from the CNN derived flood maps: The large RMSE values for the CNN shown in Figure 21 are evidence that some parts of the CCN maps are physically wrong.

My second major concern is that the comparisons are unfair. While the four non-ML based approaches (global threshold, local threshold, active contour models, change detection) only use VH data as input, both ML models use both VH and VV data as input. Apparently, this is an advantage for the two ML models, and should be considered in the discussion of the results or – even better – lead to a modification of the experimental setup!

Morphological filters may help to reduce underdetection in flood situations, but may also lead to overdetections in non-flood situations or to wrong flood pixels altogether. So while these filters may have improved the statistics with respect to the model simulations and watermarks, it is not always clear that the physical representation of observed flood areas is improved.

Most flood mapping algorithms are optimized for flood scenes, but may lead to wrong results during dry conditions. To judge the robustness of algorithms, also their performance for non-flood scenes should be tested.

MINOR COMMENTS
Section 1: What is the definition of a “hyperparameter”? What makes it different from a “normal” model parameter?
Section 1: There are some recent studies that investigated the effect of different model parameters on flood mapping accuracies (e.g. recent studies investigating different change detection algorithms). Please review the literature and relate this work to the existing publications (also come back to this point in the discussion section).
End of section 1 / beginning of section 2: Check for repetitions
Line 86: Why are so many configurations tested for the local threshold approach (36 versus 2 / 2 / 6 configurations for the other three methods). Does this mean that the threshold approach has advantages?
Figure 3: Show location of in situ sites
Line 163: Only in a narrow sense I would agree to this statement: “The main source of error in SAR imagery is speckle noise …”. In practice, there are many physical reasons for uncertainties in the SAR derived flood maps.
Line 248: Visually, the SAR2SAR filter looks indeed nicer than the other filters. But are there any quantitative indicators that can substantiate that SAR2SAR “outperforms the traditional methods”? How much of this filtered image is invented, how much of it is true?
Figure 6: These are the VH images, right?
Figure 7: Use the same y-axis for a direct comparison
Line 349: How many flood cases of the Sen1Flood11 cases show similar conditions as for the Garonne river flood.
Figure 9: I find the spread of the results for different algorithm / flood case combinations surprisingly low. What is the reason for this? Does this also reflect different pre-processing options?
Line 535: Some grasslands may cause “water-look-alike conditions”, but normally vegetation causes a loss of sensitivity of backscatter to flooding.

Reply
Citation: https://doi.org/10.5194/egusphere-2025-3726-RC1
RC2: 'Comment on egusphere-2025-3726', Anonymous Referee #2, 30 Nov 2025 reply

This study investigates the impacts of various preprocessing, mapping, and depth estimation methods on flood mapping and water depth estimation using Synthetic Aperture Radar (SAR) images. The results suggest that an ensemble approach, accounting for various uncertainty sources during the modeling process, should be preferred. Overall, this study is well-designed and comprehensive, and the findings are meaningful for flood risk management. However, I still have several comments and suggestions for improving the current work.

1) This paper is a little lengthy, especially for the method sections. If some of those methods are not new, it is suggested to make the description more concise, or put some of those in the appendix.

2) The overall study is like presenting the details of a calibration process for SAR-based flood modeling, which involves rigorous evaluation as well as subjective adjustment based on the model’s experience. Could you provide some general guidelines on how to make the model calibration more effective and efficient?

3) Figure 2: Does the longitude label "5-degree O" represent "W"? If yes, and to avoid confusion, it suggested to change it to "W". Also, it would be better to add a zoom-in view for the study area.

4) Lines 118-120: It is acknowledged that the simulated flood maps just served as a reference in this study. However, it should be noted that the uncertainty in the physics-based flood modeling process is not negligible. Furthermore, the limited flood events/areas selected in the study and the uncertainty in the “ground-truth” observations may play a role in the model evaluation process. Please refer to the two papers below.

References:

"Uncertainty analysis and quantification in flood insurance rate maps using Bayesian model averaging and hierarchical BMA" (https://doi.org/10.1061/JHYEFF.HEENG-58)

"Beyond a fixed number: Investigating uncertainty in popular evaluation metrics of ensemble flood modeling using bootstrapping analysis" (https://doi.org/10.1111/jfr3.12982)

5) Figure 7 and Line 259: Why was the performance of SAR2SAR for the flooded area not as good as that for the dry area?

6) Lines 370 and 464: No hyperparameters were considered in the random forest and CNN approaches, but they are also critical in determining the model performance in both the training, testing, and application processes.

7) Line 382: There are 1222 generated flood maps, but 26*48=1248?

8) Lines 91 and 502: The DEM data was used in water depth estimation. Does it include bathymetry for the river channel?

9) Line 646: The word "First" does not make sense here since there is no "Second" in the following text.

10) Table 3: Please add units for the Strickler parameters.

11) Line 652: Is it "94%" or "95%"?

Reply

Citation: https://doi.org/10.5194/egusphere-2025-3726-RC2

Jean-Paul Travert, Cédric Goeury, Sébastien Boyaval, Vito Bacchi, and Fabrice Zaoui

Viewed

Total article views: 238 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
159	55	24	238	14	13

HTML: 159
PDF: 55
XML: 24
Total: 238
BibTeX: 14
EndNote: 13

Views and downloads (calculated since 10 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	125	16	14	155
Dec 2025	34	39	10	83

Cumulative views and downloads (calculated since 10 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	125	16	14	155
Dec 2025	34	39	10	83

Viewed (geographical distribution)

Total article views: 235 (including HTML, PDF, and XML) Thereof 235 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 27 Dec 2025

Short summary

This study presents the impact of various processing methods on flood maps and water depth estimates derived from Synthetic Aperture Radar satellites. The results suggest that the choice of methods and parameters at each processing step has a strong influence on the outputs. This study emphasizes the importance of evaluating the entire processing pipeline to evaluate the uncertainties, that may hinder the capability to calibrate or validate hydrodynamic models.


Total:	0
HTML:	0
PDF:	0
XML:	0