the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
An Uncertainty Quantification Framework for Simulation-based Flood Frequency Analysis
Abstract. Flood frequency analysis (FFA) is essential for flood risk management and infrastructure design, yet the uncertainty associated with flood quantile estimates is often poorly characterized or disregarded – especially under data-scarce conditions. Existing uncertainty quantification methods are frequently subjective, overly complex, or impractical for routine engineering use. We introduce a simulation-based uncertainty quantification framework – UQ-flood – that integrates a process-based hydrologic model, a stochastic weather generator, and a residual error model (REM). Designed for annual maxima, the REM accounts for model bias and residual variability, enabling the generation of probabilistic streamflow ensembles tailored to extreme event analysis. We apply UQ-flood to three Canadian watersheds with long streamflow records and contrasting hydroclimatic conditions. We compare its performance against traditional statistical FFA using the Generalized Extreme Value distribution and Bayesian inference. UQ-flood yields flood quantile estimates consistent with long-record statistical methods but with substantially narrower uncertainty bounds. Under short-record conditions (e.g., 30 years), UQ-flood maintains statistically consistent estimates, while statistical FFA produces wide, often impractical uncertainty intervals. Additional experiments reveal that omitting the REM introduces systematic bias in flood magnitude estimates. UQ-flood avoids parametric assumptions about flow distributions, circumvents hydrologic model biases, and is adaptable to data-limited conditions. By explicitly propagating uncertainty from hydrologic simulation to flood quantiles, UQ-flood offers a practical alternative for robust flood risk management, including applications in infrastructure design and floodplain mapping. We recommended integrating residual error models into continuous simulation frameworks to improve bias correction and uncertainty quantification in flood risk estimation.
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-5784', Ricardo Mantilla, 19 Feb 2026
-
RC2: 'Comment on egusphere-2025-5784', Anonymous Referee #2, 17 Mar 2026
The authors present a simulation framework known as the ‘UQ-flood framework’. This framework is based on a stochastic weather generator, a hydrologic model, and a residual error model for annual maximum discharges (AMAX). It enables the simulation of long series of AMAX values, which are used to estimate extreme floods and the associated uncertainties. A case study involving three long time series in the United States (with 100, 90 and 108 years of record) shows that the flood quantiles estimated by the UQ-flood framework are less uncertain than those estimated by a GEV distribution.
I need detailed explanations regarding Section 3.6 and Figures 2 and 3.
Section 3.6.
I do not agree with the methodology presented by the authors. On page 10, line 296, they writeTo account for uncertainty in plotting positions used in flood frequency analysis, we applied a probabilistic approach based on fitting parametric distributions to annual maximum flow (AMF) data. This method enables the estimation of confidence intervals around plotting positions, which are critical for interpreting the reliability of estimated quantiles”.
Serinaldi (2009) compared several methods for computing confidence intervals for extreme quantiles (and NOT for order statistics), and recommended a method using fractional order statistics. He is thus able to compute a confidence interval for extreme values exceeding the highest value in the sample (see figures 2 and 4 of Serinaldi, 2009).
The standard method for analysing flood distribution involves plotting the experimental distribution as a reference, using observed annual maximum values and representing plotting probability position based on order statistics, as explained by Cunnane (1978) https://doi.org/10.1016/0022-1694(78)90017-3). This experimental distribution is then compared with a fitted distribution (Log Normal, Gamma, Gumbel, GEV…) or a simulated distribution. Using the SCHADEX simulation method, Paquet et al. (2013) compare the distribution of simulated daily or peak discharges against the observed CDF (Figures 11 and 13). Using the SHYPRE simulation method, Arnaud and Lavabre (2002) compare observed and simulated AMAX peak discharge (Figure 4). Using the SHYREG simulation method, Arnaud et al. (2017) present the median value and confidence interval of flood quantiles, estimated using the SHYREG method or Gumbel or GEV distributions, as well as the experimental distribution (Figure 4).
In this paper, Figures 2 and 3 are misleading. One would expect having to see a good agreement between AMFpp (in red) with AMFsim (in blue), which is not the case. The term “plotting-position confidence interval” is inappropriate, as one would expect a confidence interval for a plotting position to provide a probability interval. If the authors wish to highlight that a plotting position involves uncertainty, as expressed by order statistics, they should present a horizontal interval expressed as annual exceedance probability. This is linked to the fact that the maximum value of a AMAX sample over N years may have a return period greater than N years (it sometimes happens that a 100-year flood occurs during a 20-years observation period). Attached is an example of a confidence interval for a plotting position. It was computed from the simulation of 1000 samples of 20, 50 or 100 years of record. It can be seen that the length of the confidence interval is greater with the largest values of the sample, and that is greater when the sample size is shorter.
We require several explanations, additions or corrections regarding Figures 2 and 3:
* What is the difference between AMFsim (in blue) and UQ-flood (in blue)? I understand that AMFsim represents the output of the simulation chain, without an error model, whereas UQ-flood includes the error model. In general, the simulation chain allows results to be obtained that are consistent with the experimental distribution of AMAX values, provided that its components are properly calibrated (several examples in Paquet et al., 2013; Arnaud and Lavabre, 2002; Arnaud et al., 2017).
* AMFpp (in red) is supposed to represent the plotting position of the AMAX values, but we have only 6 points. Authors should plot the experimental AMAX distribution of the three watersheds (with series length of 100, 90 and 108 years). We wish to compare this experimental distribution with the GEV distribution and the UQ-flood distribution. In figure 2 (bottom) it is difficult to understand why the GEV distribution (in yellow) lies above the ENTIRE experimental distribution (in red).
* The vertical confidence interval (in red) on plotting position is misleading, as it can be interpreted as an uncertainty interval for the flood peak (which is estimated from the maximum stage and converted to discharge using a rating curve). It could be removed. We are more interested in comparing the confidence intervals of the GEV distribution and the UQ-flood distribution.
Minor comments:
Page 3, line 70: add also a reference to Paquet et al. (2013), noting that the SCHADEX method has been applied in around 10 to 15 countries.
http://dx.doi.org/10.1016/j.jhydrol.2013.04.045
Page 4, section 2: add a map showing the location of the three watersheds.
Page 5, line 142: “upstream of the Water Survey of Canada gauge”
Page 13, line 387: reference of Renard et al. (2013) is missing from the« References » section
10.1002/wrcr.20087
Page 14, figure 2 : please add on the two figures all the plotting positions based on the observed Annual Maximum Series : at the top with 108 values, at the bottom with 30 values. See for example Figures 2 and 8 in Lucas et al. (2024)
Model code and software
UQ-flood: An Uncertainty Quantification Framework for Simulation-based Flood Frequency Analysis Jonathan Romero-Cuellar et al. https://github.com/rarabzad/REM_UQ_EXTREME/tree/main
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 242 | 0 | 2 | 244 | 0 | 0 |
- HTML: 242
- PDF: 0
- XML: 2
- Total: 244
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The authors present a novel framework for Uncertainty Quantification of flood-frequency for Annual Maxima estimates that combines model-generated data with a Residual Error Model (REM). Their formalism provides a realistic and easily implementable method for at-a-station Flood Frequency Analysis (FFA). This work is a significant improvement over previous techniques that rely on either statistical assumptions about the properties of the Annual Maxima random variable or purely model-based generation of deterministic values. The careful treatment of model error allows for a smooth combination of these two approaches, provinding norrow and physically-constrained estimates of uncertainty bounds.
Specific Comments:
1. Clarify if the underlying HBV model was calibrated using only 30-years of the data in the experiments that were comparing 30 vs. 108 years estimates.
2. Clarify if the AR model for residual error was deemed sufficient to characterize autocorrelations or if this was a subjective decision
3. Include the lack of "parameter uncertainty" estimation in Section 5.3. It looks to me that a different set of parameters with similar performance had been chosen; the final uncertainty band could have changed. Therefore, a bootstrapping of multiple similarly performing model parameterizations can lead to a widening of the uncertainty band.
4. Further addressing the issue of uncertainty in streamflow estimates due to errors in the rating curves. I suggest citing and comparing the uncertainty bands reported here with those found by Velásquez, N., Krajewski, W.F. Effect of streamflow measurement error on flood frequency estimation. Stoch Environ Res Risk Assess 38, 2903–2910 (2024). https://doi.org/10.1007/s00477-024-02707-1
5. I suggest the authors comment on any findings in their analysis that suggest whether annual maxima produced by different flood mechanisms (e.g., snowmelt vs rainfall generated) can be detected, and if a conditional REM model may be needed to address those differences.
Line 123: superscript for square kilometers
Line 389: replace "providing" with "provides"