the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections
Abstract. Estimating extreme streamflows is critical for delimiting flood zones and designing fluvial infrastructure, but for the vast majority of river sections, no measurements are available. Estimated streamflows at ungauged river sections using spatial interpolation and hydrological modeling are uncertain, and in the context of extreme value analysis, this uncertainty can be crucial when estimating return levels. In the present paper, an errors-in-variables extreme value model is proposed to account for the estimated streamflow uncertainty at ungauged river sections. The true unobserved streamflows correspond to the missing variables in a Bayesian hierarchical model. In this model, the uncertainty of the unobserved streamflows propagates to the uncertainty in the estimated return levels. The model was implemented to estimate the streamflow return levels of 211 ungauged sections of the Chaudière River watershed in Southern Quebec, Canada.
- Preprint
(2624 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 30 Sep 2024)
-
RC1: 'Comment on egusphere-2024-2114 (An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections)', Anonymous Referee #1, 10 Sep 2024
reply
The manuscript An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections by Duy Anh Alexandre and Jonathan Jalbert presents an interesting model with a slightly less traditional application in the realm of extreme value analysis. They focus on deriving information on extremes when these are not observed, employing information from hydrological models outputs and employing an errors-in-variables (EIV) approach.
The modeling approach is interesting and novel (as far as I know), but I do have concerns about the writing of both the text and the mathematical details, which I think make the manuscript not as clear as it should be and make me doubtful of the validity of the results.
Overall I think the manuscript presents and interesting approach but I would recommend the authors spend some time in re-reading the manuscript and in making sure the mathematical details follow logically and are written clearly and with precision.Some more punctual comments:
If I understand correctly there is no observed data used in any part of the model, but you use the simulated/estimated data (you use both terms, I think you mean the same thing) to derive the hypothetical/imputed annual maxima across the whole river network. Were the hydrological models not calibrated on something? Is there not any observed flow data that you can use to validate your modeling approach?
More importantly, you are modeling annual maxima: do you extract for each of the 6 model configurations a value for the year based on the maximum value of \eta_{ij}: do you have any way to ensure that you are modeling the flow value of the same day (and does it matter?)
In Section 3.2 you use Y_{i} to indicate the maxima in year i, but then in 3.3 you use Y_{j}, with j indexing the year, but you also have the eta and zeta parameters indexed by {i,j}, where now i indicates the hyodrological model? This is quite confusing.
In equation 6: should the logNormal not be for X_{ij} - there is some unclarity in the notation here (the pendix of the f_ should it not be big Y, we normally give distributions for r.v, not for realizations - indeed you do so in eq 7)?
Line 178 "contributes to the distribution of Y_j" -> Y_i? The idea is that in any year each jth model contributes differently
Is Section 3.3 not assuming in some way that (conditional) uncertainty of the maximum distribution is independent of the size of the maximum? It is often the case that hydrological models are more uncertain for extremes (where the data available for calibration is more scarce): is this something that could undermine your approach?
What is the $\sigma$ which appears in equation 8 (and what value did you choose for this hyper-parameter)? From the results it looks like you greatly reduce the uncertainty of the estimated maximum: could this be linked to fairly informative priors?
Personally I find the derivation in Section 3.5 slightly easier to follow, probably because it is closer to the traditional EIV derivation. It's OK to leave this at a later stage of the manuscript, but it is not at all clear how the two derivations are equivalent, considering the final equation has a different form (the pendices for f are different, how is this equivalent?) Also as it is the equation at line 219 is hard to parse since \eta_i represents both the random variable and its realization (similarly the equation at line 215 would normally be written using the $\sim$ formulation and making clear what is a random variable and what is a realization).
Section 4.3: is it surprising that MCMC samples from what you have assumed to be GEV-distributed are GEV-distributed? The real test here would be to have the measured flow values and see if the qq-plot of those values behaves like a GEV.
Section 5.1: you provide only one value of DIC per model: did you try this across the whole river network?
The study would be even more convincing if it could show that the approach does indeed work as planned on simulated dataset. Since you do not use any observed data we can not really know if the new maps which have been derived are more reliable than what was there before.Some other minor points
- Line 26: a positive value *suggests* -> it's a mathematical fact, so make indicates, implies, results in...
- Line 29-30 *assuming climate stationarity*: the definition does not require stationarity, the stationarity is needed to be able to define the return period as the quantile. See on this Volpi et al (doi.org/10.1002/2015WR017820), Volpi (https://doi.org/10.1002/wat2.1340), Salas et al (https://doi.org/10.1061/(ASCE)HE.1943-5584.0000820)
- Line 114: at https://github.com/jojal5/Publications I don't see the code for this paper yet
- The title of Section 3.1 is not very informative
- In some equations (eg 6 and 9) the summation goes up to S, should this be a 6 (or should all the other summations go up to 6 and you should define what S is)
- After eq 5: the function is evaluated at y_i, not y (even if I would suggest to change eq 5 so that the function is evaluated at a y, rather than y_i value for clarity)
- I don't see why the mixture model in equation 9 would be a sensible idea for thisCitation: https://doi.org/10.5194/egusphere-2024-2114-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
110 | 34 | 61 | 205 | 2 | 1 |
- HTML: 110
- PDF: 34
- XML: 61
- Total: 205
- BibTeX: 2
- EndNote: 1
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1