An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections

Alexandre, Duy Anh; Jalbert, Jonathan

doi:10.5194/egusphere-2024-2114

Preprints

https://doi.org/10.5194/egusphere-2024-2114

Preprints

05 Aug 2024

| 05 Aug 2024

An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections

Duy Anh Alexandre and Jonathan Jalbert

Abstract. Estimating extreme streamflows is critical for delimiting flood zones and designing fluvial infrastructure, but for the vast majority of river sections, no measurements are available. Estimated streamflows at ungauged river sections using spatial interpolation and hydrological modeling are uncertain, and in the context of extreme value analysis, this uncertainty can be crucial when estimating return levels. In the present paper, an errors-in-variables extreme value model is proposed to account for the estimated streamflow uncertainty at ungauged river sections. The true unobserved streamflows correspond to the missing variables in a Bayesian hierarchical model. In this model, the uncertainty of the unobserved streamflows propagates to the uncertainty in the estimated return levels. The model was implemented to estimate the streamflow return levels of 211 ungauged sections of the Chaudière River watershed in Southern Quebec, Canada.

Received: 08 Jul 2024 – Discussion started: 05 Aug 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Duy Anh Alexandre and Jonathan Jalbert

Status: closed

RC1:
'Comment on egusphere-2024-2114 (An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections)', Anonymous Referee #1, 10 Sep 2024

The manuscript An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections by Duy Anh Alexandre and Jonathan Jalbert presents an interesting model with a slightly less traditional application in the realm of extreme value analysis. They focus on deriving information on extremes when these are not observed, employing information from hydrological models outputs and employing an errors-in-variables (EIV) approach.

The modeling approach is interesting and novel (as far as I know), but I do have concerns about the writing of both the text and the mathematical details, which I think make the manuscript not as clear as it should be and make me doubtful of the validity of the results.

Overall I think the manuscript presents and interesting approach but I would recommend the authors spend some time in re-reading the manuscript and in making sure the mathematical details follow logically and are written clearly and with precision.
Some more punctual comments:
If I understand correctly there is no observed data used in any part of the model, but you use the simulated/estimated data (you use both terms, I think you mean the same thing) to derive the hypothetical/imputed annual maxima across the whole river network. Were the hydrological models not calibrated on something? Is there not any observed flow data that you can use to validate your modeling approach?

More importantly, you are modeling annual maxima: do you extract for each of the 6 model configurations a value for the year based on the maximum value of \eta_{ij}: do you have any way to ensure that you are modeling the flow value of the same day (and does it matter?)

In Section 3.2 you use Y_{i} to indicate the maxima in year i, but then in 3.3 you use Y_{j}, with j indexing the year, but you also have the eta and zeta parameters indexed by {i,j}, where now i indicates the hyodrological model? This is quite confusing.

In equation 6: should the logNormal not be for X_{ij} - there is some unclarity in the notation here (the pendix of the f_ should it not be big Y, we normally give distributions for r.v, not for realizations - indeed you do so in eq 7)?

Line 178 "contributes to the distribution of Y_j" -> Y_i? The idea is that in any year each jth model contributes differently

Is Section 3.3 not assuming in some way that (conditional) uncertainty of the maximum distribution is independent of the size of the maximum? It is often the case that hydrological models are more uncertain for extremes (where the data available for calibration is more scarce): is this something that could undermine your approach?

What is the $\sigma$ which appears in equation 8 (and what value did you choose for this hyper-parameter)? From the results it looks like you greatly reduce the uncertainty of the estimated maximum: could this be linked to fairly informative priors?

Personally I find the derivation in Section 3.5 slightly easier to follow, probably because it is closer to the traditional EIV derivation. It's OK to leave this at a later stage of the manuscript, but it is not at all clear how the two derivations are equivalent, considering the final equation has a different form (the pendices for f are different, how is this equivalent?) Also as it is the equation at line 219 is hard to parse since \eta_i represents both the random variable and its realization (similarly the equation at line 215 would normally be written using the $\sim$ formulation and making clear what is a random variable and what is a realization).

Section 4.3: is it surprising that MCMC samples from what you have assumed to be GEV-distributed are GEV-distributed? The real test here would be to have the measured flow values and see if the qq-plot of those values behaves like a GEV.

Section 5.1: you provide only one value of DIC per model: did you try this across the whole river network?

The study would be even more convincing if it could show that the approach does indeed work as planned on simulated dataset. Since you do not use any observed data we can not really know if the new maps which have been derived are more reliable than what was there before.
Some other minor points

- Line 26: a positive value *suggests* -> it's a mathematical fact, so make indicates, implies, results in...

- Line 29-30 *assuming climate stationarity*: the definition does not require stationarity, the stationarity is needed to be able to define the return period as the quantile. See on this Volpi et al (doi.org/10.1002/2015WR017820), Volpi (https://doi.org/10.1002/wat2.1340), Salas et al (https://doi.org/10.1061/(ASCE)HE.1943-5584.0000820)

- Line 114: at https://github.com/jojal5/Publications I don't see the code for this paper yet

- The title of Section 3.1 is not very informative

- In some equations (eg 6 and 9) the summation goes up to S, should this be a 6 (or should all the other summations go up to 6 and you should define what S is)

- After eq 5: the function is evaluated at y_i, not y (even if I would suggest to change eq 5 so that the function is evaluated at a y, rather than y_i value for clarity)

- I don't see why the mixture model in equation 9 would be a sensible idea for this

Citation: https://doi.org/10.5194/egusphere-2024-2114-RC1
- AC1: 'Reply on RC1', Jonathan Jalbert, 05 Nov 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2114/egusphere-2024-2114-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2114-AC1
RC2:
'Comment on EGUSPHERE-2024-2114', Anonymous Referee #2, 19 Sep 2024

See attached pdf with review and annotated manuscript.

Citation: https://doi.org/10.5194/egusphere-2024-2114-RC2
- AC2: 'Reply on RC2', Jonathan Jalbert, 05 Nov 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2114/egusphere-2024-2114-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2114-AC2

Status: closed

RC1:
'Comment on egusphere-2024-2114 (An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections)', Anonymous Referee #1, 10 Sep 2024

The manuscript An errors-in-variables extreme-value model for estimating interpolated extreme streamflows at ungauged river sections by Duy Anh Alexandre and Jonathan Jalbert presents an interesting model with a slightly less traditional application in the realm of extreme value analysis. They focus on deriving information on extremes when these are not observed, employing information from hydrological models outputs and employing an errors-in-variables (EIV) approach.

The modeling approach is interesting and novel (as far as I know), but I do have concerns about the writing of both the text and the mathematical details, which I think make the manuscript not as clear as it should be and make me doubtful of the validity of the results.

Overall I think the manuscript presents and interesting approach but I would recommend the authors spend some time in re-reading the manuscript and in making sure the mathematical details follow logically and are written clearly and with precision.
Some more punctual comments:
If I understand correctly there is no observed data used in any part of the model, but you use the simulated/estimated data (you use both terms, I think you mean the same thing) to derive the hypothetical/imputed annual maxima across the whole river network. Were the hydrological models not calibrated on something? Is there not any observed flow data that you can use to validate your modeling approach?

More importantly, you are modeling annual maxima: do you extract for each of the 6 model configurations a value for the year based on the maximum value of \eta_{ij}: do you have any way to ensure that you are modeling the flow value of the same day (and does it matter?)

In Section 3.2 you use Y_{i} to indicate the maxima in year i, but then in 3.3 you use Y_{j}, with j indexing the year, but you also have the eta and zeta parameters indexed by {i,j}, where now i indicates the hyodrological model? This is quite confusing.

In equation 6: should the logNormal not be for X_{ij} - there is some unclarity in the notation here (the pendix of the f_ should it not be big Y, we normally give distributions for r.v, not for realizations - indeed you do so in eq 7)?

Line 178 "contributes to the distribution of Y_j" -> Y_i? The idea is that in any year each jth model contributes differently

Is Section 3.3 not assuming in some way that (conditional) uncertainty of the maximum distribution is independent of the size of the maximum? It is often the case that hydrological models are more uncertain for extremes (where the data available for calibration is more scarce): is this something that could undermine your approach?

What is the $\sigma$ which appears in equation 8 (and what value did you choose for this hyper-parameter)? From the results it looks like you greatly reduce the uncertainty of the estimated maximum: could this be linked to fairly informative priors?

Personally I find the derivation in Section 3.5 slightly easier to follow, probably because it is closer to the traditional EIV derivation. It's OK to leave this at a later stage of the manuscript, but it is not at all clear how the two derivations are equivalent, considering the final equation has a different form (the pendices for f are different, how is this equivalent?) Also as it is the equation at line 219 is hard to parse since \eta_i represents both the random variable and its realization (similarly the equation at line 215 would normally be written using the $\sim$ formulation and making clear what is a random variable and what is a realization).

Section 4.3: is it surprising that MCMC samples from what you have assumed to be GEV-distributed are GEV-distributed? The real test here would be to have the measured flow values and see if the qq-plot of those values behaves like a GEV.

Section 5.1: you provide only one value of DIC per model: did you try this across the whole river network?

The study would be even more convincing if it could show that the approach does indeed work as planned on simulated dataset. Since you do not use any observed data we can not really know if the new maps which have been derived are more reliable than what was there before.
Some other minor points

- Line 26: a positive value *suggests* -> it's a mathematical fact, so make indicates, implies, results in...

- Line 29-30 *assuming climate stationarity*: the definition does not require stationarity, the stationarity is needed to be able to define the return period as the quantile. See on this Volpi et al (doi.org/10.1002/2015WR017820), Volpi (https://doi.org/10.1002/wat2.1340), Salas et al (https://doi.org/10.1061/(ASCE)HE.1943-5584.0000820)

- Line 114: at https://github.com/jojal5/Publications I don't see the code for this paper yet

- The title of Section 3.1 is not very informative

- In some equations (eg 6 and 9) the summation goes up to S, should this be a 6 (or should all the other summations go up to 6 and you should define what S is)

- After eq 5: the function is evaluated at y_i, not y (even if I would suggest to change eq 5 so that the function is evaluated at a y, rather than y_i value for clarity)

- I don't see why the mixture model in equation 9 would be a sensible idea for this

Citation: https://doi.org/10.5194/egusphere-2024-2114-RC1
- AC1: 'Reply on RC1', Jonathan Jalbert, 05 Nov 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2114/egusphere-2024-2114-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2114-AC1
RC2:
'Comment on EGUSPHERE-2024-2114', Anonymous Referee #2, 19 Sep 2024

See attached pdf with review and annotated manuscript.

Citation: https://doi.org/10.5194/egusphere-2024-2114-RC2
- AC2: 'Reply on RC2', Jonathan Jalbert, 05 Nov 2024
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2024/egusphere-2024-2114/egusphere-2024-2114-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2024-2114-AC2

Duy Anh Alexandre and Jonathan Jalbert

Viewed

Total article views: 5,670 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
2,735	819	2,116	5,670	210	395

HTML: 2,735
PDF: 819
XML: 2,116
Total: 5,670
BibTeX: 210
EndNote: 395

Views and downloads (calculated since 05 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	461	165	290	916
Sep 2024	327	15	194	536
Oct 2024	10	15	285	310
Nov 2024	135	55	250	440
Dec 2024	61	25	255	341
Jan 2025	15	15	194	224
Feb 2025	54	5	15	74
Mar 2025	11	35	215	261
Apr 2025	30	29	228	287
May 2025	40	5	103	148
Jun 2025	35	65	5	105
Jul 2025	65	25	0	90
Aug 2025	215	4	0	219
Sep 2025	673	47	4	724
Oct 2025	85	35	5	125
Nov 2025	75	125	15	215
Dec 2025	115	30	23	168
Jan 2026	145	40	5	190
Feb 2026	76	16	14	106
Mar 2026	69	50	14	133
Apr 2026	21	9	2	32
May 2026	15	8	0	23
Jun 2026	1	0	1
Jul 2026	1	1	0	2

Cumulative views and downloads (calculated since 05 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	461	165	290	916
Sep 2024	327	15	194	536
Oct 2024	10	15	285	310
Nov 2024	135	55	250	440
Dec 2024	61	25	255	341
Jan 2025	15	15	194	224
Feb 2025	54	5	15	74
Mar 2025	11	35	215	261
Apr 2025	30	29	228	287
May 2025	40	5	103	148
Jun 2025	35	65	5	105
Jul 2025	65	25	0	90
Aug 2025	215	4	0	219
Sep 2025	673	47	4	724
Oct 2025	85	35	5	125
Nov 2025	75	125	15	215
Dec 2025	115	30	23	168
Jan 2026	145	40	5	190
Feb 2026	76	16	14	106
Mar 2026	69	50	14	133
Apr 2026	21	9	2	32
May 2026	15	8	0	23
Jun 2026	1	0	1
Jul 2026	1	1	0	2

Viewed (geographical distribution)

Total article views: 5,645 (including HTML, PDF, and XML) Thereof 5,645 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 17 Jul 2026

Short summary

Estimating extreme streamflows is essential for identifying flood-prone areas and designing safe river infrastructure. However, most river sections lack direct measurements of streamflows. Hydrologists use spatial interpolation and hydrological modeling to estimate streamflows in these unmeasured areas, but these estimates come with uncertainties. Our study introduces a new model to better account for these uncertainties, improving the accuracy of predicting extreme streamflows.


Total:	0
HTML:	0
PDF:	0
XML:	0