18 Oct 2023
 | 18 Oct 2023
Status: this preprint is open for discussion.

Regional index flood estimation at multiple durations with generalized additive models

Danielle M. Barna, Kolbjørn Engeland, Thomas Kneib, Thordis L. Thorarinsdottir, and Chong-Yu Xu

Abstract. Estimation of flood quantiles at ungauged basins is often achieved through regression based methods. In situations where flood retention is important, e.g. floodplain management and reservoir design, flood quantile estimates are often needed at multiple durations. This poses a problem for regression-based models as the form of the functional relationship between catchment descriptors and the response may not be constant across different durations. A particular type of regression model that is well-suited to this situation is a generalized additive model (GAM), which allows for flexible, semi-parametric modeling and visualization of the relationship between predictors and the response. However, in practice, selecting predictors for such a flexible model can be challenging, particularly given the characteristics of available catchment descriptor datasets. We employ a machine learning-based variable pre-selection tool which, when combined with domain knowledge, enhances the practicality of constructing GAMs. In this study, we develop a GAM for index (median) flood estimation with the primary objective of investigating duration-specific differences in how catchment descriptors influence the median flood. As the accuracy of this explainable approach is dependent on the fitted GAM being adequate, the secondary objective of our study is prediction of the median flood at ungauged locations and multiple durations, where predictive performance and reliability at ungauged locations are used as proxies for adequacy of the GAM. Predictive performance of the GAM is compared to two benchmark models: the existing log-linear model for median flood estimation in Norway and a fully data-driven machine learning model (an extreme gradient boosting tree ensemble, XGBoost). We find that the predictive accuracy and reliability of the GAM matched or exceeded that of the benchmark models at both durations studied. Within the predictor set selected for this study, we observe duration-specific differences in the relationship between the median flood and the two catchment descriptors effective lake percentage and catchment shape. Ignoring these differences results in a statistically significant decline in predictive performance. This suggests that models developed and estimated for prediction of the index flood at one duration may have reduced performance when applied directly to situations outside of that specific duration. 

Danielle M. Barna et al.

Status: open (until 13 Dec 2023)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2023-2335', Anonymous Referee #1, 15 Nov 2023 reply

Danielle M. Barna et al.

Danielle M. Barna et al.


Total article views: 302 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
259 37 6 302 5 4
  • HTML: 259
  • PDF: 37
  • XML: 6
  • Total: 302
  • BibTeX: 5
  • EndNote: 4
Views and downloads (calculated since 18 Oct 2023)
Cumulative views and downloads (calculated since 18 Oct 2023)

Viewed (geographical distribution)

Total article views: 259 (including HTML, PDF, and XML) Thereof 259 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 10 Dec 2023
Short summary
Estimating flood quantiles at data-scarce sites often involves single-duration regression models. However, floodplain management and reservoir design, for example, need estimates at several durations, posing challenges. Our flexible generalized additive model (GAM) enhances accuracy and explanation, revealing that single-duration models may underperform elsewhere, emphasizing the need for adaptable approaches.