the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Regional index flood estimation at multiple durations with generalized additive models
Kolbjørn Engeland
Thomas Kneib
Thordis L. Thorarinsdottir
Chong-Yu Xu
Abstract. Estimation of flood quantiles at ungauged basins is often achieved through regression based methods. In situations where flood retention is important, e.g. floodplain management and reservoir design, flood quantile estimates are often needed at multiple durations. This poses a problem for regression-based models as the form of the functional relationship between catchment descriptors and the response may not be constant across different durations. A particular type of regression model that is well-suited to this situation is a generalized additive model (GAM), which allows for flexible, semi-parametric modeling and visualization of the relationship between predictors and the response. However, in practice, selecting predictors for such a flexible model can be challenging, particularly given the characteristics of available catchment descriptor datasets. We employ a machine learning-based variable pre-selection tool which, when combined with domain knowledge, enhances the practicality of constructing GAMs. In this study, we develop a GAM for index (median) flood estimation with the primary objective of investigating duration-specific differences in how catchment descriptors influence the median flood. As the accuracy of this explainable approach is dependent on the fitted GAM being adequate, the secondary objective of our study is prediction of the median flood at ungauged locations and multiple durations, where predictive performance and reliability at ungauged locations are used as proxies for adequacy of the GAM. Predictive performance of the GAM is compared to two benchmark models: the existing log-linear model for median flood estimation in Norway and a fully data-driven machine learning model (an extreme gradient boosting tree ensemble, XGBoost). We find that the predictive accuracy and reliability of the GAM matched or exceeded that of the benchmark models at both durations studied. Within the predictor set selected for this study, we observe duration-specific differences in the relationship between the median flood and the two catchment descriptors effective lake percentage and catchment shape. Ignoring these differences results in a statistically significant decline in predictive performance. This suggests that models developed and estimated for prediction of the index flood at one duration may have reduced performance when applied directly to situations outside of that specific duration.
- Preprint
(2598 KB) - Metadata XML
- BibTeX
- EndNote
Danielle M. Barna et al.
Status: open (until 13 Dec 2023)
-
RC1: 'Comment on egusphere-2023-2335', Anonymous Referee #1, 15 Nov 2023
reply
The manuscript "Regional index flood estimation at multiple durations with generalized additive models" by Barna et al. is a comprehensive study on prediction of the median index flood on 234 stations in Norway. It aims to compare a GAM model with two benchmark models (a log-linear model and XGBoost) on two different flood duration (1h and 24h). Additionally, a variable selection method is included and models are tested in a cross-validation approach. The manuscript is generally well written and the results are worth publishing in HESS. However, I have two concerns about the manuscript, which I think can be addressed in a revised version.
First, one of the objectives of the study is "prediction of the median flood at ungauged locations" (Line 125-126). I am not quite sure if this is addressed adequately. One of the catchment descriptors is the mean annual runoff of the catchment of the SeNorge 2.0 dataset, which is based on observational data. In my opinion prediction at ungauged locations means, that there is no information about the runoff at this stations, so also no information about the mean annual runoff can be included in a prediction model. I understand that the information maybe necessary for the comparison with the RFA_2018 model, but I think it can also be beneficial to show that the GAM model performs as good without the mean annual runoff (and leave the mean annual runoff as predictor for the RFA_2018 model for simplicity). My second point concerning the prediction in ungauged locations, is the variable selection. If I understood it correctly, the variable selection is performed on the full dataset (with a cross validation scheme), and the selected variables are then used as input for the validation study (again with a cross-validation scheme). If this is correct, the variables are selected on the full dataset, not on a subset, so the prediction error is somehow biased, as the variable selection already included information about the full dataset. If my understanding of the validation scheme is wrong, I would suggest to make this clearer in the methods section.
Second, the manuscript is a bit too long. Some parts of the methods are in the Appendix, which is fine, but makes it hard to read at some point. Two examples: (i) The description in the Introduction between line 29-46 may be shortened, (ii) or the very detailed description of the reponse variable (Sect 2./2.1) can may be written more concise.
Additional minor comments:
The legend title of Figure 1c seems wrong.
Add panels in Figure 2.
In Table 1, the catchment descriptors $R_L$ is duplicated.
Line 351, what are the actual hyperparmameters that were used in the 10 final models? If it is easy possible, the information could be added.
Line 384: XGBoost is assessed only on the MAE; optimal predictors for the other four error metrics are not accessible for XGBoost when the data are assumed log normal. Why is XGBoost then used as for the variable selection procedure - and the MAPE as error metric, if this will produce unreliable result? Additionally, was it considered to alter the loss function in XGBoost for better comparison in terms of the chosen error metrics? I think this should be clarified in the main manuscript.
I think the formulas in Figure 5(a) may be wrong ($\times 100$), and as they are both partly defined in Table 3., I would suggest using the same variables.
Figure 6, the legend is not consistent with the actual plot.
Line 452 - 469, I think the description of partial response curves can be shortened.
Line 487, Maybe the results on regional models can be skipped in the Appendix.
Line 526, If this sentence is referring to the paragraph above, maybe change the order of the two sentences.
Citation: https://doi.org/10.5194/egusphere-2023-2335-RC1
Danielle M. Barna et al.
Danielle M. Barna et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
259 | 37 | 6 | 302 | 5 | 4 |
- HTML: 259
- PDF: 37
- XML: 6
- Total: 302
- BibTeX: 5
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1