the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A systematic evaluation of 15 actual evapotranspiration formulations within conceptual hydrological models
Abstract. Actual evapotranspiration (AET) is a major component of the water balance, yet it is rarely assessed for accuracy in conceptual rainfall-runoff models that are often calibrated to match streamflow only. Inaccurate representation of underlying AET processes may cause models to incorrectly simulate long-term changes in partitioning between AET and streamflow, even if this partitioning was relatively accurate during calibration. To investigate AET representation within conceptual hydrological models, we systematically tested 15 evapotranspiration (ET) equations that convert potential evapotranspiration (PET) and soil moisture to AET. The 15 equations represent common practice, having been sourced from a published comprehensive review of conceptual hydrological models. Each of these 15 formulations were trialled within three conceptual hydrological models (GR4J, Simhyd and Vic). Following multi-objective calibration, we evaluated performance across both streamflow and flux tower AET measurements at seven catchments from a range of Australian climates. A small number of AET equations outperformed the rest, with one equation standing out, which uses a non-linear relationship with soil moisture storage and can scale down AET such that it cannot equal PET. This equation achieved a higher objective function value for both AET and streamflow and accurately captured evapotranspiration signatures. However, even this equation showed limitations in reproducing observed AET, suggesting persistent issues across commonly used formulations. These shortcomings may reflect missing vegetation-related dynamics and other simplifications. Our findings highlight the importance of ET equation selection in modelling AET and streamflow, and we recommend the identified equation as a promising option for future Australian studies. Further work is needed to test equations for consistency with known processes to improve the physical realism of conceptual hydrological models.
- Preprint
(1939 KB) - Metadata XML
-
Supplement
(10290 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2025-3122', Anonymous Referee #1, 22 Sep 2025
-
RC2: 'Comment on egusphere-2025-3122', Anonymous Referee #2, 24 Nov 2025
Review of “A systematic evaluation of 15 actual evapotranspiration formulations within conceptual hydrological models”
A very nice and well-structured manuscript. The study is indeed very systematic, and the results thoroughly analyzed and of general interest. I have a list of comments and suggestions below.
Major comments:
- Inclusion of AET and catchment correction factors pr. catchment Table 1.
- More details on the objective function for discharge and weighting is needed
- More discussions and potentially analysis of spatial parameter transferability and regionalization
- Tables are not well organized
Specific comments:
- Line 9: correct spelling ”partitioniapendng”
- Lines 37-49: This section talks about dynamic vegetation modelling, but this is not covered in the manuscript, which is based on simple conceptual hydrological models. It seems disproportional and a bit misguiding to dedicate such attention to a topic, which is not part of the paper.
- Lines 118-127: These selection criteria seem a bit subjective. Here I had written a long section about the need to adjust the site flux to catchment water balances. However, you actually do that in section 2.4, which is great. I strongly suggest that you mention that catchment water balance adjustment already here in lines 118-127. In that way you can mention both careful site selection and the need for adjustment.
- Table 1: I strongly suggest that Table 1. Includes catchment mean daily Q and site mean daily AET from the flux-tower and the adjustment factors to illustrate the adjustments made. If the table gets large, I think you can skip the catchment code, PET, the state and High P Low P columns. Here I believe the AET and correction is more interesting than the PET. Also number of decimals are excessive, and flux tower sit description could be simplified.
- Line 179: Why not substitute the AET formulation across all storages? Why only for the storage that a priori is assumed to contribute the most?
- Line 219: I think we need some more details on this bias_penalising_log OF. I have not heard of it before, please write the equation. Also, it cannot be found in Trotter et al. (2022) (Eqn. 4), as stated.
- Table 2 is hard to read, please try to reorganize.
- Table 3 is not important, consider moving to supplement.
- Table 4, Why not in a similar format as S2 in the supplement, much easier to read.
- Lines: 225-227: The two OF’s are weighted equally, but how does that ensure that they contribute equally to the combined OF? What if the two OF’s have different magnitudes, e.g. the KGE OF will be below 1, what is the typical range of bias_penalising_log? From later plots e.g. Fig 4 and 7 it seems as if the OF’s are scaled to similar sizes, but please elaborate a bit on that.
- Lines 243-244: Great with such a split sample test, which I believe is important for this kind of exercise. However, why only do it for one AET equation? Would there be value in adding the split sample test for all equation in the supplement? I would be curious to see if there is a systematic difference between models and equations regarding overfitting and tradeoffs between Q and AET?
- Line 251: “In this figure, Simhyd was run with each of the 15 evapotranspiration (AET)”. Please make it clear if this is the result after recalibration to each AET equation.
- Lines 252: I assume it is the adjusted not the observed flux tower AET.
- Line 255: Also here, that depends on whether this is prior or post recalibration. I think that has to be very clear.
- Line 303: I think it is wise to place the other signature results in the supplement. However, I notice one thing for the Interannual variability signature Figure S4.2. It seems as if the pattern here is almost identical across all models indicating that this signature is largely a result of the precipitation data and not controlled by the models or equations.
- Line 335: I really appreciate the split-sample test and the split between Q and AET OF’s. However, I also feel that with this calibration to a single optimal parameterset, it can be difficult to interpret the possible overfitting. It seems as if Eq. 19 fits very well to AET, but performance drops for Q, could that indicate an overfitting to AET? Perhaps in the discussion or under a description of limitations also mention that all results are based on one unique optimal parameter set and there will be a range of alternative parameter sets that would perform almost equally well, but with different transferability or tradeoffs. In principle the whole analysis could have been presented as results of an ensemble of parameter sets. I am not suggesting to change that, but perhaps worth noting something on this in the discussion.
- Line 337: for the analysis under “3.4 Seasonal timing of AET” you recalibrate to AET alone to test if the model has the capacity to match better. But would a recalibration including this seasonal timing OF (e.g. in combination with the original AET OF) not be a better test of that?
- Figure 8: I would suggest having the same y-axis scale on all three plots.
- 4.3 Implications: Overall, a very nice analysis and great that you take the time in the end to analyze eq. 19 in more depth. One remaining question, that I feel is not addressed, is the possibility to regionalize and thereby parametrize the two extra parameters in eq 19 beyond the current catchments. There is a large spread in those parameters (Table S6). So, there is not a unique parametrization across catchments. I think it would be great to address this, and in the discussion point in some direction regarding this, perhaps under future work. E.g. would the optimal values of p1 and p2 be spatially related to any mappable information? Also, I assume the PET used is a purely meteorologically driven PET, would a spatial scaling, using something like a high-resolution crop coefficient approach level out some of the variability between catchments, e.g. the need to reduce AET below PET also under high moisture availability? And what is the impact of PET method used? Could some model or equations struggle with scaling a poor PET input?
Citation: https://doi.org/10.5194/egusphere-2025-3122-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 1,677 | 110 | 21 | 1,808 | 64 | 40 | 28 |
- HTML: 1,677
- PDF: 110
- XML: 21
- Total: 1,808
- Supplement: 64
- BibTeX: 40
- EndNote: 28
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Burns et al. present a study on evaluating 15 different actual evapotranspiration methods within three conceptual hydrological models across seven diverse Australian basins. They establish a multi-objective calibration framework, in which observed streamflow and observed AET data from flux tower sites are used. The authors have established a comprehensive framework, and their current manuscript might be suitable for publication in HESS after addressing my comments listed below. The readability of some figures could be enhanced by increasing the font size of the labels. Tables are not well-designed, and it is often hard to identify the information to which row the appropriate information belongs. Please, revise. Overall, I value that the authors have considered more than one hydrological model structure and used a joint multi-objective calibration of the model’s parameters. Below you will find my three main remarks and further below, more minor suggestions for revisions:
Minor: