the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The impact of calibration strategies on future evapotranspiration projections: a SWAT-T comparison of three hydrological modeling approaches in West Africa
Abstract. Actual evapotranspiration (AET) is pivotal for the assessment of current and future water availability, particularly for sub humid and AET dominant regions such as West Africa. In this region, climate change is projected to be substantial, which will catalyze hydrological changes. In the climate-hydrological modeling chain for impact assessment, multiple sources of uncertainty are embedded. While the uncertainties inherent in general circulation models (GCM) are difficult to reduce, minimizing uncertainties from hydrological modeling remains a critical focus for researchers and practitioners. Hence, the present study investigates the impact calibration strategies can have on future hydrological changes in West Africa. Given the key role of AET in West Africa, the study particularly evaluates how calibration shapes its future dynamics. In addition, we test whether a specific plant growth modeling, attributed as leaf area index (LAI), can be used as a proxy to predict AET. The Bétérou Catchment in Benin is selected as a demonstration case along hydrological modeling with the eco-hydrological SWAT-T model. To investigate calibration impacts, we apply three strategies, which range from simple (discharge (Q) only) to more comprehensive (Q and LAI; Q, LAI, and AET) approaches. We use the Robust Parameter Estimation algorithm in each calibration strategy to address parameter equinfinality. We use the standardized future climate data from ISIMIP3b (CMIP6) with five GCMs and three emission scenarios and evaluate changes for the near (2031–2050) and far (2070–2099) future periods. The findings show that the amount of future annual AET depends on the calibration strategy, where the change signal for all strategies indicates AET increases. The approach including AET calibration (Q, LAI, AET) shows high future changes, with e.g., multi-model mean changes for SSP5–8.5 of ΔEnear = 5.8 % and ΔEfar = 8.4 %. The results moreover demonstrate that the combined "Q + LAI" can be used as a proxy to predict AET rates. For discharge, the change signal mostly indicates future decreases across all calibration strategies with multi-model mean changes for SSP5–8.5 of ∆Qfar = −7.0 % (Q, LAI, AET) to ΔQfar = −1.6 % (Q only). Yet, contrasting predictions of future changes depending on single GCMs are simulated. The present study under-20 scores the relevance of uncertainty integration in climate-hydrological modeling and contributes to an improved understanding of water availability assessment in West Africa.
- Preprint
(11326 KB) - Metadata XML
-
Supplement
(125 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-3836', Anonymous Referee #1, 11 Oct 2025
-
AC2: 'Reply on RC1', Fabian Merk, 23 Dec 2025
We thank the reviewer for their time and effort in evaluating our manuscript, for acknowledging the quality of our work, and for providing constructive comments and suggestions. Please find our detailed responses to the comments in the attached document.
-
AC2: 'Reply on RC1', Fabian Merk, 23 Dec 2025
-
RC2: 'Comment on egusphere-2025-3836', Anonymous Referee #2, 19 Nov 2025
The manuscript, “The impact of calibration strategies on future evapotranspiration projections: a SWAT-T comparison of three hydrological modeling approaches in West Africa”, by Merk et al. is a promising and interesting paper that tackles an important topic. The approach of integrating LAI and AET into the calibration, and evaluating model performance and future projections across West Africa, highlights a valuable direction in hydrological modeling. However, there are a few areas where I think additional clarity, or possibly further modelling, would strengthen the manuscript. See my comments below.
Specific comments:
Reliance on MODIS-based products (GLASS-LAI and FLUXCOM AET)
• Although you note the use of GLASS-LAI and FLUXCOM, both high-quality datasets, there is no discussion of the uncertainties associated with remotely sensed products, which I think should be acknowledged.
You comment throughout that you have determined LAI is a good proxy for AET, however I find this a challenging takeaway that is not entirely proven. For example:
• Currently, there is no calibration that uses Q + AET only (QA). It would be useful to either include this configuration or explain why it wasn’t done and whether it would meaningfully change the results. I imagine it would change the results, as at the moment, the QL and QLA calibrations behave similarly. Could this simply be because LAI is driving the optimisation? Showing the QA calibration (or explaining its omission) would help clarify this. If QA performs similarly to QL, that would support the idea that LAI could be a proxy for AET.
• Further, the KGE values improve for both AET and streamflow when AET is included (QLA compared to just QL). This raises the question of whether LAI alone adds enough information.
• Additionally, as noted in your study, none of the calibration strategies reach the benchmark KGE value suggested by Knoben et al. (2020). It may be worth discussing whether LAI limits performance, and whether AET alone could achieve closer to benchmark values.
Suitability of KGE for LAI
• KGE is an appropriate metric for streamflow and for AET given the daily FLUXCOM data. However, LAI changes slowly and has limited intra-seasonal variability, so I question whether KGE is the most informative metric for LAI. I suggest either using a LAI-specific metric (e.g., RMSE, bias, seasonal amplitude/timing) or including a justification for using KGE in the manuscript.
Figures 9 and 10
• These are difficult to interpret without a baseline figure showing monthly AET in this format. Consider instead presenting percentage change in monthly AET, which may communicate the intended comparison more clearly, or showing a baseline figure (even in the appendix).
Methods section
• The Methods read a little cluttered in parts, distracting from the overall story. You could consider moving some elements (e.g., the sensitivity analysis description) to the supplementary materials, to tighten this.
Technical corrections
• Line 72 – “if or if not” is awkward. Suggest “whether” or “whether or not.”
• Line 140 – “We use the Penman–Monteith method…” appears to be repeated.
• Line 189 – Extra space before the bracket in “(5 km resolution).”
• Figure 3 – Consider adding detail to panel lettering (e.g., “a) all variables”, “b) LAI…”). Increase font size of u* on the axes and briefly define it in the caption (e.g., “higher u* = more sensitive”).
• Figure 5 – Consider relabelling the y-axis to something clearer (e.g., “Cumulative probability”) or define F(x) in the caption.
• Line 398 – missing “to” after according.
• Figure 8 – Consider adding the projection period to each panel (e.g., “a) 2031–2050” or “a) near-future”).
• Figure 11 – Axis font sizes are too small; consider increasing.
• Lines 489–490 – The sentence structure is unclear, and “with particularly for AET” is incorrect grammar. A clearer option might be: “Similar to previous comparative studies, we investigate simple to comprehensive calibration strategies, with a particular focus on AET.”Citation: https://doi.org/10.5194/egusphere-2025-3836-RC2 -
AC1: 'Reply on RC2', Fabian Merk, 23 Dec 2025
We thank the reviewer for their time and effort in evaluating our manuscript, for acknowledging the quality of our work, and for providing constructive comments and suggestions. Please find our detailed responses to the comments in the attached document.
-
AC1: 'Reply on RC2', Fabian Merk, 23 Dec 2025
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 1,603 | 195 | 29 | 1,827 | 48 | 28 | 23 |
- HTML: 1,603
- PDF: 195
- XML: 29
- Total: 1,827
- Supplement: 48
- BibTeX: 28
- EndNote: 23
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Dear Authors,
I have carefully reviewed your manuscript entitled “The impact of calibration strategies on future evapotranspiration projections: a SWAT-T comparison of three hydrological modeling approaches in West Africa.” This study addresses a critical topic for hydrological modeling in the context of climate change, particularly in data-scarce and climate-sensitive regions like West Africa. The use of SWAT-T in combination with various calibration strategies and the consideration of actual evapotranspiration (AET) as a target variable is timely and significant. The emphasis on model equifinality, and the use of LAI as a potential proxy for AET estimation, reflects a deep engagement with current methodological challenges in hydrological modeling.
There are, however, substantial areas where the manuscript could be improved to ensure clarity, scientific rigor, and proper contextualization. I recommend that the manuscript be returned for major revision.
Line 15–20
You state that “the combined ‘Q + LAI’ can be used as a proxy to predict AET rates.” This is an important claim but needs further qualification. While your results suggest that LAI contributes to improved AET estimates, it is not fully demonstrated that this proxy relationship holds across climatic variability or land use types. You may consider referencing doi: [10.1016/j.scitotenv.2020.143792], which evaluates how dynamic LULC influences surface runoff and can affect AET estimates as well.
Lines 28–30
The discussion of AET as an “essential role in the regional hydrology” is well-stated but would benefit from a more precise articulation of the mechanisms, e.g., the seasonal feedbacks between soil moisture, vegetation phenology, and transpiration. Currently, the sentence reads too broadly.
Lines 61–70
This section criticizes prior studies for omitting AET in model calibration. However, the critique would be stronger if you quantified the extent of the bias introduced by using only discharge. For example, what is the typical error in AET projections under Q-only calibration across your GCM ensemble? Additionally, reference could be made to the limitations of using discharge-only calibration in semi-arid systems, as shown in doi: [10.3390/land12112017].
Lines 88–90
The assertion that this study “contributes to minimize uncertainties from model calibration approaches” should be reworded. Rather than claiming uncertainty minimization, a more accurate statement might be that the study demonstrates how calibration strategies affect model spread and projections.
Lines 104–108
The explanation of SWAT-T improvements over SWAT is helpful, but further clarification is needed on how the tropical phenology routines specifically alter AET dynamics. For instance, how does the model account for year-round biomass retention or multi-modal LAI cycles? Please consider adding a brief example or case insight.
Lines 144–154
The plant parameters were fixed based on prior literature, but it is unclear whether those parameter values are transferable across different land covers or years. Given that some of these parameters (e.g., LAIMX2, PHU) can be sensitive to local agronomic practices and interannual climate variability, this approach could bias calibration. Have you evaluated the robustness of this transfer? Could a partial re-optimization for these parameters improve model skill?
Lines 165–175
The land use representation for croplands through “AGRL” seems overly simplified. Since croplands cover a substantial portion of the catchment, this may limit the realism of AET dynamics, particularly in peak growing periods. Have you considered applying crop-specific growth curves or incorporating a dynamic planting calendar?
Lines 195–205
The aggregation of GLASS-LAI and FLUXCOM-AET data to the subbasin scale is described clearly. However, you should mention potential scale mismatches between remote sensing and model HRUs. This aggregation likely introduces smoothing effects that may obscure localized land-atmosphere feedbacks. Please discuss whether this mismatch may influence LAI-AET correlation strength.
Lines 225–235
The explanation of the Morris method is mathematically accurate, but its description interrupts the methodological flow. Consider moving equations to the Supplementary Material or shortening the derivation here. Focus instead on the reasoning for using Morris over Sobol or other variance-based methods.
Lines 254–266
The role of half-space depth in ROPE is well explained, but please clarify how convergence was assessed. Did you apply any stopping criteria based on performance plateauing? Were all final parameter sets within the convex hull? Details like these improve transparency and reproducibility.
Lines 300–310
You use the W5E5 simulation period (2001–2015) as a baseline. However, earlier you mentioned that observed data go back to 1981. Why was a shorter baseline chosen, especially given that longer baselines can reduce noise in climate signal detection?
Lines 361–375
The seasonal patterns of AET are well captured in the model, particularly the mid-season dip. However, your explanation that this dip is “due to lack of LAI representation” in Q-only is somewhat superficial. Could the effect also arise from misrepresented soil moisture dynamics or canopy interception?
Lines 395–405
The cross-validation results show that W5E5-derived parameters can be used with observed forcing. However, the validation for LAI shows noticeable degradation. This suggests that model structural limitations may constrain LAI robustness under different climates. Could you expand on the implications of this for future climate scenario modeling?
Lines 425–435
You evaluate GCM ensemble spread and precipitation changes, but the connection between these meteorological changes and hydrological sensitivity is underexplored. For example, how does GCM precipitation variability translate into spread in Q or AET for the different calibration approaches? A more integrated uncertainty decomposition would be valuable here.
Throughout, there is a lack of critical reflection on the limitations of remote sensing products used for LAI and AET. While GLASS-LAI and FLUXCOM are high-quality datasets, they carry uncertainties, especially in tropical canopies. A brief paragraph discussing these limitations would strengthen the credibility of your validation approach.
Finally, regarding language and terminology throughout the manuscript, I believe that phrases such as “to guarantee convergence,” “mimics well,” and “modeling potential” could be rephrased to enhance precision. For instance, “the model performance plateaus after 12 iterations” is preferable to “to guarantee convergence.” Similarly, instead of “mimics well,” you could say “closely reproduces.”