the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Same Streamflow, Different Water Stories: The Hidden Impacts of Streamflow-Only Calibration in Distributed Hydrological Modeling
Abstract. Distributed hydrological models enable the characterization of spatial heterogeneities in states and fluxes, including streamflow at inner points of a basin. Despite the growing number of remotely sensed observations, calibrating the model parameters using only streamflow observed at the catchment outlet remains a popular practice. In this paper, we examine how streamflow-only calibration impacts the average seasonality and spatial patterns of simulated evapotranspiration (ET), soil moisture (SM), land surface temperature (LST), and fractional snow-covered area (fSCA). To this end, we conduct calibration experiments with the Variable Infiltration Capacity (VIC) model in six basins located in Chile, using (i) different streamflow-based objective functions, and (ii) regularizing parameters associated with different physical processes. For the latter step, we develop and test a novel spatial regularization strategy based on principal component analysis of physiographic attributes associated with the modeling units contained within each basin. Our results suggest that these decisions may have large effects on the spatial representation of ET, SM1 (i.e., SM from the first soil layer in VIC), LST, and fSCA, without degrading the performance of streamflow simulations. The average streamflow seasonality can be simulated reasonably well, with large biases in ET, fSCA, SM1, and LST (in that order). In particular, different calibration configurations can yield the same annual cycle of streamflow through very different ET seasonalities, affecting the catchment-scale seasonal water balance. Additional calibration experiments incorporating ET and SM1 besides streamflow reaffirm tradeoffs in the fidelity of different simulated variables. Overall, the results presented here reinforce the benefits of including spatial patterns of hydrological variables in the calibration of distributed hydrological models and highlight the need to verify the seasonality of other simulated variables besides streamflow.
- Preprint
(7934 KB) - Metadata XML
-
Supplement
(15401 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-1363', Anonymous Referee #1, 11 May 2026
-
RC2: 'Comment on egusphere-2026-1363', Anonymous Referee #2, 29 May 2026
This paper presents a valuable investigation of an important topic: how calibration strategies for distributed hydrological models affect not only streamflow performance, but also the physical realism of internal hydrological processes. The proposed regularization framework is interesting, and the results are generally well presented. Overall, I commend the authors for this useful contribution. I have a few comments that may help further improve the manuscript.
General comments
1.The authors use Principal Component Analysis (PCA) to regularize the spatial distribution of model parameters. This is an interesting approach, but its physical basis could be explained more clearly. In particular, it would be useful to clarify what PC1 represents in each catchment. Does it mainly reflect soil texture, elevation, slope, or a combined physiographic gradient?
The authors should also justify why only PC1 is used. Some hydrologically relevant spatial patterns may be represented by higher-order PCs, even if they explain less total variance. Reporting the PCA loadings and the variance explained by PC1 would make the method more transparent.
In addition, the authors may consider discussing the choice of input attributes used for PCA. Besides the selected soil and topographic variables, other descriptors such as land cover, vegetation type, rooting depth, drainage area, flow accumulation, or topographic wetness index may also be hydrologically relevant.
2. The calibration results may depend on the random initial conditions, random seed, or stochastic search path of the calibration algorithm. This is particularly relevant because the paper focuses on equifinality and differences among calibration configurations.
I do not suggest repeating the full experimental design for all catchments, as this may be computationally expensive. However, a limited robustness test could be useful. For example, the authors could repeat a subset of calibration experiments for one representative catchment, or one catchment from each hydrological regime, using several random seeds. Reporting the spread in final objective-function values, calibrated parameters, and key diagnostic variables such as ET, SM, LST, and fSCA would help determine whether the main conclusions are robust to calibration stochasticity.
3. The manuscript evaluates model realism using gridded or remotely sensed products for ET, LST, fSCA, and SM. These products are valuable, but they also contain uncertainties related to retrieval algorithms, spatial resolution, temporal aggregation, and scale mismatch. Therefore, the interpretation of model performance against these products should be more cautious.
For example, when simulations differ from a gridded ET or SM product, this should not necessarily be interpreted as definitive model error. The authors may revise some statements to indicate that the simulations deviate from the selected reference product, rather than implying that the model is necessarily physically incorrect. A brief discussion of observational uncertainty and scale mismatch would improve the balance of the interpretation.
Minor specific comments
- Line 55: Please define “super-parameter” when it first appears.
- Line 72: Please provide the full names of LST, ET, fSCA, and SM at first use.
- Introduction: The motivation for parameter regularization could be stated more explicitly. The authors could briefly explain the difficulties of calibrating many spatially distributed parameters, including overparameterization, equifinality, limited parameter identifiability, and computational cost. This would better motivate the need for regularization.
Citation: https://doi.org/10.5194/egusphere-2026-1363-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 300 | 125 | 17 | 442 | 160 | 13 | 14 |
- HTML: 300
- PDF: 125
- XML: 17
- Total: 442
- Supplement: 160
- BibTeX: 13
- EndNote: 14
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript addresses a highly relevant and persistent problem in distributed hydrological modeling: the extent to which streamflow-only calibration compromises the fidelity of simulated internal states and fluxes. By systematically exploring combinations of objective functions and spatial regularization strategies across six diverse Chilean catchments, the authors provide compelling evidence that similar streamflow performance can emerge from substantially different internal hydrological dynamics. The manuscript is generally well written, the hypotheses are clearly stated. However, several technical and presentation issues need to be addressed before publication.
Specific comments:
1. The manuscript proposes a PCA-based spatial regularization strategy using physiographic variables. While this approach is interesting, the novelty relative to existing regionalization and transfer-function approaches (e.g., MPR) remains somewhat unclear. In addition, the approach appears somewhat empirical, the motivation for using only PC1 is insufficiently explained. Could the use of additional PCs, or a weighted combination of PCs, improve the transferability of the approach? The physical meaning of the PCA-derived spatial fields is insufficiently discussed.
2. The manuscript repeatedly emphasizes equifinality and compensation among fluxes and states. However, the study does not provide a formal parameter uncertainty or identifiability analysis. I’m wondering how different objective functions influence parameter constraints, whether the inclusion of ET or SM observations reduces parameter equifinality, and to what extent the proposed PCA-based regularization framework effectively improves parameter identifiability and mitigates equifinality in distributed hydrological modeling.
3. One of the key findings is that ET seasonalities differ substantially despite similar streamflow simulations. This is an interesting and important result. The analysis relating these ET shifts to moisture availability in deeper soil layers (Figure 9) is insightful. However, the discussion stops short of a full diagnostic. It would be beneficial to explicitly state which process parameterizations are primarily impactful in the configurations that produce the most erroneous ET seasonalities. For instance, are large ET biases consistently linked to unrealistic soil water storage dynamics in layer 3? Whether snow accumulation/melt timing contributes to ET discrepancies? A more mechanistic interpretation would elevate the paper's impact.
4. The additional two multi-objective calibration experiments are valuable but appear somewhat preliminary relative to the broader conclusions of the paper. Why did the authors not consider combining multivariate calibration with the proposed spatial regularization framework?
5. Although the manuscript explicitly states that one of the research questions is how to overcome the tradeoffs between accurately replicating streamflow annual cycles and effectively simulating the seasonal patterns of other hydrological variables, the study does not appear to provide a clear or systematic answer to this question beyond demonstrating the existence of such tradeoffs.
6. The study only used a single calibration period (2005–2018) without independent validation. Given the strong conclusions regarding model realism and process fidelity, an independent validation period is important to demonstrate robustness and avoid overfitting.