the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
HESS Opinions: A few camels or a whole caravan?
Abstract. Large-sample datasets containing hydrometeorological time series and catchment attributes for hundreds of catchments in a country, many of them known as “Camels” (catchment attributes and meteorology for large-sample studies), have revolutionized hydrological modelling and enabled comparative analyses. The Caravan dataset is a compilation of several (“Camels” and other) large-sample datasets with uniform attribute names and data structure. This simplifies large-sample hydrology across regions, continents, or the globe. However, the use of the Caravan dataset instead of the original Camels or other large-sample datasets may affect model results and the conclusions derived thereof. For the Caravan dataset, the meteorological forcing data are based on ERA5-Land reanalysis data. Here, we describe the differences between the original precipitation, temperature, and potential evapotranspiration (Epot) data for 1252 catchments in the CAMELS-US, CAMELS-BR, and CAMELS-GB datasets and the forcing data for these catchments in the Caravan dataset. The Epot in the Caravan dataset is unrealistically high for many catchments but there are, not surprisingly, also considerable differences in the precipitation data. We show that the use of the forcing data from the Caravan dataset impairs hydrological model calibration for the vast majority of catchments, i.e., there is a drop in the calibration performance when using the forcing data from the Caravan dataset compared to the original Camels datasets. This drop is mainly due to the differences in the precipitation data. Therefore, we suggest extending the Caravan dataset with the forcing data included in the original Camels datasets wherever possible, so that users can choose which forcing data they want to use, or at least indicating clearly that the forcing data in Caravan come with a data quality loss and using the original datasets is recommended. Moreover, we suggest not using the Epot data (and derived catchment attributes, such as the aridity index) from the Caravan dataset and replacing these with (or based on) alternative Epot estimates.
- Preprint
(2191 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 28 May 2024)
-
RC1: 'Comment on egusphere-2024-864', Thorsten Wagener, 27 Apr 2024
reply
Comparative hydrology with large samples of catchment scale data is a rapidly growing topic in hydrology. Samples are growing to sizes of many thousands of catchments around the world. This offers tremendous opportunities for new learning, but it also creates potential problems. One problem is that errors or inconsistencies in the data get propagated into subsequent studies because there is an assumption that available datasets are ready for use.
Clerc-Schwarzenbach and co-authors address this issue with the example of the popular Caravan dataset in which multiple datasets have been combined. To harmonize the data, some meteorological variables of the original national datasets have been replaced by global products. However, Clerc-Schwarzenbach and co-authors found that this can cause significant problems given some large differences between national and global estimates. This is a very relevant and timely study. It is nice work with a well written manuscript. My comments are mainly suggestions for further improvement.
Main Comments
Are the are evaluations of ERA5-Land reanalysis dataset outside the use for hydrological modelling that might have relevant insights into regional differences? The studies currently cited seem largely focused on hydrological application though I assume there must also be other uses of this dataset?
(Section 4.3) As the authors discuss in this section, hydrological models can generally cope well with poor PET values given that they scale this input variable anyway. What would be nice to add to the discussion is the potential problem of biased parameters. Depending on the model structure, one or more parameters will absorb the bias in the forcing data. This is problematic if the resulting values are used to characterize the system (e.g. Bouaziz et al., 2022, HESS, https://doi.org/10.5194/hess-26-1295-2022 and references therein). Are there parameters in HBV that would show this bias? I could not find a good example in the literature, but it would be interesting to see how stepwise increases in PET are reflected in stepwise bias in a parameter.
In addition to the specific comments regarding the Caravan dataset, are there more general lessons to be learned? E.g. regarding how to benchmark new datasets? This general problem might come up more often in the future in various datasets.
Minor Comments
(Section 4.2) HBV and HyMod have been calibrated to the MOPEX catchments (precursor of CAMELS-US) with NSE (no KGE then) to identify problematic catchments (Kollat et al., 2012, WRR, doi:10.1029/2011WR011534). This might be a possible comparison of difficult to model catchments.
(Section 4.3) The low performance of models like HBV in chalk catchments in the south of the UK is significantly reduced when a more suitable model structure for groundwater processes used. See the recent study by Kiraz et al. (2023, HSJ, https://doi.org/10.1080/02626667.2023.2251968) – results for KGE are in the supplemental material of the study.
Citation: https://doi.org/10.5194/egusphere-2024-864-RC1
Model code and software
HESS Opinions: A few camels or a whole caravan? Franziska M. Clerc-Schwarzenbach https://doi.org/10.5281/zenodo.10784701
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
401 | 104 | 9 | 514 | 5 | 6 |
- HTML: 401
- PDF: 104
- XML: 9
- Total: 514
- BibTeX: 5
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1