the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
HESS Opinions: A few camels or a whole caravan?
Abstract. Large-sample datasets containing hydrometeorological time series and catchment attributes for hundreds of catchments in a country, many of them known as “Camels” (catchment attributes and meteorology for large-sample studies), have revolutionized hydrological modelling and enabled comparative analyses. The Caravan dataset is a compilation of several (“Camels” and other) large-sample datasets with uniform attribute names and data structure. This simplifies large-sample hydrology across regions, continents, or the globe. However, the use of the Caravan dataset instead of the original Camels or other large-sample datasets may affect model results and the conclusions derived thereof. For the Caravan dataset, the meteorological forcing data are based on ERA5-Land reanalysis data. Here, we describe the differences between the original precipitation, temperature, and potential evapotranspiration (Epot) data for 1252 catchments in the CAMELS-US, CAMELS-BR, and CAMELS-GB datasets and the forcing data for these catchments in the Caravan dataset. The Epot in the Caravan dataset is unrealistically high for many catchments but there are, not surprisingly, also considerable differences in the precipitation data. We show that the use of the forcing data from the Caravan dataset impairs hydrological model calibration for the vast majority of catchments, i.e., there is a drop in the calibration performance when using the forcing data from the Caravan dataset compared to the original Camels datasets. This drop is mainly due to the differences in the precipitation data. Therefore, we suggest extending the Caravan dataset with the forcing data included in the original Camels datasets wherever possible, so that users can choose which forcing data they want to use, or at least indicating clearly that the forcing data in Caravan come with a data quality loss and using the original datasets is recommended. Moreover, we suggest not using the Epot data (and derived catchment attributes, such as the aridity index) from the Caravan dataset and replacing these with (or based on) alternative Epot estimates.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(2191 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(2191 KB) - Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2024-864', Thorsten Wagener, 27 Apr 2024
Comparative hydrology with large samples of catchment scale data is a rapidly growing topic in hydrology. Samples are growing to sizes of many thousands of catchments around the world. This offers tremendous opportunities for new learning, but it also creates potential problems. One problem is that errors or inconsistencies in the data get propagated into subsequent studies because there is an assumption that available datasets are ready for use.
Clerc-Schwarzenbach and co-authors address this issue with the example of the popular Caravan dataset in which multiple datasets have been combined. To harmonize the data, some meteorological variables of the original national datasets have been replaced by global products. However, Clerc-Schwarzenbach and co-authors found that this can cause significant problems given some large differences between national and global estimates. This is a very relevant and timely study. It is nice work with a well written manuscript. My comments are mainly suggestions for further improvement.
Main Comments
Are the are evaluations of ERA5-Land reanalysis dataset outside the use for hydrological modelling that might have relevant insights into regional differences? The studies currently cited seem largely focused on hydrological application though I assume there must also be other uses of this dataset?
(Section 4.3) As the authors discuss in this section, hydrological models can generally cope well with poor PET values given that they scale this input variable anyway. What would be nice to add to the discussion is the potential problem of biased parameters. Depending on the model structure, one or more parameters will absorb the bias in the forcing data. This is problematic if the resulting values are used to characterize the system (e.g. Bouaziz et al., 2022, HESS, https://doi.org/10.5194/hess-26-1295-2022 and references therein). Are there parameters in HBV that would show this bias? I could not find a good example in the literature, but it would be interesting to see how stepwise increases in PET are reflected in stepwise bias in a parameter.
In addition to the specific comments regarding the Caravan dataset, are there more general lessons to be learned? E.g. regarding how to benchmark new datasets? This general problem might come up more often in the future in various datasets.
Minor Comments
(Section 4.2) HBV and HyMod have been calibrated to the MOPEX catchments (precursor of CAMELS-US) with NSE (no KGE then) to identify problematic catchments (Kollat et al., 2012, WRR, doi:10.1029/2011WR011534). This might be a possible comparison of difficult to model catchments.
(Section 4.3) The low performance of models like HBV in chalk catchments in the south of the UK is significantly reduced when a more suitable model structure for groundwater processes used. See the recent study by Kiraz et al. (2023, HSJ, https://doi.org/10.1080/02626667.2023.2251968) – results for KGE are in the supplemental material of the study.
Â
Citation: https://doi.org/10.5194/egusphere-2024-864-RC1 - AC1: 'Reply on RC1', Franziska Clerc-Schwarzenbach, 06 Jun 2024
-
RC2: 'Comment on egusphere-2024-864', François Brissette, 24 May 2024
HESS Opinions: A Few Camels or a Whole Caravan? By Franziska Maria Clerc-Schwarzenbach et al.
General Comments
As a strong believer in the importance of large sample studies in hydrology, I read the paper with much interest. I found the results very interesting despite being unsurprised by the results. The strengths and drawbacks of ERA5 and its little brother ERA5-Land data are fairly well known, especially when it comes to temperature (very good) and precipitation (good with some issues such as regional biases). Potential evapotranspiration is more of an unknown, and I found its use in Caravan a bit perplexing since it is an unknown quantity. I believe that 'exotic' data from reanalysis should be thoroughly validated before their incorporation into any hydrological study. My understanding is that potential evapotranspiration is computed using the surface energy balance assuming a crop soil surface, as it was included for irrigation purposes. As such, it is not surprising that catchment scale estimates would be severely overestimated I many cases. See Muñoz-Sabater et al. (2021) for example.
Despite calling the results 'unsurprising', I believe this paper makes some very good observations that are useful to the community and is therefore worthy of publication. In particular:
1-Observations on three continents (and many catchments) confirm results from previous regional studies, notably that ERA5 temperatures are excellent substitutes for observations, at least in the context of hydrological models, and that precipitation is more problematic. Precipitation from ERA5 can certainly be used, but a modeling performance decrease is to be expected.
2-It clearly shows that potential evapotranspiration (PET) from ERA5-Land is problematic. This could have been an educated guess prior to this study, but now we have a clear and well-documented issue.
3-The documentation that precipitation deficiencies are more important than PET deficiencies, despite the much larger biases of the latter, is also very interesting.
4-PET deficiencies can be largely removed by recalculating it using PET formulas based on other variables (e.g., temperature, as done in this work). Other formulations using additional reanalysis variables would likely perform even better.
5-The seven modeling experiments provide very useful information on the strengths of the various datasets.
6-The paper provides a clear warning to hydrologists who are increasingly willing to use such datasets without a clear understanding of the outstanding issues related to reanalysis data.
Â
Specific Comments
I am not fully sure why this is considered an Opinion paper. To me, the breadth of the research work and analysis clearly qualifies it as a research paper. I did not see much 'opinion' in this paper, as most of the arguments/discussions are results-based. Consequently, I suggest this submission be reclassified as a research paper.
I think the title does a disservice to the paper. I like the catchy phrase, but in reality, I would think that a majority of hydrologists are not familiar with Camel, and an even smaller number are aware of Caravan. A more generic title referring to large sample datasets and global datasets would be more appropriate for the varied readership of HESS.
I believe an additional discussion point should be added regarding the choice of a particular hydrological model. It is well known that some models may be more flexible than others at adapting to biases in input variables such as precipitation and easily scale PET with specific calibration parameters. This is mentioned in the paper, but I believe some other hydrological models may perform better than the one used in this study, and the performance drop mentioned in this study may not be as bad. I certainly would not expect the conclusions of this paper to be any different, but this should be mentioned.
There should be a mention of the upcoming ERA6 reanalysis. The ERA5 reanalysis used in Caravan will soon be a thing of the past. In addition to improved resolution, ERA6 will have a full overhaul of the model physics, including radiation, which is overestimated in ERA5 and likely part of the PET problem, in addition to the issue discussed above. Based on past history, we can expect a significant performance increase with ERA6. This should be mentioned in the paper. I believe that reanalysis is indeed the future of large-sample hydrology and that merging reanalysis with Deep Learning approaches will produce very high-quality global datasets much sooner than most people think. Already, the merging of deep-learning methods with weather forecasting models promises to revolutionize weather forecasting—exciting times.
I would also like to add one important advantage of global datasets based on reanalysis that was not mentioned in the paper: they are easily updated once a new version comes out. In addition, new data is produced in near-real time. Comparatively, datasets relying on observations (e.g., Camel) are much more complex to update (missing data, stations being decommissioned, etc.) and, based on past history, are unlikely to be updated at all, or very infrequently. A dataset such as Caravan will still need to be updated, but the process is much more straightforward.
The use of 'significant/ly' should be clarified if it is in the 'statistical' sense from the get-go at line 70. In some cases, it clearly is, but not so much in others.
I would suggest the use of PET instead of Epot, with the former being a lot more common, in my opinion.
Muñoz-Sabater, J., Dutra, E., AgustÃ-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., ... & Thépaut, J. N. (2021). ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth system science data, 13(9), 4349-4383.
Citation: https://doi.org/10.5194/egusphere-2024-864-RC2 - AC2: 'Reply on RC2', Franziska Clerc-Schwarzenbach, 06 Jun 2024
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2024-864', Thorsten Wagener, 27 Apr 2024
Comparative hydrology with large samples of catchment scale data is a rapidly growing topic in hydrology. Samples are growing to sizes of many thousands of catchments around the world. This offers tremendous opportunities for new learning, but it also creates potential problems. One problem is that errors or inconsistencies in the data get propagated into subsequent studies because there is an assumption that available datasets are ready for use.
Clerc-Schwarzenbach and co-authors address this issue with the example of the popular Caravan dataset in which multiple datasets have been combined. To harmonize the data, some meteorological variables of the original national datasets have been replaced by global products. However, Clerc-Schwarzenbach and co-authors found that this can cause significant problems given some large differences between national and global estimates. This is a very relevant and timely study. It is nice work with a well written manuscript. My comments are mainly suggestions for further improvement.
Main Comments
Are the are evaluations of ERA5-Land reanalysis dataset outside the use for hydrological modelling that might have relevant insights into regional differences? The studies currently cited seem largely focused on hydrological application though I assume there must also be other uses of this dataset?
(Section 4.3) As the authors discuss in this section, hydrological models can generally cope well with poor PET values given that they scale this input variable anyway. What would be nice to add to the discussion is the potential problem of biased parameters. Depending on the model structure, one or more parameters will absorb the bias in the forcing data. This is problematic if the resulting values are used to characterize the system (e.g. Bouaziz et al., 2022, HESS, https://doi.org/10.5194/hess-26-1295-2022 and references therein). Are there parameters in HBV that would show this bias? I could not find a good example in the literature, but it would be interesting to see how stepwise increases in PET are reflected in stepwise bias in a parameter.
In addition to the specific comments regarding the Caravan dataset, are there more general lessons to be learned? E.g. regarding how to benchmark new datasets? This general problem might come up more often in the future in various datasets.
Minor Comments
(Section 4.2) HBV and HyMod have been calibrated to the MOPEX catchments (precursor of CAMELS-US) with NSE (no KGE then) to identify problematic catchments (Kollat et al., 2012, WRR, doi:10.1029/2011WR011534). This might be a possible comparison of difficult to model catchments.
(Section 4.3) The low performance of models like HBV in chalk catchments in the south of the UK is significantly reduced when a more suitable model structure for groundwater processes used. See the recent study by Kiraz et al. (2023, HSJ, https://doi.org/10.1080/02626667.2023.2251968) – results for KGE are in the supplemental material of the study.
Â
Citation: https://doi.org/10.5194/egusphere-2024-864-RC1 - AC1: 'Reply on RC1', Franziska Clerc-Schwarzenbach, 06 Jun 2024
-
RC2: 'Comment on egusphere-2024-864', François Brissette, 24 May 2024
HESS Opinions: A Few Camels or a Whole Caravan? By Franziska Maria Clerc-Schwarzenbach et al.
General Comments
As a strong believer in the importance of large sample studies in hydrology, I read the paper with much interest. I found the results very interesting despite being unsurprised by the results. The strengths and drawbacks of ERA5 and its little brother ERA5-Land data are fairly well known, especially when it comes to temperature (very good) and precipitation (good with some issues such as regional biases). Potential evapotranspiration is more of an unknown, and I found its use in Caravan a bit perplexing since it is an unknown quantity. I believe that 'exotic' data from reanalysis should be thoroughly validated before their incorporation into any hydrological study. My understanding is that potential evapotranspiration is computed using the surface energy balance assuming a crop soil surface, as it was included for irrigation purposes. As such, it is not surprising that catchment scale estimates would be severely overestimated I many cases. See Muñoz-Sabater et al. (2021) for example.
Despite calling the results 'unsurprising', I believe this paper makes some very good observations that are useful to the community and is therefore worthy of publication. In particular:
1-Observations on three continents (and many catchments) confirm results from previous regional studies, notably that ERA5 temperatures are excellent substitutes for observations, at least in the context of hydrological models, and that precipitation is more problematic. Precipitation from ERA5 can certainly be used, but a modeling performance decrease is to be expected.
2-It clearly shows that potential evapotranspiration (PET) from ERA5-Land is problematic. This could have been an educated guess prior to this study, but now we have a clear and well-documented issue.
3-The documentation that precipitation deficiencies are more important than PET deficiencies, despite the much larger biases of the latter, is also very interesting.
4-PET deficiencies can be largely removed by recalculating it using PET formulas based on other variables (e.g., temperature, as done in this work). Other formulations using additional reanalysis variables would likely perform even better.
5-The seven modeling experiments provide very useful information on the strengths of the various datasets.
6-The paper provides a clear warning to hydrologists who are increasingly willing to use such datasets without a clear understanding of the outstanding issues related to reanalysis data.
Â
Specific Comments
I am not fully sure why this is considered an Opinion paper. To me, the breadth of the research work and analysis clearly qualifies it as a research paper. I did not see much 'opinion' in this paper, as most of the arguments/discussions are results-based. Consequently, I suggest this submission be reclassified as a research paper.
I think the title does a disservice to the paper. I like the catchy phrase, but in reality, I would think that a majority of hydrologists are not familiar with Camel, and an even smaller number are aware of Caravan. A more generic title referring to large sample datasets and global datasets would be more appropriate for the varied readership of HESS.
I believe an additional discussion point should be added regarding the choice of a particular hydrological model. It is well known that some models may be more flexible than others at adapting to biases in input variables such as precipitation and easily scale PET with specific calibration parameters. This is mentioned in the paper, but I believe some other hydrological models may perform better than the one used in this study, and the performance drop mentioned in this study may not be as bad. I certainly would not expect the conclusions of this paper to be any different, but this should be mentioned.
There should be a mention of the upcoming ERA6 reanalysis. The ERA5 reanalysis used in Caravan will soon be a thing of the past. In addition to improved resolution, ERA6 will have a full overhaul of the model physics, including radiation, which is overestimated in ERA5 and likely part of the PET problem, in addition to the issue discussed above. Based on past history, we can expect a significant performance increase with ERA6. This should be mentioned in the paper. I believe that reanalysis is indeed the future of large-sample hydrology and that merging reanalysis with Deep Learning approaches will produce very high-quality global datasets much sooner than most people think. Already, the merging of deep-learning methods with weather forecasting models promises to revolutionize weather forecasting—exciting times.
I would also like to add one important advantage of global datasets based on reanalysis that was not mentioned in the paper: they are easily updated once a new version comes out. In addition, new data is produced in near-real time. Comparatively, datasets relying on observations (e.g., Camel) are much more complex to update (missing data, stations being decommissioned, etc.) and, based on past history, are unlikely to be updated at all, or very infrequently. A dataset such as Caravan will still need to be updated, but the process is much more straightforward.
The use of 'significant/ly' should be clarified if it is in the 'statistical' sense from the get-go at line 70. In some cases, it clearly is, but not so much in others.
I would suggest the use of PET instead of Epot, with the former being a lot more common, in my opinion.
Muñoz-Sabater, J., Dutra, E., AgustÃ-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., ... & Thépaut, J. N. (2021). ERA5-Land: A state-of-the-art global reanalysis dataset for land applications. Earth system science data, 13(9), 4349-4383.
Citation: https://doi.org/10.5194/egusphere-2024-864-RC2 - AC2: 'Reply on RC2', Franziska Clerc-Schwarzenbach, 06 Jun 2024
Peer review completion
Journal article(s) based on this preprint
Model code and software
HESS Opinions: A few camels or a whole caravan? Franziska M. Clerc-Schwarzenbach https://doi.org/10.5281/zenodo.10784701
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
803 | 241 | 43 | 1,087 | 25 | 18 |
- HTML: 803
- PDF: 241
- XML: 43
- Total: 1,087
- BibTeX: 25
- EndNote: 18
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Cited
Franziska Maria Clerc-Schwarzenbach
Giovanni Selleri
Mattia Neri
Elena Toth
Ilja van Meerveld
Jan Seibert
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(2191 KB) - Metadata XML