the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Comprehensive Global Assessment of 23 Gridded Precipitation Datasets Across 16,295 Catchments Using Hydrological Modeling
Abstract. Numerous gridded precipitation (P) datasets have been developed to address a variety of needs and challenges. However, selecting the most suitable and reliable dataset remains a challenge for users. We conducted the most comprehensive global evaluation to date of gridded (sub-)daily P datasets using hydrological modeling. A total of 23 datasets, derived from satellite, model, gauge sources, or their combinations thereof, were assessed. To evaluate their performance, we calibrated the conceptual hydrological model HBV against observed daily streamflow for 16,295 catchments (each < 10, 000 km2) world- wide, using each P dataset as input. The Kling-Gupta Efficiency (KGE) was used as the performance metric and the calibration score served as a proxy for P dataset performance. Overall, MSWEP V2.8 demonstrated the highest performance (median KGE of 0.75), highlighting the value of merging P estimates from diverse data sources and applying daily gauge corrections. Among the purely satellite-based P datasets, the soil moisture- and microwave-based GPM+SM2RAIN dataset performed best (median KGE of 0.60), while the JRA-3Q reanalysis ranked highest among the purely model-based datasets (median KGE of 0.67), outperforming the widely used ERA5 reanalysis (median KGE of 0.59). Performance varied across Köppen-Geiger climate zones, with the best results in polar (E) regions (median KGE of 0.74 across datasets) and the lowest in arid (B) regions (median KGE of 0.33 across datasets). We further examined the spatial relationships between catchment attributes and KGE scores, identifying potential evaporation, air temperature, solid P fraction, and latitude as the strongest predictors of performance. Our analysis revealed significant regional differences in dataset performance and heterogeneity in P error characteristics, underscoring the critical importance of careful dataset selection for water resource management, hazard assessment, agricultural planning, and environmental monitoring.
- Preprint
(7741 KB) - Metadata XML
-
Supplement
(28536 KB) - BibTeX
- EndNote
Status: open (until 16 Apr 2025)
-
RC1: 'Comment on egusphere-2024-4194', Anonymous Referee #1, 20 Feb 2025
reply
This manuscript presents an unprecedented evaluation of 23 (sub)daily (quasi)global precipitation (P)datasets across 16,295 catchments worldwide using hydrological modeling. The 23 P datasets belong to six major families of data sources: satellite only, reanalysis only, rain gauge only, satellite and rain gauge, satellite and reanalysis; and satellite, reanalysis and rain gauge. The conceptual hydrological model HBV was used to simulate the conversion of precipitation into streamflow at the daily temporal scale. Each P dataset, along with air temperature (from MSWX) and potential evapotranspiration (computed using the Hargreaves formula), are used to drive the hydrological simulations. The modified Kling-Gupta efficiency (KGE’) is used to evaluate the performance of the simulated streamflows against daily observations, and serves as a proxy for the performance of the P datasets.
This manuscript addresses an important topic for the hydrometeorological community. The manuscript is well written, concise and clear, with updated references. Unfortunately, the manuscript lacks a clear scientific question or hypothesis to be tested, and the Methodology section does not provide enough scientific detail to fully understand what was done and how, which prevents adequate reproducibility of the results. In addition, some conclusions are speculative and are not supported by the results included in the manuscript. Finally, some references are not used in the text and others contain minor errors. To summarise, the manuscript in its current form does not represent a substantial contribution to the global hydrometeorological community; but all the problems mentioned could be addresed by the authors during the review process. The following lines describe the major and minor problems detected in the manuscript.
Major comments:
-
MC1. The motivation for the article is not well developed. The manuscript does a really good job of pointing out the limitations of previous evaluations of P datasets. However, what is the ultimate purpose of this comprehensive evaluation of P datasets on a global scale? Is it just to provide some numbers on a global scale, or is it to test a hypothesis or answer a scientific question, or to provide recommendations for the selection of P products for specific applications or specific geographic regions? If so, the hypothesis, the scientific question or the ultimate purpose of the manuscript should be explicitly stated.
-
MC2. Usage of the outdated CMORPH-RAW (Joyce et al., 2004) and the unknown CMORPH-RT (Xie et al., 2017) instead of the new bias-corrected CMORPH-CDR v.1 (Xie et al., 2017, 2018). In the manuscript it is mentioned that the old CMORPH-RAW and CMORPH-RT are available from 2019 onwards (which seriously limit the hydrological modeling runs), while the newest version of CMORPH, termed CMORPH-CDR, is available from 1998 onwards (not from 2019 onwards). Moreover, it is not clear what is the product CMORPH-RT used in this study, every time that Xie et al. (2017) describe CMORPH-CDR version 1, which is available since 1998 and not from 2019. Therefore, I request the authors to remove the usage of the outdated CMORPH-RAW (version 0) and the unknown CMORPH-RT and use the relatively new bias-corrected CMORPH-CDR version 1, which is available since 1998, and it is described by Xie et al. (2017) and Xie et al. (2018).
- Xie, P., Joyce, R., Wu, S., Yoo, S.-H., Yarosh, Y., Sun, F., and Lin, R.: Reprocessed, Bias-Corrected CMORPH Global High-Resolution Precipitation Estimates from 1998, Journal of Hydrometeorology, 18, 1617–1641, doi:10.1175/JHM-D-16-0168.1, 2017.
- Xie, P., Joyce, R., Wu, S., Yoo, S., Yarosh, Y., Sun, F., Lin, R., and NOAA CDR Program: NOAA Climate Data Record (CDR) of CPC Morphing Technique (CMORPH) High Resolution Global Precipitation Estimates, Version 1, doi:10.25921/W9VA-Q159, URL https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=gov.noaa.ncdc:C00948, 2018.
-
MC3. Use of the under-revision PERSIANN-CCS-CDR (Sadeghi et al., 2021). This paper uses PERSIANN-CCS-CDR (Sadeghi et al., 2021) as one of the 23 P datasets to be evaluated. However, the websitehttps://chrsdata.eng.uci.edu/ clearly states that “PERSIANN-CCS-CDR is currently under revision and unavailable for download”. Therefore, I request the authors to remove the use of PERSIANN-CCS-CDR from this study or clarify the data version used in this study and indicate whether the chosen version is problematic or not.
-
MC4. Catchment selection. To ensure the suitability of the catchments used in the analyses, five selection criteria were applied in the manuscript to the 34,768 streamflow stations that passed the duplication check. However, the following two decisions are entirely subjective and require more detailed explanation (in the manuscript) by the authors: i) discarding streamflow stations where both the station location and the corresponding catchment centroid were within 5 km of those of another station (how does the spatial resolution of the individual P products influence this criterion?); ii) the number of events had to be greater than 10 non-consecutive (how does the duration of each selected event affect this criterion?; are 11 non-consecutive days with Q >= 5 mm d-1 sufficient to ensure a robust calibration of a hydrological model?)
-
MC5. Use of an unknown version of the HBV hydrological model. The manuscript does not contain a description about the version of the HBV hydrological model used in all analyses. L107 indicates that the HBV-light software described by Seibert and Vis (2012) was the software version used in this study. However, it seems unlikely that a Windows-based version of HBV was selected to simulate 16,295 catchments worldwide. I request the authors to provide details of the version of HBV used in this study. In the event that the authors use their own version of HBV, I request them to provide a link to the source code of the model in the “Code Availability” section requested by HESS (https://www.hydrology-and-earth-system-sciences.net/submission.html#templates).
-
MC6. Use of catchment-mean P time series to drive the hydrological model (L89-90, L114-115). The use of catchment mean P time series to drive the hydrological model HBV could lead to important problems in the representation of observed streamflows in catchments with mixed hydrological regimes (i.e. snow-dominated or snow-influenced hydrological regimes), which should be reflected in low KGE values. Therefore, I request the authors to provide -in the supplementary material- five to seven example catchments where the HBV is able to reproduce their mixed hydrological regime by using catchment-mean P time series to drive the hydrological model (I request not only the presentation of the KGE values and the daily time series of the observed Q compared to the simulated Q, but also a comparison of the mean monthly streamflows). If the model is not able to acceptably reproduce the daily and mean monthly observed streamflows of catchments with mixed hydrological regimes, I suggest the authors to implement different elevation bands in these catchments. A publicly available open-source version of an HBV-like hydrological model can be found at: https://cran.r-project.org/package=TUWmodel, which allows the use of up to 10 elevation bands in each catchment.
-
MC7. Using the Hargreaves (1994) equation to calculate potential evapotranspiration (PET) to drive the hydrological model. I request the authors to justify this choice after knowing that Oudin 2005a proposed a different temperature-based PET model after evaluating 27 potential evapotranspiration models in terms of streamflow simulation efficiency in a large sample of 308 catchments in France, Australia and the United States.
-
MC8. Range used for the calibration of the PCORR parameter of HBV. Table 2 shows that the PCORR parameter is used as a multiplier to mitigate the systematic underestimation of P characteristics of some P products, and therefore a range of [1, 2] is used in the optimisation of this parameter. This decision could lead to low KGE values in arid or hyper-arid catchments (see Table 3), where some P datasets overestimate the true (and unknown) P amount. Therefore, I request the authors to extend this range to [0.5, 2] so that the calibration procedure can compensate not only for an underestimation of P but also for an overestimation of it.
-
MC9. Use of an unknown version of the (µ+λ) evolutionary algorithm used to calibrate the HBV hydrological model. The manuscript does not contain a description of the version of the (µ+λ) evolutionary algorithm used to calibrate the HBV hydrological model. From L124, the reader can infer that the DEAP Python software was used to calibrate the HBV model. However, I request the authors to clarify the name and version of the software used to implement the (µ+λ) evolutionary algorithm and to describe how this algorithm was coupled to the (unknown) version of the HBV hydrological model (see MC5). Finally, I request the authors to describe whether they can ensure that the (µ+λ) evolutionary algorithm has converged to a stable KGE value after 1200 model runs (L125) or not.
-
MC10. Selection of temporal period used for the calibration of the individual catchments. It is not clear from the manuscript whether the period used to calibrate the HBV hydrological model with each P dataset was the same or whether it depended on the data availability of the respective P product. I request the authors to clarify this situation in the manuscript. In the case that the temporal period used for the calibration of each catchment depends on the data availability of each P product, and therefore, it was not the same for all the P products used as forcing in each catchment, I request the authors to use the same temporal period for the calibration of all P products in each catchment, to ensure a fair comparison of the performance of different P datasets in a given catchment. Of course, the temporal periods may be different from one catchment to another, but for the same catchment the same temporal period should be used to calibrate the HBV model with all P datasets.
-
MC11. Based on the boxplots summarising the performance of each of the 23 P datasets used in this study, it is quite surprising that the CPC Unified dataset, which is based solely on rain gauge information and has the coarsest spatial resolution of all P datasets (0.5°), ranked second among all datasets. I request the authors to add a paragraph suggesting possible reasons for this unexpected behaviour.
-
MC12. To provide an initial assessment of the ability of all 23 P datasets used in this study to reproduce the mean annual precipitation at a given location, I request the authors to create a new figure with the mean annual precipitation for 2007-2015 (the longest period for which all datasets have data, after removing the two CMORPH products described in MC2), computed as the average of the mean annual values obtained for each of the 23 P datasets for that period (Pavg). In addition, I request the authors to prepare 23 new figures showing the difference between the mean annual precipitation of each P dataset for 2007-2015 (Pi) and Pavg, i.e., Pi - Pavg. All the figures requested in this comment should be included in the supplementary material only, and they will allow to identify major problems in the representation of mean annual values of a given P dataset in some specific regions of the world.
-
MC13. To facilitate the “generalizability of their findings” (L50, L57) for readers from different countries, I request the authors to add a new figure to the main body of the manuscript: a map showing, in different colours, the KGE values obtained in each catchment. This figure will allow us to identify the spatial distribution of the high and low performance of each P dataset in the simulation of daily streamflows. This new figure will make it possible to support several statements in the “Results and Discussion” section that are currently not supported by any figure in the manuscript.
-
MC14. To facilitate even more the “generalizability of their findings” (L50, L57) to readers from the same country but from catchments with different hydrological regimes, I strongly suggest (and do not request) the authors to make an extra effort and classify the hydrological regimes of each of the 16,295 catchments (e.g., pluvial, glacial, snow-dominated, snow-influenced, tropical). This would allow readers to use the results of the articles to select one or more P datasets to use for analysing specific case studies in their own countries. If this suggestion could not be addressed by the authors, I request them to insert three new columns in Table 3: low solid P fraction, medium solid P fraction and high solid P fraction, where the thresholds to distinguish between low, medium and high values of solid P fraction should be proposed by the authors based on their knowledge and the values of solid P fraction of all 16,295 catchments.he values of the solid P fraction of all the 16,295 catchments.
-
MC15a. Poor performance of HBV in arid climates. Although the manuscript does not explicitly mention this, it can be inferred that the authors assume that the performance of HBV is likely to be poor in arid climates (L226), because “P in arid regions tends to be brief and intense, making it challenging to detect and model accurately(Beck et al., 2017b; Sun et al., 2018; El Kenawy et al., 2019; Beck et al., 2019a)” (L227-228). However, Seibert and Bergström (2022) mention in their review that the HBV is routinely used to model the impacts of climate change on water resources around the world, including regions as arid as the Nile (Booij et al., 2011) and, threfore, aridity per se should therefore not be a reason to explain a poor performance of the HBV model.
- Booij, M. J., Tollenaar, D., van Beek, E., and Kwadijk, J. C. J.: Simulating impacts of climate change on river discharges in the Nile basin, Phys. Chem. Earth, 36, 696–709, https://doi.org/10.1016/j.pce.2011.07.042, 2011.
-
MC15b. Definition of the aridity index. In the main text of the manuscript, arid regions are associated with values of the aridity index greater than 1 (L250-251, L266). However, this association is inconsistent with the definition of the aridity index in Table B1 of Appendix B, where the aridity index is defined as the ratio between mean annual P and potential evapotranspiration, and therefore values greater than 1 would indicate wet rather than dry catchments. Please clarify this discrepancy.
-
MC16. Efficiency of the filter used to select the study's catchmens. In Section 3.2 (Regional performance differences) the authors mention aridity, groundwater use and/or anthropogenic water use as possible explanations for the low performance obtained for several P products in Australia, India, South Korea and Africa. Does this mean that the five criteria used in Section 2.2 to “ensure the suitability of the catchments for the present analysis” (L87) did not work as expected?. I request the authors to add a discussion of why the five criteria previously mentioned were not sufficient to filter out catchments that were not suitable for the present analysis. I also request the authors to consider whether it is necessary to add one or more criteria that would allow the presence of irrigation, hydrograph regulation and/or major consumptive water use to be detected, in order to screen out catchments that will not provide reliable results from the analysis. I suggest the authors analyse the criteria used by the Reference Observatory of Basins for INternational hydrological climate change detection (ROBIN; Kumar et al., 2024) to ensure that the streamflows observed in each selected catchment are free from anthropogenic influences.
- Kumar, A., Hannaford, J., Turner, S., Barker, L. J., Dixon, H., Griffin, A., Suman, G., and Armitage, R.: Global trend and drought analysis of near-natural river flows: The ROBIN Initiative, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17249, https://doi.org/10.5194/egusphere-egu24-17249, 2024.
-
MC17. Make the observed streamflow dataset publicly accessible. HESS request the authors to follow their data policy (https://www.hydrology-and-earth-system-sciences.net/submission.html), which includes a statement on how the underlying research data can be accessed. If the data are not publicly accessible, a detailed explanation of why this is the case is required (e.g. applicable laws, university and research institution policies, funder terms, privacy, intellectual property and licensing agreements, and the ethical context of the research). In addition, the HESS data policy states the provision of unrestricted access to all data and materials underlying reported findings for which ethical or legal constraints do not apply. It is true that a URL or reference to the data source of the streamflow data used in this study is provided in Table A1. However, a researcher wishing to reproduce the results of this study will never be certain that the data downloaded from each URL corresponds exactly to the original 43,627 stations used in this study. Furthermore, in the hypothetical situation of having downloaded exactly the same 43,627 stations that were originally used in this study, it would not be possible to ensure that applying the five criteria, presented in Section 2.2 for filtering out stations, would result in exactly the 16,295 stations finally analysed in this study. Therefore, in practise, it would not be possible for a researcher to reproduce the results of this study. The entire scientific community will thank the authors of this study for providing public access to the daily streamflow data, the catchment boundaries and the location of the outlet of each catchment in order to improve this dataset for future analyses on a global scale.
Minor comments:
-
In all the manuscript, I ask the authors to use the word “reanalysis” instead of “model” when referring to atmospheric models of the global climate (e.g., ERA5, JRA-3Q), to avoid confusion with the HBV hydrological model used in this study.
-
Provide the full name of all the abbreviations used in the manuscript the first instance they appear, as specified in the “English guidelines and house standards” of HESS (https://www.hydrology-and-earth-system-sciences.net/submission.html). This is particularly important for all the precipitation products, which can not be assumed to be known by the wider scientific community. In addition, please provide a reference for each P dataset the first time they appear in the text.
-
Because CAMELS is a catchment dataset specifically developed for U.S., I request the authors to use CAMELS only for the US datasets, while when referring to CAMELS-like datasets developed for other countries, the individual names of the datasets should be used (e.g., CAMELS-GB, CAMELS-CL) or a generic name different from “CAMELS”.
-
To avoid possible ambiguities, use always in the text “streamflow” instead of “flow”. Also, when using “runoff” instead of “streamflow”, specify how runoff was obtained.
-
L20-21. Provide a reference for the crucial role that the spatio-temporal distribution of P plays in water resources assessment.
-
Table 1. Correct the reference provided for IMERG-Final V7, because Huffman et al. (2019) makes reference to version 6 and not to version 7.
-
Table 1. In the column “Temp. Cov.”, please explain the meaning of “NRT” in the caption of the table, and remove that term (assumed to mean “near real-time”) for all the products which time latency is larger than 1 day.
-
Table 1. Provide a “Time Latency” value for all the products lacking such information.
-
Table 1, Table 3, Figure 2. Please check whether IMERG-Early V7 was used in this study or not, because L72 mentions only IMERG-Early V6 and not IMERG-Early V7.
-
L69. It mentions that “The datasets fall into two main categories”. However, in L149 it is mentioned that “Among the six main categories of P datasets”, which is consistent with the six categories used in Table 1 (column ‘Data Source’) and Table 3 (column ‘Dataset Type’). I ask the authors to keep six categories in all the manuscript, using ‘Dataset Type’ as a consistent denomination name and using “S, R (reanalysis), G, S+G, S+R, S+R+G” as possible values for this denomination name (instead of “S; R (reanalysis); G; S,G; S,R; S,R,G” as used in Table 1).
CAMELS-like instead of CAMELS.
-
L82. Change “and websites” by “or websites”, because Table A1 provide either a reference or a URL but not both.
-
L103-104. Provide the catchments areas corresponding to the 2.5 and 97.5 percentiles as well.
-
Figure 1. Explain in the caption what is specifically shown in panels a) and b) of this figure.
-
Table 2. Please add a new column “units” to specify the measurement units of each HBV parameter.
-
L122-124. Provide more details about the statement: “Model initialization was done by running the model with 10 years of prior P data, if available. If 10 years of prior P data were not available, the model was run multiple times using the available P data until a total of more than 10 years was accumulated”. In particular, clarify how running multiple times the HBV model allow to compensate the lack of P data.
-
L130. Remove GDAS from the examples of P datasets with short record, because its data start in 2001, in contrast to the two CMORPH versions which data starts in 2019.
-
L144. Explain what do you mean by “γ reflects the shape of P probability distribution”.
-
L158. In the sentence “Specifically, gauge data enhance performance in …” do you mean something like “Specifically, bias correction using gauge data enhance performance in ….”?
-
L165-166. Please provide a reference that support the statement about the climatological rain gauge adjustment in IMERG-Late V7. This is requested because to the best of my knowledge the document “IMERG_V07_ReleaseNotes_final_230713.pdf”, only mentions “Applied climatological adjustment to the Final Run for Early and Late Runs”.
-
L174-175. Provide a discussion about the poor performance of PDIR-now in UK, Denmark and Italy.
-
L179. Provide a reference for GDAS.
-
L202. Could you be more specific with the sentence “the importance of improving coverage in data sparse regions due to data sharing limitations” ?
-
L203. Where can we see the “comparison of PCORR parameter values obtained after calibration using different P datasets” ?
-
L209. How is it possible to obtain negative values of the PCORR parameter if the range specified for this parameter in Table 2 was [1, 2]?
-
Figure 2. Add to the caption of this figure the meaning of the horizontal black line shown in each boxplot.
-
L237-L241. To avoid confusion, please use the same attribute names used in Figure 4 and Appendix B (e.g., use “Mean PET” instead of “low Mean PET”).
-
L240. Develop more the idea “…, as frontal P is prevalent under these conditions”.
-
L243. Please introduce the concept “Rain Gauge Density map” before using it here.
-
L272. Correct “JRA-3”
-
L275. Explain the meaning of TOVS-to-ATOVS.
-
L277-280. Where can we see the low performance obtained by PDIR-Now in Italy and Denmark, as well as the low performances obtained by JRA-3Q in Tahiland?
-
L288. To improve the clarity of the text, please change “bias-adjustment techniques” by “bias-adjustment techniques of P datasets”.
-
L310-312. Can you provide any number to support the statement “our approach may slightly overestimate the relative performance of gauge-based and model-based datasets compared to satellite-only datasets"?
-
L313. Remove GDAS from the examples of P datasets with short record, because its data start in 2001, in contrast to the two CMORPH versions which data starts in 2019.
-
L317-321. I suggest to move these lines into a new section termed “Future work”.
-
L331. Given that GPM+SM2RAIN performed best among all the satellite-only P datasets, and considering that the developers of that product are among the authors of this work, can you provide some description of the reasons that prevent updating this product at least once a year?
-
L334. Stating that MSWEP is a “gauge-based” dataset gives the wrong idea that this product is only based on rain gauge information. I suggest to be more specific here and specify that this product uses information from rain gauges, among other sources.
-
L339-340. The statement “while arid regions exhibited overall poor performance, with model-based datasets slightly outperforming others” is not correct, because Table 3 shows that IMERG-Final V7, GPCP v3.2 and CPC Unified outperformed reanalysis datasets in arid regions. Please correct.
-
In the sections “Results and Discussion” and “Conclusions” please provide some analysis of the performance of the P datasets in mountainous regions, which is of utmost interest for the wider hydrological community.
-
In the Section “Conclusions” please mention something about the catchment attributes that would allow to predict -to some extent- a good performance of the P datasets, which is of utmost interest for the wider hydrological community.
-
L359. NOAA is written twice. Correct.
-
L373. Change the capital “O” used in “Observed”.
-
L377. Mention in the text where the radiation and humidity data are used in this work.
-
Table A1. Please separate the “Data source” column into two different columns: “Institution name” and “Country”, to have better information about the data source used for the observed streamflow data.
-
Table B1. Indicate the measurement unit used for the attribute “Rain gauge density”.
-
Table B1. Incorrect citation to Legates and Bogart (2009). Please correct.
-
Table B1. Considering the existence of the attribute “Permafrost fraction”, why the attribute “Glacier fraction” was not included in the analysis?
-
L388-394. Please provide the correct acknowledgment to each one of the P datasets used in this study, as requested by each data source provider.
-
L399. There is an incorrect character in the reference. Correct it.
-
L503-508. This reference is repeated twice. Correct it.
-
L612-615. This reference is repeated twice. Correct it.
-
L631. Correct the error in the URL.
Citation: https://doi.org/10.5194/egusphere-2024-4194-RC1 -
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
443 | 129 | 9 | 581 | 51 | 5 | 7 |
- HTML: 443
- PDF: 129
- XML: 9
- Total: 581
- Supplement: 51
- BibTeX: 5
- EndNote: 7
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 168 | 28 |
China | 2 | 61 | 10 |
Italy | 3 | 34 | 5 |
Germany | 4 | 29 | 4 |
United Kingdom | 5 | 21 | 3 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 168