Preprints
https://doi.org/10.5194/egusphere-2026-2036
https://doi.org/10.5194/egusphere-2026-2036
29 Apr 2026
 | 29 Apr 2026
Status: this preprint is open for discussion.

Evaluating the Statistical Agreement between Gridded Evapotranspiration Data Sets in the Conterminous United States via Triple Collocation

Keith Doore, Thomas M. Over, Timothy O. Hodson, and Sydney S. Foks

Abstract. Evapotranspiration (ET) is a critical component in hydrologic budgets and one of the most difficult to measure, which can create substantial inconsistencies among ET datasets. This is particularly an issue for benchmarking the performance of hydrologic model simulations, as we lack any ground truth gridded data for verifying ET datasets. Commonly, some form of collocation analysis is used to estimate the error variance of ET data from multiple independent datasets. However, this technique only assesses the error variances and does not estimate the biases of the datasets from the truth, which can be more important for some applications. Although the biases from the truth cannot be determined, relative biases between dataset can. To assess these, we developed a novel method that combines the temporal median of the relative bias between datasets and the error covariance matrices estimated from the Extended Collocation method to derive dataset agreement probabilities. We then applied this method to six gridded monthly ET datasets that cover the Conterminous United States (CONUS): SSEBop, GLEAM, ERA-5-Land, NLDAS-2, TerraClimate, and the water balance ET dataset from Reitz et al. (2023a). From these probabilities, we found that all but one dataset pair had >70 % of grid cells with p-values > 0.16 (and >85 % of grid cells with p-values > 0.05) across CONUS over the full time period, which indicates reasonable agreement. When split by season, winter has 36.8 % of grid cells across all dataset pairs with p-values > 0.16 (56.7 % with p-values > 0.05), with spring, summer, and fall having 41.9 % (63.3 %), 54.2 % (74.4 %), and 63.3 % (82.2 %), respectively. However, when looking at certain regions of concern for water resources management, estimated agreement probabilities split by season showed that the Central Valley had a majority of dataset pairs with p-values < 0.005 in the summer. This may be directly attributed to the lack of an irrigation component in the GLEAM, ERA5-Land, NLDAS, and TerraClimate datasets.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Share
Keith Doore, Thomas M. Over, Timothy O. Hodson, and Sydney S. Foks

Status: open

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
Keith Doore, Thomas M. Over, Timothy O. Hodson, and Sydney S. Foks

Model code and software

Workflow Notebooks for Evaluating the Statistical Agreement between Gridded Evapotranspiration Data Sets in the Conterminous United States via Triple Collocation K. J. Doore et al. https://doi.org/10.5066/P1VN9ENA

Interactive computing environment

Workflow Notebooks for Evaluating the Statistical Agreement between Gridded Evapotranspiration Data Sets in the Conterminous United States via Triple Collocation K. J. Doore et al. https://doi.org/10.5066/P1VN9ENA

Keith Doore, Thomas M. Over, Timothy O. Hodson, and Sydney S. Foks
Metrics will be available soon.
Latest update: 29 Apr 2026
Download
Short summary
Evapotranspiration (ET) is hard to measure, and datasets often disagree with few ground truth observations to assess their accuracy. This work presents a method to assess agreement and bias among ET datasets. Most of the conterminous US showed reasonable agreement, except in the mountainous West and Southeast. Agreement drops by season, weakest in winter. In the Central Valley, summer disagreement stems from datasets lacking irrigation. This method pinpoints such issues for future improvement.
Share