An Adaptive Method to Estimate Evapotranspiration using Satellite and Reanalysis Products
Abstract. Accurate estimation of evapotranspiration (ET) is critical for hydrological, agricultural, and climate-related applications. However, spatially and temporally consistent ET datasets are often limited, particularly in regions like Ireland, where cloud cover is high and ground-based observations are sparse. This study evaluates ten global, operational open-access ET products by comparing them to Penman-Monteith (PM) reference values derived from weather station data across 22 locations in Ireland between 2019 and 2023. Systematic errors were identified in all ET products, varying across sites, seasons, and years. An adaptive bias correction (AB) method was applied, which dynamically adjusts each product based on recent errors. Although the AB method significantly improved individual ET estimates, no single product consistently exhibited superior performance under all conditions. To further enhance ET accuracy, a novel Combination (COM) method was introduced. This method assigns dynamic weights to each bias-corrected ET product based on recent skill scores, enabling the creation of an optimally merged ET estimate. Unlike traditional static statistical methods, which are interpretable but inflexible, and machine learning approaches, which are adaptive but opaque and data-intensive, the COM method offers a transparent, computationally efficient, and interpretable solution. It requires minimal historical data and runs efficiently on non-specialised systems, making it particularly suitable for operational settings. Results show that the merged COM product outperformed all individual ET datasets, achieving lower errors and stronger correlations with PM observations. Given the persistent cloud cover and variable satellite retrieval accuracy in regions like the Ireland, the ability to adapt to recent performance represents a significant advancement. Overall, the proposed adaptive merging framework provides a scalable, lightweight solution for improving ET monitoring. This method holds promise for enhancing operational hydrology, agricultural decision-making, and climate impact assessments in Ireland and other regions facing similar challenges.
This manuscript presents an adaptive framework for estimating evapotranspiration (ET) by combining ten satellite-derived and reanalysis ET products over Ireland. The authors first construct a grass-reference ET benchmark using the Penman–Monteith equation and ground-based data, then evaluate individual products against this reference. Building on this, they propose an AB preprocessing scheme that (i) applies seasonal bias correction and (ii) dynamically updates combination weights based on recent skill scores (error, bias, RMSE, correlation).
Overall, the paper is good and well-written, with a clear structure, strong methodological grounding, and meaningful results. However, several important clarifications and revisions are needed before it can be considered for publication.
Major comments:
1) In Section 2, the manuscript states: “Data were extracted for the period 2019–2023, the earliest interval where all 10 ET products overlapped completely. Although 2018 also offered full coverage, it was excluded to avoid bias from that year’s extreme European heatwave.”
This statement appears to be factually incorrect or at least misleading. Most of the selected satellite and reanalysis ET datasets (GLEAM, ERA5-Land, GLDAS, MOD16, MERRA-2, WaPOR, SSEBop, JRA-3Q, LSA-SAF) have global coverage starting well before 2019, typically in the 2000s or even earlier. Correct this statement or provide a more precise justification; you may consider the earliest period with homogeneous versions, gap-filled “GF” products, post-reprocessing consistency, or stable input forcing. I recommend adding the temporal coverage in Table 1. Also add and specify the exact version used (e.g., GLEAM v4.1a, WaPOR v3, SSEBop v6, ERA5-Land, MERRA-2, etc.).
2) The manuscript notes that 2018 “offered full coverage” but was excluded “to avoid bias from that year’s extreme European heatwave.” The paper aims to develop an adaptive method that should be robust to varying conditions (including extremes). Excluding a documented extreme year may weaken the claim that the method is suitable for operational and climate-related applications, where extremes are precisely the periods of greatest interest. As you have the 2018 dataset, and no sensitivity test is shown to demonstrate how including 2018 would affect the skill scores, this would add value to your paper. More clearly justify why 2018 must be excluded (e.g., known data quality anomalies or product discontinuities), not only because it is an extreme meteorological year.
3) Given that the core of this study is multi-product fusion, the temporal coverage explanation in Section 2 is too brief and currently causes confusion. A clearer justification of the analysis period—supported by a table summarizing coverage dates, version numbers, processing levels, and potential reprocessing events—would greatly improve the transparency and reproducibility of the study.
Minor comment:
1) the manuscript states: “For methodology development under well-maintained synoptic station conditions, Ks = Kc = 1 … We therefore assume the grass reference ETo represents actual ET from the grass surface at these stations (ET = ETo).”
This assumption is standard in FAO-56 terminology, but it requires proper citation such as Allen, R. G., Pereira, L. S., Raes, D., & Smith, M. (1998). Crop Evapotranspiration—Guidelines for Computing Crop Water Requirements. FAO Irrigation and Drainage Paper 56.
2) The sentence “10-day resolution composites were uniformly distributed into daily averages, then reaggregated into 8-day summed totals” describes a temporal disaggregation–reaggregation procedure that implicitly assumes constant ET across each 10-day interval. While this approach is dimensionally consistent, it represents a substantial physical simplification because ET varies significantly from day to day with changes in radiation, temperature, humidity, and wind. Please justify this methodological choice and clarify. Please justify this methodological choice. If such methods have precedent in the remote-sensing ET literature, please provide an appropriate citation to support this assumption.
3) In the Introduction, the paper compares its results to machine learning approaches; however, since the study does not directly evaluate ML models—and doing so would be outside the scope—I recommend tempering the degree of comparison. Machine learning and bias-correction methods have been widely applied in recent work, such as “Analysis of historical global warming impacts on climatological trends for the partially gauged Hirmand River Basin based on multiple data products and bias correction methods.” I strongly recommend considering and citing this study to provide a more balanced context and to strengthen the discussion of existing methodological alternatives. This will also help position your method as one of the potential complementary approaches within the broader suite of emerging ET estimation techniques.
4) Figures 4 and 5 currently appear in the Methods section, but they clearly present results of parameter testing (the sensitivity of the AB and COM window sizes to performance metrics). These figures are therefore conceptually part of the Results rather than the methodology.
5) Typos and Acronyms: Abstract: “regions like the Ireland” → should be “regions like Ireland” (drop “the”). Please check for small spacing issues such as “8days”, “mm/8days”, “Km” vs “km”, and make them consistent (e.g., “8 days”, “mm per 8 days”, “km”). PM is sometimes referred to as “Penman–Monteith”, “PM”, and “PM model”. Standardize the phrasing, e.g., “Penman–Monteith (PM) reference ET” on first use and then use “PM” consistently. AB and COM are clearly defined, but in the Methods and Results it would help to remind the reader once (e.g., “COM (combined product)”) when first mentioned in Section 4.
6) In the Introduction: “Remote sensing and reanalysis products provide ET estimates with broad spatial coverage, making them particularly useful for regions with sparse ground-based measurements (Li et al., 2009).” In this sentence, the citation is old. I strongly recommend considering and citing “Assimilation of Sentinel‐Based Leaf Area Index for Modeling Surface‐Ground Water Interactions in Irrigation Districts” to strengthen your sentence.
7) Some figures (e.g., Figures 6, 7, 11, 13, 15) are well described, but you might briefly restate key acronyms in captions (e.g., “PM = Penman–Monteith reference ET; COM = combined ET product”) so figures can be interpreted independently of the main text.
8) In Section 5.1 and Figure 15, you distinguish “coastal” vs “inland” stations. It would be helpful to briefly state how this classification was made (e.g., distance threshold from coastline, visual assessment), or add a note to Figure 1 or the text.
9) The study focuses on well-maintained synoptic stations over predominantly grassland surfaces in Ireland, using PM-based grass reference ET as the benchmark. While this is appropriate for method development, it would be useful to expand the Discussion to address transferability of the AB/COM framework to (i) other land-cover types (e.g., crops, forests), (ii) more water-limited or arid climates, and (iii) regions with sparser or lower-quality meteorological data.
A short paragraph explicitly outlining key limitations and assumptions (e.g., reliance on a high-quality PM benchmark, grass reference conditions, relatively humid maritime climate) would help readers understand in which contexts the method is expected to perform well and where additional adaptation or testing would be needed.
10) Since all skill scores are computed relative to the PM-based benchmark, it would be helpful to briefly discuss the uncertainty in the reference ET itself (e.g., effects of gap-filled radiation, wind, humidity; representativeness of station-scale PM ET for the product pixel). Even a short qualitative statement or a reference to typical PM uncertainties would help frame the evaluation.
After these revisions are completed, I believe the paper will be of high quality and suitable for publication.