the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Learning Evaporative Fraction with Memory
Abstract. Evaporative Fraction (EF), the ratio of latent heat flux to the sum of sensible and latent heat flux, is a key metric of surface energy partitioning and water stress. Recognizing the importance of soil moisture and vegetation memory effect, we developed a machine learning (ML) model using Long Short-Term Memory (LSTM) unit, which include memory effects, to predict EF, based on eddy-covariance data from the combined ICOS, AmeriFlux, and FLUXNET2015 Tier 1 dataset across different plant functional types (PFT). The results show that the model can accurately capture and reconstruct the EF dynamics, particularly in the dry season and during drydowns, using routinely available weather observations, e.g., precipitation, net radiation, air temperature, vapor pressure deficit (VPD), and other static variables: PFT and soil properties. Specifically, there is a strong correlation (R2 of 0.72) between the ensemble mean EF predictions and the observations on the test set, across sites spanning a large climate and ecosystem gradient. Second, we employ explainable ML techniques to elucidate the drivers of EF while accounting for the memory effect. Precipitation, VPD are two main drivers for woody savanna (WSA), savanna (SAV), open shrubland (OSH) and grassland (GRA) sites, while air temperature is dominate controlling factor in most forest sites, comprising deciduous broadleaf forest (DBF), evergreen needleleaf forest (ENF) and mixed forest (MF). Additionally, our findings reveal varying memory effects across different PFTs, as indicated by the contributions of antecedent time steps via integrated gradients. Specifically, GRA and WSA exhibited relatively lower memory effect contributions compared to forested sites. A detailed analysis of memory effects indicates their strong relationship with rooting depth, soil water holding capacity, and plant water use strategies, which collectively regulate the time scales of droughts. Notably, the learned memory effect across diverse PFTs could potentially serve as proxies for inferring vegetation rooting depth and assessing the plant water stress conditions. Our findings underscore the crucial influence of meteorological memory effect on EF predictions, particularly important for estimating future water stress, as the frequence and intensity of droughts are expected to rise.
Competing interests: Pierre Gentine is in the editorial board of HESS journal.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
                                        (8958 KB) 
- Metadata XML
- 
                                    Supplement (4641 KB) 
- BibTeX
- EndNote
Status: closed
- 
                     RC1:  'Comment on egusphere-2025-365', Anonymous Referee #1, 12 Mar 2025
                        
                                
                        
            
            
            
            
                        - 
                                        
                                     AC2:  'Reply on RC1', Wenli Zhao, 02 May 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-365/egusphere-2025-365-AC2-supplement.pdf
 
- 
                                        
                                     AC2:  'Reply on RC1', Wenli Zhao, 02 May 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        
- 
                     RC2:  'Comment on egusphere-2025-365', Anonymous Referee #2, 25 Mar 2025
            
            
            
            
                        Review of Learning Evaporative Fraction with Memory in HESS https://doi.org/10.5194/egusphere-2025-365 Preprint. Discussion started: 10 February 2025 Summary: This manuscript uses machine learning models to estimate daily evaporative fraction (EF) at flux tower sites using daily precipitation, net radiation, air temperature, vapor pressure deficit, and wind speed from the tower, and daily satellite-derived LAI, along with long-term soil texture characteristics, plant functional type, annual mean tower precipitation, annual mean tower air temperature, and annual mean tower net radiation. The manuscript uses Neural Networks with a Long Short-Term Memory (LSTM) layer to encode input memory up to 365 days. Their model architecture is one 128 node LSTM layer x 1 fully-connected dense 36 node layer x 1 output dense layer w/ 1 neuron. The manuscript compares this versus a NN with 128 x 128 x 36 x 1 fully-connected nodes as a baseline. The results indicate that the LSTM models have typical R2 values of around 0.6-0.7 for daily EF using 0-365 days of memory respectively. By training multiple models at each memory length, they find that the ensemble average estimate each day from 20 models is somewhat better than individual models (R2 of 0.64-0.73 for 0-365 days respectively). The by averaging 20 instances of the 365-day LSTM they achieve RMSE values of 0.13 across their test data set (1 tower held out per PFT). The fully connected (non-LSTM) NN has lower R2 values for individual models and for an ensemble mean at all input time series lengths. The ensemble mean model of the 20 365-day LSTM models is shown to follow temporal patterns of daily EF dynamics across many precipitation events and periods between (Fig 3) for highlighted test sites and periods. The manuscript further attributes predictor importance using an Integrated Gradient method, and suggests that daily precipitation, daily air temperature, and daily VPD are usually dominant predictors (when using all 365 days of input data per daily prediction), with a tendency toward temperature in more humid biomes and towards precipitation and VPD in more arid biomes. The manuscript also breaks down the importance of more proximate versus further antecedent terms to discuss which variables’ “memories” are more significant, and compare this with site rooting depth, sand fraction, and aridity index. The manuscript concludes that their LSTM model delivers robust EF-prediction performance, adequately captures EF dynamics, that Ta and P are “the most influential driver of EF prediction”, and the memory effects are a function of site characteristics (sand fraction, rooting depth, AI). I agree with the motivating notion of the study that predicting EF (and evaporation) is difficult and poorly observationally-constrained beyond eddy covariance tower sites. I also had some concerns about the approach and conclusions drawn from the study, as well as some comments about the manuscript itself, which I detail below. Major Comments: - The place of Machine Learning in this study. I am not averse to the use of ML to answer science questions, particularly for data-driven prediction. In this context though, I didn’t fully understand why ML was necessary to answer any specific research questions. One motivation given in the introduction was that EF is tied to soil moisture, but that soil moisture estimates are “difficult or even impossible to acquire.” This is certainly true from some perspectives, but this method uses net radiation from eddy covariance towers as an input variable, which is certainly more scarcely observed than soil moisture. Additionally, while not ideal, simple indices such as SPI are fairly decent proxies of soil moisture, accounting for “memory” processes, even with some physical justification, and can be estimated globally.
 Beyond this, even completely mechanistic models of evaporation can be represented if given precipitation, VPD, Ta, wind speed, and net radiation (particularly with daily LAI, soil texture, and PFT provided as well). I recognize that observations and models do not show the same realities all the time, particularly at different spatial and time-scales, but this still strikes me as an enormous amount of input data, which really should be enough to use physical models.
- If it were the case that physical models represented EF poorly and ML methods were much better, then this might be an ideal alternative. But in this case, I was surprised by the RMSE of the models, given the wealth of input data. These models are using 20 [models] x ( 365 [days] x 6 [variables] + 5 [site variables] ) to predict each daily EF value, along with a huge number of parameters, but the typical daily error is greater than 10%!! This makes me concerned that there are some failings somewhere in the model construction / fitting / tuning, etc. 
 If one used an information criterion (AIC, BIC, DIC), I would assume that it would suggest a model of EF = 0.5 as much better than the LSTM, and possibly EF = mean_annual_precip / 2000mm as better still. I don’t suggest that you do this, but the manuscript needs to demonstrate that the methods are justifiable somehow.
- I think the NN baseline is a fine idea to see how memory effects determine daily EF values, but only if the NN is itself convincingly good as a predictor. As a first step, I strongly suggest some simpler baseline tests:
- I am not certain that an RMSE of 0.13 beats the daily climatology of EF for the given sites. Recognizing that we would not know the climatology of unseen sites, I would still hope that the model can do better than the seasonal cycle of EF. I would think that there could be much larger errors than 0.13 immediately following rain in arid sites, but that most days might be easily within that range.
- A site-blind soil moisture proxy. I would also not be surprised if a model such as EF = beta0 * tanh( beta1* SPI + beta2 ), or something similarly parameterized, might outperform an RMSE of 0.13 as well, at least in temperate conditions. This could be fit site-by-site, biome-by-biome, or globally. This would then be a function of a single variable (P), with some memory from the SPI calculation (or any other simple weighted average).
- Something slightly more physical, such as a simple soil moisture box model with PET-driven evaporation, particularly if using observed net radiation. This would again give the LSTM the chance to demonstrate why lagged VPD (for example) data from a year ago is a reasonable predictor.
 
- The introduction could do a better job of motivating the idea of memory in these physical variables and why LSTM is an appropriate tool to represent it. I find it intuitive that slowly varying processes like soil moisture, which integrates precipitation and evaporation, need memory at daily scale, where they are very autocorrelated. So to some degree this is a model of soil moisture. But since the variable the manuscript is trying to predict is EF, I think a few mechanisms should be explained more deeply. For example, talk through the idea of how the last year’s Ta reflects the water-demand of the atmosphere and therefore water losses from the soil (or conversely, that it represents high sensible heat fluxes and so suggests that soils have been dry for the year), and then tie that to the idea that soil moisture is the dominant predictor.
- The language throughout could use quite a bit of editing and proof-reading.
- The fact that the models substantially perform better when used as an ensemble mean would not be concerning for differently-parameterized or differently-structured physical models. For an ML model with this level of complexity, I have a hard time grasping why the individual models would perform so much worse. Each model has so many non-linear interacting terms and so many parameters, surely it could also account for a simple mean of the same model with tweaks to the parameters? I wonder if the training and tuning of the individual models is not as rigorous as it could be. Otherwise, you could just train another model on the input/output of the 20 models, sampled evenly, and get a single, better model. The NN training-validation-retraining process is not my area of expertise, but there is certainly literature on this.
- I unfortunately find it difficult to evaluate the last three figures at present, until the earlier methods are either adjusted or more directly shown to be robust and high-skill.
 Minor comments: - I think the idea of “memory” was not very clearly laid-out. It would help to be a little more precise. Is memory the same as predictors of integrated variables? Is it the same as the idea of spectral decomposition when one is using daily data to estimate processes that aren’t easy to model at daily scale? Is EF itself a process with memory, and if so, how so?
- Line 74: “…the lens of the memory effect, a perspective that becomes increasingly relevant as climate change escalates the frequency and severity of drought conditions.” I’m not sure I’m following what this means.
 Citation: https://doi.org/10.5194/egusphere-2025-365-RC2 - 
                                        
                                     AC1:  'Reply on RC2', Wenli Zhao, 02 May 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-365/egusphere-2025-365-AC1-supplement.pdf
 
- The place of Machine Learning in this study. I am not averse to the use of ML to answer science questions, particularly for data-driven prediction. In this context though, I didn’t fully understand why ML was necessary to answer any specific research questions. One motivation given in the introduction was that EF is tied to soil moisture, but that soil moisture estimates are “difficult or even impossible to acquire.” This is certainly true from some perspectives, but this method uses net radiation from eddy covariance towers as an input variable, which is certainly more scarcely observed than soil moisture. Additionally, while not ideal, simple indices such as SPI are fairly decent proxies of soil moisture, accounting for “memory” processes, even with some physical justification, and can be estimated globally.
Status: closed
- 
                     RC1:  'Comment on egusphere-2025-365', Anonymous Referee #1, 12 Mar 2025
                        
                                
                        
            
            
            
            
                        Manuscript number: HESS-2025-365 Title: Learning Evaporative Fraction with Memory Zhao et al. developed a machine learning model to predict the daily evaporative fraction (EF) across 67 eddy-covariance flux sites with different plant functional types (PFTs) worldwide. A Long Short-Term Memory (LSTM) unit is used to examine the effect of previous climatic conditions (e.g., precipitation, air temperature) – analogous to a memory effect – on EF predictions. The manuscript highlights that memory effects – ranging from 7 to 365 days – largely contribute to EF dynamic predictions, particularly across forest sites, likely due to deep roots and water use strategies compared to grassland and savanna sites. More specifically, the results show that air temperature and precipitation, rather than net radiation, emerged as the most influential drivers of EF predictions. The study provides new insights into how PFTs, soil properties, and the climate characteristics of the previous year (i) impacted EF dynamics in the following year (i + 1). This has the potential to improve the understanding of water and carbon fluxes under less frequent but more intense precipitation events due to climate change. Below are my comments: - 
                                        
                                     AC2:  'Reply on RC1', Wenli Zhao, 02 May 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-365/egusphere-2025-365-AC2-supplement.pdf
 
- 
                                        
                                     AC2:  'Reply on RC1', Wenli Zhao, 02 May 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        
- 
                     RC2:  'Comment on egusphere-2025-365', Anonymous Referee #2, 25 Mar 2025
            
            
            
            
                        Review of Learning Evaporative Fraction with Memory in HESS https://doi.org/10.5194/egusphere-2025-365 Preprint. Discussion started: 10 February 2025 Summary: This manuscript uses machine learning models to estimate daily evaporative fraction (EF) at flux tower sites using daily precipitation, net radiation, air temperature, vapor pressure deficit, and wind speed from the tower, and daily satellite-derived LAI, along with long-term soil texture characteristics, plant functional type, annual mean tower precipitation, annual mean tower air temperature, and annual mean tower net radiation. The manuscript uses Neural Networks with a Long Short-Term Memory (LSTM) layer to encode input memory up to 365 days. Their model architecture is one 128 node LSTM layer x 1 fully-connected dense 36 node layer x 1 output dense layer w/ 1 neuron. The manuscript compares this versus a NN with 128 x 128 x 36 x 1 fully-connected nodes as a baseline. The results indicate that the LSTM models have typical R2 values of around 0.6-0.7 for daily EF using 0-365 days of memory respectively. By training multiple models at each memory length, they find that the ensemble average estimate each day from 20 models is somewhat better than individual models (R2 of 0.64-0.73 for 0-365 days respectively). The by averaging 20 instances of the 365-day LSTM they achieve RMSE values of 0.13 across their test data set (1 tower held out per PFT). The fully connected (non-LSTM) NN has lower R2 values for individual models and for an ensemble mean at all input time series lengths. The ensemble mean model of the 20 365-day LSTM models is shown to follow temporal patterns of daily EF dynamics across many precipitation events and periods between (Fig 3) for highlighted test sites and periods. The manuscript further attributes predictor importance using an Integrated Gradient method, and suggests that daily precipitation, daily air temperature, and daily VPD are usually dominant predictors (when using all 365 days of input data per daily prediction), with a tendency toward temperature in more humid biomes and towards precipitation and VPD in more arid biomes. The manuscript also breaks down the importance of more proximate versus further antecedent terms to discuss which variables’ “memories” are more significant, and compare this with site rooting depth, sand fraction, and aridity index. The manuscript concludes that their LSTM model delivers robust EF-prediction performance, adequately captures EF dynamics, that Ta and P are “the most influential driver of EF prediction”, and the memory effects are a function of site characteristics (sand fraction, rooting depth, AI). I agree with the motivating notion of the study that predicting EF (and evaporation) is difficult and poorly observationally-constrained beyond eddy covariance tower sites. I also had some concerns about the approach and conclusions drawn from the study, as well as some comments about the manuscript itself, which I detail below. Major Comments: - The place of Machine Learning in this study. I am not averse to the use of ML to answer science questions, particularly for data-driven prediction. In this context though, I didn’t fully understand why ML was necessary to answer any specific research questions. One motivation given in the introduction was that EF is tied to soil moisture, but that soil moisture estimates are “difficult or even impossible to acquire.” This is certainly true from some perspectives, but this method uses net radiation from eddy covariance towers as an input variable, which is certainly more scarcely observed than soil moisture. Additionally, while not ideal, simple indices such as SPI are fairly decent proxies of soil moisture, accounting for “memory” processes, even with some physical justification, and can be estimated globally.
 Beyond this, even completely mechanistic models of evaporation can be represented if given precipitation, VPD, Ta, wind speed, and net radiation (particularly with daily LAI, soil texture, and PFT provided as well). I recognize that observations and models do not show the same realities all the time, particularly at different spatial and time-scales, but this still strikes me as an enormous amount of input data, which really should be enough to use physical models.
- If it were the case that physical models represented EF poorly and ML methods were much better, then this might be an ideal alternative. But in this case, I was surprised by the RMSE of the models, given the wealth of input data. These models are using 20 [models] x ( 365 [days] x 6 [variables] + 5 [site variables] ) to predict each daily EF value, along with a huge number of parameters, but the typical daily error is greater than 10%!! This makes me concerned that there are some failings somewhere in the model construction / fitting / tuning, etc. 
 If one used an information criterion (AIC, BIC, DIC), I would assume that it would suggest a model of EF = 0.5 as much better than the LSTM, and possibly EF = mean_annual_precip / 2000mm as better still. I don’t suggest that you do this, but the manuscript needs to demonstrate that the methods are justifiable somehow.
- I think the NN baseline is a fine idea to see how memory effects determine daily EF values, but only if the NN is itself convincingly good as a predictor. As a first step, I strongly suggest some simpler baseline tests:
- I am not certain that an RMSE of 0.13 beats the daily climatology of EF for the given sites. Recognizing that we would not know the climatology of unseen sites, I would still hope that the model can do better than the seasonal cycle of EF. I would think that there could be much larger errors than 0.13 immediately following rain in arid sites, but that most days might be easily within that range.
- A site-blind soil moisture proxy. I would also not be surprised if a model such as EF = beta0 * tanh( beta1* SPI + beta2 ), or something similarly parameterized, might outperform an RMSE of 0.13 as well, at least in temperate conditions. This could be fit site-by-site, biome-by-biome, or globally. This would then be a function of a single variable (P), with some memory from the SPI calculation (or any other simple weighted average).
- Something slightly more physical, such as a simple soil moisture box model with PET-driven evaporation, particularly if using observed net radiation. This would again give the LSTM the chance to demonstrate why lagged VPD (for example) data from a year ago is a reasonable predictor.
 
- The introduction could do a better job of motivating the idea of memory in these physical variables and why LSTM is an appropriate tool to represent it. I find it intuitive that slowly varying processes like soil moisture, which integrates precipitation and evaporation, need memory at daily scale, where they are very autocorrelated. So to some degree this is a model of soil moisture. But since the variable the manuscript is trying to predict is EF, I think a few mechanisms should be explained more deeply. For example, talk through the idea of how the last year’s Ta reflects the water-demand of the atmosphere and therefore water losses from the soil (or conversely, that it represents high sensible heat fluxes and so suggests that soils have been dry for the year), and then tie that to the idea that soil moisture is the dominant predictor.
- The language throughout could use quite a bit of editing and proof-reading.
- The fact that the models substantially perform better when used as an ensemble mean would not be concerning for differently-parameterized or differently-structured physical models. For an ML model with this level of complexity, I have a hard time grasping why the individual models would perform so much worse. Each model has so many non-linear interacting terms and so many parameters, surely it could also account for a simple mean of the same model with tweaks to the parameters? I wonder if the training and tuning of the individual models is not as rigorous as it could be. Otherwise, you could just train another model on the input/output of the 20 models, sampled evenly, and get a single, better model. The NN training-validation-retraining process is not my area of expertise, but there is certainly literature on this.
- I unfortunately find it difficult to evaluate the last three figures at present, until the earlier methods are either adjusted or more directly shown to be robust and high-skill.
 Minor comments: - I think the idea of “memory” was not very clearly laid-out. It would help to be a little more precise. Is memory the same as predictors of integrated variables? Is it the same as the idea of spectral decomposition when one is using daily data to estimate processes that aren’t easy to model at daily scale? Is EF itself a process with memory, and if so, how so?
- Line 74: “…the lens of the memory effect, a perspective that becomes increasingly relevant as climate change escalates the frequency and severity of drought conditions.” I’m not sure I’m following what this means.
 Citation: https://doi.org/10.5194/egusphere-2025-365-RC2 - 
                                        
                                     AC1:  'Reply on RC2', Wenli Zhao, 02 May 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-365/egusphere-2025-365-AC1-supplement.pdf
 
- The place of Machine Learning in this study. I am not averse to the use of ML to answer science questions, particularly for data-driven prediction. In this context though, I didn’t fully understand why ML was necessary to answer any specific research questions. One motivation given in the introduction was that EF is tied to soil moisture, but that soil moisture estimates are “difficult or even impossible to acquire.” This is certainly true from some perspectives, but this method uses net radiation from eddy covariance towers as an input variable, which is certainly more scarcely observed than soil moisture. Additionally, while not ideal, simple indices such as SPI are fairly decent proxies of soil moisture, accounting for “memory” processes, even with some physical justification, and can be estimated globally.
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 763 | 379 | 28 | 1,170 | 57 | 27 | 52 | 
- HTML: 763
- PDF: 379
- XML: 28
- Total: 1,170
- Supplement: 57
- BibTeX: 27
- EndNote: 52
Viewed (geographical distribution)
| Country | # | Views | % | 
|---|
| Total: | 0 | 
| HTML: | 0 | 
| PDF: | 0 | 
| XML: | 0 | 
- 1
 
 
                         
                         
                         
                         
                 
                 
                 
                 
                
Manuscript number: HESS-2025-365
Title: Learning Evaporative Fraction with Memory
Zhao et al. developed a machine learning model to predict the daily evaporative fraction (EF) across 67 eddy-covariance flux sites with different plant functional types (PFTs) worldwide. A Long Short-Term Memory (LSTM) unit is used to examine the effect of previous climatic conditions (e.g., precipitation, air temperature) – analogous to a memory effect – on EF predictions. The manuscript highlights that memory effects – ranging from 7 to 365 days – largely contribute to EF dynamic predictions, particularly across forest sites, likely due to deep roots and water use strategies compared to grassland and savanna sites. More specifically, the results show that air temperature and precipitation, rather than net radiation, emerged as the most influential drivers of EF predictions. The study provides new insights into how PFTs, soil properties, and the climate characteristics of the previous year (i) impacted EF dynamics in the following year (i + 1). This has the potential to improve the understanding of water and carbon fluxes under less frequent but more intense precipitation events due to climate change. Below are my comments: