Learning Evaporative Fraction with Memory
Abstract. Evaporative fraction (EF), defined as the ratio of latent heat flux to the sum of sensible and latent heat flux, is a key metric of surface energy partitioning and an indicator of plant water stress. Recognizing the role of vegetation memory effects, we developed an explainable machine learning (ML) model based on a Long Short-Term Memory (LSTM) architecture, which explicitly incorporates memory effects, to investigate the mechanisms underlying EF dynamics. The model was trained using data from 90 eddy-covariance sites across diverse plant functional types (PFTs), compiled from the ICOS, AmeriFlux, and FLUXNET2015 Tier 1 datasets. It accurately captures EF dynamics – particularly during post-rainfall pulses and soil moisture dry-down events – using only routinely available meteorological inputs (e.g., precipitation, radiation, air temperature, vapor pressure deficit) and static site attributes (e.g., PFT, soil properties). The ensemble mean predictions showed strong agreement with observations (R² = 0.82) across sites spanning broad climate and ecosystem gradients. Using explainable ML techniques, we identified precipitation and vapor pressure deficit as the primary drivers of EF in woody savanna, savanna, open shrubland, and grassland ecosystems, while air temperature emerged as the dominant factor in deciduous broadleaf, evergreen needleleaf, and mixed forests. Furthermore, expected gradients revealed variation in memory contributions across PFTs, with evergreen broadleaf forests and savannas exhibiting stronger influences from antecedent conditions compared to grasslands. These memory effects are strongly associated with rooting depth, soil water-holding capacity, and plant water use strategies, which collectively determine the time scales of drought response. Notably, the learned memory patterns could serve as proxies for inferring rooting depth and assessing plant water stress. Our findings underscore the critical role of meteorological memory effects in EF prediction and highlight their relevance for anticipating vegetation water stress under increasing drought frequency and intensity.
This manuscript applies a relatively new machine learning model to effectively capture the temporal variability of Evaporative Fration (EF). The model shows strong agreement with observations, demonstrating its capability to represent the dynamics of EF across different PFTs and climate zones. The authors also quantitatively assess the influence of surface hydrometeorological drivers on vegetation memory, providing valuable insights into soil-plant-atmosphere interactions.
Overall, I find this study to be of interest and with potential for publication. Several aspects of the methodological description and the presentation of the results require further clarification to ensure that readers can fully understand and evaluate the work.
Major comments:
Specific comments:
9: What are vegetation memory effects? Please explain.
10: I'm new to ML method, what's the difference between explainable ML and regular ML?
11: Should be "vegetation memory effects"
14: What's the advantage of this study compared to SFE (Surface Flux Equilibrium), which also only require routine weather station data for EF calculation?
17-19: Which corresponds to "water-limited" and "energy-limited" regimes?
24: Previously it says "vegetation memory effects", please be consistent.
31: I think here you don't have to emphasize root-zone, since SM-EF at surface soil layer should be stronger.
36: I feel the cause-and-effect of this paragraph should be rephrased as how vegetation memory influence EF, rather than the other way around. The goal of this study is to predict EF, and vegetation memory is one of the key drivers. Or you should place this content after the description of EF prediction.
42: Please explain the terminology the first time it appears.
74: Again, what's the difference between "explainable ML" and regular ML method?
115: Did you also mask those with energy imbalance larger than a threshold (e.g., (Rn-Gs)-(LH+SH)>30W/m2)? Please explicitly indicate it in the main context.
121: What is "corrected" LH? Please explain.
143: Section 3.2, I suggest the authors describe explicitly the difference of each baseline model from LSTM. To non-expert in ML, it looks like the settings of FNN is similar to LSTM, then why does FNN perform worse? Additionally, why do the authors want to add the SPI-based model, and why the SPI model performs so bad (R2<0.1)? Please add relevant discussions.
145: In figure 3 it says FNN, please be consistent.
176: I suggest the authors also add a brief description of EG in the main context, since it is an important component of this study.
212: -1.21, Is it a typo? Why R2 can be negative and larger than 1?
216: “Figure 4” should be “Figure 3”
293: From here to 299, can be moved to method part.
302: Please indicate this is Shortwave radiation (RAD)
307-309: Isn't air temperature correlated with radiation?
351: Figure8, Can you re-arrange this figure with x-axis ranging from shallow to deep rooting-depth? Plus, how do you normalize the Contributions? Why are contributions from all variables even lower than precipitation alone? Does this indicate there could be negative feedbacks between variables?
356: Should be “temporal EF machine learning model”.