Unrecognised water limitation is a main source of uncertainty for models of terrestrial photosynthesis
Abstract. Quantification of environmental controls on ecosystem photosynthesis is essential to understand the impacts of climate change and extreme events on the carbon cycle and the provisioning of ecosystem services. Machine learning models have become popular for simulating ecosystem terrestrial photosynthesis because of their predictive skill, but often do not consider temporal dependencies in the data, even though process understanding suggests that these should exist. Here, we investigate how models that account for temporal structure impact the prediction of ecosystem photosynthesis. Using time-series measurements of ecosystem fluxes paired with measurements of meteorological variables from a network of globally distributed sites (N = 109) and remotely sensed vegetation indices, we train three different models to predict ecosystem gross primary production (GPP): a mechanistic, theory-based photosynthesis model, a memoryless multilayer perceptron (MLP) and a recurrent neural network (Long Short-Term Memory, LSTM). Through comparisons of patterns in model error, we assess the ability of these models to predict GPP across a wide diversity of ecosystems and climates, and to account for temporal dependencies, with a focus on effects by low rooting zone moisture and freezing air temperatures. We find that both deep learning models outperform the mechanistic model, and that the LSTM performs best with an R2 of 0.74 for spatial out-of-sample predictions. In particular, model skill is consistently good across moist sites with strong seasonality. Model error tends to increase with increasing potential cumulative water deficits, in particular in ecosystems with evergreen vegetation. Generalisation patterns reveal that the LSTM tends to be more successful than the MLP in simulating GPP in dry environments, suggesting an advantage of recurrent models in those conditions. However, a large variability in model skill across relatively dry sites remained. Insufficient information on the exposure and response to water stress and related effects on GPP appear to be dominant sources of error for modelling ecosystem fluxes across the globe. With the increasing frequency of hydroclimatic extreme events, effects of water limitation are expected to become more prevalent, which calls for models that better represent its impact on ecosystem function.