On the added value of sequential deep learning for upscaling evapotranspiration

Kraft, Basil; Nelson, Jacob A.; Walther, Sophia; Gans, Fabian; Weber, Ulrich; Duveiller, Gregory; Reichstein, Markus; Zhang, Weijie; Rußwurm, Marc; Tuia, Devis; Körner, Marco; Hamdi, Zayd Mahmoud; Jung, Martin

doi:https://doi.org/10.5194/egusphere-2024-2896

Basil Kraft, Jacob A. Nelson, Sophia Walther, Fabian Gans, Ulrich Weber, Gregory Duveiller, Markus Reichstein, Weijie Zhang, Marc Rußwurm, Devis Tuia, Marco Körner, Zayd Mahmoud Hamdi, and Martin Jung

Abstract. Estimating ecosystem-atmosphere fluxes such as evapotranspiration (ET) in a robust manner and at global scale remains a challenge. Machine learning-based methods have shown promising results to achieve such upscaling, providing a complementary methodology that is independent from process-based and semi-empirical approaches. However, a systematic evaluation of the skill and robustness of different ML approaches is an active field of research that requires more investigations. Concretely, deep learning approaches in the time domain have not been explored systematically for this task.

In this study, we compared instantaneous (i.e., non-sequential) models—extreme gradient boosting (XGBoost) and a fully-connected neural network (FCN)—with sequential models—a long short-term memory (LSTM) model and a temporal convolutional network (TCN), for the modeling and upscaling of ET. We compared different types of covariates (meteorological, remote sensing, and plant functional types) and their impact on model performance at the site level in a cross-validation setup. For the upscaling from site to global coverage, we input the best-performing combination of covariates—which was meteorological and remote sensing observations—with globally available gridded data. To evaluate and compare the robustness of the modeling approaches, we generated a cross-validation-based ensemble of upscaled ET, compared the ensemble mean and variance among models, and contrasted it with independent global ET data.

We found that the sequential models performed better than the instantaneous models (FCN and XGBoost) in cross-validation, while the advantage of the sequential models diminished with the inclusion of remote-sensing-based predictors. The generated patterns of global ET variability were highly consistent across all ML models overall. However, the temporal models yielded 6–9 % lower globally integrated ET compared to the non-temporal counterparts and estimates from independent land surface models, which was likely due to their enhanced vulnerability to changes in the predictor distributions from site-level training data to global prediction data. In terms of global integrals, the neural network ensembles showed a sizable spread due to training data subsets, which exceeds differences among neural network variants. XGBoost showed smaller ensemble spread compared to neural networks in particular when conditions were poorly represented in the training data.

Our findings highlight non-linear model responses to biases in the training data and underscore the need for improved upscaling methodologies, which could be achieved by increasing the amount and quality of training data or by the extraction of more targeted features representing spatial variability. Approaches such as knowledge-guided ML, which encourage physically consistent results while harnessing the efficiency of ML, or transfer learning, should be investigated. Deep learning for flux upscaling holds large promise, while remedies for its vulnerability to training data distribution changes, especially of sequential models, still need consideration by the community.

Received: 17 Sep 2024 – Discussion started: 10 Oct 2024

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Country	#	Views	%
United States of America	1	217	32
China	2	54	7
Germany	3	51	7
Switzerland	4	40	5
Netherlands	5	27	3


Total:	0
HTML:	0
PDF:	0
XML:	0

On the added value of sequential deep learning for upscaling evapotranspiration

Viewed

Viewed (geographical distribution)