Preprints
https://doi.org/10.5194/egusphere-2024-2896
https://doi.org/10.5194/egusphere-2024-2896
10 Oct 2024
 | 10 Oct 2024
Status: this preprint is open for discussion.

On the added value of sequential deep learning for upscaling evapotranspiration

Basil Kraft, Jacob A. Nelson, Sophia Walther, Fabian Gans, Ulrich Weber, Gregory Duveiller, Markus Reichstein, Weijie Zhang, Marc Rußwurm, Devis Tuia, Marco Körner, Zayd Mahmoud Hamdi, and Martin Jung

Abstract. Estimating ecosystem-atmosphere fluxes such as evapotranspiration (ET) in a robust manner and at global scale remains a challenge. Machine learning-based methods have shown promising results to achieve such upscaling, providing a complementary methodology that is independent from process-based and semi-empirical approaches. However, a systematic evaluation of the skill and robustness of different ML approaches is an active field of research that requires more investigations. Concretely, deep learning approaches in the time domain have not been explored systematically for this task.

In this study, we compared instantaneous (i.e., non-sequential) models—extreme gradient boosting (XGBoost) and a fully-connected neural network (FCN)—with sequential models—a long short-term memory (LSTM) model and a temporal convolutional network (TCN), for the modeling and upscaling of ET. We compared different types of covariates (meteorological, remote sensing, and plant functional types) and their impact on model performance at the site level in a cross-validation setup. For the upscaling from site to global coverage, we input the best-performing combination of covariates—which was meteorological and remote sensing observations—with globally available gridded data. To evaluate and compare the robustness of the modeling approaches, we generated a cross-validation-based ensemble of upscaled ET, compared the ensemble mean and variance among models, and contrasted it with independent global ET data.

We found that the sequential models performed better than the instantaneous models (FCN and XGBoost) in cross-validation, while the advantage of the sequential models diminished with the inclusion of remote-sensing-based predictors. The generated patterns of global ET variability were highly consistent across all ML models overall. However, the temporal models yielded 6–9 % lower globally integrated ET compared to the non-temporal counterparts and estimates from independent land surface models, which was likely due to their enhanced vulnerability to changes in the predictor distributions from site-level training data to global prediction data. In terms of global integrals, the neural network ensembles showed a sizable spread due to training data subsets, which exceeds differences among neural network variants. XGBoost showed smaller ensemble spread compared to neural networks in particular when conditions were poorly represented in the training data.

Our findings highlight non-linear model responses to biases in the training data and underscore the need for improved upscaling methodologies, which could be achieved by increasing the amount and quality of training data or by the extraction of more targeted features representing spatial variability. Approaches such as knowledge-guided ML, which encourage physically consistent results while harnessing the efficiency of ML, or transfer learning, should be investigated. Deep learning for flux upscaling holds large promise, while remedies for its vulnerability to training data distribution changes, especially of sequential models, still need consideration by the community.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.
Basil Kraft, Jacob A. Nelson, Sophia Walther, Fabian Gans, Ulrich Weber, Gregory Duveiller, Markus Reichstein, Weijie Zhang, Marc Rußwurm, Devis Tuia, Marco Körner, Zayd Mahmoud Hamdi, and Martin Jung

Status: open (until 24 Dec 2024)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on egusphere-2024-2896', Simon Besnard, 07 Nov 2024 reply
Basil Kraft, Jacob A. Nelson, Sophia Walther, Fabian Gans, Ulrich Weber, Gregory Duveiller, Markus Reichstein, Weijie Zhang, Marc Rußwurm, Devis Tuia, Marco Körner, Zayd Mahmoud Hamdi, and Martin Jung
Basil Kraft, Jacob A. Nelson, Sophia Walther, Fabian Gans, Ulrich Weber, Gregory Duveiller, Markus Reichstein, Weijie Zhang, Marc Rußwurm, Devis Tuia, Marco Körner, Zayd Mahmoud Hamdi, and Martin Jung

Viewed

Total article views: 278 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
212 42 24 278 5 7
  • HTML: 212
  • PDF: 42
  • XML: 24
  • Total: 278
  • BibTeX: 5
  • EndNote: 7
Views and downloads (calculated since 10 Oct 2024)
Cumulative views and downloads (calculated since 10 Oct 2024)

Viewed (geographical distribution)

Total article views: 268 (including HTML, PDF, and XML) Thereof 268 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 13 Dec 2024
Download
Short summary
Global evapotranspiration (ET) can be estimated using machine learning (ML) models optimized on local data and applied to global data. This study explores whether sequential neural networks, which consider past data, perform better than models that do not. The findings show that sequential models struggle with global upscaling, likely due to their sensitivity to data shifts from local to global scales. To improve ML-based upscaling, additional data or integration of physical knowledge is needed.