the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Earth system models might overestimate the local plant productivity response to temperature–moisture extremes
Abstract. Compound temperature-moisture extremes, such as droughts or hot-wet extremes, have a pronounced and sometimes long-lasting impact on vegetation productivity. Accurate simulation of the involved processes by emission-driven Earth system models (ESMs) is crucial for inferring future terrestrial carbon uptake. However, ESMs often exhibit biases in the frequency and intensity of climate and weather extremes. Their ability to reproduce observed impacts of extreme atmospheric conditions on gross primary productivity (GPP) is therefore unclear. Comprehensive assessments of the statistical link between compound events and vegetation productivity beyond individual regions or event types are rare. Here, we scrutinize the relationship between temperature-moisture extremes and exceptionally low or high vegetation productivity in two state-of-the-art ESMs, CESM2 and MPI-ESM1.2, and gauge their performance relative to observation-constrained data. We find that temperature-moisture extremes modulate vegetation productivity in observations and models. The global-scale strength and timing of the statistical relationship agree well between observation-based data and model output. However, this agreement deteriorates towards smaller spatial scales, especially in the low latitudes. Here, an overestimated coupling strength by both models, likely related to biased rates of soil moisture change, suggests potentially unrealistic evaporative feedbacks, exaggerated drainage, or inadequate effective water-holding capacity in ESMs. Nevertheless, all data sources identify coherent significant relationships for all combinations of temperature-moisture and GPP extremes. This result highlights both beneficial and detrimental influences of temperature-moisture compound events on vegetation productivity and the importance of comprehensive assessments beyond single event types for capturing the net effect of climatic extremes on the biosphere. Further research should examine whether overestimated plant productivity responses to extreme conditions are a recurring phenomenon across all Earth system models. It could also investigate non-stationarity and nonlinearity of the relationships between climatic and vegetation extremes under climate change.
Competing interests: Kira Rehfeld is a member of the editorial board of Earth System Dynamics. Besides, the authors have no other competing interests to declare.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(5143 KB) - Metadata XML
-
Supplement
(10033 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2026-626', Anonymous Referee #1, 10 Jun 2026
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2026/egusphere-2026-626/egusphere-2026-626-RC1-supplement.pdfCitation: https://doi.org/
10.5194/egusphere-2026-626-RC1 -
RC2: 'Comment on egusphere-2026-626', Anonymous Referee #2, 19 Jun 2026
This study investigates the impacts of compound temperature–moisture extremes on vegetation productivity using Earth system models and gridded GPP products. The topic is relevant, and the attempt to compare model-based and observation-based estimates of the coupling between climate extremes and GPP extremes is valuable. The manuscript suggests that compound extremes can have both beneficial and detrimental effects on GPP, but that Earth system models may overestimate the strength of the coupling compared with FLUXCOM-based estimates.
However, I have several concerns regarding the robustness and interpretation of the results. In particular, I think the manuscript would benefit from a more careful discussion of the limitations of FLUXCOM as a validation dataset for GPP extremes, a clearer justification and interpretation of the co-occurrence rate metric, and substantial improvements to the clarity of the result and discussion section. I outline these points below.
Major comments1. Reliability of FLUXCOM for validating GPP extremes
A central part of the manuscript relies on FLUXCOM GPP as an observational benchmark for evaluating the co-occurrence between compound climate extremes and GPP extremes. However, FLUXCOM is a machine-learning-based upscaling product trained on eddy-covariance tower observations and remote-sensing/climate predictors. It is therefore not a direct observational estimate of GPP, especially, it largely underestimate the interannual variability and may not fully capture extreme GPP anomalies (Jung et al. 2020; Nelson et al. 2024). This could strongly affect the inferred frequency, magnitude, and timing of GPP extremes. Therefore, the manuscript should more explicitly discuss whether FLUXCOM is suitable for validating GPP extremes.
I also recommend avoiding the term “observations” when referring to FLUXCOM, unless it is carefully qualified. Terms such as “FLUXCOM GPP estimates” would be more accurate.
The authors may also consider using FLUXCOM-X at daily scale, which would better match the daily ERA5 climate data used in the study. FLUXCOM-X also includes more flux sites and has been reported to show higher consistency with independent atmospheric carbon-cycle constraints than previous FLUXCOM versions. They should discuss how the choice of FLUXCOM product may influence the conclusions.
2. Interpretation and reliability of the co-occurrence rate metric
The co-occurrence rate metric is central to the manuscript, but its interpretation as a response-time indicator is not fully convincing. In particular, the peak timing of the co-occurrence rate may not necessarily represent the time required for GPP to respond to a compound climate extreme.
For example, consider an idealized case in which GPP extremes occur at months t = 2–8 and 10, while compound climate extremes occur at t = 1 and 3–9. In this case, the co-occurrence rate could peak at a lag of one month. However, this does not necessarily mean that the dominant GPP response time is one month. For example, GPP extremes from t = 3–8 can be induced by concurrent climate extremes, while only two cases, t = 2 and t = 10, are consistent with a one-month delayed response. The metric could therefore identify a lagged peak even though the dominant relationship is concurrent.
This issue becomes problematic because the same GPP extreme event can be counted multiple times across different time lags. For example, if a GPP extreme occurs at time t and compound climate extremes occur both at t and t − 1, the same GPP event contributes to both the concurrent and lagged co-occurrence rates. This makes it difficult to interpret the lagged co-occurrence rate as evidence that climate extremes at t − 1 increased the likelihood of a GPP extreme at t.
Therefore, I recommend that the authors reconsider the interpretation of the “peak time” metric. At minimum, the manuscript should clearly state that the peak timing of the co-occurrence rate does not necessarily represent a physiological response time. The authors could also test an alternative event-matching framework that avoids double-counting GPP extremes across different lags, for example by assigning each GPP extreme to the nearest or strongest associated compound climate extreme within a defined time window.
This concern is especially relevant for Figure 2d, f, and h, where some peak timings exceed 8 months. Such long delays are difficult to interpret as direct vegetation response times, although they may partly reflect seasonal memory or legacy effects, but these effects are normally not more substantial than the concurrent effects unless there is a tree mortality (see Fig. S3 in Yu et al., 2025). The manuscript should discuss this more carefully.
3. Soil moisture depths are different for ERA5 and different ESMs
The manuscript states that soil moisture is chosen because it directly reflects water available to plants. Please clarify which soil moisture depth is used. ERA5 soil moisture and ESM soil moisture represent different soil depths, and the soil depth is different among different ESMs; this difference may affect the comparison of GPP sensitivity to soil moisture. The manuscript should explain how this issue is handled or discuss it as a limitation.
4. Improve the clarity and support of the Discussion
The discussion section needs substantial clarification. Several statements introduce assumptions about mechanisms, model biases, soil moisture memory, or vegetation–climate coupling without enough explanation or supporting references. The discussion would be stronger if the authors more clearly separated:
- what is directly shown by the results,
- what is inferred mechanistically and supported by reference, and
- what remains speculative.
For example, the manuscript states that climate extremes pre-condition vegetation towards productivity extremes more strongly in ESMs than in the observational data, but the supporting evidence in the Results section should be shown and cited more explicitly.
The discussion should also include more references to support mechanistic interpretations, e.g. regarding soil-moisture memory, effective integration time, precipitation versus soil moisture as indicators of water availability. In several places, the current discussion raises interesting points but does not sufficiently explain how they connect to the results.
Other commentsLn 28: “diminishing” => “diminished”.
Ln 29: “high GPP” may be better phrased as “positive GPP anomaly” or “positive GPP extreme”
Ln 39: The phrase “increased likelihoods of compound events” is unclear. Does this refer to an increasing trend in all compound-event types? Please clarify.
Ln 47: The statement that “ESMs may continue to underestimate the likelihoods of temperature–moisture events and productivity extremes individually” needs more explanation and supporting references. It is not clear whether this refers to biases in temperature extremes, moisture extremes, compound-event frequency, GPP extremes, or their joint occurrence.
Ln 56: The connection between plant functional types, soil heterogeneity, and root-zone nutrient dynamics and the main research question is not clear. Please explain how these structural model simplifications are expected to influence the simulated coupling between compound climate extremes and GPP extremes. This paragraph could potentially be merged with the following paragraph to improve flow.
Ln 65: Please clarify why this is challenging. Is the difficulty related to hydrological variability, plant physiological responses, soil moisture memory, model structure?
Ln 107: The sentence “These differences between the ESMs are attractive for assessing the link between temperature–moisture and GPP extremes...” is unclear. Why are these differences “attractive”? What specific mechanistic insights are expected? Biases and commonalities in what?
Figure 4d, f: The different peak timings among compound-event types are interesting and deserve more discussion. What mechanisms could explain why different temperature–moisture combinations show different lag structures?
Figure 5: Consider adding direct labels above each subplot, such as “hot–dry–high GPP” or “hot–dry–low GPP”, to make the figure easier to interpret.
Ln 367: Please explain the concept of “effective integration time” and provide a reference.
Ln 368: The sentence is confusing. It is unclear what “it” and “these” refer to. This part may fit better in the Introduction, where the authors could explain why results may differ depending on whether precipitation or soil moisture is used as the indicator of moisture availability.
Ln 375: Please explain explicitly what is meant by “different mechanistic aspects”. Which mechanisms are being contrasted?
Ln 380: The manuscript should further discuss why maps based on temperature–precipitation extremes exhibit strong dipole patterns.
Ln 409: The manuscript argues that models and FLUXCOM agree at the regional scale. Please provide stronger evidence for this claim. For example, the authors could include a supplementary version of Figure 4C-F separated by IPCC regions.
Ln 422–425: The claim that “climate extremes pre-condition vegetation towards productivity extremes more strongly in the ESMs than in the observational data” needs clearer support from the Results section. Please also clarify the logical connection between the first two sentences of this paragraph, as the current writing is difficult to follow.
Ln 426–427: Please clarify what type of biases are being discussed.
Reference:
Jung, M., Schwalm, C., Migliavacca, M., Walther, S., Camps-Valls, G., Koirala, S., Anthoni, P., Besnard, S., Bodesheim, P., Carvalhais, N., Chevallier, F., Gans, F., Goll, D. S., Haverd, V., Köhler, P., Ichii, K., Jain, A. K., Liu, J., Lombardozzi, D., Nabel, J. E. M. S., Nelson, J. A., O’Sullivan, M., Pallandt, M., Papale, D., Peters, W., Pongratz, J., Rödenbeck, C., Sitch, S., Tramontana, G., Walker, A., Weber, U., and Reichstein, M.: Scaling carbon fluxes from eddy covariance sites to globe: synthesis and evaluation of the FLUXCOM approach, Biogeosciences, 17, 1343–1365, https://doi.org/10.5194/bg-17-1343-2020, 2020.
Nelson, J. A., Walther, S., Gans, F., Kraft, B., Weber, U., Novick, K., Buchmann, N., Migliavacca, M., Wohlfahrt, G., Šigut, L., Ibrom, A., Papale, D., Göckede, M., Duveiller, G., Knohl, A., Hörtnagl, L., Scott, R. L., Dušek, J., Zhang, W., Hamdi, Z. M., Reichstein, M., Aranda-Barranco, S., Ardö, J., Op de Beeck, M., Billesbach, D., Bowling, D., Bracho, R., Brümmer, C., Camps-Valls, G., Chen, S., Cleverly, J. R., Desai, A., Dong, G., El-Madany, T. S., Euskirchen, E. S., Feigenwinter, I., Galvagno, M., Gerosa, G. A., Gielen, B., Goded, I., Goslee, S., Gough, C. M., Heinesch, B., Ichii, K., Jackowicz-Korczynski, M. A., Klosterhalfen, A., Knox, S., Kobayashi, H., Kohonen, K.-M., Korkiakoski, M., Mammarella, I., Gharun, M., Marzuoli, R., Matamala, R., Metzger, S., Montagnani, L., Nicolini, G., O’Halloran, T., Ourcival, J.-M., Peichl, M., Pendall, E., Ruiz Reverter, B., Roland, M., Sabbatini, S., Sachs, T., Schmidt, M., Schwalm, C. R., Shekhar, A., Silberstein, R., Silveira, M. L., Spano, D., Tagesson, T., Tramontana, G., Trotta, C., Turco, F., Vesala, T., Vincke, C., Vitale, D., Vivoni, E. R., Wang, Y., Woodgate, W., Yepez, E. A., Zhang, J., Zona, D., and Jung, M.: X-BASE: the first terrestrial carbon and water flux products from an extended data-driven scaling framework, FLUXCOM-X, Biogeosciences, 21, 5079–5115, https://doi.org/10.5194/bg-21-5079-2024, 2024.
Yu, X., Orth, R., Reichstein, M., Reimers, C., Gomarasca, U., Migliavacca, M., Papale, D., Bahn, M., and Bastos, A.: Widespread but divergent drought legacy effects on gross primary productivity across biomes, Global Change Biology, 31, e70541, https://doi.org/10.1111/gcb.70541, 2025.
Citation: https://doi.org/10.5194/egusphere-2026-626-RC2 -
RC3: 'Comment on egusphere-2026-626', Anonymous Referee #2, 22 Jun 2026
I would also suggest that the authors check the version of FLUXCOM (FLUXCOM-RS or FLUXCOM-RS+METEO) used in the article. If the authors use the FLUXCOM-RS+METEO version, only mean seasonal cycles of RS-based land surface properties and concurrent meteorological input are used in the RS+METEO setup to predict the GPP in each time step (Jung et al., 2020). Therefore, in theory, the FLUXCOM GPP estimates should not be considered to respond to the meteorological conditions in the previous months. The FLUXCOM GPP estimates also lack the effect of CO2 fertilization (Jung et al., 2020), so the discussion on the long-term trend of GPP due to CO2 fertilization should also be reevaluated.
Citation: https://doi.org/10.5194/egusphere-2026-626-RC3
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 369 | 133 | 26 | 528 | 332 | 28 | 38 |
- HTML: 369
- PDF: 133
- XML: 26
- Total: 528
- Supplement: 332
- BibTeX: 28
- EndNote: 38
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1