the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Evaluating vegetation indices for monitoring drought and post-drought declines in European forest productivity
Abstract. Drought is causing increasingly severe and widespread negative impacts on forest gross primary productivity (GPP) but modelling these impacts over large spatial scales with remote sensing data is challenging. It is especially problematic in forests which have lower spectral sensitivity to drought compared to other ecosystems and where the timing of vegetation index (VI) response may lag GPP. We tested the ability of 12 MODIS variables (land surface temperature, leaf area index, fraction absorbed photosynthetic active radiation and nine VIs) to capture drought-induced reductions in GPP at 18 forest sites across Europe. Our analysis quantified the time lags between the Standardized Precipitation Evapotranspiration Index, GPP and VI response to drought as well as legacy effects in the first year post-drought. We found that land surface temperature was the only MODIS variable that showed significant change between drought and non-drought reference periods at both deciduous broadleaf and evergreen coniferous forests. At deciduous sites, the Chlorophyll/Carotenoid Index, Normalized Difference Water Index and Normalized Difference Vegetation Index (NDVI) were also significantly reduced during drought while the near infrared reflectance index (NIRv) was significantly reduced at coniferous sites. There were substantial variations in the magnitude and timing of drought response among the VIs which we relate to drought-induced changes in tree physiology and their differences between the five tree species represented at the study sites. VIs related to canopy structure (NDVI, Plant Phenology Index and NIRv) remained low in the first year following drought at both broadleaf and coniferous sites, even though GPP recovered to long-term mean values, implying a significant decoupling between GPP and these VIs post-drought. Remote sensing-based GPP models based on these structural indices alone may thus overestimate drought impacts on GPP and underestimate forest resilience to drought.
Competing interests: At least one of the (co-)authors is a member of the editorial board of Biogeosciences.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(2441 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-1822', Anonymous Referee #1, 20 May 2026
-
RC2: 'Comment on egusphere-2026-1822', Anonymous Referee #2, 29 May 2026
General Comments
Previous studies have shown that satellite-driven models of gross primary productivity (GPP) often underestimate the observed response of GPP to drought conditions. This likely is because the vegetation indices (VIs) and products that the models are using may not be responding to drought as expected. This study evaluates broadleaf and coniferous forest responses to recent droughts in Europe, comparing the response of GPP from 18 flux tower sites to the responses of a suite of MODIS products (mostly VIs). The authors find that many of the VIs (broadly defined here to also refer to other satellite remote sensing products) commonly used in GPP models (e.g., FPAR) do not respond sufficiently to droughts (defined by the Standardized Precipitation Evapotranspiration Index at 3-month span, SPEI-3) or inconsistently at best. They find that land surface temperature consistently responds to drought periods, which may be as expected, but that it would be worthwhile to include in GPP models. Furthermore, other vegetation indices such as the older and simpler NDVI have more consistent responses to droughts.
Initially, I thought this study was retreading old ground, but instead it appears to do a rather nice job at succinctly trying to diagnose why GPP models may not be getting the response to drought in temperate forests quite right. It could use some revision to make its main points clearer and tighten up the analysis, but I believe it will be an important contribution to the GPP literature once edited. I suspect it may be contentious by some in the community, especially given its relatively limited analysis compared to the application scope of global GPP models, but I think that it is a needed case study to kickstart discussions on how to model GPP better using satellite data.
Specific Comments
- This study appears to treat all SPEI-3 droughts the same, and is attempting to find if there is an effect of drought on vegetation indices regardless of drought severity. Was there any ranking or hierarchy in terms of the severity of drought (how negative SPEI-3 is) and whether the effect of drought manifested in a particular VI? There may be a threshold sorting in which drought may always develop in some variable (LST) but less frequently in others (NDVI, NIRv) depending on the intensity of the event. There is some brief discussion related to drought severity at line 320, comparing broadleaf and needleleaf sites, but this could be expanded.
- Last comment in Abstract states: “Remote sensing-based GPP models based on these structural indices alone may thus overestimate drought impacts on GPP and underestimate forest resilience to drought.” Most GPP models include at least some form of meteorological data in additional to veg structure metrics. Anything specific? This could be directly tested (see comments below.)
- Any particular reason MODIS GPP (MOD17) or other MODIS-derived GPP products like FLUXSAT were not evaluated? This analysis does not need to be done here since it has been evaluated in many previous studies (like Stocker et al 2019, as mentioned) on model development, but it does need to be carefully stated (likely something related to “how are drought impacts being expressed in forests in Europe that are measurable from satellite remote sensing?”)
- In addition, why MODIS and not Sentinel-2? Is this to keep a focus on global products? Just need more of an explanation in the Discussion.
- Line 411: There is some circularity in LST being predictive of SPEI-defined drought, since the SPEI’s PET is derived from modeled temperature and solar radiation variables in ERA5-Land that would directly covary with LST in most situations. This should be noted, but I also agree and think it is a good suggestion that LST should be included in GPP models for drought relationships.
- Line 474: “We suggest that these indices should not be used to capture drought effects on GPP.” This is a big suggestion since FPAR is used for the MODIS GPP product. Would you be willing to suggest that NDVI would be a better variable to use instead of FPAR in this product?
- I would change Figure 3a-f and Figure 4a-f into box plots or into points for the mean with SD ranges. The very small bars, which the dates that are actually close to the measured GPP and SPEI dates, are what should be the focus of the plots (in my view), and the small size of the bars detracts from this being the emphasis.
Technical Comments
Abstract needs to be more explicit about where the GPP numbers are coming from, please state that they are coming from eddy covariance sites using data from FLUXNET (or ICOS, whichever the authors prefer).
Graphical abstract: Need to specify the units on the y-axis (these are SDs?)
Line 81: Introduction “Yet no previous studies have quantified the lag between drought start and the timing of GPP and VI response for a wide range of different indices in forest ecosystems.” This is a key justification for the study, please include some variation of this statement in the Abstract if possible.
Line 43-45: I know the references for these VIs are listed in Table 1, but the references should be here too because this is where they are first introduced and Table 1 doesn’t appear until much later in the document.
Line 115: Why was a 15-day moving average needed/chosen for LST?
Line 119: Why were NDVI and EVI calculated instead of using the existing MODIS products (MOD/MYD13)? This is likely what many MODIS users would likely use instead.
Line 119: Were these daily or 8-day MODIS reflectance products, or something different? I assume this is the 8-day product since it’s smoothed using TIMESAT to make daily products. Any more details that can be provided about settings for the TIMESAT smoothing? Spline spans, knots, order, etc.
Line 126: Was this a regular linear regression slope, or Sen’s slope with a Mann-Kendall test (which would be more appropriate for time series and less sensitive to outliers)?
Line 134: Please define MAD
Line 135: Similar to earlier comments, form of slope and test calculation? Why was 15-days chosen for smoothing?
Line 151: Any reference for discarding values outside +/- 6 z-score SPEI?
Line 160: These z-score values selected are reasonable, but why choose different thresholds for through based on SPEI (-1.5) and GPP (-1)? What happens if these values are adjusted? Don’t necessarily need to do some kind of sensitivity analysis here, but would be good to know or reference.
Line 205: I would think that the droughts where all the VIs were less than -1 z-scores would be an indicator of the severity of these particular droughts. A broad impact like this may not be normally expected. Might be worth including some comments on this in the Discussion.
Line 215, Figure 2, panel b: How can we be sure this decline in NDVI preceding the SPEI identified drought is also a drought and is related? NDVI z-score could have dropped before the drought event for an unrelated reason and might not have anything to do with the drought itself. If it is likely a drought, is this a suggestion that SPEI-3 is too long a time span and SPEI-1 month would be more appropriate due to the quickness of the response?
Line 318 and Figure 6 caption near Line 355: These groupings are tree genus, not tree species.
Line 370: Figure 7 may be better to include in a supplemental section since no significant differences were found for any pre- and post-drought comparisons
Line 433: Pinus and Fagus are both genera, not species.
Line 486: I think this should be referencing Figure A2 instead. Figure 2a only shows NDVI, while A2 shows all the VIs for DE-Hai.
Citation: https://doi.org/10.5194/egusphere-2026-1822-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 338 | 73 | 23 | 434 | 21 | 18 |
- HTML: 338
- PDF: 73
- XML: 23
- Total: 434
- BibTeX: 21
- EndNote: 18
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This manuscript evaluates the ability of multiple MODIS-derived vegetation indices (VIs) and related satellite variables to capture drought-induced declines in forest gross primary productivity (GPP) across 18 European forest flux sites. By comparing the timing and magnitude of drought responses in GPP and 12 VIs, the authors show that LST most consistently tracks drought impacts across both broadleaf and coniferous forests, while structural indices such as NDVI, PPI, and NIRv often exhibit delayed or persistent post-drought responses. The topic is interesting, and the manuscript combines flux observations and remote sensing data in a potentially valuable way. However, several issues currently limit the robustness of the conclusions, I therefore recommend a Major Revision. Detailed suggestions follow below:
Most of the VIs used in this study are derived from a single MODIS product family (MCD43A4), which raises the concern that the reported conclusions may be strongly influenced by the characteristics of a specific data source. The authors are encouraged to include an additional MODIS dataset to verify the robustness of the results. In addition, the LST data were not directly derived from standard MODIS LST products (e.g., MOD11A2) but from the FluxnetEO database. Therefore, the superior performance of LST reported in this study may partly reflect the quality or processing advantages of the FluxnetEO product rather than the intrinsic advantage of LST itself. This issue should be clarified and discussed more carefully.
The preprocessing procedures for the MODIS data should be described in greater detail. For example, the manuscript does not clearly explain how cloud contamination was removed or corrected. Similarly, the criteria used to define “good quality” flux observations (Line 131) should also be explicitly described.
In Line 155, the removal of highly collinear variables should be conducted using more standardized approaches, such as variance inflation factor (VIF) analysis, rather than relying solely on pairwise correlation coefficients.
I found the interpretation of Figures 3 and 4 somewhat confusing. For example, Figure 3 suggests that NDVI Datestart occurs approximately 12 days earlier than GPP Datestart, whereas Figure 4 indicates that NDVI occurs about 3 days earlier than SPEI and GPP about 2 days earlier than SPEI, implying only a ~1 day difference between NDVI and GPP. The relationship between these two figures and the interpretation should therefore be clarified more carefully.
In Fig.5, drought periods are effectively defined based on reductions in GPP, and the same periods are then used to evaluate the responses of other VIs. This framework may not provide a fully fair comparison among variables. It would be more appropriate to define drought periods independently using SPEI and then compare the responses of GPP and all VIs within the same externally defined drought window.
In Fig.7, the interpretation currently relies largely on visual comparison. More rigorous statistical significance testing should be included to support claims regarding post-drought differences and potential decoupling between GPP and VIs.
The Discussion section should be reorganized. At present, much of the discussion mainly repeats the Results rather than providing deeper interpretation, synthesis, or broader implications. In addition, given that different VIs exhibit contrasting drought responses and recovery patterns, it would be more effective to compare and discuss the different VIs together within a Section, rather than discussing each index separately.