the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
pyVPRM: A next-generation Vegetation Photosynthesis and Respiration Model for the post-MODIS era
Abstract. The Vegetation Photosynthesis and Respiration Model (VPRM) is a well-established tool to estimate carbon exchange fluxes between the atmosphere and the biosphere. The gross primary production (GPP) and respiration (Reco) of the ecosystem are modelled separately at high spatial and temporal resolution using the satellite-derived Enhanced Vegetation Index (EVI) and Land Surface Water Index (LSWI), as well as meteorological variables for solar irradiance and surface temperature. The net ecosystem exchange (NEE) is calculated as the difference between the gross fluxes GPP and Respiration. VPRM is widely used as a biospheric flux model in atmospheric transport modeling, most often on scales ranging from city to continent, but also in studies of biospheric carbon budgets and their changes with climate extremes. Historically, satellite-based surface reflectances from the 500-m-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) have been used to determine the EVI and LSWI. However, MODIS is reaching the end of its lifetime and will soon be decommissioned. Therefore, we present an updated version of VPRM, pyVPRM, which provides a software framework with a modular structure that can be used with various satellite products, land cover maps, meteorological data sources, and VPRM model parameterizations. Our tool naturally provides an interface to use satellite data from Sentinel-2, MODIS and VIIRS, as well as global high-resolution land cover classification maps from the Copernicus Dynamic Land Cover Collection 3 and ESA World Cover at 100 m and 10 m resolution, respectively. Neither product is static, hence dynamic changes of the land cover from year to year can be represented. Using Sentinel-2, ecosystem fluxes can be calculated at a resolution of up to 20 m, providing more accurate flux estimates in heterogeneous landscapes like croplands and allowing to resolve small-scale vegetation patches as common in urban areas. In contrast, VIIRS data are at the same resolution as MODIS, and thus provide for continuity once MODIS is discontinued, requiring only minor adjustments to the VPRM data preprocessing. In addition, pyVPRM improves the data handling, for example for snow-covered scenes. This paper presents the pyVPRM framework, discusses changes and improvements compared to previous VPRM implementations, and provides VPRM parameters for the European domain based on indices calculated from MODIS, Sentinel-2 and VIIRS using a new, wind-speed-optimized selection of eddy-covariance observations from 97 flux tower sites. Using pyVPRM and the new parameters we observe significant improvements in the estimation of the European carbon budget. The results are well conform with those from inversion studies.
- Preprint
(13353 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 18 Mar 2025)
-
CEC1: 'Comment on egusphere-2024-3692 - No compliance with the policy of the journal', Juan Antonio Añel, 12 Feb 2025
reply
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
You have archived your code on GitHub. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. Moreover, this is clearly stated in our policy, which we would have expected you read before submitting your manuscript. Therefore, the current situation with your manuscript is irregular. Please, publish your code in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy.
Also, you do not provide permanent repositories for the data that you use, but website links, and in some cases they are simple links to main portals, not the specific data files that you use. This does not comply with our policy either. Please provide repositories (links and DOIs) containing the specific files that you use in your manuscript.
Note that you must include a modified 'Code and Data Availability' section in a potentially reviewed manuscript containing the DOI and links of the new repositories.
Finally, if you do not fix this problem, we will have to reject your manuscript for publication in our journal.Juan A. Añel
Geosci. Model Dev. Executive EditorCitation: https://doi.org/10.5194/egusphere-2024-3692-CEC1 -
AC1: 'Reply on CEC1', Theo Glauch, 26 Feb 2025
reply
Dear editors,
on behalf of the authors I want to apologize for the problems with the "Code and data availability" sections. Indeed, this slipped through in our final revisions. I'm posting the corrected version of the section below.
All the best,
Theo Glauch
--------------------------
The current version of pyVPRM is available from the project website: https://github.com/tglauch/pyVPRM under the MIT licence. The exact version of the model used to produce the results used in this paper is archived on Zenodo (https://doi.org/10.5281/zenodo.14216613 (v3.0)). Data processing scripts are specific to the DLR Terrabyte cluster that has been used to calculate the results. They are provided on request. The flux tower data can be retrieved through https://doi.org/10.18160/06dk-vryh (ICOS) and https://doi.org/10.1038/s41597-020-0534-3 (FLUXNET). The MODIS and VIIRS satellite datasets are publicly available through LP DAAC: MOD09GA (https://doi.org/10.5067/modis/mod09ga.061), MYD09GA (https://doi.org/10.5067/modis/myd09ga.061), MOD09A1 (https://doi.org/10.5067/modis/mod09a1.061), MYD09A1 (https://doi.org/10.5067/modis/myd09a1.061), VNP09GA (https://doi.org/10.5067/viirs/vnp09ga.002), VNP09H1 (https://doi.org/10.5067/viirs/vnp09h1.002). Sentinel-2 Collection 1 Level-2A data are publicly distributed through the Copernicus Data Space Ecosystem (http://dx.doi.org/10.5270/S2_-znk9xsj). The land cover maps are accessible through https://doi.org/10.5281/zenodo.3939049 (Copernicus Land Cover Service), https://doi.org/10.5281/zenodo.7254220 (ESA World Cover) and https://doi.org/10.1016/j.rse.2006.01.020 (SYNMAP).
Citation: https://doi.org/10.5194/egusphere-2024-3692-AC1
-
AC1: 'Reply on CEC1', Theo Glauch, 26 Feb 2025
reply
-
RC1: 'Comment on egusphere-2024-3692', Anonymous Referee #1, 05 Mar 2025
reply
Review on „pyVPRM: A next-generation Vegetation Photosynthesis and Respiration Model for the post-MODIS era” by Theo Glauch et al. submitted to Geoscientific Model development
The paper describes a new version of the previously published VPRM model (which estimates biogenic CO2 exchange for large land surfaces in gridded form after fitting parameters that relate proxies from satellite remote sensing and meteo models to CO2 exchange as measured by reference stations). The most important updates are with respect to the remote-sensing data sources that can be handled, in particular their spatial resolution, the user friendliness (modular, public python package) and parameter fitting procedure.
The paper addresses a relevant scientific modelling question within the scope of GMD with novel concepts and data. The presented example results demonstrate a substantial advance, especially in the realistic representation of crop ecosystems, which particularly suffered from mismatch between remote sensing pixels and station footprints due to strong differences in the sowing and senescence / harvest dates of crops between adjacent fields.
The language is fluent and clear almost everywhere and the figures are mostly of high quality. Together with the published open-source code and the appendix, the should allow for reproduction of the results.
There are a few issues: The structure of the paper is at some points a bit arbitrary, creating an impression of jumping back and forth that may confuse readers. Some knowledge of the remote sensing and modelling community is taken for granted, while some descriptions of the eddy-covariance measurements used for the fitting are imprecise or misleading; in one case (relation between wind speed and footprint size) this may have led to a poor decision in data treatment. A regionally suspiciously large annual net respiration in the results is not discussed. These issues are described in detail below, together with minor needs or suggestions for improvement in particular sentences, figures or tables.
I think that the paper can provide a very helpful reference and guideline for the many usage cases the new model will experience, once these issues are resolved. While some thoughts, future suggestions and questions on the analysis itself are included below, it is probably sufficient to revise the clarity of its description and adequate discussion of problems in the paper.
Introduction
L28: A word like “occur” seems missing after “Earth system”
L34: Before or after the last sentence of the 1st paragraph, consider also setting the net flux (already mentioned before) into relation to anthropogenic fluxes. GPP and respiration mostly cancel out each other and while comparing their magnitude to anthropogenic emissions is impressive, doing the same for their net effect gives a more realistic (and still important) idea why we should care about better constraining or even managing these fluxes.
L45: consider mentioning the 3 fluxes in a more instructive order and way, like: “GPP, respiration and consequently NEE…”
L60 ff.: Letter order in VRPM a typo? The rest of the paragraph from “Different approaches” is somewhat unclear. I guess you intend to mention past work by others that anyone could easily include to the new pyVPRM thanks to its modular structure, but it hasn’t been done yet?
Section 2
L78-70: The dropping of the class savanna raises 2 questions: First, while the title and start of this section suggest it is about how the model in general / in its traditional version / in its various other earlier implementations works, at this point (for the first time) the reader wonders if accidently you already stepped ahead to describe a recent implementation inside pyVPRM. Please clarify and also check if this also applies to other points in the rest of the section.
Second, while there may or may not be good reasons to drop the class savanna, increased resolution is not a sufficient one for all (maybe not even most) savannas. E.g. equally distributed trees with 16 m crown diameter and 30 m stem distance at 20 m resolution would cover no pixel with more than 50% tree and leave none of them tree-free either. I see it’s cumbersome to re-run the example if you cannot give a better reason, but maybe then you could stress it’s just an example and has nothing to do with limitations of the actual pyVPRM.
L91: Maybe add something like “in our convention”, because the usual one is that while NEE is negative if there is uptake, GPP is still always positive (i.e. NEE = R – GPP), which may be a bit of nuisance but still makes most sense given the words the abbreviations stand for, and their history in biology.
L120: What does the general assumption of grassland being xeric imply for the use of this vegetation class in the humid temperate zone (where we have lots of anthropogenic grasslands) and tundra ecosystems? Again (like dropping savanna), this is not super critical for the model description but maybe for the evaluation over Europe.
Section 3
Table 1: If Journal policy is as usual, remember to overwrite (not underwrite) tables.
Section 3.1.1: Since some descriptions before were about the specific examples in the paper rather than about inherent settings of pyVPRM, maybe continue to clarify here whether the chosen parameters for Eq. 10-12 are purely a part of the example, hardcoded, or (most likely?) something in between, e.g. default settings of functions the user can redefine in the function call.
L201: Describing eddy covariance footprints like this is a bit tricky, since various parameters (distance of peak weight from station, x% (e.g. 50 to 90) contour line maximum distance, or a similar percentage but for the crosswind-integrated footprint) can differ hugely from each other. ICOS downloads contain e.g. the distance where the cross-wind integrated footprint cumulative probability is 90% (variable FETCH_90), which for DE-RuS is on average (median) over time around 85 (87) m, which is larger (and intuitively better matching the flux community’s understsnding of typical footprint sizes) than your given dimension of 50 m but still less than the distance from station to any field edge. Alternatively, for a 2Dmap of the footprint climatology you could refer to the site labelling report (https://docs.google.com/document/d/1ksB7EZi7hOSJhpgCkDrSJzo2JBL2V174et9m7eqAPL4/edit?tab=t.0, p. 12) or maybe as a more citeable source to Fig. 2b in https://doi.org/10.5194/bg-21-2051-2024. In the following it might be worthwhile considering to use the actual crops which nicely support your hypothesis: In 2022 the field itself had potatoes, which are only planted in April, take time to grow and last until fall. The eastern and southeastern neighbouring fields in contrast had winter wheat, which was sown the fall before, starts photosynthesis early, and is typically harvested around in July. Unfortunately including this kind of information (at least for the field itself) in the ICOS metadata is still work in progress, but you could cite (if journal guidelines allow) “pers. comm. Schmidt 2025”. Marius Schmidt is the main site PI, gave me this information but can also be contacted directly. I am aware that the above information is too long for the paper, so please just regard it as an offering to pick from.
L210/Sect. 3.2-3.3: Briefly mentioning the fitting of the standard model with 4 independent parameters at the beginning of this section raises a number of clarity questions: Does in this case “the standard model” also apply to your example implementation, the results of which are shown at the end of 3.3 in Figure 4? Is the fitting reference and methodology used to generate Fig. 4 already the one only described later in section 4? If yes it creates a bit of an impression of jumping back and forth that can easily confuse readers, and might be resolved in different ways, all of which probably have their pros and cons: Reconsider the section order, abstain from already showing example results in Figure 4, use non-fitted default parameters for it, or (at least) insert some information at the start of 3.2 clarifying explicitly or implicitly that the fitting will be explained later.
L215/Table 4: It does not get clear to outsiders what the numbers in the table stand for – internal land cover codes of the two schemes?
L254/Fig. 5: Why does the URB fraction subpanel suggest 100% coverage at sea?
Section 4
L257 and also Eq. 9, “linear”: Note that process understanding suggests that the temperature dependence of respiration is far from linear. The chosen equation shape can probably be justified (especially since the Tlow clipping eliminates the worst possible effects) by reasons like simplicity, robustness or community tradition, but it should be avoided to use a wording that suggests to some readers the relation is actually linear.
L264: To avoid misunderstandings maybe it would be good to choose a wording making clear that not FLUXNET as such provides data only up to 2015, but rather the (currently still most up-to-date but hopefully replaced soon) FLUXNET2015 dataset.
L271: The part “which statistically determines vertical fluxes from turbulence (eddies) in the wind field” is a bit unclear in the sense that it does not give any helpful explanation to readers unfamiliar with the methods, and puts a strange focus on arbitrary aspects of the methods for those familiar. If you want to stay short, maybe end at the first comma with a citation (a textbook, early paper like https://doi.org/10.1175/1520-0469(1951)008<0135:TMOVTO>2.0.CO;2 , or influential paper like https://doi.org/10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO;2). Or try something like “which determines vertical fluxes from turbulence-resolving (i.e. sub-second) measurements of the vertical wind component and the quantity of interest, in our case CO2” (with or without citation).
L279: Is this selection somehow documented in a reproducible way? Or are all towers mentioned in the manuscript (Figure and table) already filtered for this criterion? Maybe more for your interest than relevant for the level of detail of the manuscript: All ICOS and some FLUXNET site datasets already include a filter where data not representative of the target land use (according to a footprint model run by the site PI in case of FLUXNET or centrally in case of ICOS) never enter the final gap-filling procedure. This may help (in the future) to also consider some more sites you now removed, if atypical land use is only found in a certain wind direction.
L286: It is a (maybe not common but repeatedly occurring) misconception that footprints as such get larger with increasing wind speed. Because the vertical and cross-wind turbulence intensity (and thus the dispersion of any upwind signal) scale with wind speed itself, a direct cause-and-effect relation used in footprint models only exists for quantities such as Obukhov length (a parameter including the effect of unstable or stable temperature stratification) or variance of vertical or (horizontal cross-) wind, but not for wind speed itself. A secondary (scattered, unreliable and differing between sites) relation of wind speed to footprint size may result from this, however not necessarily in the direction you assume (e.g. very calm nights are often associated with very stable stratification and thus very long footprints). Also, it is not documented how you did this prioritization (P.S.: It is at L296, but the reader cannot anticipate it here). That said, for most (except the very highest) towers your choice to use the smallest possible patch for the 500 m resolution EVI/LSWI sources, and 3 x 3 for Sentinel-2 probably is a good way to catch the most important part of the footprint. Luckily, the way PIs select their tower locations and your additional screening of sites (L279) will probably make the result pretty much insensitive to the exact choice of the smoothing (e.g. 60x60 vs. 120 x 120 m). For future implementations you might want to consider the footprint information given in some (at least all ICOS) time series along with the fluxes (see comment on L201). For the 20 m resolution also a “ring-like” average of just the outer 8 pixels could slightly improve results, since the station infrastructure itself (by its pure presence, unwanted vegetation damage or because a small clearing was chosen for the setup) might affect the EVI/LSWI readings for the central pixel, while its contribution to the flux measurement will be small due to the fact that the peak of the footprint is always some (deca) meters upwind of the station.
L296: If the only reason for the inverse wind speed scaling function is the one mentioned above, I would advocate to drop it.
L300-303: Very elegant and consistent solution, avoiding to effectively do the source partitioning modelling of NEE into GPP and R twice, with inconsistent parameter sets. A minor drawback might be that your four parameters are per vegetation class (and strongly simplified for respiration, see comment on L257) while the source partitioning in the provided EC flux dataset allow for (more realistic) locally varying parameters. Therefore a future sensitivity study on this choice (probably outside the scope of this paper) might still be interesting.
Section 5 to 7
L331: change yes, but (unlike all evidence presented so far) in this case I’m not sure whether for the better. It is a bit hard to “read” the continuous colour scale but it seems to suggest net annual losses of 1 kg m-2 yr-1 in some large regions with very low soil and biomass carbon stocks. Especially if it is kg C and not kg CO2 (please clarify) and goes on for several consecutive years (which we cannot know from the figure), this is hardly possible (both from the typical range of actually measured annual fluxes, and theoretically from the flux to stock ratio). It should be followed up in the discussion section how realistic these large net respiration events are. One possible reason may be the fitting of parameters per vegetation class independent of region or climate, which may have been partly compensated by the discussed errors in the old model version.
Figure 9: Check the “green” and “white” at the end of the caption. To me they look blue vs. yellow, and if matching blue with green and white with yellow it gives a meaning opposite to the legend within the figure.
L444-445: “describing different the categories”: check word order
Code and data availability, references, acknowledgements
L471: In the original submission (if I didn’t overlook anything) neither the acknowledgement, nor the references or a separate dataset citation section (many journals want it in both) acknowledge the data contribution of the FLUXNET and ICOS site PIs. For ICOS there is a good description how to do it best at https://www.icos-cp.eu/how-to-cite, which might also serve as a blueprint for the used non-ICOS FLUXNET sites. Most of these guys are not officially paid to produce the data, but scientists sacrificing a considerable amount of their time to make them available to the community, and continuation critically depends on traceability of their efforts. P.S.: In reaction to a slightly different comment by the editor on the journal policy compliance, the authors now posted a code and data availability section which briefly mentions DOIs to both datasets but does not automatically address all of the above (which of course they couldn’t know at the time)
Appendix
Table C1-C7: It did not get clear to me why only one year per site was chosen and why used sites or site-years can differ between the VPRM models for the different satellite products. If I didn’t overlook it earlier in the methods section, please clarify. Given that the height but not the width of the tables challenge the page dimensions and a lot of information is redundant between the tables, would it be an option to present only one (two page long) table with more columns, one for each of MODIS, Sentinel-2 and VIIRS?
Citation: https://doi.org/10.5194/egusphere-2024-3692-RC1 -
RC2: 'Comment on egusphere-2024-3692', Anonymous Referee #2, 06 Mar 2025
reply
This is a well-written and clear article that presents a promising contribution to Geoscientific Model Development. The study is compelling, and I believe it is well-suited for publication in this journal. Before I can recommend acceptance, I would appreciate clarification on a few points, mostly related to curiosities and minor refinements.
Curiosities:
- Impact of temporal resolution when transitioning from MODIS to Sentinel-2:
You currently use 8-day MODIS products, while Sentinel-2 has a slightly longer revisit cycle. How does this difference affect the lowess smoothing? While you briefly mention this at the end of Section 3.1, I believe the issue deserves a bit more discussion. - Source of shortwave (SW) radiation for PAR calculation:
You mention in a footnote that this comes from ERA-5 reanalysis, but this point is somewhat understated. It would be beneficial to explicitly describe this in the main text. - Stratifying VPRM beyond broad forest types:
Given recent advances in AI and remote sensing for characterizing canopy height texture (e.g., https://gmd.copernicus.org/articles/18/337/2025/gmd-18-337-2025.html), would it make sense in the future to refine VPRM parameterization beyond the current broad categories (evergreen, deciduous, and mixed forests)? - Use of Sentinel-1 for cloud gap-filling:
With growing research on using Sentinel-1 to interpolate missing Sentinel-2 data, have you considered incorporating S1 to address cloud cover issues? This could potentially make your smoothing better and avoid part of the issues evoked at the end of section 3.1. - Alternative to filtering for low wind speed periods:
Could recent methods for separating advection (large-scale) from turbulence (e.g., https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3243/) provide an alternative to prioritizing time periods with relatively low wind speeds? I mean, if you can always remove the advection, you do not need to filter out these periods, which represent quite an important part of your calibration datasets, I guess? - Missing results for deciduous and mixed forests (Fig. 8):
Given that Fig. 6 indicates flux tower sites in the deciduous and mixed forest categories, is there a reason why results for these categories are not also shown in Fig. 8? It would be great to be able to see that. - Source of EVI and LSWI in Section 3.4:
The section mentions using forecasted temperature and SW data, but what method is used to generate EVI and LSWI? Is there an extrapolation process involved?
Minor comments:
- L42: Add a reference for the bottom-up approach.
- P44: It would improve coherence to briefly mention similar models (e.g., TBMs relying on remote sensing) before introducing the specific model you are improving.
- P60: The discussion on approaches for improving VPRM feels slightly out of place here. Wouldn’t it fit better earlier in the introduction, before detailing your proposed improvements?
- L91: Where is the negative sign?
- L109: Strengthen the link between Equations 6 and 7 by explicitly referencing TH_full_leaf_expansion in Equation 6.
- L146: The phrasing is slightly awkward - please consider rewording.
- L238: There is an inconsistency between Equations 9 and 13 - why is the respiration equation parameter β replaced with δ?
- L309: “This question will be investigated in further studies.” I’m looking forward to this! This is an exciting question from a remote sensing perspective.
- L384: “In the future, the parameter estimation can be further improved by taking into account the (time-dependent) footprint of the eddy-covariance towers.” Could you rephrase please?
Citation: https://doi.org/10.5194/egusphere-2024-3692-RC2 - Impact of temporal resolution when transitioning from MODIS to Sentinel-2:
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
193 | 56 | 8 | 257 | 2 | 4 |
- HTML: 193
- PDF: 56
- XML: 8
- Total: 257
- BibTeX: 2
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1