<em>pyVPRM</em>: A next-generation Vegetation Photosynthesis and Respiration Model for the post-MODIS era

Glauch, Theo; Marshall, Julia; Gerbig, Christoph; Botía, Santiago; Gałkowski, Michał; Vardag, Sanam N.; Butz, André

doi:https://doi.org/10.5194/egusphere-2024-3692

Preprints

https://doi.org/10.5194/egusphere-2024-3692

Preprints

21 Jan 2025

| 21 Jan 2025

pyVPRM: A next-generation Vegetation Photosynthesis and Respiration Model for the post-MODIS era

Theo Glauch, Julia Marshall, Christoph Gerbig, Santiago Botía, Michał Gałkowski, Sanam N. Vardag, and André Butz

Abstract. The Vegetation Photosynthesis and Respiration Model (VPRM) is a well-established tool to estimate carbon exchange fluxes between the atmosphere and the biosphere. The gross primary production (GPP) and respiration (R_eco) of the ecosystem are modelled separately at high spatial and temporal resolution using the satellite-derived Enhanced Vegetation Index (EVI) and Land Surface Water Index (LSWI), as well as meteorological variables for solar irradiance and surface temperature. The net ecosystem exchange (NEE) is calculated as the difference between the gross fluxes GPP and Respiration. VPRM is widely used as a biospheric flux model in atmospheric transport modeling, most often on scales ranging from city to continent, but also in studies of biospheric carbon budgets and their changes with climate extremes. Historically, satellite-based surface reflectances from the 500-m-resolution Moderate Resolution Imaging Spectroradiometer (MODIS) have been used to determine the EVI and LSWI. However, MODIS is reaching the end of its lifetime and will soon be decommissioned. Therefore, we present an updated version of VPRM, pyVPRM, which provides a software framework with a modular structure that can be used with various satellite products, land cover maps, meteorological data sources, and VPRM model parameterizations. Our tool naturally provides an interface to use satellite data from Sentinel-2, MODIS and VIIRS, as well as global high-resolution land cover classification maps from the Copernicus Dynamic Land Cover Collection 3 and ESA World Cover at 100 m and 10 m resolution, respectively. Neither product is static, hence dynamic changes of the land cover from year to year can be represented. Using Sentinel-2, ecosystem fluxes can be calculated at a resolution of up to 20 m, providing more accurate flux estimates in heterogeneous landscapes like croplands and allowing to resolve small-scale vegetation patches as common in urban areas. In contrast, VIIRS data are at the same resolution as MODIS, and thus provide for continuity once MODIS is discontinued, requiring only minor adjustments to the VPRM data preprocessing. In addition, pyVPRM improves the data handling, for example for snow-covered scenes. This paper presents the pyVPRM framework, discusses changes and improvements compared to previous VPRM implementations, and provides VPRM parameters for the European domain based on indices calculated from MODIS, Sentinel-2 and VIIRS using a new, wind-speed-optimized selection of eddy-covariance observations from 97 flux tower sites. Using pyVPRM and the new parameters we observe significant improvements in the estimation of the European carbon budget. The results are well conform with those from inversion studies.

Received: 26 Nov 2024 – Discussion started: 21 Jan 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 13353 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (13353 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

30 Jul 2025

pyVPRM: a next-generation vegetation photosynthesis and respiration model for the post-MODIS era

Theo Glauch, Julia Marshall, Christoph Gerbig, Santiago Botía, Michał Gałkowski, Sanam N. Vardag, and André Butz

Geosci. Model Dev., 18, 4713–4742, https://doi.org/10.5194/gmd-18-4713-2025,https://doi.org/10.5194/gmd-18-4713-2025, 2025

Short summary

Theo Glauch, Julia Marshall, Christoph Gerbig, Santiago Botía, Michał Gałkowski, Sanam N. Vardag, and André Butz

Interactive discussion

Status: closed

CEC1:
'Comment on egusphere-2024-3692 - No compliance with the policy of the journal', Juan Antonio Añel, 12 Feb 2025

Dear authors,

Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html

You have archived your code on GitHub. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. Moreover, this is clearly stated in our policy, which we would have expected you read before submitting your manuscript. Therefore, the current situation with your manuscript is irregular. Please, publish your code in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy.

Also, you do not provide permanent repositories for the data that you use, but website links, and in some cases they are simple links to main portals, not the specific data files that you use. This does not comply with our policy either. Please provide repositories (links and DOIs) containing the specific files that you use in your manuscript.

Note that you must include a modified 'Code and Data Availability' section in a potentially reviewed manuscript containing the DOI and links of the new repositories.

Finally, if you do not fix this problem, we will have to reject your manuscript for publication in our journal.
Juan A. Añel

Geosci. Model Dev. Executive Editor

Citation: https://doi.org/10.5194/egusphere-2024-3692-CEC1
- AC1: 'Reply on CEC1', Theo Glauch, 26 Feb 2025
  
  Dear editors,
  on behalf of the authors I want to apologize for the problems with the "Code and data availability" sections. Indeed, this slipped through in our final revisions. I'm posting the corrected version of the section below.
  All the best,
  Theo Glauch
  --------------------------
  The current version of pyVPRM is available from the project website: https://github.com/tglauch/pyVPRM under the MIT licence. The exact version of the model used to produce the results used in this paper is archived on Zenodo (https://doi.org/10.5281/zenodo.14216613 (v3.0)). Data processing scripts are specific to the DLR Terrabyte cluster that has been used to calculate the results. They are provided on request. The flux tower data can be retrieved through https://doi.org/10.18160/06dk-vryh (ICOS) and https://doi.org/10.1038/s41597-020-0534-3 (FLUXNET). The MODIS and VIIRS satellite datasets are publicly available through LP DAAC: MOD09GA (https://doi.org/10.5067/modis/mod09ga.061), MYD09GA (https://doi.org/10.5067/modis/myd09ga.061), MOD09A1 (https://doi.org/10.5067/modis/mod09a1.061), MYD09A1 (https://doi.org/10.5067/modis/myd09a1.061), VNP09GA (https://doi.org/10.5067/viirs/vnp09ga.002), VNP09H1 (https://doi.org/10.5067/viirs/vnp09h1.002). Sentinel-2 Collection 1 Level-2A data are publicly distributed through the Copernicus Data Space Ecosystem (http://dx.doi.org/10.5270/S2_-znk9xsj). The land cover maps are accessible through https://doi.org/10.5281/zenodo.3939049 (Copernicus Land Cover Service), https://doi.org/10.5281/zenodo.7254220 (ESA World Cover) and https://doi.org/10.1016/j.rse.2006.01.020 (SYNMAP).
  
  Citation: https://doi.org/10.5194/egusphere-2024-3692-AC1
RC1:
'Comment on egusphere-2024-3692', Anonymous Referee #1, 05 Mar 2025

Review on „pyVPRM: A next-generation Vegetation Photosynthesis and Respiration Model for the post-MODIS era” by Theo Glauch et al. submitted to Geoscientific Model development
The paper describes a new version of the previously published VPRM model (which estimates biogenic CO₂ exchange for large land surfaces in gridded form after fitting parameters that relate proxies from satellite remote sensing and meteo models to CO₂ exchange as measured by reference stations). The most important updates are with respect to the remote-sensing data sources that can be handled, in particular their spatial resolution, the user friendliness (modular, public python package) and parameter fitting procedure.
The paper addresses a relevant scientific modelling question within the scope of GMD with novel concepts and data. The presented example results demonstrate a substantial advance, especially in the realistic representation of crop ecosystems, which particularly suffered from mismatch between remote sensing pixels and station footprints due to strong differences in the sowing and senescence / harvest dates of crops between adjacent fields.
The language is fluent and clear almost everywhere and the figures are mostly of high quality. Together with the published open-source code and the appendix, the should allow for reproduction of the results.
There are a few issues: The structure of the paper is at some points a bit arbitrary, creating an impression of jumping back and forth that may confuse readers. Some knowledge of the remote sensing and modelling community is taken for granted, while some descriptions of the eddy-covariance measurements used for the fitting are imprecise or misleading; in one case (relation between wind speed and footprint size) this may have led to a poor decision in data treatment. A regionally suspiciously large annual net respiration in the results is not discussed. These issues are described in detail below, together with minor needs or suggestions for improvement in particular sentences, figures or tables.
I think that the paper can provide a very helpful reference and guideline for the many usage cases the new model will experience, once these issues are resolved. While some thoughts, future suggestions and questions on the analysis itself are included below, it is probably sufficient to revise the clarity of its description and adequate discussion of problems in the paper.
Introduction
L28: A word like “occur” seems missing after “Earth system”
L34: Before or after the last sentence of the 1^st paragraph, consider also setting the net flux (already mentioned before) into relation to anthropogenic fluxes. GPP and respiration mostly cancel out each other and while comparing their magnitude to anthropogenic emissions is impressive, doing the same for their net effect gives a more realistic (and still important) idea why we should care about better constraining or even managing these fluxes.
L45: consider mentioning the 3 fluxes in a more instructive order and way, like: “GPP, respiration and consequently NEE…”
L60 ff.: Letter order in VRPM a typo? The rest of the paragraph from “Different approaches” is somewhat unclear. I guess you intend to mention past work by others that anyone could easily include to the new pyVPRM thanks to its modular structure, but it hasn’t been done yet?
Section 2
L78-70: The dropping of the class savanna raises 2 questions: First, while the title and start of this section suggest it is about how the model in general / in its traditional version / in its various other earlier implementations works, at this point (for the first time) the reader wonders if accidently you already stepped ahead to describe a recent implementation inside pyVPRM. Please clarify and also check if this also applies to other points in the rest of the section.
Second, while there may or may not be good reasons to drop the class savanna, increased resolution is not a sufficient one for all (maybe not even most) savannas. E.g. equally distributed trees with 16 m crown diameter and 30 m stem distance at 20 m resolution would cover no pixel with more than 50% tree and leave none of them tree-free either. I see it’s cumbersome to re-run the example if you cannot give a better reason, but maybe then you could stress it’s just an example and has nothing to do with limitations of the actual pyVPRM.
L91: Maybe add something like “in our convention”, because the usual one is that while NEE is negative if there is uptake, GPP is still always positive (i.e. NEE = R – GPP), which may be a bit of nuisance but still makes most sense given the words the abbreviations stand for, and their history in biology.
L120: What does the general assumption of grassland being xeric imply for the use of this vegetation class in the humid temperate zone (where we have lots of anthropogenic grasslands) and tundra ecosystems? Again (like dropping savanna), this is not super critical for the model description but maybe for the evaluation over Europe.
Section 3
Table 1: If Journal policy is as usual, remember to overwrite (not underwrite) tables.
Section 3.1.1: Since some descriptions before were about the specific examples in the paper rather than about inherent settings of pyVPRM, maybe continue to clarify here whether the chosen parameters for Eq. 10-12 are purely a part of the example, hardcoded, or (most likely?) something in between, e.g. default settings of functions the user can redefine in the function call.
L201: Describing eddy covariance footprints like this is a bit tricky, since various parameters (distance of peak weight from station, x% (e.g. 50 to 90) contour line maximum distance, or a similar percentage but for the crosswind-integrated footprint) can differ hugely from each other. ICOS downloads contain e.g. the distance where the cross-wind integrated footprint cumulative probability is 90% (variable FETCH_90), which for DE-RuS is on average (median) over time around 85 (87) m, which is larger (and intuitively better matching the flux community’s understsnding of typical footprint sizes) than your given dimension of 50 m but still less than the distance from station to any field edge. Alternatively, for a 2Dmap of the footprint climatology you could refer to the site labelling report (https://docs.google.com/document/d/1ksB7EZi7hOSJhpgCkDrSJzo2JBL2V174et9m7eqAPL4/edit?tab=t.0, p. 12) or maybe as a more citeable source to Fig. 2b in https://doi.org/10.5194/bg-21-2051-2024. In the following it might be worthwhile considering to use the actual crops which nicely support your hypothesis: In 2022 the field itself had potatoes, which are only planted in April, take time to grow and last until fall. The eastern and southeastern neighbouring fields in contrast had winter wheat, which was sown the fall before, starts photosynthesis early, and is typically harvested around in July. Unfortunately including this kind of information (at least for the field itself) in the ICOS metadata is still work in progress, but you could cite (if journal guidelines allow) “pers. comm. Schmidt 2025”. Marius Schmidt is the main site PI, gave me this information but can also be contacted directly. I am aware that the above information is too long for the paper, so please just regard it as an offering to pick from.
L210/Sect. 3.2-3.3: Briefly mentioning the fitting of the standard model with 4 independent parameters at the beginning of this section raises a number of clarity questions: Does in this case “the standard model” also apply to your example implementation, the results of which are shown at the end of 3.3 in Figure 4? Is the fitting reference and methodology used to generate Fig. 4 already the one only described later in section 4? If yes it creates a bit of an impression of jumping back and forth that can easily confuse readers, and might be resolved in different ways, all of which probably have their pros and cons: Reconsider the section order, abstain from already showing example results in Figure 4, use non-fitted default parameters for it, or (at least) insert some information at the start of 3.2 clarifying explicitly or implicitly that the fitting will be explained later.
L215/Table 4: It does not get clear to outsiders what the numbers in the table stand for – internal land cover codes of the two schemes?
L254/Fig. 5: Why does the URB fraction subpanel suggest 100% coverage at sea?
Section 4
L257 and also Eq. 9, “linear”: Note that process understanding suggests that the temperature dependence of respiration is far from linear. The chosen equation shape can probably be justified (especially since the Tlow clipping eliminates the worst possible effects) by reasons like simplicity, robustness or community tradition, but it should be avoided to use a wording that suggests to some readers the relation is actually linear.
L264: To avoid misunderstandings maybe it would be good to choose a wording making clear that not FLUXNET as such provides data only up to 2015, but rather the (currently still most up-to-date but hopefully replaced soon) FLUXNET2015 dataset.
L271: The part “which statistically determines vertical fluxes from turbulence (eddies) in the wind field” is a bit unclear in the sense that it does not give any helpful explanation to readers unfamiliar with the methods, and puts a strange focus on arbitrary aspects of the methods for those familiar. If you want to stay short, maybe end at the first comma with a citation (a textbook, early paper like https://doi.org/10.1175/1520-0469(1951)008<0135:TMOVTO>2.0.CO;2 , or influential paper like https://doi.org/10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO;2). Or try something like “which determines vertical fluxes from turbulence-resolving (i.e. sub-second) measurements of the vertical wind component and the quantity of interest, in our case CO₂” (with or without citation).
L279: Is this selection somehow documented in a reproducible way? Or are all towers mentioned in the manuscript (Figure and table) already filtered for this criterion? Maybe more for your interest than relevant for the level of detail of the manuscript: All ICOS and some FLUXNET site datasets already include a filter where data not representative of the target land use (according to a footprint model run by the site PI in case of FLUXNET or centrally in case of ICOS) never enter the final gap-filling procedure. This may help (in the future) to also consider some more sites you now removed, if atypical land use is only found in a certain wind direction.
L286: It is a (maybe not common but repeatedly occurring) misconception that footprints as such get larger with increasing wind speed. Because the vertical and cross-wind turbulence intensity (and thus the dispersion of any upwind signal) scale with wind speed itself, a direct cause-and-effect relation used in footprint models only exists for quantities such as Obukhov length (a parameter including the effect of unstable or stable temperature stratification) or variance of vertical or (horizontal cross-) wind, but not for wind speed itself. A secondary (scattered, unreliable and differing between sites) relation of wind speed to footprint size may result from this, however not necessarily in the direction you assume (e.g. very calm nights are often associated with very stable stratification and thus very long footprints). Also, it is not documented how you did this prioritization (P.S.: It is at L296, but the reader cannot anticipate it here). That said, for most (except the very highest) towers your choice to use the smallest possible patch for the 500 m resolution EVI/LSWI sources, and 3 x 3 for Sentinel-2 probably is a good way to catch the most important part of the footprint. Luckily, the way PIs select their tower locations and your additional screening of sites (L279) will probably make the result pretty much insensitive to the exact choice of the smoothing (e.g. 60x60 vs. 120 x 120 m). For future implementations you might want to consider the footprint information given in some (at least all ICOS) time series along with the fluxes (see comment on L201). For the 20 m resolution also a “ring-like” average of just the outer 8 pixels could slightly improve results, since the station infrastructure itself (by its pure presence, unwanted vegetation damage or because a small clearing was chosen for the setup) might affect the EVI/LSWI readings for the central pixel, while its contribution to the flux measurement will be small due to the fact that the peak of the footprint is always some (deca) meters upwind of the station.
L296: If the only reason for the inverse wind speed scaling function is the one mentioned above, I would advocate to drop it.
L300-303: Very elegant and consistent solution, avoiding to effectively do the source partitioning modelling of NEE into GPP and R twice, with inconsistent parameter sets. A minor drawback might be that your four parameters are per vegetation class (and strongly simplified for respiration, see comment on L257) while the source partitioning in the provided EC flux dataset allow for (more realistic) locally varying parameters. Therefore a future sensitivity study on this choice (probably outside the scope of this paper) might still be interesting.
Section 5 to 7
L331: change yes, but (unlike all evidence presented so far) in this case I’m not sure whether for the better. It is a bit hard to “read” the continuous colour scale but it seems to suggest net annual losses of 1 kg m^-2 yr^-1 in some large regions with very low soil and biomass carbon stocks. Especially if it is kg C and not kg CO2 (please clarify) and goes on for several consecutive years (which we cannot know from the figure), this is hardly possible (both from the typical range of actually measured annual fluxes, and theoretically from the flux to stock ratio). It should be followed up in the discussion section how realistic these large net respiration events are. One possible reason may be the fitting of parameters per vegetation class independent of region or climate, which may have been partly compensated by the discussed errors in the old model version.
Figure 9: Check the “green” and “white” at the end of the caption. To me they look blue vs. yellow, and if matching blue with green and white with yellow it gives a meaning opposite to the legend within the figure.
L444-445: “describing different the categories”: check word order
Code and data availability, references, acknowledgements
L471: In the original submission (if I didn’t overlook anything) neither the acknowledgement, nor the references or a separate dataset citation section (many journals want it in both) acknowledge the data contribution of the FLUXNET and ICOS site PIs. For ICOS there is a good description how to do it best at https://www.icos-cp.eu/how-to-cite, which might also serve as a blueprint for the used non-ICOS FLUXNET sites. Most of these guys are not officially paid to produce the data, but scientists sacrificing a considerable amount of their time to make them available to the community, and continuation critically depends on traceability of their efforts. P.S.: In reaction to a slightly different comment by the editor on the journal policy compliance, the authors now posted a code and data availability section which briefly mentions DOIs to both datasets but does not automatically address all of the above (which of course they couldn’t know at the time)
Appendix
Table C1-C7: It did not get clear to me why only one year per site was chosen and why used sites or site-years can differ between the VPRM models for the different satellite products. If I didn’t overlook it earlier in the methods section, please clarify. Given that the height but not the width of the tables challenge the page dimensions and a lot of information is redundant between the tables, would it be an option to present only one (two page long) table with more columns, one for each of MODIS, Sentinel-2 and VIIRS?

Citation: https://doi.org/10.5194/egusphere-2024-3692-RC1
- AC3: 'Reply on RC1', Theo Glauch, 27 Mar 2025
  
  Thanks for reviewing our paper and providing such useful and detailed feedback. This was very helpful. You find our responses below. The original questions are in boldface and our answers in italic.
  
  The paper describes a new version of the previously published VPRM model (which estimates biogenic CO2 exchange for large land surfaces in gridded form after fitting parameters that relate proxies from satellite remote sensing and meteo models to CO2 exchange as measured by reference stations). The most important updates are with respect to the remote-sensing data sources that can be handled, in particular their spatial resolution, the user friendliness (modular, public python package) and parameter fitting procedure.
  The paper addresses a relevant scientific modelling question within the scope of GMD with novel concepts and data. The presented example results demonstrate a substantial advance, especially in the realistic representation of crop ecosystems, which particularly suffered from mismatch between remote sensing pixels and station footprints due to strong differences in the sowing and senescence / harvest dates of crops between adjacent fields.
  The language is fluent and clear almost everywhere and the figures are mostly of high quality. Together with the published open-source code and the appendix, the should allow for reproduction of the results.
  There are a few issues: The structure of the paper is at some points a bit arbitrary, creating an impression of jumping back and forth that may confuse readers. Some knowledge of the remote sensing and modelling community is taken for granted, while some descriptions of the eddy-covariance measurements used for the fitting are imprecise or misleading; in one case (relation between wind speed and footprint size) this may have led to a poor decision in data treatment. A regionally suspiciously large annual net respiration in the results is not discussed. These issues are described in detail below, together with minor needs or suggestions for improvement in particular sentences, figures or tables.
  I think that the paper can provide a very helpful reference and guideline for the many usage cases the new model will experience, once these issues are resolved. While some thoughts, future suggestions and questions on the analysis itself are included below, it is probably sufficient to revise the clarity of its description and adequate discussion of problems in the paper.
  L28: A word like “occur” seems missing after “Earth system”
  The grammar of the sentence was indeed a bit confusing. We changed the beginning to “Carbon cycling occurs…”.
  L34: Before or after the last sentence of the 1st paragraph, consider also setting the net flux (already mentioned before) into relation to anthropogenic fluxes. GPP and respiration mostly cancel out each other and while comparing their magnitude to anthropogenic emissions is impressive, doing the same for their net effect gives a more realistic (and still important) idea why we should care about better constraining or even managing these fluxes.
  We’ve added a sentence on the net flux at the very end of the paragraph. The comparison between GPP and anthropogenic emissions is the important one because those are the fluxes that dominate the observed CO2 concentration on short timescales, as usually considered by atmospheric transport models.
  “Over longer time scales GPP and respiration are nearly in balance with a net carbon sink of around 3 GtC per year ( Friedlingstein et al., 2023)” (Line 35f)
  L45: consider mentioning the 3 fluxes in a more instructive order and way, like: “GPP, respiration and consequently NEE…”
  Done.
  L60 ff.: Letter order in VRPM a typo? The rest of the paragraph from “Different approaches” is somewhat unclear. I guess you intend to mention past work by others that anyone could easily include to the new pyVPRM thanks to its modular structure, but it hasn’t been done yet?
  Yes. That’s a typo! Well caught. Thanks. Your interpretation of the paragraph is correct. We have adjusted the text to be concise.
  L78-70: The dropping of the class savanna raises 2 questions: First, while the title and start of this section suggest it is about how the model in general / in its traditional version / in its various other earlier implementations works, at this point (for the first time) the reader wonders if accidently you already stepped ahead to describe a recent implementation inside pyVPRM. Please clarify and also check if this also applies to other points in the rest of the section.
  Second, while there may or may not be good reasons to drop the class savanna, increased resolution is not a sufficient one for all (maybe not even most) savannas. E.g. equally distributed trees with 16 m crown diameter and 30 m stem distance at 20 m resolution would cover no pixel with more than 50% tree and leave none of them tree-free either. I see it’s cumbersome to re-run the example if you cannot give a better reason, but maybe then you could stress it’s just an example and has nothing to do with limitations of the actual pyVPRM.
  You are right. We have decided to restructure the text a little bit. We have removed the discussion on changes in the land cover class from Sec. 2 and instead moved it to Sect. 3.2 where we discuss the default land cover products of pyVPRM. Removing savanna is also motivated by the fact that this is not a dedicated class in the Copernicus Land Cover and the ESA WorldCover. If savanna is needed for a region, land cover maps containing this land cover class should be used.
  L91: Maybe add something like “in our convention”, because the usual one is that while NEE is negative if there is uptake, GPP is still always positive (i.e. NEE = R – GPP), which may be a bit of nuisance but still makes most sense given the words the abbreviations stand for, and their history in biology.
  We have removed that sentence, because we indeed use the convention you are suggesting, i.e. NEE = R – GPP.
  L120: What does the general assumption of grassland being xeric imply for the use of this vegetation class in the humid temperate zone (where we have lots of anthropogenic grasslands) and tundra ecosystems? Again (like dropping savanna), this is not super critical for the model description but maybe for the evaluation over Europe.
  Grassland is a complicated vegetation class, because it covers a wide range of ecosystems: managed and non-managed, xeric and non-xeric, high- and low-altitude. All this is relevant in Europe. In the present paper we have decided to comply with the current implementation in WRF with all its caveats. Without being able to resolve this within the scope of this paper we have added a few sentences about those uncertainties in the discussion of the European flux results in Sect. 6.
  “Finally, grasslands pose a special challenge as they can be managed/unmanaged, xeric/non-xeric and high/low altitude. Most European flux tower sites currently measure managed grasslands in central Europe. Adding flux towers in the dry (Mediterranean) and polar (Scandinavian) regions could be used to calculate VPRM parameters for those different types for grasslands.“ (Line 410 ff)
  Table 1: If Journal policy is as usual, remember to overwrite (not underwrite) tables.
  Done.
  Section 3.1.1: Since some descriptions before were about the specific examples in the paper rather than about inherent settings of pyVPRM, maybe continue to clarify here whether the chosen parameters for Eq. 10-12 are purely a part of the example, hardcoded, or (most likely?) something in between, e.g. default settings of functions the user can redefine in the function call.
  Done. The paragraph now reads:
  “In pyVPRM, both EVI and LSWI are calculated within the VPRM preprocessor class (see Fig. 1 whenever a new satellite image is added to the instance (using the add_sat_img(.) function). The implementation of pyVPRM allows the user to adjust the free parameters in the function call or even add an entirely different implementation of satellite indices.” (Line 175ff)
  L201: Describing eddy covariance footprints like this is a bit tricky, since various parameters (distance of peak weight from station, x% (e.g. 50 to 90) contour line maximum distance, or a similar percentage but for the crosswind-integrated footprint) can differ hugely from each other. ICOS downloads contain e.g. the distance where the cross-wind integrated footprint cumulative probability is 90% (variable FETCH_90), which for DE-RuS is on average (median) over time around 85 (87) m, which is larger (and intuitively better matching the flux community’s understsnding of typical footprint sizes) than your given dimension of 50 m but still less than the distance from station to any field edge. Alternatively, for a 2Dmap of the footprint climatology you could refer to the site labelling report (https://docs.google.com/document/d/1ksB7EZi7hOSJhpgCkDrSJzo2JBL2V174et9m7eqAPL4/edit?tab=t.0, p. 12) or maybe as a more citeable source to Fig. 2b in https://doi.org/10.5194/bg-21-2051-2024. In the following it might be worthwhile considering to use the actual crops which nicely support your hypothesis: In 2022 the field itself had potatoes, which are only planted in April, take time to grow and last until fall. The eastern and southeastern neighbouring fields in contrast had winter wheat, which was sown the fall before, starts photosynthesis early, and is typically harvested around in July. Unfortunately including this kind of information (at least for the field itself) in the ICOS metadata is still work in progress, but you could cite (if journal guidelines allow) “pers. comm. Schmidt 2025”. Marius Schmidt is the main site PI, gave me this information but can also be contacted directly. I am aware that the above information is too long for the paper, so please just regard it as an offering to pick from.
  Thanks a lot for the detailed information! We have updated the discussion on the footprint following your suggestions and also added information on the crop types as it provides valuable information to interpret the plots. As a citation we added the personal communication. From the journal guidelines we didn’t see any reason why this shouldn’t be possible.
  L210/Sect. 3.2-3.3: Briefly mentioning the fitting of the standard model with 4 independent parameters at the beginning of this section raises a number of clarity questions: Does in this case “the standard model” also apply to your example implementation, the results of which are shown at the end of 3.3 in Figure 4? Is the fitting reference and methodology used to generate Fig. 4 already the one only described later in section 4? If yes it creates a bit of an impression of jumping back and forth that can easily confuse readers, and might be resolved in different ways, all of which probably have their pros and cons: Reconsider the section order, abstain from already showing example results in Figure 4, use non-fitted default parameters for it, or (at least) insert some information at the start of 3.2 clarifying explicitly or implicitly that the fitting will be explained later.
  Well noticed. We have tried to clarify this by a) putting a statement at the beginning of Sect. 2 that the standard VPRM model is the one that is used throughout this paper and b) by referencing Sect.2 again at the beginning of Sect 3.2. In order to avoid confusion about Figure 4 we have moved the figure and the corresponding text to the “Results” section, after showing the VPRM parameters for Sentinel-2.
  L215/Table 4: It does not get clear to outsiders what the numbers in the table stand for – internal land cover codes of the two schemes?
  Your interpretation is correct. We have added a sentence to the caption to clarify this for the reader.
  “The numbers represent the internal land cover codes of the two data products.”
  L254/Fig. 5: Why does the URB fraction subpanel suggest 100% coverage at sea?
  This is explained in the caption. URB covers not only urban areas but all non-vegetated surfaces.
  L257 and also Eq. 9, “linear”: Note that process understanding suggests that the temperature dependence of respiration is far from linear. The chosen equation shape can probably be justified (especially since the Tlow clipping eliminates the worst possible effects) by reasons like simplicity, robustness or community tradition, but it should be avoided to use a wording that suggests to some readers the relation is actually linear.
  True. We also discuss this in line 375ff. We have also rephrased the sentence in L257 to clarify that this is a simplified model.
  L264: To avoid misunderstandings maybe it would be good to choose a wording making clear that not FLUXNET as such provides data only up to 2015, but rather the (currently still most up-to-date but hopefully replaced soon) FLUXNET2015 dataset.
  Done.
  L271: The part “which statistically determines vertical fluxes from turbulence (eddies) in the wind field” is a bit unclear in the sense that it does not give any helpful explanation to readers unfamiliar with the methods, and puts a strange focus on arbitrary aspects of the methods for those familiar. If you want to stay short, maybe end at the first comma with a citation (a textbook, early paper like https://doi.org/10.1175/1520-0469(1951)008<0135:TMOVTO>2.0.CO;2 , or influential paper like https://doi.org/10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO;2). Or try something like “which determines vertical fluxes from turbulence-resolving (i.e. sub-second) measurements of the vertical wind component and the quantity of interest, in our case CO2” (with or without citation).
  Thanks! We have used your suggestion to replace our description of the eddy covariance method. We have also added the references for completeness.
  L279: Is this selection somehow documented in a reproducible way? Or are all towers mentioned in the manuscript (Figure and table) already filtered for this criterion? Maybe more for your interest than relevant for the level of detail of the manuscript: All ICOS and some FLUXNET site datasets already include a filter where data not representative of the target land use (according to a footprint model run by the site PI in case of FLUXNET or centrally in case of ICOS) never enter the final gap-filling procedure. This may help (in the future) to also consider some more sites you now removed, if atypical land use is only found in a certain wind direction.
  Yes, all sites mentioned in this manuscript are filtered for this criterion. We are currently working on integrating a time-dependent flux-tower-footprint-model into the parameter estimation. This should help to overcome those problems in the future.
  L286: It is a (maybe not common but repeatedly occurring) misconception that footprints as such get larger with increasing wind speed. Because the vertical and cross-wind turbulence intensity (and thus the dispersion of any upwind signal) scale with wind speed itself, a direct cause-and-effect relation used in footprint models only exists for quantities such as Obukhov length (a parameter including the effect of unstable or stable temperature stratification) or variance of vertical or (horizontal cross-) wind, but not for wind speed itself. A secondary (scattered, unreliable and differing between sites) relation of wind speed to footprint size may result from this, however not necessarily in the direction you assume (e.g. very calm nights are often associated with very stable stratification and thus very long footprints). Also, it is not documented how you did this prioritization (P.S.: It is at L296, but the reader cannot anticipate it here). That said, for most (except the very highest) towers your choice to use the smallest possible patch for the 500 m resolution EVI/LSWI sources, and 3 x 3 for Sentinel-2 probably is a good way to catch the most important part of the footprint. Luckily, the way PIs select their tower locations and your additional screening of sites (L279) will probably make the result pretty much insensitive to the exact choice of the smoothing (e.g. 60x60 vs. 120 x 120 m). For future implementations you might want to consider the footprint information given in some (at least all ICOS) time series along with the fluxes (see comment on L201). For the 20 m resolution also a “ring-like” average of just the outer 8 pixels could slightly improve results, since the station infrastructure itself (by its pure presence, unwanted vegetation damage or because a small clearing was chosen for the setup) might affect the EVI/LSWI readings for the central pixel, while its contribution to the flux measurement will be small due to the fact that the peak of the footprint is always some (deca) meters upwind of the station.
  Thank you for the detailed discussion on the challenges of incorporating footprint knowledge. For the present paper our main goal was to resolve major problems with the footprint in previous versions of VPRM. It is absolutely true that the new solution also has caveats. We are currently working on including a (simplified) footprint model into the parameter estimation. This way we can select satellite pixels (especially for Sentinel-2 ) on an hourly scale depending on the footprint’s distribution. Ultimately, this would avoid confusion in inhomogeneous landscapes and therefore provide a more reliable estimate of the VPRM parameters. This is a major change to the current way of implementation and therefore not scope of this paper.
  L296: If the only reason for the inverse wind speed scaling function is the one mentioned above, I would advocate to drop it.
  We follow the reviewers suggestion to drop the (inverse) wind speed weighting as it is indeed not doing what we were hoping. A first attempt to replace the wind-speed weighting with a filter on the footprint size is not feasible as FLUXNET does not provide the FETCH_90 variable. Hence, we have decided to remove the weighting entirely. The impact on the VPRM parameters is, however, minor. We have adjusted the parameter table accordingly and removed the wind-speed filtering from the abstract, section 4.1, 4.2 and 4.3. For the future it remains the goal to incorporate a footprint model to overcome the current limitations.
  L300-303: Very elegant and consistent solution, avoiding to effectively do the source partitioning modelling of NEE into GPP and R twice, with inconsistent parameter sets. A minor drawback might be that your four parameters are per vegetation class (and strongly simplified for respiration, see comment on L257) while the source partitioning in the provided EC flux dataset allow for (more realistic) locally varying parameters. Therefore a future sensitivity study on this choice (probably outside the scope of this paper) might still be interesting.
  Thanks for the comment. We are currently working on a paper that discusses the various uncertainties in the VPRM approach. This would be a nice location for such a sensitivity study.
  L331: change yes, but (unlike all evidence presented so far) in this case I’m not sure whether for the better. It is a bit hard to “read” the continuous colour scale but it seems to suggest net annual losses of 1 kg m-2 yr-1 in some large regions with very low soil and biomass carbon stocks. Especially if it is kg C and not kg CO2 (please clarify) and goes on for several consecutive years (which we cannot know from the figure), this is hardly possible (both from the typical range of actually measured annual fluxes, and theoretically from the flux to stock ratio). It should be followed up in the discussion section how realistic these large net respiration events are. One possible reason may be the fitting of parameters per vegetation class independent of region or climate, which may have been partly compensated by the discussed errors in the old model version.
  We believe the paper has shown good evidence that the new model is better on site level. Still, it is true that uncertainty increases highly outside the regions and climates of the fitting data. We have added more discussion to the already existing one along the lines you suggest. Also it is true that the unit in the plot was not complete - it is kgC/(yr m^2).
  Figure 9: Check the “green” and “white” at the end of the caption. To me they look blue vs. yellow, and if matching blue with green and white with yellow it gives a meaning opposite to the legend within the figure.
  Well spotted. We have corrected this mistake.
  L444-445: “describing different the categories”: check word order
  The sentence was indeed misleading. It now reads “Configuration (yaml) files defining the VPRM vegetation classes for different land cover maps are found in pyVPRM/vprm_configs and are required to map the land cover categories to VPRM classes.”
  L471: In the original submission (if I didn’t overlook anything) neither the acknowledgement, nor the references or a separate dataset citation section (many journals want it in both) acknowledge the data contribution of the FLUXNET and ICOS site PIs. For ICOS there is a good description how to do it best at https://www.icos-cp.eu/how-to-cite, which might also serve as a blueprint for the used non-ICOS FLUXNET sites. Most of these guys are not officially paid to produce the data, but scientists sacrificing a considerable amount of their time to make them available to the community, and continuation critically depends on traceability of their efforts. P.S.: In reaction to a slightly different comment by the editor on the journal policy compliance, the authors now posted a code and data availability section which briefly mentions DOIs to both datasets but does not automatically address all of the above (which of course they couldn’t know at the time)
  We have added FLUXNET and ICOS to the acknowledgement and adjusted the data availability section to comply with the ICOS standards. Both datasets are also referenced within the main text.
  Table C1-C7: It did not get clear to me why only one year per site was chosen and why used sites or site-years can differ between the VPRM models for the different satellite products. If I didn’t overlook it earlier in the methods section, please clarify. Given that the height but not the width of the tables challenge the page dimensions and a lot of information is redundant between the tables, would it be an option to present only one (two page long) table with more columns, one for each of MODIS, Sentinel-2 and VIIRS?
  Good suggestion. We have adapted the table accordingly. In addition we have added two sentences in the main text to explain the background of the data selection. “To be consistent with previous VPRM versions (Gerbig and Koch, 2021) we use only one year of flux tower data per site. The year is selected as the year within the data acquisition period of the respective satellite with the maximum amount of available flux tower data.” (Line 284f)
  
  Citation: https://doi.org/10.5194/egusphere-2024-3692-AC3
RC2:
'Comment on egusphere-2024-3692', Anonymous Referee #2, 06 Mar 2025
This is a well-written and clear article that presents a promising contribution to Geoscientific Model Development. The study is compelling, and I believe it is well-suited for publication in this journal. Before I can recommend acceptance, I would appreciate clarification on a few points, mostly related to curiosities and minor refinements.
Curiosities:
Impact of temporal resolution when transitioning from MODIS to Sentinel-2:

You currently use 8-day MODIS products, while Sentinel-2 has a slightly longer revisit cycle. How does this difference affect the lowess smoothing? While you briefly mention this at the end of Section 3.1, I believe the issue deserves a bit more discussion.

Source of shortwave (SW) radiation for PAR calculation:

You mention in a footnote that this comes from ERA-5 reanalysis, but this point is somewhat understated. It would be beneficial to explicitly describe this in the main text.

Stratifying VPRM beyond broad forest types:

Given recent advances in AI and remote sensing for characterizing canopy height texture (e.g., https://gmd.copernicus.org/articles/18/337/2025/gmd-18-337-2025.html), would it make sense in the future to refine VPRM parameterization beyond the current broad categories (evergreen, deciduous, and mixed forests)?

Use of Sentinel-1 for cloud gap-filling:

With growing research on using Sentinel-1 to interpolate missing Sentinel-2 data, have you considered incorporating S1 to address cloud cover issues? This could potentially make your smoothing better and avoid part of the issues evoked at the end of section 3.1.

Alternative to filtering for low wind speed periods:

Could recent methods for separating advection (large-scale) from turbulence (e.g., https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3243/) provide an alternative to prioritizing time periods with relatively low wind speeds? I mean, if you can always remove the advection, you do not need to filter out these periods, which represent quite an important part of your calibration datasets, I guess?

Missing results for deciduous and mixed forests (Fig. 8):

Given that Fig. 6 indicates flux tower sites in the deciduous and mixed forest categories, is there a reason why results for these categories are not also shown in Fig. 8? It would be great to be able to see that.

Source of EVI and LSWI in Section 3.4:

The section mentions using forecasted temperature and SW data, but what method is used to generate EVI and LSWI? Is there an extrapolation process involved?

Minor comments:
L42: Add a reference for the bottom-up approach.

P44: It would improve coherence to briefly mention similar models (e.g., TBMs relying on remote sensing) before introducing the specific model you are improving.

P60: The discussion on approaches for improving VPRM feels slightly out of place here. Wouldn’t it fit better earlier in the introduction, before detailing your proposed improvements?

L91: Where is the negative sign?

L109: Strengthen the link between Equations 6 and 7 by explicitly referencing TH_full_leaf_expansion in Equation 6.

L146: The phrasing is slightly awkward - please consider rewording.

L238: There is an inconsistency between Equations 9 and 13 - why is the respiration equation parameter β replaced with δ?

L309: “This question will be investigated in further studies.” I’m looking forward to this! This is an exciting question from a remote sensing perspective.

L384: “In the future, the parameter estimation can be further improved by taking into account the (time-dependent) footprint of the eddy-covariance towers.” Could you rephrase please?
Citation: https://doi.org/10.5194/egusphere-2024-3692-RC2
- AC2: 'Reply on RC2', Theo Glauch, 27 Mar 2025
  
  Thanks a lot for the feedback and all the valuable comments. Our answers are below in italic. The line numbers refer to the line numbers of the revised manuscript.
  This is a well-written and clear article that presents a promising contribution to Geoscientific Model Development. The study is compelling, and I believe it is well-suited for publication in this journal. Before I can recommend acceptance, I would appreciate clarification on a few points, mostly related to curiosities and minor refinements.
  Curiosities:
  Impact of temporal resolution when transitioning from MODIS to Sentinel-2:
  
  You currently use 8-day MODIS products, while Sentinel-2 has a slightly longer revisit cycle. How does this difference affect the lowess smoothing? While you briefly mention this at the end of Section 3.1, I believe the issue deserves a bit more discussion.
  In fact we are using daily MODIS products, but the usage of 8-day products is also possible in pyVPRM. We have clarified this in the text at the end of section 3.1. Moreover we have added a small study at three very homogeneous test sites to illustrate the difference between the Sentinel-2 and MODIS curves (Figure 3 in the updated manuscript). They are overall pretty compatible. Sentinel-2 shows a bit more instability. Those fluctuations are, however, much smaller than the absolute values of EVI and LSWI.
  Source of shortwave (SW) radiation for PAR calculation:
  
  You mention in a footnote that this comes from ERA-5 reanalysis, but this point is somewhat understated. It would be beneficial to explicitly describe this in the main text.
  We have moved the footnote to the main text with slight adjustments.
  Stratifying VPRM beyond broad forest types:
  
  Given recent advances in AI and remote sensing for characterizing canopy height texture (e.g., https://gmd.copernicus.org/articles/18/337/2025/gmd-18-337-2025.html), would it make sense in the future to refine VPRM parameterization beyond the current broad categories (evergreen, deciduous, and mixed forests)?
  Yes, adding more (homogeneous) classes in the future is certainly valuable. The major limitation comes from the amount of eddy covariance for fitting VPRM parameters. With the growing network of stations the VPRM classes can be revised in the future.
  Use of Sentinel-1 for cloud gap-filling:
  
  With growing research on using Sentinel-1 to interpolate missing Sentinel-2 data, have you considered incorporating S1 to address cloud cover issues? This could potentially make your smoothing better and avoid part of the issues evoked at the end of section 3.1.
  Yes we are in contact with research groups that work on hybrid satellite data products like combined Sentinel-2 and Sentinel-3 or Landsat data. Since the pyVPRM framework is implemented in a way that allows different satellite products to be easily exchanged or added, such data products can be used in the near future.
  Alternative to filtering for low wind speed periods:
  
  Could recent methods for separating advection (large-scale) from turbulence (e.g., https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3243/) provide an alternative to prioritizing time periods with relatively low wind speeds? I mean, if you can always remove the advection, you do not need to filter out these periods, which represent quite an important part of your calibration datasets, I guess?
  In response to the questions of reviewer 1 we have removed the wind speed filtering as it is not working exactly the way we were expecting.
  In general the best approach would be to use a (semi-analytical) model for the tower footprint to get the optimal match between footprint and satellite observation. We are already working on this, but it’s beyond the scope of this paper. The method you are suggesting could be an alternative approach to that.
  Missing results for deciduous and mixed forests (Fig. 8):
  
  Given that Fig. 6 indicates flux tower sites in the deciduous and mixed forest categories, is there a reason why results for these categories are not also shown in Fig. 8? It would be great to be able to see that.
  We have added the same plots for deciduous and mixed forest to the appendix and refer to them in the text (Fig G1)
  Source of EVI and LSWI in Section 3.4:
  
  The section mentions using forecasted temperature and SW data, but what method is used to generate EVI and LSWI? Is there an extrapolation process involved
  This was misleading. We don’t use VPRM to generate forecasts, although it would probably be possible for a few days in advance since vegetation phenology is only evolving slowly (https://gmd.copernicus.org/articles/13/4091/2020/gmd-13-4091-2020.pdf). What was meant here, is that VPRM is driven by the internal meteorology of the atmospheric model. In other words, VPRM runs online coupled to the meteorology of the numerical weather prediction model.
  Minor comments:
  L42: Add a reference for the bottom-up approach.
  We have added a reference that discusses the differences between top-down and bottom-up approach ESSD - Global anthropogenic CO2 emissions and uncertainties as a prior for Earth system modelling and data assimilation
  P44: It would improve coherence to briefly mention similar models (e.g., TBMs relying on remote sensing) before introducing the specific model you are improving.
  We have added a sentence and a number of references for that.
  “There are several TBMs that utilize remote-sensing data to locally estimate the plant and water dynamics (Nelson et al., 2024; Gerbig and Koch, 2021; Randerson et al., 1996; Running and Zhao, 2021).” (Line 45f)
  P60: The discussion on approaches for improving VPRM feels slightly out of place here. Wouldn’t it fit better earlier in the introduction, before detailing your proposed improvements?
  Okay. We’ve moved it below the paragraph where we discuss remote-sensing-based TBMs and VPRM.
  L91: Where is the negative sign?
  Well spotted. We’ve removed this sentence. It was a leftover from a different sign convention.
  L109: Strengthen the link between Equations 6 and 7 by explicitly referencing TH_full_leaf_expansion in Equation 6.
  Done.
  L146: The phrasing is slightly awkward - please consider rewording.
  Done. Rephrased to: “In its current implementation pyVPRM provides an interface for satellite data products from MODIS, VIIRS, and Sentinel-2. Due to the modular structure, pyVPRM can be extended to other missions (e.g. Landsat) or fusion products of different satellite missions, e.g. .Moreno-Martínez et al. (2020).” (Line 148ff)
  L238: There is an inconsistency between Equations 9 and 13 - why is the respiration equation parameter β replaced with δ?
  This was an error, thanks for spotting it. We have corrected it..
  L309: “This question will be investigated in further studies.” I’m looking forward to this! This is an exciting question from a remote sensing perspective.
  Agreed. It’d be especially interesting to know if there is a way to calibrate the EVIs and LSWI against each other. That’d allow to fit the model on the high resolution Sentinel-2 data which better matches the station observation and use it with the lower resolution MODIS/VIIRS data. (Line 417 ff)
  L384: “In the future, the parameter estimation can be further improved by taking into account the (time-dependent) footprint of the eddy-covariance towers.” Could you rephrase please?
  We have rephrased the sentence (together with the preceding one) to “Especially for heterogeneous land cover types such as croplands and grasslands, the new VPRM parameters enhance the coherence between the model and flux tower data. However, the current VPRM parameter estimation still has limitations. Notably, it does not account for the time-dependent footprint of the flux towers, which remains a key area for improvement in future research.” (Line 416 ff)
  
  Citation: https://doi.org/10.5194/egusphere-2024-3692-AC2

Interactive discussion

Status: closed

CEC1:
'Comment on egusphere-2024-3692 - No compliance with the policy of the journal', Juan Antonio Añel, 12 Feb 2025

Dear authors,

Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".

https://www.geoscientific-model-development.net/policies/code_and_data_policy.html

You have archived your code on GitHub. However, GitHub is not a suitable repository for scientific publication. GitHub itself instructs authors to use other long-term archival and publishing alternatives, such as Zenodo. Moreover, this is clearly stated in our policy, which we would have expected you read before submitting your manuscript. Therefore, the current situation with your manuscript is irregular. Please, publish your code in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible, as we can not accept manuscripts in Discussions that do not comply with our policy.

Also, you do not provide permanent repositories for the data that you use, but website links, and in some cases they are simple links to main portals, not the specific data files that you use. This does not comply with our policy either. Please provide repositories (links and DOIs) containing the specific files that you use in your manuscript.

Note that you must include a modified 'Code and Data Availability' section in a potentially reviewed manuscript containing the DOI and links of the new repositories.

Finally, if you do not fix this problem, we will have to reject your manuscript for publication in our journal.
Juan A. Añel

Geosci. Model Dev. Executive Editor

Citation: https://doi.org/10.5194/egusphere-2024-3692-CEC1
- AC1: 'Reply on CEC1', Theo Glauch, 26 Feb 2025
  
  Dear editors,
  on behalf of the authors I want to apologize for the problems with the "Code and data availability" sections. Indeed, this slipped through in our final revisions. I'm posting the corrected version of the section below.
  All the best,
  Theo Glauch
  --------------------------
  The current version of pyVPRM is available from the project website: https://github.com/tglauch/pyVPRM under the MIT licence. The exact version of the model used to produce the results used in this paper is archived on Zenodo (https://doi.org/10.5281/zenodo.14216613 (v3.0)). Data processing scripts are specific to the DLR Terrabyte cluster that has been used to calculate the results. They are provided on request. The flux tower data can be retrieved through https://doi.org/10.18160/06dk-vryh (ICOS) and https://doi.org/10.1038/s41597-020-0534-3 (FLUXNET). The MODIS and VIIRS satellite datasets are publicly available through LP DAAC: MOD09GA (https://doi.org/10.5067/modis/mod09ga.061), MYD09GA (https://doi.org/10.5067/modis/myd09ga.061), MOD09A1 (https://doi.org/10.5067/modis/mod09a1.061), MYD09A1 (https://doi.org/10.5067/modis/myd09a1.061), VNP09GA (https://doi.org/10.5067/viirs/vnp09ga.002), VNP09H1 (https://doi.org/10.5067/viirs/vnp09h1.002). Sentinel-2 Collection 1 Level-2A data are publicly distributed through the Copernicus Data Space Ecosystem (http://dx.doi.org/10.5270/S2_-znk9xsj). The land cover maps are accessible through https://doi.org/10.5281/zenodo.3939049 (Copernicus Land Cover Service), https://doi.org/10.5281/zenodo.7254220 (ESA World Cover) and https://doi.org/10.1016/j.rse.2006.01.020 (SYNMAP).
  
  Citation: https://doi.org/10.5194/egusphere-2024-3692-AC1
RC1:
'Comment on egusphere-2024-3692', Anonymous Referee #1, 05 Mar 2025

Review on „pyVPRM: A next-generation Vegetation Photosynthesis and Respiration Model for the post-MODIS era” by Theo Glauch et al. submitted to Geoscientific Model development
The paper describes a new version of the previously published VPRM model (which estimates biogenic CO₂ exchange for large land surfaces in gridded form after fitting parameters that relate proxies from satellite remote sensing and meteo models to CO₂ exchange as measured by reference stations). The most important updates are with respect to the remote-sensing data sources that can be handled, in particular their spatial resolution, the user friendliness (modular, public python package) and parameter fitting procedure.
The paper addresses a relevant scientific modelling question within the scope of GMD with novel concepts and data. The presented example results demonstrate a substantial advance, especially in the realistic representation of crop ecosystems, which particularly suffered from mismatch between remote sensing pixels and station footprints due to strong differences in the sowing and senescence / harvest dates of crops between adjacent fields.
The language is fluent and clear almost everywhere and the figures are mostly of high quality. Together with the published open-source code and the appendix, the should allow for reproduction of the results.
There are a few issues: The structure of the paper is at some points a bit arbitrary, creating an impression of jumping back and forth that may confuse readers. Some knowledge of the remote sensing and modelling community is taken for granted, while some descriptions of the eddy-covariance measurements used for the fitting are imprecise or misleading; in one case (relation between wind speed and footprint size) this may have led to a poor decision in data treatment. A regionally suspiciously large annual net respiration in the results is not discussed. These issues are described in detail below, together with minor needs or suggestions for improvement in particular sentences, figures or tables.
I think that the paper can provide a very helpful reference and guideline for the many usage cases the new model will experience, once these issues are resolved. While some thoughts, future suggestions and questions on the analysis itself are included below, it is probably sufficient to revise the clarity of its description and adequate discussion of problems in the paper.
Introduction
L28: A word like “occur” seems missing after “Earth system”
L34: Before or after the last sentence of the 1^st paragraph, consider also setting the net flux (already mentioned before) into relation to anthropogenic fluxes. GPP and respiration mostly cancel out each other and while comparing their magnitude to anthropogenic emissions is impressive, doing the same for their net effect gives a more realistic (and still important) idea why we should care about better constraining or even managing these fluxes.
L45: consider mentioning the 3 fluxes in a more instructive order and way, like: “GPP, respiration and consequently NEE…”
L60 ff.: Letter order in VRPM a typo? The rest of the paragraph from “Different approaches” is somewhat unclear. I guess you intend to mention past work by others that anyone could easily include to the new pyVPRM thanks to its modular structure, but it hasn’t been done yet?
Section 2
L78-70: The dropping of the class savanna raises 2 questions: First, while the title and start of this section suggest it is about how the model in general / in its traditional version / in its various other earlier implementations works, at this point (for the first time) the reader wonders if accidently you already stepped ahead to describe a recent implementation inside pyVPRM. Please clarify and also check if this also applies to other points in the rest of the section.
Second, while there may or may not be good reasons to drop the class savanna, increased resolution is not a sufficient one for all (maybe not even most) savannas. E.g. equally distributed trees with 16 m crown diameter and 30 m stem distance at 20 m resolution would cover no pixel with more than 50% tree and leave none of them tree-free either. I see it’s cumbersome to re-run the example if you cannot give a better reason, but maybe then you could stress it’s just an example and has nothing to do with limitations of the actual pyVPRM.
L91: Maybe add something like “in our convention”, because the usual one is that while NEE is negative if there is uptake, GPP is still always positive (i.e. NEE = R – GPP), which may be a bit of nuisance but still makes most sense given the words the abbreviations stand for, and their history in biology.
L120: What does the general assumption of grassland being xeric imply for the use of this vegetation class in the humid temperate zone (where we have lots of anthropogenic grasslands) and tundra ecosystems? Again (like dropping savanna), this is not super critical for the model description but maybe for the evaluation over Europe.
Section 3
Table 1: If Journal policy is as usual, remember to overwrite (not underwrite) tables.
Section 3.1.1: Since some descriptions before were about the specific examples in the paper rather than about inherent settings of pyVPRM, maybe continue to clarify here whether the chosen parameters for Eq. 10-12 are purely a part of the example, hardcoded, or (most likely?) something in between, e.g. default settings of functions the user can redefine in the function call.
L201: Describing eddy covariance footprints like this is a bit tricky, since various parameters (distance of peak weight from station, x% (e.g. 50 to 90) contour line maximum distance, or a similar percentage but for the crosswind-integrated footprint) can differ hugely from each other. ICOS downloads contain e.g. the distance where the cross-wind integrated footprint cumulative probability is 90% (variable FETCH_90), which for DE-RuS is on average (median) over time around 85 (87) m, which is larger (and intuitively better matching the flux community’s understsnding of typical footprint sizes) than your given dimension of 50 m but still less than the distance from station to any field edge. Alternatively, for a 2Dmap of the footprint climatology you could refer to the site labelling report (https://docs.google.com/document/d/1ksB7EZi7hOSJhpgCkDrSJzo2JBL2V174et9m7eqAPL4/edit?tab=t.0, p. 12) or maybe as a more citeable source to Fig. 2b in https://doi.org/10.5194/bg-21-2051-2024. In the following it might be worthwhile considering to use the actual crops which nicely support your hypothesis: In 2022 the field itself had potatoes, which are only planted in April, take time to grow and last until fall. The eastern and southeastern neighbouring fields in contrast had winter wheat, which was sown the fall before, starts photosynthesis early, and is typically harvested around in July. Unfortunately including this kind of information (at least for the field itself) in the ICOS metadata is still work in progress, but you could cite (if journal guidelines allow) “pers. comm. Schmidt 2025”. Marius Schmidt is the main site PI, gave me this information but can also be contacted directly. I am aware that the above information is too long for the paper, so please just regard it as an offering to pick from.
L210/Sect. 3.2-3.3: Briefly mentioning the fitting of the standard model with 4 independent parameters at the beginning of this section raises a number of clarity questions: Does in this case “the standard model” also apply to your example implementation, the results of which are shown at the end of 3.3 in Figure 4? Is the fitting reference and methodology used to generate Fig. 4 already the one only described later in section 4? If yes it creates a bit of an impression of jumping back and forth that can easily confuse readers, and might be resolved in different ways, all of which probably have their pros and cons: Reconsider the section order, abstain from already showing example results in Figure 4, use non-fitted default parameters for it, or (at least) insert some information at the start of 3.2 clarifying explicitly or implicitly that the fitting will be explained later.
L215/Table 4: It does not get clear to outsiders what the numbers in the table stand for – internal land cover codes of the two schemes?
L254/Fig. 5: Why does the URB fraction subpanel suggest 100% coverage at sea?
Section 4
L257 and also Eq. 9, “linear”: Note that process understanding suggests that the temperature dependence of respiration is far from linear. The chosen equation shape can probably be justified (especially since the Tlow clipping eliminates the worst possible effects) by reasons like simplicity, robustness or community tradition, but it should be avoided to use a wording that suggests to some readers the relation is actually linear.
L264: To avoid misunderstandings maybe it would be good to choose a wording making clear that not FLUXNET as such provides data only up to 2015, but rather the (currently still most up-to-date but hopefully replaced soon) FLUXNET2015 dataset.
L271: The part “which statistically determines vertical fluxes from turbulence (eddies) in the wind field” is a bit unclear in the sense that it does not give any helpful explanation to readers unfamiliar with the methods, and puts a strange focus on arbitrary aspects of the methods for those familiar. If you want to stay short, maybe end at the first comma with a citation (a textbook, early paper like https://doi.org/10.1175/1520-0469(1951)008<0135:TMOVTO>2.0.CO;2 , or influential paper like https://doi.org/10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO;2). Or try something like “which determines vertical fluxes from turbulence-resolving (i.e. sub-second) measurements of the vertical wind component and the quantity of interest, in our case CO₂” (with or without citation).
L279: Is this selection somehow documented in a reproducible way? Or are all towers mentioned in the manuscript (Figure and table) already filtered for this criterion? Maybe more for your interest than relevant for the level of detail of the manuscript: All ICOS and some FLUXNET site datasets already include a filter where data not representative of the target land use (according to a footprint model run by the site PI in case of FLUXNET or centrally in case of ICOS) never enter the final gap-filling procedure. This may help (in the future) to also consider some more sites you now removed, if atypical land use is only found in a certain wind direction.
L286: It is a (maybe not common but repeatedly occurring) misconception that footprints as such get larger with increasing wind speed. Because the vertical and cross-wind turbulence intensity (and thus the dispersion of any upwind signal) scale with wind speed itself, a direct cause-and-effect relation used in footprint models only exists for quantities such as Obukhov length (a parameter including the effect of unstable or stable temperature stratification) or variance of vertical or (horizontal cross-) wind, but not for wind speed itself. A secondary (scattered, unreliable and differing between sites) relation of wind speed to footprint size may result from this, however not necessarily in the direction you assume (e.g. very calm nights are often associated with very stable stratification and thus very long footprints). Also, it is not documented how you did this prioritization (P.S.: It is at L296, but the reader cannot anticipate it here). That said, for most (except the very highest) towers your choice to use the smallest possible patch for the 500 m resolution EVI/LSWI sources, and 3 x 3 for Sentinel-2 probably is a good way to catch the most important part of the footprint. Luckily, the way PIs select their tower locations and your additional screening of sites (L279) will probably make the result pretty much insensitive to the exact choice of the smoothing (e.g. 60x60 vs. 120 x 120 m). For future implementations you might want to consider the footprint information given in some (at least all ICOS) time series along with the fluxes (see comment on L201). For the 20 m resolution also a “ring-like” average of just the outer 8 pixels could slightly improve results, since the station infrastructure itself (by its pure presence, unwanted vegetation damage or because a small clearing was chosen for the setup) might affect the EVI/LSWI readings for the central pixel, while its contribution to the flux measurement will be small due to the fact that the peak of the footprint is always some (deca) meters upwind of the station.
L296: If the only reason for the inverse wind speed scaling function is the one mentioned above, I would advocate to drop it.
L300-303: Very elegant and consistent solution, avoiding to effectively do the source partitioning modelling of NEE into GPP and R twice, with inconsistent parameter sets. A minor drawback might be that your four parameters are per vegetation class (and strongly simplified for respiration, see comment on L257) while the source partitioning in the provided EC flux dataset allow for (more realistic) locally varying parameters. Therefore a future sensitivity study on this choice (probably outside the scope of this paper) might still be interesting.
Section 5 to 7
L331: change yes, but (unlike all evidence presented so far) in this case I’m not sure whether for the better. It is a bit hard to “read” the continuous colour scale but it seems to suggest net annual losses of 1 kg m^-2 yr^-1 in some large regions with very low soil and biomass carbon stocks. Especially if it is kg C and not kg CO2 (please clarify) and goes on for several consecutive years (which we cannot know from the figure), this is hardly possible (both from the typical range of actually measured annual fluxes, and theoretically from the flux to stock ratio). It should be followed up in the discussion section how realistic these large net respiration events are. One possible reason may be the fitting of parameters per vegetation class independent of region or climate, which may have been partly compensated by the discussed errors in the old model version.
Figure 9: Check the “green” and “white” at the end of the caption. To me they look blue vs. yellow, and if matching blue with green and white with yellow it gives a meaning opposite to the legend within the figure.
L444-445: “describing different the categories”: check word order
Code and data availability, references, acknowledgements
L471: In the original submission (if I didn’t overlook anything) neither the acknowledgement, nor the references or a separate dataset citation section (many journals want it in both) acknowledge the data contribution of the FLUXNET and ICOS site PIs. For ICOS there is a good description how to do it best at https://www.icos-cp.eu/how-to-cite, which might also serve as a blueprint for the used non-ICOS FLUXNET sites. Most of these guys are not officially paid to produce the data, but scientists sacrificing a considerable amount of their time to make them available to the community, and continuation critically depends on traceability of their efforts. P.S.: In reaction to a slightly different comment by the editor on the journal policy compliance, the authors now posted a code and data availability section which briefly mentions DOIs to both datasets but does not automatically address all of the above (which of course they couldn’t know at the time)
Appendix
Table C1-C7: It did not get clear to me why only one year per site was chosen and why used sites or site-years can differ between the VPRM models for the different satellite products. If I didn’t overlook it earlier in the methods section, please clarify. Given that the height but not the width of the tables challenge the page dimensions and a lot of information is redundant between the tables, would it be an option to present only one (two page long) table with more columns, one for each of MODIS, Sentinel-2 and VIIRS?

Citation: https://doi.org/10.5194/egusphere-2024-3692-RC1
- AC3: 'Reply on RC1', Theo Glauch, 27 Mar 2025
  
  Thanks for reviewing our paper and providing such useful and detailed feedback. This was very helpful. You find our responses below. The original questions are in boldface and our answers in italic.
  
  The paper describes a new version of the previously published VPRM model (which estimates biogenic CO2 exchange for large land surfaces in gridded form after fitting parameters that relate proxies from satellite remote sensing and meteo models to CO2 exchange as measured by reference stations). The most important updates are with respect to the remote-sensing data sources that can be handled, in particular their spatial resolution, the user friendliness (modular, public python package) and parameter fitting procedure.
  The paper addresses a relevant scientific modelling question within the scope of GMD with novel concepts and data. The presented example results demonstrate a substantial advance, especially in the realistic representation of crop ecosystems, which particularly suffered from mismatch between remote sensing pixels and station footprints due to strong differences in the sowing and senescence / harvest dates of crops between adjacent fields.
  The language is fluent and clear almost everywhere and the figures are mostly of high quality. Together with the published open-source code and the appendix, the should allow for reproduction of the results.
  There are a few issues: The structure of the paper is at some points a bit arbitrary, creating an impression of jumping back and forth that may confuse readers. Some knowledge of the remote sensing and modelling community is taken for granted, while some descriptions of the eddy-covariance measurements used for the fitting are imprecise or misleading; in one case (relation between wind speed and footprint size) this may have led to a poor decision in data treatment. A regionally suspiciously large annual net respiration in the results is not discussed. These issues are described in detail below, together with minor needs or suggestions for improvement in particular sentences, figures or tables.
  I think that the paper can provide a very helpful reference and guideline for the many usage cases the new model will experience, once these issues are resolved. While some thoughts, future suggestions and questions on the analysis itself are included below, it is probably sufficient to revise the clarity of its description and adequate discussion of problems in the paper.
  L28: A word like “occur” seems missing after “Earth system”
  The grammar of the sentence was indeed a bit confusing. We changed the beginning to “Carbon cycling occurs…”.
  L34: Before or after the last sentence of the 1st paragraph, consider also setting the net flux (already mentioned before) into relation to anthropogenic fluxes. GPP and respiration mostly cancel out each other and while comparing their magnitude to anthropogenic emissions is impressive, doing the same for their net effect gives a more realistic (and still important) idea why we should care about better constraining or even managing these fluxes.
  We’ve added a sentence on the net flux at the very end of the paragraph. The comparison between GPP and anthropogenic emissions is the important one because those are the fluxes that dominate the observed CO2 concentration on short timescales, as usually considered by atmospheric transport models.
  “Over longer time scales GPP and respiration are nearly in balance with a net carbon sink of around 3 GtC per year ( Friedlingstein et al., 2023)” (Line 35f)
  L45: consider mentioning the 3 fluxes in a more instructive order and way, like: “GPP, respiration and consequently NEE…”
  Done.
  L60 ff.: Letter order in VRPM a typo? The rest of the paragraph from “Different approaches” is somewhat unclear. I guess you intend to mention past work by others that anyone could easily include to the new pyVPRM thanks to its modular structure, but it hasn’t been done yet?
  Yes. That’s a typo! Well caught. Thanks. Your interpretation of the paragraph is correct. We have adjusted the text to be concise.
  L78-70: The dropping of the class savanna raises 2 questions: First, while the title and start of this section suggest it is about how the model in general / in its traditional version / in its various other earlier implementations works, at this point (for the first time) the reader wonders if accidently you already stepped ahead to describe a recent implementation inside pyVPRM. Please clarify and also check if this also applies to other points in the rest of the section.
  Second, while there may or may not be good reasons to drop the class savanna, increased resolution is not a sufficient one for all (maybe not even most) savannas. E.g. equally distributed trees with 16 m crown diameter and 30 m stem distance at 20 m resolution would cover no pixel with more than 50% tree and leave none of them tree-free either. I see it’s cumbersome to re-run the example if you cannot give a better reason, but maybe then you could stress it’s just an example and has nothing to do with limitations of the actual pyVPRM.
  You are right. We have decided to restructure the text a little bit. We have removed the discussion on changes in the land cover class from Sec. 2 and instead moved it to Sect. 3.2 where we discuss the default land cover products of pyVPRM. Removing savanna is also motivated by the fact that this is not a dedicated class in the Copernicus Land Cover and the ESA WorldCover. If savanna is needed for a region, land cover maps containing this land cover class should be used.
  L91: Maybe add something like “in our convention”, because the usual one is that while NEE is negative if there is uptake, GPP is still always positive (i.e. NEE = R – GPP), which may be a bit of nuisance but still makes most sense given the words the abbreviations stand for, and their history in biology.
  We have removed that sentence, because we indeed use the convention you are suggesting, i.e. NEE = R – GPP.
  L120: What does the general assumption of grassland being xeric imply for the use of this vegetation class in the humid temperate zone (where we have lots of anthropogenic grasslands) and tundra ecosystems? Again (like dropping savanna), this is not super critical for the model description but maybe for the evaluation over Europe.
  Grassland is a complicated vegetation class, because it covers a wide range of ecosystems: managed and non-managed, xeric and non-xeric, high- and low-altitude. All this is relevant in Europe. In the present paper we have decided to comply with the current implementation in WRF with all its caveats. Without being able to resolve this within the scope of this paper we have added a few sentences about those uncertainties in the discussion of the European flux results in Sect. 6.
  “Finally, grasslands pose a special challenge as they can be managed/unmanaged, xeric/non-xeric and high/low altitude. Most European flux tower sites currently measure managed grasslands in central Europe. Adding flux towers in the dry (Mediterranean) and polar (Scandinavian) regions could be used to calculate VPRM parameters for those different types for grasslands.“ (Line 410 ff)
  Table 1: If Journal policy is as usual, remember to overwrite (not underwrite) tables.
  Done.
  Section 3.1.1: Since some descriptions before were about the specific examples in the paper rather than about inherent settings of pyVPRM, maybe continue to clarify here whether the chosen parameters for Eq. 10-12 are purely a part of the example, hardcoded, or (most likely?) something in between, e.g. default settings of functions the user can redefine in the function call.
  Done. The paragraph now reads:
  “In pyVPRM, both EVI and LSWI are calculated within the VPRM preprocessor class (see Fig. 1 whenever a new satellite image is added to the instance (using the add_sat_img(.) function). The implementation of pyVPRM allows the user to adjust the free parameters in the function call or even add an entirely different implementation of satellite indices.” (Line 175ff)
  L201: Describing eddy covariance footprints like this is a bit tricky, since various parameters (distance of peak weight from station, x% (e.g. 50 to 90) contour line maximum distance, or a similar percentage but for the crosswind-integrated footprint) can differ hugely from each other. ICOS downloads contain e.g. the distance where the cross-wind integrated footprint cumulative probability is 90% (variable FETCH_90), which for DE-RuS is on average (median) over time around 85 (87) m, which is larger (and intuitively better matching the flux community’s understsnding of typical footprint sizes) than your given dimension of 50 m but still less than the distance from station to any field edge. Alternatively, for a 2Dmap of the footprint climatology you could refer to the site labelling report (https://docs.google.com/document/d/1ksB7EZi7hOSJhpgCkDrSJzo2JBL2V174et9m7eqAPL4/edit?tab=t.0, p. 12) or maybe as a more citeable source to Fig. 2b in https://doi.org/10.5194/bg-21-2051-2024. In the following it might be worthwhile considering to use the actual crops which nicely support your hypothesis: In 2022 the field itself had potatoes, which are only planted in April, take time to grow and last until fall. The eastern and southeastern neighbouring fields in contrast had winter wheat, which was sown the fall before, starts photosynthesis early, and is typically harvested around in July. Unfortunately including this kind of information (at least for the field itself) in the ICOS metadata is still work in progress, but you could cite (if journal guidelines allow) “pers. comm. Schmidt 2025”. Marius Schmidt is the main site PI, gave me this information but can also be contacted directly. I am aware that the above information is too long for the paper, so please just regard it as an offering to pick from.
  Thanks a lot for the detailed information! We have updated the discussion on the footprint following your suggestions and also added information on the crop types as it provides valuable information to interpret the plots. As a citation we added the personal communication. From the journal guidelines we didn’t see any reason why this shouldn’t be possible.
  L210/Sect. 3.2-3.3: Briefly mentioning the fitting of the standard model with 4 independent parameters at the beginning of this section raises a number of clarity questions: Does in this case “the standard model” also apply to your example implementation, the results of which are shown at the end of 3.3 in Figure 4? Is the fitting reference and methodology used to generate Fig. 4 already the one only described later in section 4? If yes it creates a bit of an impression of jumping back and forth that can easily confuse readers, and might be resolved in different ways, all of which probably have their pros and cons: Reconsider the section order, abstain from already showing example results in Figure 4, use non-fitted default parameters for it, or (at least) insert some information at the start of 3.2 clarifying explicitly or implicitly that the fitting will be explained later.
  Well noticed. We have tried to clarify this by a) putting a statement at the beginning of Sect. 2 that the standard VPRM model is the one that is used throughout this paper and b) by referencing Sect.2 again at the beginning of Sect 3.2. In order to avoid confusion about Figure 4 we have moved the figure and the corresponding text to the “Results” section, after showing the VPRM parameters for Sentinel-2.
  L215/Table 4: It does not get clear to outsiders what the numbers in the table stand for – internal land cover codes of the two schemes?
  Your interpretation is correct. We have added a sentence to the caption to clarify this for the reader.
  “The numbers represent the internal land cover codes of the two data products.”
  L254/Fig. 5: Why does the URB fraction subpanel suggest 100% coverage at sea?
  This is explained in the caption. URB covers not only urban areas but all non-vegetated surfaces.
  L257 and also Eq. 9, “linear”: Note that process understanding suggests that the temperature dependence of respiration is far from linear. The chosen equation shape can probably be justified (especially since the Tlow clipping eliminates the worst possible effects) by reasons like simplicity, robustness or community tradition, but it should be avoided to use a wording that suggests to some readers the relation is actually linear.
  True. We also discuss this in line 375ff. We have also rephrased the sentence in L257 to clarify that this is a simplified model.
  L264: To avoid misunderstandings maybe it would be good to choose a wording making clear that not FLUXNET as such provides data only up to 2015, but rather the (currently still most up-to-date but hopefully replaced soon) FLUXNET2015 dataset.
  Done.
  L271: The part “which statistically determines vertical fluxes from turbulence (eddies) in the wind field” is a bit unclear in the sense that it does not give any helpful explanation to readers unfamiliar with the methods, and puts a strange focus on arbitrary aspects of the methods for those familiar. If you want to stay short, maybe end at the first comma with a citation (a textbook, early paper like https://doi.org/10.1175/1520-0469(1951)008<0135:TMOVTO>2.0.CO;2 , or influential paper like https://doi.org/10.1175/1520-0477(2001)082<2415:FANTTS>2.3.CO;2). Or try something like “which determines vertical fluxes from turbulence-resolving (i.e. sub-second) measurements of the vertical wind component and the quantity of interest, in our case CO2” (with or without citation).
  Thanks! We have used your suggestion to replace our description of the eddy covariance method. We have also added the references for completeness.
  L279: Is this selection somehow documented in a reproducible way? Or are all towers mentioned in the manuscript (Figure and table) already filtered for this criterion? Maybe more for your interest than relevant for the level of detail of the manuscript: All ICOS and some FLUXNET site datasets already include a filter where data not representative of the target land use (according to a footprint model run by the site PI in case of FLUXNET or centrally in case of ICOS) never enter the final gap-filling procedure. This may help (in the future) to also consider some more sites you now removed, if atypical land use is only found in a certain wind direction.
  Yes, all sites mentioned in this manuscript are filtered for this criterion. We are currently working on integrating a time-dependent flux-tower-footprint-model into the parameter estimation. This should help to overcome those problems in the future.
  L286: It is a (maybe not common but repeatedly occurring) misconception that footprints as such get larger with increasing wind speed. Because the vertical and cross-wind turbulence intensity (and thus the dispersion of any upwind signal) scale with wind speed itself, a direct cause-and-effect relation used in footprint models only exists for quantities such as Obukhov length (a parameter including the effect of unstable or stable temperature stratification) or variance of vertical or (horizontal cross-) wind, but not for wind speed itself. A secondary (scattered, unreliable and differing between sites) relation of wind speed to footprint size may result from this, however not necessarily in the direction you assume (e.g. very calm nights are often associated with very stable stratification and thus very long footprints). Also, it is not documented how you did this prioritization (P.S.: It is at L296, but the reader cannot anticipate it here). That said, for most (except the very highest) towers your choice to use the smallest possible patch for the 500 m resolution EVI/LSWI sources, and 3 x 3 for Sentinel-2 probably is a good way to catch the most important part of the footprint. Luckily, the way PIs select their tower locations and your additional screening of sites (L279) will probably make the result pretty much insensitive to the exact choice of the smoothing (e.g. 60x60 vs. 120 x 120 m). For future implementations you might want to consider the footprint information given in some (at least all ICOS) time series along with the fluxes (see comment on L201). For the 20 m resolution also a “ring-like” average of just the outer 8 pixels could slightly improve results, since the station infrastructure itself (by its pure presence, unwanted vegetation damage or because a small clearing was chosen for the setup) might affect the EVI/LSWI readings for the central pixel, while its contribution to the flux measurement will be small due to the fact that the peak of the footprint is always some (deca) meters upwind of the station.
  Thank you for the detailed discussion on the challenges of incorporating footprint knowledge. For the present paper our main goal was to resolve major problems with the footprint in previous versions of VPRM. It is absolutely true that the new solution also has caveats. We are currently working on including a (simplified) footprint model into the parameter estimation. This way we can select satellite pixels (especially for Sentinel-2 ) on an hourly scale depending on the footprint’s distribution. Ultimately, this would avoid confusion in inhomogeneous landscapes and therefore provide a more reliable estimate of the VPRM parameters. This is a major change to the current way of implementation and therefore not scope of this paper.
  L296: If the only reason for the inverse wind speed scaling function is the one mentioned above, I would advocate to drop it.
  We follow the reviewers suggestion to drop the (inverse) wind speed weighting as it is indeed not doing what we were hoping. A first attempt to replace the wind-speed weighting with a filter on the footprint size is not feasible as FLUXNET does not provide the FETCH_90 variable. Hence, we have decided to remove the weighting entirely. The impact on the VPRM parameters is, however, minor. We have adjusted the parameter table accordingly and removed the wind-speed filtering from the abstract, section 4.1, 4.2 and 4.3. For the future it remains the goal to incorporate a footprint model to overcome the current limitations.
  L300-303: Very elegant and consistent solution, avoiding to effectively do the source partitioning modelling of NEE into GPP and R twice, with inconsistent parameter sets. A minor drawback might be that your four parameters are per vegetation class (and strongly simplified for respiration, see comment on L257) while the source partitioning in the provided EC flux dataset allow for (more realistic) locally varying parameters. Therefore a future sensitivity study on this choice (probably outside the scope of this paper) might still be interesting.
  Thanks for the comment. We are currently working on a paper that discusses the various uncertainties in the VPRM approach. This would be a nice location for such a sensitivity study.
  L331: change yes, but (unlike all evidence presented so far) in this case I’m not sure whether for the better. It is a bit hard to “read” the continuous colour scale but it seems to suggest net annual losses of 1 kg m-2 yr-1 in some large regions with very low soil and biomass carbon stocks. Especially if it is kg C and not kg CO2 (please clarify) and goes on for several consecutive years (which we cannot know from the figure), this is hardly possible (both from the typical range of actually measured annual fluxes, and theoretically from the flux to stock ratio). It should be followed up in the discussion section how realistic these large net respiration events are. One possible reason may be the fitting of parameters per vegetation class independent of region or climate, which may have been partly compensated by the discussed errors in the old model version.
  We believe the paper has shown good evidence that the new model is better on site level. Still, it is true that uncertainty increases highly outside the regions and climates of the fitting data. We have added more discussion to the already existing one along the lines you suggest. Also it is true that the unit in the plot was not complete - it is kgC/(yr m^2).
  Figure 9: Check the “green” and “white” at the end of the caption. To me they look blue vs. yellow, and if matching blue with green and white with yellow it gives a meaning opposite to the legend within the figure.
  Well spotted. We have corrected this mistake.
  L444-445: “describing different the categories”: check word order
  The sentence was indeed misleading. It now reads “Configuration (yaml) files defining the VPRM vegetation classes for different land cover maps are found in pyVPRM/vprm_configs and are required to map the land cover categories to VPRM classes.”
  L471: In the original submission (if I didn’t overlook anything) neither the acknowledgement, nor the references or a separate dataset citation section (many journals want it in both) acknowledge the data contribution of the FLUXNET and ICOS site PIs. For ICOS there is a good description how to do it best at https://www.icos-cp.eu/how-to-cite, which might also serve as a blueprint for the used non-ICOS FLUXNET sites. Most of these guys are not officially paid to produce the data, but scientists sacrificing a considerable amount of their time to make them available to the community, and continuation critically depends on traceability of their efforts. P.S.: In reaction to a slightly different comment by the editor on the journal policy compliance, the authors now posted a code and data availability section which briefly mentions DOIs to both datasets but does not automatically address all of the above (which of course they couldn’t know at the time)
  We have added FLUXNET and ICOS to the acknowledgement and adjusted the data availability section to comply with the ICOS standards. Both datasets are also referenced within the main text.
  Table C1-C7: It did not get clear to me why only one year per site was chosen and why used sites or site-years can differ between the VPRM models for the different satellite products. If I didn’t overlook it earlier in the methods section, please clarify. Given that the height but not the width of the tables challenge the page dimensions and a lot of information is redundant between the tables, would it be an option to present only one (two page long) table with more columns, one for each of MODIS, Sentinel-2 and VIIRS?
  Good suggestion. We have adapted the table accordingly. In addition we have added two sentences in the main text to explain the background of the data selection. “To be consistent with previous VPRM versions (Gerbig and Koch, 2021) we use only one year of flux tower data per site. The year is selected as the year within the data acquisition period of the respective satellite with the maximum amount of available flux tower data.” (Line 284f)
  
  Citation: https://doi.org/10.5194/egusphere-2024-3692-AC3
RC2:
'Comment on egusphere-2024-3692', Anonymous Referee #2, 06 Mar 2025
This is a well-written and clear article that presents a promising contribution to Geoscientific Model Development. The study is compelling, and I believe it is well-suited for publication in this journal. Before I can recommend acceptance, I would appreciate clarification on a few points, mostly related to curiosities and minor refinements.
Curiosities:
Impact of temporal resolution when transitioning from MODIS to Sentinel-2:

You currently use 8-day MODIS products, while Sentinel-2 has a slightly longer revisit cycle. How does this difference affect the lowess smoothing? While you briefly mention this at the end of Section 3.1, I believe the issue deserves a bit more discussion.

Source of shortwave (SW) radiation for PAR calculation:

You mention in a footnote that this comes from ERA-5 reanalysis, but this point is somewhat understated. It would be beneficial to explicitly describe this in the main text.

Stratifying VPRM beyond broad forest types:

Given recent advances in AI and remote sensing for characterizing canopy height texture (e.g., https://gmd.copernicus.org/articles/18/337/2025/gmd-18-337-2025.html), would it make sense in the future to refine VPRM parameterization beyond the current broad categories (evergreen, deciduous, and mixed forests)?

Use of Sentinel-1 for cloud gap-filling:

With growing research on using Sentinel-1 to interpolate missing Sentinel-2 data, have you considered incorporating S1 to address cloud cover issues? This could potentially make your smoothing better and avoid part of the issues evoked at the end of section 3.1.

Alternative to filtering for low wind speed periods:

Could recent methods for separating advection (large-scale) from turbulence (e.g., https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3243/) provide an alternative to prioritizing time periods with relatively low wind speeds? I mean, if you can always remove the advection, you do not need to filter out these periods, which represent quite an important part of your calibration datasets, I guess?

Missing results for deciduous and mixed forests (Fig. 8):

Given that Fig. 6 indicates flux tower sites in the deciduous and mixed forest categories, is there a reason why results for these categories are not also shown in Fig. 8? It would be great to be able to see that.

Source of EVI and LSWI in Section 3.4:

The section mentions using forecasted temperature and SW data, but what method is used to generate EVI and LSWI? Is there an extrapolation process involved?

Minor comments:
L42: Add a reference for the bottom-up approach.

P44: It would improve coherence to briefly mention similar models (e.g., TBMs relying on remote sensing) before introducing the specific model you are improving.

P60: The discussion on approaches for improving VPRM feels slightly out of place here. Wouldn’t it fit better earlier in the introduction, before detailing your proposed improvements?

L91: Where is the negative sign?

L109: Strengthen the link between Equations 6 and 7 by explicitly referencing TH_full_leaf_expansion in Equation 6.

L146: The phrasing is slightly awkward - please consider rewording.

L238: There is an inconsistency between Equations 9 and 13 - why is the respiration equation parameter β replaced with δ?

L309: “This question will be investigated in further studies.” I’m looking forward to this! This is an exciting question from a remote sensing perspective.

L384: “In the future, the parameter estimation can be further improved by taking into account the (time-dependent) footprint of the eddy-covariance towers.” Could you rephrase please?
Citation: https://doi.org/10.5194/egusphere-2024-3692-RC2
- AC2: 'Reply on RC2', Theo Glauch, 27 Mar 2025
  
  Thanks a lot for the feedback and all the valuable comments. Our answers are below in italic. The line numbers refer to the line numbers of the revised manuscript.
  This is a well-written and clear article that presents a promising contribution to Geoscientific Model Development. The study is compelling, and I believe it is well-suited for publication in this journal. Before I can recommend acceptance, I would appreciate clarification on a few points, mostly related to curiosities and minor refinements.
  Curiosities:
  Impact of temporal resolution when transitioning from MODIS to Sentinel-2:
  
  You currently use 8-day MODIS products, while Sentinel-2 has a slightly longer revisit cycle. How does this difference affect the lowess smoothing? While you briefly mention this at the end of Section 3.1, I believe the issue deserves a bit more discussion.
  In fact we are using daily MODIS products, but the usage of 8-day products is also possible in pyVPRM. We have clarified this in the text at the end of section 3.1. Moreover we have added a small study at three very homogeneous test sites to illustrate the difference between the Sentinel-2 and MODIS curves (Figure 3 in the updated manuscript). They are overall pretty compatible. Sentinel-2 shows a bit more instability. Those fluctuations are, however, much smaller than the absolute values of EVI and LSWI.
  Source of shortwave (SW) radiation for PAR calculation:
  
  You mention in a footnote that this comes from ERA-5 reanalysis, but this point is somewhat understated. It would be beneficial to explicitly describe this in the main text.
  We have moved the footnote to the main text with slight adjustments.
  Stratifying VPRM beyond broad forest types:
  
  Given recent advances in AI and remote sensing for characterizing canopy height texture (e.g., https://gmd.copernicus.org/articles/18/337/2025/gmd-18-337-2025.html), would it make sense in the future to refine VPRM parameterization beyond the current broad categories (evergreen, deciduous, and mixed forests)?
  Yes, adding more (homogeneous) classes in the future is certainly valuable. The major limitation comes from the amount of eddy covariance for fitting VPRM parameters. With the growing network of stations the VPRM classes can be revised in the future.
  Use of Sentinel-1 for cloud gap-filling:
  
  With growing research on using Sentinel-1 to interpolate missing Sentinel-2 data, have you considered incorporating S1 to address cloud cover issues? This could potentially make your smoothing better and avoid part of the issues evoked at the end of section 3.1.
  Yes we are in contact with research groups that work on hybrid satellite data products like combined Sentinel-2 and Sentinel-3 or Landsat data. Since the pyVPRM framework is implemented in a way that allows different satellite products to be easily exchanged or added, such data products can be used in the near future.
  Alternative to filtering for low wind speed periods:
  
  Could recent methods for separating advection (large-scale) from turbulence (e.g., https://egusphere.copernicus.org/preprints/2024/egusphere-2024-3243/) provide an alternative to prioritizing time periods with relatively low wind speeds? I mean, if you can always remove the advection, you do not need to filter out these periods, which represent quite an important part of your calibration datasets, I guess?
  In response to the questions of reviewer 1 we have removed the wind speed filtering as it is not working exactly the way we were expecting.
  In general the best approach would be to use a (semi-analytical) model for the tower footprint to get the optimal match between footprint and satellite observation. We are already working on this, but it’s beyond the scope of this paper. The method you are suggesting could be an alternative approach to that.
  Missing results for deciduous and mixed forests (Fig. 8):
  
  Given that Fig. 6 indicates flux tower sites in the deciduous and mixed forest categories, is there a reason why results for these categories are not also shown in Fig. 8? It would be great to be able to see that.
  We have added the same plots for deciduous and mixed forest to the appendix and refer to them in the text (Fig G1)
  Source of EVI and LSWI in Section 3.4:
  
  The section mentions using forecasted temperature and SW data, but what method is used to generate EVI and LSWI? Is there an extrapolation process involved
  This was misleading. We don’t use VPRM to generate forecasts, although it would probably be possible for a few days in advance since vegetation phenology is only evolving slowly (https://gmd.copernicus.org/articles/13/4091/2020/gmd-13-4091-2020.pdf). What was meant here, is that VPRM is driven by the internal meteorology of the atmospheric model. In other words, VPRM runs online coupled to the meteorology of the numerical weather prediction model.
  Minor comments:
  L42: Add a reference for the bottom-up approach.
  We have added a reference that discusses the differences between top-down and bottom-up approach ESSD - Global anthropogenic CO2 emissions and uncertainties as a prior for Earth system modelling and data assimilation
  P44: It would improve coherence to briefly mention similar models (e.g., TBMs relying on remote sensing) before introducing the specific model you are improving.
  We have added a sentence and a number of references for that.
  “There are several TBMs that utilize remote-sensing data to locally estimate the plant and water dynamics (Nelson et al., 2024; Gerbig and Koch, 2021; Randerson et al., 1996; Running and Zhao, 2021).” (Line 45f)
  P60: The discussion on approaches for improving VPRM feels slightly out of place here. Wouldn’t it fit better earlier in the introduction, before detailing your proposed improvements?
  Okay. We’ve moved it below the paragraph where we discuss remote-sensing-based TBMs and VPRM.
  L91: Where is the negative sign?
  Well spotted. We’ve removed this sentence. It was a leftover from a different sign convention.
  L109: Strengthen the link between Equations 6 and 7 by explicitly referencing TH_full_leaf_expansion in Equation 6.
  Done.
  L146: The phrasing is slightly awkward - please consider rewording.
  Done. Rephrased to: “In its current implementation pyVPRM provides an interface for satellite data products from MODIS, VIIRS, and Sentinel-2. Due to the modular structure, pyVPRM can be extended to other missions (e.g. Landsat) or fusion products of different satellite missions, e.g. .Moreno-Martínez et al. (2020).” (Line 148ff)
  L238: There is an inconsistency between Equations 9 and 13 - why is the respiration equation parameter β replaced with δ?
  This was an error, thanks for spotting it. We have corrected it..
  L309: “This question will be investigated in further studies.” I’m looking forward to this! This is an exciting question from a remote sensing perspective.
  Agreed. It’d be especially interesting to know if there is a way to calibrate the EVIs and LSWI against each other. That’d allow to fit the model on the high resolution Sentinel-2 data which better matches the station observation and use it with the lower resolution MODIS/VIIRS data. (Line 417 ff)
  L384: “In the future, the parameter estimation can be further improved by taking into account the (time-dependent) footprint of the eddy-covariance towers.” Could you rephrase please?
  We have rephrased the sentence (together with the preceding one) to “Especially for heterogeneous land cover types such as croplands and grasslands, the new VPRM parameters enhance the coherence between the model and flux tower data. However, the current VPRM parameter estimation still has limitations. Notably, it does not account for the time-dependent footprint of the flux towers, which remains a key area for improvement in future research.” (Line 416 ff)
  
  Citation: https://doi.org/10.5194/egusphere-2024-3692-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Theo Glauch on behalf of the Authors (27 Mar 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (12 Apr 2025) by Makoto Saito

RR by Anonymous Referee #1 (07 May 2025)

ED: Publish as is (22 May 2025) by Makoto Saito

AR by Theo Glauch on behalf of the Authors (24 May 2025) Manuscript

Journal article(s) based on this preprint

30 Jul 2025

pyVPRM: a next-generation vegetation photosynthesis and respiration model for the post-MODIS era

Theo Glauch, Julia Marshall, Christoph Gerbig, Santiago Botía, Michał Gałkowski, Sanam N. Vardag, and André Butz

Geosci. Model Dev., 18, 4713–4742, https://doi.org/10.5194/gmd-18-4713-2025,https://doi.org/10.5194/gmd-18-4713-2025, 2025

Short summary

Theo Glauch, Julia Marshall, Christoph Gerbig, Santiago Botía, Michał Gałkowski, Sanam N. Vardag, and André Butz

Viewed

Total article views: 852 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
646	185	21	852	22	30

HTML: 646
PDF: 185
XML: 21
Total: 852
BibTeX: 22
EndNote: 30

Views and downloads (calculated since 21 Jan 2025)

Month	HTML	PDF	XML	Total
Jan 2025	102	20	5	127
Feb 2025	80	34	3	117
Mar 2025	100	31	4	135
Apr 2025	85	27	2	114
May 2025	81	30	0	111
Jun 2025	111	8	5	124
Jul 2025	87	35	2	124

Cumulative views and downloads (calculated since 21 Jan 2025)

Month	HTML	PDF	XML	Total
Jan 2025	102	20	5	127
Feb 2025	80	34	3	117
Mar 2025	100	31	4	135
Apr 2025	85	27	2	114
May 2025	81	30	0	111
Jun 2025	111	8	5	124
Jul 2025	87	35	2	124

Viewed (geographical distribution)

Total article views: 865 (including HTML, PDF, and XML) Thereof 865 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 30 Jul 2025

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (13353 KB)
Metadata XML

Short summary

The Vegetation Photosynthesis and Respiration Model (VPRM) estimates carbon exchange between the atmosphere and biosphere by modeling gross primary production and respiration using satellite data and weather variables. Our new version, pyVPRM, supports diverse satellite products like Sentinel-2, MODIS, VIIRS and new land cover maps, enabling high spatial and temporal resolution. This improves flux estimates, especially in complex landscapes, and ensures continuity as MODIS nears decommissioning.


Total:	0
HTML:	0
PDF:	0
XML:	0