Sensitivity and Uncertainty Analysis of China's Terrestrial Carbon-Water Cycle Using a Dynamic Global Vegetation Model
Abstract. Parameter uncertainty in Dynamic Global Vegetation Models (DGVMs) substantially impacts the reliability of carbon-water cycle simulations. Using the LPJ-GUESS model at 13 sites across China's diverse ecosystems, this study employed a multi-method (Morris, eFAST, Sobol') sensitivity analysis on 39 key parameters to assess their impacts on nine carbon-water cycle variables. Our results revealed that the model's behavior is co-dominated by both core physiological parameters, often hard-coded in the source, and plant functional type-specific traits. This finding suggests limitations in the common practice of focusing calibration solely on user-adjusted files. Furthermore, these parameter controls are highly context-dependent, shifting based on both the target process (e.g., carbon uptake as opposed to water flux) and the regional climate, where arid ecosystems respond most strongly to water-use parameters. The multi-method approach also highlighted that the influence of many parameters is mediated through complex interactions rather than direct effects alone. Consequently, this complex web of sensitivities propagates into contrasting patterns of model uncertainty: arid ecosystems exhibit the highest relative uncertainty, making predictions more uncertain, while humid, productive ecosystems show the largest absolute uncertainty, posing a challenge for carbon budgeting. These findings provide a scientific basis for developing targeted, region-specific parameterization strategies to reduce model uncertainty and improve assessments of terrestrial carbon sink functions.
This discussion paper presents a global sensitivity analysis of the LPJ-GUESS model that also considers changes in sensitivities induced by different climatic zones across China.
The overall framing, conceptualization and presentation of the manuscript are very good. The language is easily accessible; the questions and methods are clear and relevant for the vegetation modelling community, the figures are nice, and the general logic of the manuscript is clearly understandable. The topic fits well within the scope of Biogeosciences (BG), but given the technical nature of the manuscript, it would also fit within the scope of Geoscientific Model Development (GMD).
While the overall logic of the manuscript is reasonably straightforward, I had several major concerns and requests for clarifications that should be addressed before publication:
1) The motivation for the parameter ranges should be better explained and the current ranges should possibly be changed. Parameter ranges should always be informed by ecological plausibility. If nothing is known about the effect of a parameter, an initial manual sensitivity analysis may be helpful to determine appropriate ranges. This is in any case better than setting parameters to technically possible min/max ranges or ad-hoc relative values such as +/- 25%. While important to resolve, I do not expect this to have major impacts on the results.
2) In the data availability statement, the authors provide download links for model code and data, but this would likely not be sufficient to reproduce the results, as neither the analysis code, nor the modified model code are provided (I assume the LPJ-GUESS code must have been modified to change the hard-coded parameter values the authors refer to). For a revision, I suggest the authors should link to a repository that contains their entire project, including (as far as copyrights allow it) their modified model code, the data and their analysis code (e.g. the code to calculate sensitivity indices as well as the code used to generate the figures), as well as the intermediate results (i.e. the sensitivity indices) that were used to generate the figures. Please make sure that the results are computationally reproducible, e.g. by appropriately fixing random seeds before running the sensitivity analysis.
3) An interesting but also a bit unusual aspect of this study is that the authors used and compared several sensitivity indices in parallel. This is interesting, both to learn about the differences between the indices, and to explore the robustness of the conclusions, but I would have liked to see more analysis and discussion about the reasons for differences where they occur. Also, I had concerns about the standardization that is employed and would ask the authors to consider removing it (see detailed comments).
4) The number of parameter draws / model runs to calculate the sensitivity indices seems to be worryingly low. In particular Sobol' indices often require a much larger number of iterations to converge, but the sample size for the Morris screening also seems rather low. It is critical that the authors demonstrate that the results are stable. This could be done either by repeating the analysis several times and showing that results are qualitatively identical, or by bootstrapping the existing simulations (which would be less computationally demanding).
5) The authors conclude from the fact that there are regional changes in sensitivity that we need region-specific parametrizations of the model. This appears to be a misinterpretation of the results. If we would introduce regionalized model parameters, sensitivities would change between regions, but the fact that the authors find regional changes in sensitivity says nothing about the sense or regional model parameters. Rather, it says that the climatic drivers interact with the parameters in producing the outputs. This is also the interpretation in (Oberpriller et al., 2022), which is cited in the study.
6) Overall, the discussion of the results seemed in part very speculative to me (see detailed comments below). I would ask the authors to consider if their conclusions in the discussion are directly supported by the results generated in the sensitivity analysis.
I hope the following, line-specific comments will be of further help to the authors in revising this manuscript.
LINE-SPECIFIC COMMENTS
*************************
L18: I’m not sure how people typically interact with the model. Don’t people also change parameters that are hard-coded in the source, or change the parameters included in the parameter files?
L68 It’s not clear to me if you cite the reference as support for the statement, or as an example of a study which did pay attention to this. If the latter, cite this paper as: (but see XXX)
L74 OK, the previous sensitivity analyses were conducted in Europe, but why would we expect that the results are different in China? Maybe you could make this more explicit by contrasting the climatic ranges of your study to previous studies (are you effectively considering a wider range of ecosystems?)
80 Again, not being an LPJ-GUESS expert, I’m not sure how the interaction with the model typically works in practice, but my understanding is that it is not uncommon that users modify the code either directly, or change with paramters are changed through parameter files?
L81 Is this a new paragraph here? Seems sensible.
L88 I agree with the statement but what I don’t understand is your motivation to run these 3 methods in parallel, as the more complex methods deliver everything the simpler methods deliver and more. Reading on, it seems to me that the purpose of applying multiple methods is rather that you want to compare if they give different results? If so, please make this more transparent and explain your reasoning (do you have references for the fact that the different sensitivity methods produce similar results?). If there is no reason to expect differences, it seems computationally wasteful and potentially confusing to run 3 indices, rather than concentrating sampling effort on calculating one index properly with a larger sample size.
L89 13 Sites doesn’t seem a lot. If they differ in multiple climate variables, I’m not sure how you will later disentangle which aspect of the climate is responsible for the change in sensitivity, given such a small set.
L105 Why does it matter that the sites are undisturbed? Isn’t the model only driven by climate? Would sensitivity calculations be affected by that?
L116 Overall, I did not understand the reasoning behind the site selection. Why so few sites? Was this mostly to limit computational costs? And what kind of representativeness are we aiming for here? Representative of the climatic space, ecological (PFT) space, or geographic space?
L130 Driving or Driver?
L183 I don’t agree with this argument. It can sometimes make sense to set parameter ranges to extreme values in an SA to test model behavior, but in the context of this study your goal is to explore model sensitivity in a realistic setting, parameters should be set within the range of plausible values. This logically follows from the fact that the results of the global SA will depend on the ranges, and sensitivities calculated for a range that is biologically implausible does not seem sensible. The argument that you have to make subjective assumptions to set the range is not convincing either – yes, it’s true, but setting the range to implausible maximum values clearly seems worse than guessing a plausible range. Moreover, in my experience, one should be very careful with too wide ranges in a global SA. The reason is that by setting one or more parameters to extreme values, the entire model sensitivity changes, e.g. because the plant doesn’t grow any more. This can lead to inappropriate conclusions about the sensitivity of the other parameters.
L 190 Uniform makes sense here, but similar to above, a uniform distribution does not reflect an absence of prior information, it makes the assumption that all values are equally likely a priori.
L 240 It’s costly, yes, but given that these computational costs must be incurred anyway, what advantage does it have to run the other methods as well and compare?
L 248 Adjectives such as “systematically”, “comprehensive” here and elsewhere could be erased. Just say what you did and leave it to the reader to decide if they find it comprehensive.
L 263 How do you know it’s sufficient? You could e.g. explore if results are stable when repeating the analysis, or how they react to changing the sample size. Your Morris values in Table 2 seem too small to me to get stable estimates. I also have doubts about the stability of the other methods.
L 274 Not sure about the standardization of the sensitivity indices. eFast and Sobol are both variance-based, so in principle they should provide similar values, even without standardization. There can be differences due to interactions, but these are meaningful and can be interpreted. Morris is a different method, because it effectively is a derivative-based method. Likely, the values are neither the same, nor completely linearly related to eFast and Sobol. This, btw., could also be an important aspect to consider for your discussion of differences later. In any case, the standardization adds more complexity to the analysis, and it could also mask potential methodological problems and sampling uncertainty. Please consider if it could be removed without harming your analysis goals.
Fig. 4: the consistency analysis is nice although the reasons for the variation could be better explored. One reason is likely a limited sample size of the SA methods. This could be explored by re-running the analysis and quantifying the sampling variation of the estimates. Then, it would be interesting to discuss if there are any systematic differences not explained by sampling variation, and if so, why those occur. What is also not clear to me is what sensitivities are you comparing here. eFast, for example, can produce total and direct / first order. Which of those is used for the comparison?
L 570 For this entire section: I didn’t understand why this observation deserves an entire subsection. Yes, relative uncertainty is not absolute uncertainty, and that matters for particular applications. But isn’t this the case for most modelling studies and even for empirical studies? For example, considering diurnal differences in uncertainties, we often find that absolute uncertainties in carbon fluxes are higher during daytime whereas relative uncertainties are higher at nighttime. The reason for this pattern is that LPJ-GUESS, like virtually all process-based ecosystem models, has a largely multiplicative structure — GPP emerges from the product of radiation, efficiency terms, water availability modifiers, and so on, which then means that uncertainties are mostly relative and absolute values increase with the mean output. I do not understand why you think the spatial / climatic pattern you observed is particularly noteworthy and warrants an entire section or some rethinking of the issue of relative / absolute uncertainty. Were you surprised by the differences of relative / absolute uncertainty that your results uncovered? To me, the result that absolute uncertainties are higher in areas that have large values seems expected.
L 609 See my general comment – this seems to be a big misunderstanding. The sensitivity changes because the drivers / context change. If you are in a high-water environment, the model will not be very sensitive to parameters that regulate water use efficiency. This is no indication that you need a regional parameterization. These statements need to be changed.
L 612 See comments above, I didn’t understand these arguments.
L 625 You could note that this is done in other sensitivity studies on LPJ-GUESS and briefly summarize the results of such an analysis.
L 663 See my general comment 5