the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Contextualizing Pan-Tropical Allometric Models for Biomass Estimation
Abstract. Allometric Models (AMs) play a central role in monitoring and mitigating climate change as they provide accurate estimation of biomass and carbon sequestered by trees from non-destructive, easy to obtain physical measurements. Unfortunately, practitioners spend considerable effort in researching, qualifying and choosing AMs for specific growth conditions. To overcome this situation Chave et al. (2014) developed a pan-tropical AM with equivalent accuracy to local, site-specific AMs. We ameliorate this result by incorporating contextual information pertaining to growth conditions in a Machine Learning (ML) model, eventually achieving a reduction in Mean Average Error (MAE) of -17 % as measured on hold-out data. This breakthrough shall have important impact in applications such as national forest inventories, carbon certifications and calibration of satellite based biomass maps to field data. To complete, we propose a principled method to estimate how much additional error one can expect when applying a given AM to shifting conditions and provide a data-driven safety check to practitioners.
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-6341', Anonymous Referee #1, 22 Feb 2026
-
AC1: 'Reply on RC1', Eustache Diemert, 18 Mar 2026
We thank Anonymous Referee 1 for their review RC1 and answer remarks and concerns on the following topics: contribution areas, intended readership, and significance of the work.
NB: [x] designates additional references in this answer whereas (author, year) refers to manuscript references.
Contribution Areas
We’d like to recall that - as described in the abstract - this paper proposes 2 main contributions:
- #1: a more precise pan-tropical allometric model
- #2: a method to predict the suitability of a given allometric model to new application conditions.
First of all, we remark that the five potential contribution types listed by RC1 only apply to contribution #1. Out of these 5 areas, our work targets (2) better representation of growth conditions, (3) correct evaluation of new models and (4) precision improvements over previous generation of models. We disagree with RC1 that only assumes our contribution to target (4). On (2), as described in Section 2.2 we recall that our proposed model does incorporate better representation of 5 important ecological factors: continent, primary vs secondary forest, forest type, altitude, rainfall and dry months. These factors are all missing in the baseline model from Chave and it is thus biologically plausible that incorporating them would increase precision of predictions. On (3), as described in Section 2.2.2 we perform a 30 random splits cross-validation and report 3 error metrics on both training and testing parts. This method is standard for evaluation Machine Learning (ML) models and produces valid confidence intervals that capture randomness of the data and of model coefficients. The work from (Sileshi 2014), while important in classical forestry modelling is not applicable to ML models directly as they are over-parametrized and thus traditional goodness of fit statistics are too easy to overfit. We would be curious to understand how our procedure can be criticized or improved when assessing ML based allometric models. As an example, [2] uses the same metrics and a near identical experimental procedure to assess ML models prediction quality on a related AGB estimation problem.
Finally, we assume that a valid contribution may only tackle some (in our case 3) of the 5 areas pointed by RC1. In our view interesting research may contribute to only some of these goals and still be scientifically sound and worth disseminating in the community.
On our contribution #2 (method for assessing suitability of an allometric model to new conditions), we agree with RC1 that our contribution #2 tackles “an important problem”. We would have hoped that this contribution would be more discussed in this review as it has important methodological implications for the community. Contrarily to what RC1 reports this contribution is announced in the last part of the abstract and highlighted clearly in the Introduction p.2 l.30.
Intended Readership
We selected BGU BioGeosciences (BG) journal because of the declared focus on “cutting across the boundaries of established sciences and achieve an interdisciplinary view of the interactions [between the biological, chemical, and physical processes …]”. In that respect we expect the readership of this journal to be far more extensive than the Forestry practitioners community that RC1 assumes to be the only potential target. We note that methodologically heavy papers tackling a problem close to our contribution #2 seem to have found their audience in this journal [1].
Our intention at the onset has been to introduce recent statistical learning techniques to the established field of allometry research. In that respect we expect our work to be of interest firstly to forward looking scientists and researchers aiming to modernize the field and apply novel ideas and techniques to important, established problems (especially contribution #2). Forestry practitioners with “no prior knowledge in statistical learning methods” (quoting RC1) is not our primary target as we agree that such readers would value less theoretical developments and more practical examples based on established theory. We note though that a quick search returns 496 papers using ML techniques in EGU and BG journals indicating an interested readership exists in this venue. More broadly we noted in the Introduction of the paper recent development of ML based allometric models, see (Dutta Roy and Debbarma, 2024; Wongchai et al., 2022) for instance.
We are also not at ease with the comment of RC1 that we wrote this paper “presumably motivated by the fact to reach out […] to clients, also, given the competing interests”. We declared our potential conflicts of interest in good faith and we assume that relevant, solid scientific work can (and arguably should, in a number of cases) be funded by private organizations. In any case this assertion is not falsifiable and should thus not appear in a scientific debate over the merits of a given piece of research work. To complete, we noted p.8 l.28 that our work - when applied to carbon sequestration projects - would imply to emit less carbon credits than the baseline from Chave (see also Figure 4). We don’t see how this could be of mercantile benefit to our employer or to its clients either. Nor do we see how RC1 comment could shed light on the soundness and relevance of our contributions.
Moreover, another assertion of RC1 seems overly antagonistic: “the whole text is full of technicalities that serve no other apparent purpose than making an impression on the non-specialized reader”. Again, this assertion is not falsifiable and should not be brought in scientific debate. In the contrary, we make a point of giving enough details so that readers with good enough knowledge of statistical learning can assess our methods and reproduce our experiments. Technically heavy sections such as 2.3.1 and 2.3.2 are necessary to explain the mathematical foundations of the work, in particular when it comes to transfer learning theory that underpins the development of the model that predicts suitability of an allometric equation to new conditions. Likewise in Section 2.2.1, for the sake of reproducible science we are bound to give enough details so that a trained practitioner can reproduce our experimental results in whatever programming language or computing environment that she chooses to use. We believe that RC1 has his/her own readership in mind but that appears to give a somewhat particular angle on the contributions of the papers, ignoring methodological contributions that would appeal to a larger, more fundamentally inclined or technically savvy audience.
Significance - Contribution #1: a more precise pan-tropical allometric model
We disagree with RC1 on the notion that “none of these complex models perform so well in a cross-validation test relative to the baseline”. In fact, Table 3 shows that the proposed COFARM-NN model improves upon the Chave baseline on all 3 metrics (R2, RMSE, MAE) with improvements in MAE of 17% and 14% in RMSE. Moreover, the calibration curve on Figure 4 explicitly demonstrates that the proposed model consistently makes less errors than the baseline on all tree sizes. We could debate endlessly about how much error reduction would be necessary to make a “substantial improvement”, yet the reported experiments show i) statistical significance ii) consistency over AGB ranges iii) double digit relative magnitude of the improvement with the proposed model. We also note that, indeed, some complex models such as HGBRT perform just on par with the Chave baseline, indicating that more complex models are not sufficient per se but rather that incorporation of relevant ecological information should be designed carefully to provide precision benefits. Simpler power models with access to the same information such as ContextualChave improve MAE by 7% only, confirming that the proposed COFARM-NN model more than doubles the benefits of incorporating additional ecological information compared to traditional approaches. In our view these two examples provide a sharp contrast and help to grasp the significance of the proposed model.
An interesting discussion related to RC1 comment and that we could emphasize more pertains to the tradeoff between model complexity and improved precision. ContextualChave has 7 parameters (Equation 3) and improves MAE by 7%, COFARM model has 28 parameters and improves MAE by 14%, COFARM-NN model has 132 parameters (Figure 3) and improves MAE by 17% whereas. All models may be seen as frugal by modern standards and are tractable to learn and use even on a low-end laptop computer. Ultimately we believe users may prefer different options based on non quantitative aspects such as explainability. In any case this work offers a number of options to practitioners and researchers that improve over existing baselines.
Significance - Contribution #2: a method to predict the suitability of a given allometric model to new application conditions
Again, we would have hoped a more thorough review of this contribution. We believe RC1 misses the significance of our results. The remark that “In practice, Table 2 only demonstrates that some covariates are significant predictors in the regression exercise” is misleading: Table 2 describes the experimental setup and not the results of the experiments. In that respect the relevant information from Table 2 is that different kind of shifts in growth conditions (such as eg when applying an allometric equation developed for a Dry Forest to a Moist Forest) produce different magnitudes of shift in the data distribution. To the best of our knowledge the fact that shifts in the data distribution (i.e. tree measurements) are informative of the additional biomass prediction error when applying an allometric equation to a new site is a novel result never heard of in the allometry literature. Moreover, the success of the additional prediction error model can be observed in Figure 5 where predicted additional errors and observed additional errors are compared for a variety of shifts in altitude, continent, dry months, forest type, rainfall and random controls. The fact that we can achieve a R2 of 0.832 when predicting this additional error is in our view a major feat and non trivial finding. What it means is that for the first time in our knowledge practitioners have a reliable, data driven procedure to quantify the suitability of a given allometric model to new sites. This is illustrated in a graphical manner in Figure 6 and discussed in Section 4.3.1.
Additional References
[1] Picard et al. 2025 “Selecting allometric equations to estimate forest biomass from plot rather than individual-level predictive performance”, in BG, 22, 1413–1426, 2025
[2] Contreras et al. 2025 “Multi-source remote sensing for large-scale biomass estimation in Mediterranean olive orchards using GEDI LiDAR and machine learning”, in BG, 22, 7625–7646, 2025
Citation: https://doi.org/10.5194/egusphere-2025-6341-AC1
-
AC1: 'Reply on RC1', Eustache Diemert, 18 Mar 2026
-
AC2: 'Proposed Manuscript Update', Eustache Diemert, 18 Mar 2026
After reviewing Anonymous Referee 1 comments we propose to improve the following:
- add a discussion on model complexity vs improved precision in Section 3.1
- add a link to the open source code of the experiments at the start of Section 3 -> https://github.com/PUR-Projet/contextual_allometric_models
Citation: https://doi.org/10.5194/egusphere-2025-6341-AC2 -
RC2: 'Comment on egusphere-2025-6341', Anonymous Referee #2, 27 May 2026
General comments
This study proposes a Machine Learning approach to contextualize allometric models for tree biomass prediction. Contextualization means adding environmental variables to the predictors of the allometric model. A model for accounting for the additional error caused by shifting environmental conditions is also proposed. The method is applied to a subset of the tree biomass dataset of Chave et al. (2014). The approach proposed is original. The point of view of Statistical Learning and Domain Adaptation that is adopted in this study is also very interesting.
Contextualizing allometric models with environmental information is indeed an important subject. The authors present it as a new idea, but it has actually been addressed by several studies. Chave et al. (2014) proposed for instance an allometric model depending on tree diameter, species wood density, and an index E that was formed from climatic variables (Eq. (7) in their paper). This index was used to modulate the height-diameter allometry depending on the environmental conditions. Another example is the study by Wang et al. (2023, https://doi.org/10.1038/s41598-023-28843-2) who compiled allometric equations of the kind AGB = aDb or AGB = a(D2H)b globally, and subsequently modelled the dependence of the a and b coefficients on environmental variables at the global level. It would be useful to introduce the issue of model contextualization in the context of previous studies.
The fact that contextual information reduces the residual error of allometric models is not new and similar gains in predictive performance can be obtained even with classical modelling approaches. For instance, the contextual equation (7) by Chave et al. (2014) reports a residual standard error of 0.413, but this residual standard error increases to 0.517 when dropping the E index from the model predictors. Hence, contextualisation in this case brought a 20% improvement in the residual standard error. The gain provided by machine learning approaches must therefore be put into perspective in light of the gain that contextualization provides in any case.
One concern with machine learning approaches is the reusability of the models. Chave et al.’s pantropical model is easy to use because you only need to know the two coefficients α and β of the model. With machine learning models, there are often hidden parameters that are difficult to communicate, so that the fitted models are hardly reusable. In the present case, the authors should provide a way for anybody to reuse the models they fitted.
The hidden parameters in machine learning models brings another question regarding the comparison of the model predictive performances. The predictive performance of a model should be penalized by the number of free parameters used by the model. In the present case, Chave et al.’s model that has two free parameters and the machine learning models are compared on the basis of predictive performance criteria (RMSE, MAE, R2) that do not account for the number of parameters, which is not very fair. Random cross-validation is not very appropriate either because it maintains the same data structure in the calibration and in the validation datasets. Machine learning techniques are often very good at detecting these data structures, but may fail to extrapolate to data structures for which they have not been trained. In contrast, models grounded in biological principles may be more robust. Therefore, the authors should provide an estimate of the number of free parameters involved in each model. They should also use non-random block cross-validation rather than random cross-validation. Here, the blocks could consist of the different sites, or of the different distribution shifts defined in §2.3.3.
Some effort should be made to clarify mathematical notations and acronyms. Some notations and acronyms are used without any definition.
Specific comments
P1L36, “early efforts dating from the work of Smith and Brand (1983)”: there are many earlier works on tree allometry. The concept of allometry was coined by Huxley & Teissier in 1936 (https://doi.org/10.1038/137780b0). The GlobaAllomeTree database (http://www.globallometree.org) lists 2640 allometric equations that were published before 1983. The oldest equation in this database dates from 1947 (https://pub.epsilon.slu.se/9900/1/medd_statens_skogsforskningsinst_036_03.pdf).
P2L28: the acronym MAE is used without being defined. The same acronym is subsequently used at P4L15 without being defined either. Please clarify.
P3L65-78: many mathematical notations are not defined. Please define them. “Predictor” is used as a synonym of “model” (the model h is called a predictor), whereas “predictor” in statistical modelling usually refers to an explanatory variable of the model. Please clarify.
P4L12: shouldn’t we read Δs,tL instead of Δs,t?
Table 2: the reference values are called “threshold values” in the text (P4L40-45) so the column label “Reference” and the table caption should be changed to “Threshold” to be consistent with the text. The meaning of the right arrows is unclear.
P5L56, “HGBRT […] behaves surprisingly badly”: please clarify that you refer to the predictive performance criteria on the test dataset, because the predictive performance criteria on the training dataset are actually quite good for the HGBRT model.
Figure 4 and P4L67-68: the meaning of Fig. 4 is unclear. What does the y-axis of the figure correspond to? The density of what is shown as a grey area? Presumably it is the density of distribution of log(AGB) in a dataset, but which dataset? The text speaks of “calibration curve” but this concept has not been introduced in the Methods section. Please clarify in the Methods section what is shown in Fig. 4.
P6L44-45, “reduced error in AGB estimation should decrease uncertainty”: isn’t it tautologic? Please rephrase.
Figure 5: the red area is explained in the text but not in the figure caption. Please also explain it in the figure caption so that the figure is self-explanatory.
P8L13: making tree biomass primary data available was one of the objectives of the Globallometree platform (http://www.globallometree.org).
P8L28: what do you mean by “more conservative biomass estimates”?
Typos
- Parentheses around citations are often mistyped (“introduced in (Chave et al., 2005)” instead of “introduced in Chave et al. (2005)”, etc.)
- P4L27: remove the comma at the beginning of the line.
- P4L47: type “≥” instead of “>=”.
- P5L70, “is be suboptimal”: remove “be”.
Citation: https://doi.org/10.5194/egusphere-2025-6341-RC2
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 637 | 0 | 16 | 653 | 0 | 0 |
- HTML: 637
- PDF: 0
- XML: 16
- Total: 653
- BibTeX: 0
- EndNote: 0
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Biomass estimation models are used to estimate tree biomass from simple physical measurements, but choosing the right model for specific conditions is difficult. This paper reanalyzes a global dataset by developing a machine learning model. The resulting model is claimed to be much improved as it reduces prediction error by 17%. Improved biomass estimation models should be based on (1) larger sample sizes, (2) a better representation of a wider range of real-life conditions. Also, developers should ensure that (3) the resulting models are correctly evaluated using the proper goodness of fit methods (Sileshi 2014), and (4) minimize the goodness of fit and show significant improvements over the previous generation of models. Finally they should ensure that (5) they are easy to implement for a wide range of practitioners (including private owners, forestry consultants, academics, and businesses), and the conditions of their use is clearly stated.
This paper address one of the five goals, namely goal 4. It does nothing to address the crucial aspects (1) and (2), and the practical implementation of the method (condition 5) is likely to be more complex rather than simpler. Code availability in Python but also R is not reported, and the fact that the authors have competing interest in this development may explain this situation. Unfortunately, it is unlikely that this manuscript will make a valuable addition in the academic literature. Below are major issues with this study, none of which, unfortunately, is fixable with a minor revision of the text.
First, the argument that including more predictors in a statistical model generally improves its fit to observed data is as old as regression theory. Because the problem at hand is simple (non-linear regression of a single predicted variable), it makes it clear that adding environmental variables as predictors improves the fit. That model (3), vastly more complex that model (1), leads to a reduction of only 17% of the MAE should raise the question of whether this tremendous increase in complexity is worth the effort. There is no clear answer to this question in this manuscript because it is predicated on the assumption that the 4004-tree dataset encapsulates the full universe of possibles. This is a serious shortcoming. In fact, looking at the main result, the 17% reduction in MAE, this result is reported in Table 3. It is shown that none of these complex models perform so well in a cross-validation test relative to the baseline reported in the first column. Gains in RMSE and MAE are at best modest, so this method is better seem as a proof of concept rather than a breakthrough result for biomass regression models.
Second, the text is written for an audience of data scientists, and it misses its potential audience. For a readership with a training in data science, this study is an application of established methods. It may be of interest for the data science community precisely because it is so simple, and which case the manuscript should be submitted to a journal of statistical learning. The intention to submit to Biogeosciences is presumably motivated by the fact to reach out to the user community (and to clients, also, given the competing interests). Users (foresters, or private actors) will however likely find this text totally opaque. Section 2.2.1 is a case in point ("context-agnostic baselines", "we adjunct a L2 regularization", "hyper-parameter optimization", "target encoding") but the whole text is full of technicalities that serve no other apparent purpose than making an impression on the non-specialized reader. If the goal is to reach out to the user community, the recommendation is to take a radically different approach and explain each and every step, assuming no prior knowledge in statistical learning methods. This would imply to drastically cut down the material presented, to provide worked-out examples of applications, and most importantly to make fully open access all the methods and scripts (both in Python and R, the latter being a more go-to language in the foresty community).
Third and last, section 2.3 is seeks to explore situations where a biomass regression model may be applied outside of condition where it was calibrated. In principle this is an important problem. In practice, Table 2 only demonstrates that some covariates are significant predictors in the regression exercise, which was the assumption at the outset. Notably, none of this is reported in this abstract. It would take more practical case studies for this theoretical section to be a convincing addition to the literature on biomass estimation models.