the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
mLDNDCv1.0: A Machine Learning-based Surrogate of LandscapeDNDC for Optimising Cropping Systems in Denmark
Abstract. Optimising Danish arable management is critical for reducing greenhouse‐gas (GHG) emissions and nitrogen (N) losses while maintaining or even improving crop productivity and soil health. Process-based models such as LandscapeDNDC can simulate the effects of management on agroecosystem functioning. However, their computational demand limits large-scale optimisation. Here we present mLDNDCv1.0, a tree-based machine-learning surrogate of LandscapeDNDC that allows for the rapid exploration of large decision spaces without sacrificing mechanistic fidelity. We generated a synthetic training set of >45 million LandscapeDNDC simulations from a full factorial of soils, climate (2011–2020), and management options for winter wheat. We benchmarked gradient-boosted tree algorithms (LightGBM, XGBoost, CatBoost) on predictive performance. XGBoost delivered the most accurate and stable predictions for the core indicators in this study: soil N2O emissions (R2 = 0.81), NO3− leaching (R2 = 0.84), yield (R2 = 0.93), and for soil-organic-carbon stock changes (R2 = 0.86). The model maintained high accuracy when confronted with real management and environmental settings that reflected true operating conditions. Coupling mLDNDC with the multi-objective evolutionary algorithm NSGA-II allowed us to optimise millions of management combinations across all winter wheat fields in Denmark. Pareto-optimal solutions reduced N2O emissions by 27.5 ± 4.5 %, NO3− and leaching by 27 ± 3.0 %. These solutions also increased grain yield by 8.5 ± 1.5 % and soil-organic-carbon stocks by 1.2 ± 0.1 %, and improving nitrogen-use efficiency (NUE) by 10 ± 2 %, while turning the system into a net GHG sink (2200 ± 400 Mg CO2-eq ha−1 yr−1). These gains were achieved without increasing total fertiliser input. They arose from re-allocating mineral and organic fertliser N input, adjusting incorporation depth, and optimising residue, catch-crop, and irrigation practices. Thus, mLDNDC therefore provides a scalable, transparent framework for country-wide optimisation and real-time decision support in climate-smart agriculture.
- Preprint
(1922 KB) - Metadata XML
-
Supplement
(922 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
CEC1: 'Comment on egusphere-2026-294 - No compliance with the policy of the journal', Juan Antonio Añel, 25 Mar 2026
-
AC1: 'Reply on CEC1', Jaber Rahimi, 26 Mar 2026
Dear Dr. Añel,
Thank you for your message and for highlighting this issue.
We have now released an updated version of the dataset on Zenodo to support the reproducibility of our study. The repository includes the harmonized field-level dataset for winter wheat in Denmark used for training the machine-learning surrogate model (mLDNDC), together with the associated feature engineering outputs at the field level.
https://doi.org/10.5281/zenodo.18573225
This dataset contains the management information and derived variables necessary to run and reproduce the surrogate modeling framework and optimization presented in our study. While some components of the original SmartField dataset are subject to data protection constraints (e.g., field’s coordination), we have ensured that all essential inputs required to train and use the ML model are included in the repository.
We believe that this fulfills the requirements of the Code and Data Policy and allows reviewers and readers to reproduce the key results of the manuscript.
Could you please confirm whether this is sufficient for the manuscript to proceed in the discussion and review process?
Thank you very much for your guidance.
Best regards,
Dr. Jaber Rahimi, on behalf of the authors
Citation: https://doi.org/10.5194/egusphere-2026-294-AC1 -
CEC2: 'Reply on AC1', Juan Antonio Añel, 28 Mar 2026
Dear authors,
Many thanks for the quick reply. I can confirm that now the current version of your manuscript is in compliance with the policy of the journal.
Juan A. Añel
Geosci. Model Dev. Executive Editor
Citation: https://doi.org/10.5194/egusphere-2026-294-CEC2
-
CEC2: 'Reply on AC1', Juan Antonio Añel, 28 Mar 2026
-
AC1: 'Reply on CEC1', Jaber Rahimi, 26 Mar 2026
-
RC1: 'Comment on egusphere-2026-294', Anonymous Referee #1, 12 Apr 2026
Overall, I think the manuscript presents an interesting and well-executed workflow. That said, several aspects need to be clarified or toned down before the results can be fully supported.
First, I think the manuscript consistently overstates the level of validation and generalization achieved. The surrogate is trained and evaluated primarily on synthetic data generated by the same process-based model. The "actual space" evaluation still relies on model-generated outputs, and independent validation is effectively limited to crop yield. In that sense, what is demonstrated here is a good emulation of LandscapeDNDC within the sampled domain, rather than generalization to real-world conditions. This is particularly relevant for N2O, NO3 leaching, and SOC, where no independent validation is provided. I would suggest toning down statements around "real-world conditions", "transferability", and decision-support applicability, and making it clearer that the results are conditional on the model and the constructed dataset.
Second, I found the description of the validation workflow somewhat unclear. The manuscript mentions an 80/20 train-test split, but also refers to five-fold cross-validation. It is not clear whether cross-validation was restricted to the training subset or applied to the full dataset, which would compromise the independence of the test set. This should be clarified. More importantly, even if implemented correctly, the current validation design is not very demanding. A random split within a synthetic factorial dataset mainly tests interpolation within a highly structured space. It does not provide a strong assessment of generalization across management regimes or environmental conditions. I would therefore be more cautious in interpreting the reported performance as evidence of transferability.
Third, the whole framework stands on the synthetic dataset and the way the decision space is defined, but this part remains somewhat under-specified. In particular, it is not entirely clear how unrealistic or inconsistent combinations were identified and removed, or how sensitive the results are to the chosen parameter ranges and constraints. Given how strongly both the surrogate and the optimization depend on these choices, I think this deserves a more explicit discussion.
Specific points follow,
l24, this is too strong given what is actually done. A tree-based surrogate trained on model outputs does not preserve mechanistic fidelity in any strict sense. At best, it approximates the response surface of the parent model within the sampled domain.
l30, this is somewhat misleading. The so-called "real" conditions still rely on model-generated outputs, not independent observations (except partially for yield). This is not a true real-world validation, and the sentence overstates the level of external validation I would like to say.
l32, this sounds more general than it is. It would be more precise to state that the optimization is conditional on the model and the defined decision space.
l36, 2200 ± 400 Mg CO2-eq ha-1 yr-1 is highly questionable. The magnitude appears unrealistically large for cropland systems..
l39, the study demonstrates a model-based optimization workflow, not a validated decision-support system. "Real-time decision support" in particular is not demonstrated.
l133, it is not clear whether the design leads to unrealistic combinations, especially when multiple categorical and continuous variables are combined. This needs more justification.
l148, the optimization results will be entirely conditioned on these boundaries, yet it is not clear how they were defined, how strict they are, or how sensitive results are to these choices.
l214, This is a key step but not described in sufficient detail. How many combinations were removed? How sensitive are results to these rules? This directly affects the training distribution.
l245, despite this, the manuscript assumes that prior validation is sufficient for all variables and conditions considered here, which may need explicitly stating here.
l305, I would say that a simple random split may not be sufficient to test generalization across management regimes.
l351, but validation against yield alone is not sufficient to claim general transferability, especially for the other target variables.
l404-406, I think it effectively removes all trade-offs and selects only "win–win" solutions. As a result, the reported improvements are no longer representative of the Pareto space but of a heavily filtered subset. The authors should clarify this point.
l483, this applies only to yield, but other variables are not independently validated. The sentence generalizes beyond what is actually shown.
l485, not fully convinced; agreement with yield alone does not demonstrate that the model captures underlying processes, especially for nitrogen and carbon dynamics.
l548, that's true, yet the interpretation in the section occasionally goes beyond this (consistency check that the surrogate has not learned spurious relationships). Given that the surrogate is trained on model-generated outputs, the SHAP analysis primarily reflects the behavior of the parent model instead of independent evidence of processes. This distinction could be made more explicit in the section.Citation: https://doi.org/10.5194/egusphere-2026-294-RC1 -
AC2: 'Reply on RC1', Jaber Rahimi, 20 May 2026
General remarks
We are grateful for the evaluation of the manuscript and have now addressed the issues and questions that will certainly improve the comprehensiveness of the text in a point to point manner.
Specific answers to the remarks (RC: Reviewer’s Comment; AC: Authors’ Response)
Reviewer 1RC: l24, this is too strong given what is actually done. A tree-based surrogate trained on model outputs does not preserve mechanistic fidelity in any strict sense. At best, it approximates the response surface of the parent model within the sampled domain.
AC: Thank you for your valuable feedback. We have reviewed the manuscript and updated it as seen below.l22-24
“Here we present mLDNDCv1.0, a tree-based machine-learning surrogate of LandscapeDNDC that allows for the rapid exploration of large decision spaces while maintaining high fidelity to the parent process-based model's input-output behaviour.”RC: l30, this is somewhat misleading. The so-called "real" conditions still rely on model-generated outputs, not independent observations (except partially for yield). This is not a true real-world validation, and the sentence overstates the level of external validation I would like to say.
AC: Thank you for this valuable observation. We acknowledge that our original phrasing overstated the nature of the validation. To clarify: the surrogate model was trained on synthetic management data with corresponding LandscapeDNDC-simulated outputs (N2O, NO3, yield, and SOC) as labels. For validation, real field activity data were used as inputs to both the process-based model and the surrogate, and the surrogate predictions were compared against the LandscapeDNDC outputs. We recognise that this constitutes a model-to-model comparison rather than independent real-world validation. However, for crop yield, we were additionally able to compare the surrogate predictions against observed yield data not derived from any model. We have revised the text to clearly distinguish between these two levels of validation.l30-32
"When evaluated on real field activity data from Denmark, the surrogate model closely reproduced the process-based model outputs, with particularly strong agreement for crop yield, which was further corroborated by independent observational data."RC: l32, this sounds more general than it is. It would be more precise to state that the optimization is conditional on the model and the defined decision space.
AC: Thank you for this observation. We agree that the original phrasing could be interpreted as implying a general, model-independent optimisation. We have revised the text to explicitly state that the optimisation operates within a predefined decision space and is conditional on the surrogate model. The identified optimal solutions are therefore subject to both the boundaries of the decision variables considered and the fidelity of the surrogate to the underlying process-based model. We believe the revised wording makes this conditionality clear.l33-35
“Coupling mLDNDC with the multi-objective evolutionary algorithm NSGA-II allowed us to optimise millions of management combinations within a predefined decision boundary across all winter wheat fields in Denmark.”RC: l36, 2200 ± 400 Mg CO2-eq ha-1 yr-1 is highly questionable. The magnitude appears unrealistically large for cropland systems.
AC: Thank you for flagging this. Upon review, we identified a unit error. The reported value of 2200 ± 400 Mg CO2-eq yr⁻¹ represents the aggregate annual soil-net GHG balance across all winter wheat fields considered, not a per-hectare estimate. The inclusion of ha⁻¹ was erroneous. We have corrected the unit in both the main text (line 36 now 38) and the Figure 7 caption, which now reads:l774-775
“Figure 7. Soil-net GHG balance (Mg CO₂-eq yr⁻¹) under baseline and optimised management scenarios, averaged over the 10-year period 2011–2020; negative values indicate greater net climate-mitigation potential.”RC: l39, the study demonstrates a model-based optimization workflow, not a validated decision-support system. "Real-time decision support" in particular is not demonstrated.
AC: Thank you for this important distinction. While development as a real-time and validated decision support tool remains a future goal, we recognise this is beyond the scope of the current work and have ensured the revised manuscript does not imply otherwise. We have removed all references to real-time decision support and revised the sentence to accurately reflect what this study contributes and have rephrased the paragraph as shown below.l40-42
“Thus, mLDNDC provides a scalable, transparent framework for country-wide scenario comparison and strategic planning at an annual time scale in climate-smart agriculture”RC: l133, it is not clear whether the design leads to unrealistic combinations, especially when multiple categorical and continuous variables are combined. This needs more justification.
AC: Thank you for this valuable feedback. We address unrealistic combinations using two complementary filters: field-level activity data and scientific reality.
First, field-level activity data served as a baseline to identify and remove implausible combinations. For instance, fertiliser replication (a variable representing how many times fertiliser was applied) could still carry a non-zero value even when both synthetic and organic fertiliser amounts are 0 kg, which is scientifically meaningless. Such rows were removed.
Second, scientific reality was applied to constrain continuous variable ranges. Based on observed field activity data, there are realistic thresholds below which a second or third fertiliser application would not occur in practice. For example, a 30 kg fertiliser amount would not realistically be applied two or three times in Denmark. A comprehensive set of conditions derived from both field data and domain knowledge was therefore used to filter such combinations.
We have further clarified this in the manuscript and updated the relevant section accordingly.l135-142
“Second, we varied these variables within their agronomic limits using a full factorial design to generate a large set of unique management combinations representing the full range of plausible practices within the Danish agricultural context. While the factorial design is comprehensive by nature, it inevitably produces unrealistic combinations. These were systematically filtered out based on agronomic constraints and consistency with observed field practice. For instance, fertiliser replication (the number of times fertiliser is applied) was set to zero wherever both synthetic and organic fertiliser amounts were 0 kg, and application frequency was constrained by realistic quantity thresholds. For example, a total fertiliser amount of 30 kg would not feasibly support two or three separate applications.”RC: l148, the optimization results will be entirely conditioned on these boundaries, yet it is not clear how they were defined, how strict they are, or how sensitive results are to these choices.
AC: Thank you for this valuable feedback. You are correct that the optimisation results are conditioned on the defined boundaries. These boundaries were derived directly from field-level activity data, capturing the observed range of current agricultural practices and regulations in Denmark. The rationale is intentional: the goal is to discover management options that fields have not yet explored but that remain within the bounds of current agricultural reality, ensuring that optimised strategies are practically feasible rather than theoretically ideal.
Regarding strictness, the boundaries are data-driven and therefore reflect what has actually been practised in Denmark, making them neither arbitrarily conservative nor overly permissive. We acknowledge, however, that results are inherently conditioned on these choices, and we have made this more transparent in the updated manuscript as follows.l155-161
“Management boundaries were identified from Danish agricultural registry data and agronomic guidelines to represent the observed range of current agricultural practices in Denmark. These boundaries served as reference points for defining the limits of each management variable. A management library was then constructed, encapsulating all possible combinations of these management parameters for further use in the factorial design stage. The optimisation algorithm operates within these boundaries, ensuring that identified management strategies are grounded in current agricultural reality while still revealing unexplored but feasible options.”RC: l214, This is a key step but not described in sufficient detail. How many combinations were removed? How sensitive are results to these rules? This directly affects the training distribution.
AC: Thank you for raising this point. We agree that this filtering step is critical and warranted more detailed description. We have expanded the relevant section accordingly.
In total, approximately 9.1 million rows were removed from the roughly 13.6 million generated combinations, retaining around 4.5 million realistic management scenarios for further use. Regarding sensitivity, the filtering rules are grounded in field-level activity data and scientific knowledge rather than arbitrary thresholds, which limits subjectivity in their application. Rather than distorting the training distribution, this filtering step improves it by removing implausible combinations that would otherwise introduce noise and reduce model reliability. The retained dataset therefore reflects a more focused and realistic representation of the management space, which we argue is preferable for training a model intended to support practical decision-making. The number of combinations removed are stated in section 2.2.
The manuscript has been updated as follows.l224-228
“Therefore, the synthetic dataset was systematically screened and cleaned using rule-based plausibility checks to eliminate such combinations and ensure that the remaining scenarios reflect realistic and interpretable management configurations. Rather than undermining the training distribution, this step refines it by concentrating the training data on agronomically plausible scenarios, improving both model reliability and the interpretability of optimisation results.”RC: l245, despite this, the manuscript assumes that prior validation is sufficient for all variables and conditions considered here, which may need explicitly stating here.
AC: We appreciate the reviewer’s comment and agree that our original wording could be read as implying a too comprehensive validation across all variables and conditions. We have revised the manuscript and now explicitly state that LDNDC has been evaluated against multiple independent outputs (SOC, N2O, yield) under diverse conditions, which supports the consistency of its process representation and justifies its use as the parent model for our surrogate. However, at the same time, we emphasise that neither LDNDC nor the surrogate can be assumed to be correct in all applications, and that use of the surrogate requires careful evaluation, as is would be the case for the parent model. The paragraph has now been updated as shown below.l257-263
"Given the extensive international and Danish-level validation of LDNDC cited above, and its demonstrated agreement with observations for multiple quantities (e.g., SOC dynamics, N₂O fluxes, crop yields) under diverse conditions, the model's process representation is well-supported across a range of conditions. LDNDC therefore provides a defensible foundation for the surrogate-modelling work carried out here. Nevertheless, neither LDNDC nor its surrogate can be assumed valid for every variable, management regime, or environmental condition considered. Applications of the surrogate therefore require careful evaluation, as would be expected for the parent process-based model itself."RC: l305, I would say that a simple random split may not be sufficient to test generalization across management regimes.
AC: Thank you for this valuable feedback. We appreciate this important point. It is worth clarifying that the random split at this stage served a specific and limited purpose: to benchmark and select the best-performing model for full training, rather than to assess generalisation across management regimes. XGBoost was identified as the best-performing model through this comparison and was subsequently taken forward for hyperparameter tuning. The final model was then trained on the full dataset using cross-validation with the tuned hyperparameters, which provides a more robust assessment of generalisation performance. The manuscript has been updated as follows.l321-325
“Before training on the full data, it is vital to select which model will be used for this final training. The dataset was therefore split into two parts: 80% for model training and the remaining 20% withheld as an independent test set for evaluating generalisation performance. This ratio is widely adopted in machine learning practice as it balances the data available for learning with the need for reliable evaluation (Kuhn & Johnson, 2013).”RC: l351, but validation against yield alone is not sufficient to claim general transferability, especially for the other target variables.
AC: Thank you for this valuable feedback. We fully acknowledge this limitation. External validation data were only available for winter wheat yield, which constrained our ability to validate transferability across all target variables. We have therefore been careful not to claim general transferability beyond yield in the manuscript. The limitation of lacking external validation data for the remaining target variables has been explicitly stated in Section 4 as part of the study's limitations. The relevant manuscript text has been updated as follows.l368-379
“This dataset represents real-world observations across diverse management and environmental conditions, providing a meaningful basis for evaluating model transferability with respect to winter wheat yield beyond the simulation domain. While site-scale observations for N₂O, NO₃, and SOC exist in Denmark and have been previously evaluated ((Aderele et al., 2025; Grados et al., 2024; Kollmer, 2023; Rahimi et al., 2024)), national-scale validation data for these variables were unavailable, precluding transferability assessment at the scale considered in this study. It should be noted, however, that LDNDC has been evaluated at site scale for winter wheat systems in other European contexts, including Germany (Haas et al., 2021; Kasper et al., 2018; Molina-Herrera et al., 2016), as well, supporting confidence in its broader process representation. Nonetheless, transferability claims in this study remain limited to yield, and this is acknowledged as a key limitation. Future work should seek to validate surrogate performance across the full set of agroecosystem indicators as suitable observational datasets become available."
RC: l404-406, I think it effectively removes all trade-offs and selects only "win–win" solutions. As a result, the reported improvements are no longer representative of the Pareto space but of a heavily filtered subset. The authors should clarify this point.
AC: We appreciate the reviewer for the valuable feedback. The filtering step was intentionally designed to retain only management solutions that simultaneously outperform the baseline across all target indicators, effectively selecting "win-win" solutions. We acknowledge that this approach does not represent the full Pareto space but rather a filtered subset of it. This choice was deliberate, as the practical goal of the study is to recommend management strategies that offer clear and unambiguous improvements for farmers and policymakers, rather than solutions that involve trade-offs between indicators. Nevertheless, we recognise that this should be stated transparently, and the manuscript has been updated accordingly.l431-438
“Specifically, an optimised solution was retained only if it produced lower nitrous oxide emissions and lower nitrate leaching while achieving higher crop yield and greater soil organic carbon content than the baseline management for that field. This requirement ensured that recommended solutions deliver clear improvements across all key sustainability indicators rather than merely redistributing environmental impact. It is important to note that the reported improvements therefore reflect a filtered subset of the full Pareto space, specifically those solutions meeting all four criteria simultaneously, and should be interpreted as such rather than as a representation of all available trade-off solutions.”RC: l483, this applies only to yield, but other variables are not independently validated. The sentence generalizes beyond what is actually shown.
AC: Thank you for this observation. We agree that the original sentence generalised beyond what was directly demonstrated. The independent validation using Statistics Denmark yield data applies solely to winter wheat yield, and we have revised the manuscript to state this explicitly. While external observational datasets for the remaining target variables were unavailable, we note that the surrogate model shows strong agreement with the parent process-based model (LandscapeDNDC) across all output variables, as demonstrated in the cross-validation results, which provides confidence in its predictive capacity more broadly. The absence of independent external validation for these variables is acknowledged as a limitation in Section 4. The relevant manuscript text has been updated as follows.l512-519
"When model predictions on national data were compared to Denmark national yield data for winter wheat from Statistics Denmark (https://www.dst.dk), the fully trained XGBoost model achieved an R² of 0.77, indicating strong agreement between predicted and observed winter wheat yields (Figure S.1). This independent validation was limited to yield, as national-scale estimates for the remaining target variables are unavailable at the crop level. However, the surrogate model's close reproduction of LandscapeDNDC outputs across all target variables in cross-validation suggests reliable predictive performance beyond yield, though independent confirmation remains a priority for future work."RC: l485, not fully convinced; agreement with yield alone does not demonstrate that the model captures underlying processes, especially for nitrogen and carbon dynamics.
AC: Thank you for this comment. We respectfully note that the surrogate model is designed to approximate the outputs of LandscapeDNDC, a process-based model whose representation of nitrogen and carbon dynamics has been extensively validated in prior studies (as detailed in Section 2.2.1). The surrogate's role is not to independently capture biophysical processes, but to faithfully reproduce the behaviour of the parent model. Our cross-validation results demonstrate that the surrogate achieves this across all target variables, including nitrogen and carbon indicators. The agreement with independent yield observations provides an additional layer of confidence by confirming that both the PBM and the surrogate produce outputs consistent with real-world data for at least one key variable. We have revised the text to clarify this distinction between process representation, which is inherited from the PBM, and predictive skill, which is evaluated through both cross-validation and independent yield data.l519-527
“This level of performance demonstrates the surrogate model's predictive skill for winter wheat yield at a national scale and, together with the strong cross-validation agreement across all target variables, supports the model's capacity to faithfully approximate LandscapeDNDC's process-based outputs. Since the surrogate inherits its representation of nitrogen and carbon dynamics from the parent model, whose process fidelity has been established through extensive prior validation, agreement with independent yield observations provides additional confidence in the overall modelling framework. Comprehensive validation of nitrogen and carbon indicators against independent estimates at the crop level remains a priority for future work as suitable datasets become available.”RC: l548, that's true, yet the interpretation in the section occasionally goes beyond this (consistency check that the surrogate has not learned spurious relationships). Given that the surrogate is trained on model-generated outputs, the SHAP analysis primarily reflects the behavior of the parent model instead of independent evidence of processes. This distinction could be made more explicit in the section.
AC: Thank you for this observation. We agree that this distinction warrants explicit statement and have updated the manuscript accordingly. As the surrogate is trained on LandscapeDNDC outputs, the SHAP analysis naturally reflects the behaviour of the parent model, and we do not claim otherwise. However, we would note that this is precisely the purpose of the analysis in our context: to verify that the surrogate has faithfully learned the input-output relationships encoded in the process-based model rather than fitting to spurious patterns in the training data. The fact that the resulting SHAP patterns align with well-established agronomic relationships documented in the literature (as referenced throughout the section) provides two-fold assurance: first, that the surrogate reliably reproduces the parent model's process logic, and second, that the parent model's own representations are consistent with domain knowledge. We have revised the text to make the interpretive scope of the SHAP analysis explicit.l589-596
"Since the surrogate is trained on LandscapeDNDC-generated outputs, the SHAP analysis presented here reflects the learned behaviour of the parent model rather than providing independent empirical evidence of biophysical processes. This is by design: the analysis serves to verify that the surrogate faithfully captures the process logic of the parent model rather than fitting to spurious patterns. The consistency of the resulting SHAP attributions with agronomically established relationships reported in the referenced literature provides confidence in both the surrogate's fidelity and the coherence of the parent model's process representations."Citation: https://doi.org/10.5194/egusphere-2026-294-AC2
-
AC2: 'Reply on RC1', Jaber Rahimi, 20 May 2026
-
RC2: 'Comment on egusphere-2026-294', Anonymous Referee #2, 19 Apr 2026
This study makes a good contribution by showing a hybrid approach of integrating process-based model with machine learning for optimizing agroecosystem management practices. Previous reviewer have already provided an excellent and thorough reviews. Please see below some points that would help the authors to improve the manuscript:
First is about the mechanistic fidelity claim. The paper repeatedly claims about preserving mechanistic fidelity but actually the surrogate model only preserve the input-out mapping of the process based model, not its mechanistic processes and it cannot report the intermediate state variables. This distinction should be stated clearly.
Second is the path dependency and legacy effects in agricultural systems. This approach ignores the trajectory-dependent nature of agroecosystem state variables. Soil organic carbon accrual, microbial community composition and other soil health related variables are slow variables whose current states constrain future management responses and have feedback effects. The discussion should acknowledgement that two-year rotation categorial variable is an insufficient proxy for these cumulative legacy dynamics.
The third issue is temporal specificity and the real-time decision support claim. The surrogate predicts annual totals from static or seasonally aggregated features. Claiming suitability for real-time decision support overstates the tool's capability. In operational terms in farming conditions, real time decision making is much difficult because of within season interactions effects of uncertain weather, crop phenology, soil moisture dynamics among others. It would be better to state it as scenario-comparison and strategic planning tool.
Overall, the manuscript reads well. As reported by the first reviewer, the author needs to address the circular validation architecture used in the study as a limitation. Also, it would be better to be more concise in SHAP discussion sections as the post-hoc interpretability of a statistical surrogate does not substitute for mechanistic diagnosis.Citation: https://doi.org/10.5194/egusphere-2026-294-RC2 -
AC3: 'Reply on RC2', Jaber Rahimi, 20 May 2026
General remarks
We are grateful for the evaluation of the manuscript and have now addressed the issues and questions that will certainly improve the comprehensiveness of the text in a point to point manner.
Specific answers to the remarks (RC: Reviewer’s Comment; AC: Authors’ Response)
Reviewer 2RC: First is about the mechanistic fidelity claim. The paper repeatedly claims about preserving mechanistic fidelity but actually the surrogate model only preserve the input-out mapping of the process based model, not its mechanistic processes and it cannot report the intermediate state variables. This distinction should be stated clearly.
AC: We thank the reviewer for this clarification and agree that the phrasing "mechanistic fidelity" could be misinterpreted. Our intended meaning is that the surrogate preserves the input-output behaviour of a mechanistic model i.e., it reproduces LandscapeDNDC's responses with high accuracy, rather than that it replicates the underlying process representations or intermediate state variables. We have revised the manuscript to make this distinction explicit, replacing "mechanistic fidelity" with language that more precisely describes what is preserved (e.g., "fidelity to the process-based model's predictions").l22-24
“Here we present mLDNDCv1.0, a tree-based machine-learning surrogate of LandscapeDNDC that allows for the rapid exploration of large decision spaces while maintaining high fidelity to the parent process-based model's input-output behaviour.”RC: Second is the path dependency and legacy effects in agricultural systems. This approach ignores the trajectory-dependent nature of agroecosystem state variables. Soil organic carbon accrual, microbial community composition and other soil health related variables are slow variables whose current states constrain future management responses and have feedback effects. The discussion should acknowledgement that two-year rotation categorial variable is an insufficient proxy for these cumulative legacy dynamics.
AC: We thank the reviewer for raising this important point. We agree that agroecosystem state variables such as soil organic carbon and microbial community composition are trajectory-dependent, and that a two-year categorical rotation variable cannot fully capture these cumulative legacy dynamics. We note that LandscapeDNDC does internally simulate these slow variables over the full modelling period, and so the surrogate's training data implicitly reflects their influence on outputs. However, the surrogate itself has no explicit representation of soil state trajectories, and cannot generalise to management histories outside those captured in the training simulations. We have added a paragraph to Section 4 discussing this design boundary and outlining how future versions could incorporate temporally explicit management sequences or slow state variables as additional inputs.l863-873
"The current surrogate framework represents management history through a categorical two-year crop rotation variable, which indexes distinct management trajectories as simulated by LandscapeDNDC. Because the process-based model tracks trajectory-dependent state variables, including soil organic carbon accrual and microbial community development, internally over the full modelling period, the surrogate's training data implicitly reflects their influence on modelled outputs for the rotation sequences considered. However, the surrogate does not explicitly represent the evolution of these slow state variables, and its applicability is therefore bounded by the management histories included in the training simulations. Future development could extend the framework by incorporating temporally explicit management sequences or slow state variables as additional inputs, further enhancing its capacity to represent path-dependent system behaviour across a broader range of management histories."RC: The third issue is temporal specificity and the real-time decision support claim. The surrogate predicts annual totals from static or seasonally aggregated features. Claiming suitability for real-time decision support overstates the tool's capability. In operational terms in farming conditions, real time decision making is much difficult because of within season interactions effects of uncertain weather, crop phenology, soil moisture dynamics among others. It would be better to state it as scenario-comparison and strategic planning tool.
AC: Thank you for this point. We agree that the original phrasing did not accurately reflect the temporal resolution of the surrogate, which operates on annual totals derived from static or seasonally aggregated inputs. The current framework is designed for evaluating and comparing management strategies at a planning level rather than for guiding within-season operational decisions, which would require resolving sub-annual dynamics that are beyond the scope of mLDNDC in its present form. We have revised the relevant text in the conclusions to describe the tool in terms consistent with its actual capabilities.l40-42
“Thus, mLDNDC provides a scalable, transparent framework for country-wide scenario comparison and strategic planning at an annual time scale in climate-smart agriculture".RC: Overall, the manuscript reads well. As reported by the first reviewer, the author needs to address the circular validation architecture used in the study as a limitation. Also, it would be better to be more concise in SHAP discussion sections as the post-hoc interpretability of a statistical surrogate does not substitute for mechanistic diagnosis.
AC: Thank you for this feedback. We have been more explicit with the SHAP discussion sections and revised the text to clearly situate the interpretability analysis within its appropriate scope. Regarding the circular validation architecture, this has been addressed as a limitation in Section 4, as also requested by the first reviewer.l589-596
"Since the surrogate is trained on LandscapeDNDC-generated outputs, the SHAP analysis presented here reflects the learned behaviour of the parent model rather than providing independent empirical evidence of biophysical processes. This is by design: the analysis serves to verify that the surrogate faithfully captures the process logic of the parent model rather than fitting to spurious patterns. The consistency of the resulting SHAP attributions with agronomically established relationships reported in the referenced literature provides confidence in both the surrogate's fidelity and the coherence of the parent model's process representations.”Citation: https://doi.org/10.5194/egusphere-2026-294-AC3
-
AC3: 'Reply on RC2', Jaber Rahimi, 20 May 2026
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 823 | 420 | 87 | 1,330 | 134 | 70 | 98 |
- HTML: 823
- PDF: 420
- XML: 87
- Total: 1,330
- Supplement: 134
- BibTeX: 70
- EndNote: 98
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Dear authors,
Unfortunately, after checking your manuscript, it has come to our attention that it does not comply with our "Code and Data Policy".
https://www.geoscientific-model-development.net/policies/code_and_data_policy.html
Checking the Code and Data Availability section, and the repository that you provide for the data, we have not found the data from the harmonized field-level data from the SmartField project that you have used. If we have missed it, please let us know replying to this comment, and omit the remainder of this comment.
This issue should have been noticed before, and due to it, your manuscript should have not been accepted for Discussions or peer review in the journal. Therefore, the current situation is irregular.
The GMD review and publication process depends on reviewers and community commentators being able to access, during the discussion phase, the code and data on which a manuscript depends, and on ensuring the provenance of replicability of the published papers for years after their publication. Please, therefore, publish your data in one of the appropriate repositories and reply to this comment with the relevant information (link and a permanent identifier for it (e.g. DOI)) as soon as possible. We cannot have manuscripts under discussion that do not comply with our policy.
The 'Code and Data Availability’ section must also be modified to cite the new repository locations, and corresponding references added to the bibliography.
I must note that if you do not fix this problem, we cannot continue with the peer-review process or accept your manuscript for publication in GMD.
Juan A. Añel
Geosci. Model Dev. Executive Editor