the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Ensemble Agroecosystem Modeling Enhances Predictions of Crop Yields and Soil Carbon Across the United States
Abstract. Accurately estimating crop yields and soil organic carbon (SOC) dynamics is essential for agricultural planning, carbon accounting, and sustainable land management. However, process-based agroecosystem models often produce divergent estimates due to variations in model structure, parameterization, and underlying assumptions. In this study, we developed a multi-model ensemble framework that integrates three widely used process-based models-Daily Century (DAYCENT), DeNitrification DeComposition (DNDC), and Ecosystem model (ECOSYS)-to simulate crop yields and SOC stock changes (0–30 cm) across cultivated lands of the continental United States (CONUS) at 4 km2 spatial resolution. Each model was parameterized using harmonized environmental, soil, and management datasets and evaluated using observed crop yields from the National Agricultural Statistics Service and measured SOC data from the Rapid Carbon Assessment. For the baseline period (2014-2023) under conventional corn-soybean rotation, the ensemble mean showed strong agreement with observations (corn: 7.7 vs. 8.5 Mg ha-1, RMSE = 3.0; soybean: 2.5 vs. 3.0 Mg ha-1, RMSE = 1.0), while simulated SOC stocks (5.5 vs. 4.8 kg C m-2, RMSE = 2.5) closely matched measured data. Spatially, the ensemble model projected SOC gains in the Midwest and Southeastern regions and losses in the Great Plains and Western United States, underscoring the importance of region-specific management practices. Overall, the ensemble approach improved predictive accuracy and reduced uncertainty relative to individual models, providing a scalable pathway for robust, data driven assessments of soil carbon and crop productivity across U.S. agroecosystems.
- Preprint
(4836 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-1094', Anonymous Referee #1, 02 Apr 2026
- AC1: 'Comment on egusphere-2026-1094', Sagar Gautam, 05 May 2026
-
RC2: 'Comment on egusphere-2026-1094', Anonymous Referee #2, 10 May 2026
This manuscript egusphere-2026-1094 presents a multi-model ensemble framework combining three process-based agroecosystem models (DAYCENT, DNDC, ECOSYS) to simulate corn and soybean yields as well as soil organic carbon (SOC) stock changes across the continental United States (CONUS) from 2014–2023. The study is spatially comprehensive (4 km² resolution) and leverages harmonized environmental, soil, and management datasets. The ensemble median generally outperforms individual models when compared against NASS yield data and RaCA SOC measurements, showing reduced RMSE and improved central tendency. The authors conclude that ensemble modeling reduces structural uncertainty and provides a more robust basis for carbon accounting and sustainable agricultural policy.
While the topic is timely and relevant to SOIL, the manuscript contains several conceptual, methodological, and presentational issues that need substantial revision before publication. Please find my comments below.
Major comments:
1. The manuscript oscillates between claiming the ensemble “represents” uncertainty (e.g., lines 38–39, Page 3) and “reduces” uncertainty (e.g., lines 17–18, Page 2). These are different scientific goals. Please clarify: is the ensemble intended to characterize the range of plausible outcomes from structural variability, or to improve predictive accuracy via error cancellation? The framing affects the interpretation of RMSE improvements and the value of including biased models like ECOSYS.
2. ECOSYS consistently underperforms for both corn and SOC. The manuscript acknowledges this but does not convincingly justify why ECOSYS should be retained in the ensemble beyond “complementary strengths.” If ECOSYS is mechanistically more detailed but poorly calibrated for CONUS, its inclusion may degrade rather than enhance ensemble performance. Please provide a clearer rationale, or consider a sensitivity analysis excluding ECOSYS.
3. Section 2.3 uses only RMSE for model evaluation and parameter fine-tuning. RMSE does not penalize model complexity, nor does it account for systematic bias or pattern similarity. Please add additional metrics (e.g., Nash-Sutcliffe efficiency, percent bias, or Akaike Information Criterion as you suggested) to better characterize model performance. Also clarify whether the same data were used for calibration and validation to avoid overfitting.
4. In the method section (Section 2.3), it appears that the same datasets (NASS crop yield data and RaCA SOC data) have been used for both model parameterization (i.e., fine-tuning to minimize RMSE) and subsequent model validation. This practice risks overfitting and can lead to overly optimistic performance metrics, including the reported RMSE improvements for the ensemble. Please clarify whether any form of data separation (e.g., temporal or spatial holdout, cross-validation) was applied. If not, the reported agreement between simulated and observed values may reflect calibration rather than true predictive skill, and the ensemble’s apparent advantage could be overstated.
5. The manuscript states that “model parameterization was conducted to minimize RMSE between observed and predicted values” but does not describe the spatial/temporal splitting of NASS and RaCA data. If the same years and locations used for calibration are also used for evaluation, the reported RMSE values likely underestimate true prediction uncertainty. Please discuss how this might affect the apparent ensemble improvement.
6. The finding that the Midwest and Southeast show SOC gains while the Great Plains and West show losses under corn-soybean rotation is important but under-discussed. Why would the same rotation cause SOC losses in drier/western regions? Is this due to lower baseline SOC, different decomposition rates, or management differences? Similarly, DNDC projected SOC losses exceeding 0.01 kg C m⁻² yr⁻¹ in high-yielding zones. Please expand the mechanistic interpretation.
7. The caption and title of Figure 6 currently read “ECOSYS model projected SOC change…” but the figure shows (and text describes) multiple models and the ensemble. This must be corrected to “Agroecosystem model projected…” or similar.
8. You state that the ensemble “captures uncertainty”, but with only three models, it is unlikely that their spread represents the full range of uncertainty across all existing agroecosystem models. Please add a discussion of how the three-model spread compares to published multi-model ensemble studies (e.g., AgMIP) and whether the results are robust to inclusion of other models (e.g., APSIM, EPIC).Minor comments:
1. Line 13–14 (Page 2): Add units to RMSE values
2. Lines 17-18 (Page 4): The claim of “actionable information for policymakers” is unsupported by the current results (baseline only, no management comparisons). Please tone down or reframe.
3. Lines 24 (Page 4): Change “projections” to “estimates” or “simulations” since the study does not forecast future conditions.
4. Line 27 (Page 5): Typo: “where grown” → “were grown.”
5. Section 2.2 (Page 5–6): The model descriptions are dense and lack comparative synthesis. Please add a summary table or paragraph comparing key differences: process formulation (mechanistic vs. semi-empirical), required inputs, treatment of belowground processes, and yield determination logic. Also explicitly state why these three models were selected over others.
6. Figure 2: Use a color-blind friendly palette. Also explain why the ensemble shows many outliers despite lower RMSE.
7. Line 17 (Page 7): Delete the extra “a” before “an RMSE of 2.7.”
8. Line 43–44 (Page 8): You suggest ECOSYS bias may come from coupled root-canopy processes. Could you quantify/support this (e.g., from sensitivity analyses) using your model output?
9. Line numbers are renewed in every page.Citation: https://doi.org/10.5194/egusphere-2026-1094-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 777 | 428 | 56 | 1,261 | 70 | 83 |
- HTML: 777
- PDF: 428
- XML: 56
- Total: 1,261
- BibTeX: 70
- EndNote: 83
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The study by Gautam et al. presents an ensemble framework based on three established, process-based, agroecosystem models. This is used to simulate yield of corn and soybean and SOC across USA over a relatively long period. The study makes use of large datasets for model calibration and clearly shows the improvement provided by the ensemble compared to the use of each single model, while discussing the results not only in terms of model reliability but also introducing environmental issues and policies implications. Also, the manuscript is rather concise and provides, for the most part, precise information. I think that this work is relevant for SOIL and its readers, and it fits the special issue "Advances in dynamic soil modelling across scales" well.
However, I have some comments, especially about how the results are shown/discussed and how certain datasets are described. I am sure that the authors can address my comments and I think taht this will not modify the main findings in a substantial way. But I believe that this will provide a stronger manuscript that is clearer for the readers of SOIL and will have more impact. Please find below my main concerns, followed by minor points, and by more detail comments.
Main points:
Minor points:
Detailed comments:
The use of page-wise line numbers complicated the referencing to parts of the text. I will use PxxLxx to refer to page and line. I suggest using a continuous line numbering in the revised manuscript.
P03L07: what management of agroecosystem? Sustainable management? Or something more specific was intended here to then transition to SOC management and sequestration?
P03L11: I think that a comma after “crop rotation” and after “Merr.)” would help readability.
P03L13: better to use rotations ““can improve” instead of "improve”? I feel that pointing at a possibility would be more appropriate here.
P03L20-34: this paragraph offers a well-made description of the three models used in the manuscript. I do not think it is necessary to mention other models here. However, in the discussion part, a brief mention of the possible benefits and drawback of including additional models in the ensemble (or even substituting a current model), withpossible examples, would benefit the reader and the discussion.
P04L11: Is the 4km resolution sufficient for carbon accounting? Also, in the conclusions, carbon credits quantification is mentioned. I do not challenge the value of the presented methodology and results. But I wonder if the scale is sufficient for accurate carbon crediting. Especially given the fact that a lot of attention in this topic is currently shifting towards small scales like field-scales.
P04L14-16: I think this study has the potential to “emphasize the spatial variability” but the main and minor review points that I have listed above, currently hinder this possibility.
P04L31-37: In describing these datasets, their scales and resolutions are not provided but they would be useful to the reader.
Section 2.2: the vertical discretization of the soil layers is not discussed but should be mentioned when relevant in one or more of the three models.
P05L12: missing a space after the comma.
P05L23: which kind of stress conditions? Mentioning the most important owuld be useful.
P06L37: This sentence similar to other sentences across the text that mention the higher performance of the model ensemble compared to single models. Although I wonder how likely it is that this model ensemble would perform worse than the single models it is built on (maybe a short addition on this in the discussion could be beneficial), I agree that this improved performance should be mentioned in the appropriate parts of the text, as already done by the authors. The following discussion section does a good job in illustrating the strengths and weaknesses of each model and how they converge in the higher performance of the ensemble. What is missing is a discussion on the value of this higher performance, both generally and spatially. On the one hand because PP09L33-34 is a bit generic. On the other because the spatial distribution of the improvements of the ensemble is not explicit in the figures.
P08L29-31: I agree that the ensemble suggest this, but I think that the capacity of conventional corn-soybean systems to enhance carbon sequestration should either be supported by references or be presented as a possibility.
P08L43-35: I would suggest improving the readability of this sentence.
P08L46 to P09L1-3: also of this sentence.
P09L15: this is another case in which an explicit description of data distribution would help because, up to now, spatial density of calibration was not discussed.
P09L15-18: also here, please check sentence readability.
P09L20-23: I think Grace&Robetrson indicate such potential when certain regenerative farming practices are adopted. Also, Wu mentions scenarios with shiftings in crop rotation. Does this compare well to the rotations used in the ensemble of this study? Also, the area of the corn belt, as of figure 6, seems to show both increases (up to 2t/ha per year if I am correct) and decreases (down to -2t/ha per year) in SOC. Providing an average value of this increase for the area that is discussed would be beneficial for this discussion. At the same time, it would be interesting to discuss how this increase interacts with the discrepancy between the ensemble results and the RaCA dataset shown in figure 3 (where the ensemble overestimates RaCA median by some 15 %). While I agree that the ensemble offers the best comparison in this figure, I think it is necessary to provide a clear discussion about how significant this increase per year is, in the entire corn belt, with respect to the accuracy of the ensemble itself.
P09L36-46: This paragraphs nicely discuss some limitations of the study. However, I think the spatial resolution should also be addressed. 4km is likely a good trade-off, given the necessity of a supercomputer to run the simulations. But the implication of this resolution depending on applications is not discussed and, in cases like carbon credit quantification (mentioned in the conclusions), the scale becomes important as such activity likely needs higher spatial resolutions.
P09L40: please double check the grammar.
P10L15: I think that, for carbon crediting and regenerative agriculture initiatives, it has the potential to do these things. However, this should be presented more clearly as an outlook of this ensemble that will probably requires further model testing or development. For example, because regenerative practices are not included in the current simulations and, probably, the resolution of 4km is not sufficient for accurate carbon crediting.
P12L28: In reference “FAO. (2025). Agricultural land (% of land area). Retrieved from: https://ourworldindata.org/grapher/agricultural-land-percent-land-area”, the hyperlink does not work.
Figure 2 and Figure 3: please use consistent number of decimals. Also, in the second panel of Figure 2, is the RMSE of Ecosys 1.1? It seems to deviate most from NASS and has smaller RMSE than Caycent. Having something more than RMSE, like model efficiency and RMSE as % of max value of measurements (or something similar), would help in better reading these results.
Figure 6: Most values are concentrated between -02 and 0.2 km C m2 yr-1. Would it be more readable if the colour ramp is stretched in this range, maybe maintaining different colours for negative and positive values? Or it would get too complicated to read?