Ensemble Agroecosystem Modeling Enhances Predictions of Crop Yields and Soil Carbon Across the United States
Abstract. Accurately estimating crop yields and soil organic carbon (SOC) dynamics is essential for agricultural planning, carbon accounting, and sustainable land management. However, process-based agroecosystem models often produce divergent estimates due to variations in model structure, parameterization, and underlying assumptions. In this study, we developed a multi-model ensemble framework that integrates three widely used process-based models-Daily Century (DAYCENT), DeNitrification DeComposition (DNDC), and Ecosystem model (ECOSYS)-to simulate crop yields and SOC stock changes (0–30 cm) across cultivated lands of the continental United States (CONUS) at 4 km2 spatial resolution. Each model was parameterized using harmonized environmental, soil, and management datasets and evaluated using observed crop yields from the National Agricultural Statistics Service and measured SOC data from the Rapid Carbon Assessment. For the baseline period (2014-2023) under conventional corn-soybean rotation, the ensemble mean showed strong agreement with observations (corn: 7.7 vs. 8.5 Mg ha-1, RMSE = 3.0; soybean: 2.5 vs. 3.0 Mg ha-1, RMSE = 1.0), while simulated SOC stocks (5.5 vs. 4.8 kg C m-2, RMSE = 2.5) closely matched measured data. Spatially, the ensemble model projected SOC gains in the Midwest and Southeastern regions and losses in the Great Plains and Western United States, underscoring the importance of region-specific management practices. Overall, the ensemble approach improved predictive accuracy and reduced uncertainty relative to individual models, providing a scalable pathway for robust, data driven assessments of soil carbon and crop productivity across U.S. agroecosystems.
The study by Gautam et al. presents an ensemble framework based on three established, process-based, agroecosystem models. This is used to simulate yield of corn and soybean and SOC across USA over a relatively long period. The study makes use of large datasets for model calibration and clearly shows the improvement provided by the ensemble compared to the use of each single model, while discussing the results not only in terms of model reliability but also introducing environmental issues and policies implications. Also, the manuscript is rather concise and provides, for the most part, precise information. I think that this work is relevant for SOIL and its readers, and it fits the special issue "Advances in dynamic soil modelling across scales" well.
However, I have some comments, especially about how the results are shown/discussed and how certain datasets are described. I am sure that the authors can address my comments and I think taht this will not modify the main findings in a substantial way. But I believe that this will provide a stronger manuscript that is clearer for the readers of SOIL and will have more impact. Please find below my main concerns, followed by minor points, and by more detail comments.
Main points:
Minor points:
Detailed comments:
The use of page-wise line numbers complicated the referencing to parts of the text. I will use PxxLxx to refer to page and line. I suggest using a continuous line numbering in the revised manuscript.
P03L07: what management of agroecosystem? Sustainable management? Or something more specific was intended here to then transition to SOC management and sequestration?
P03L11: I think that a comma after “crop rotation” and after “Merr.)” would help readability.
P03L13: better to use rotations ““can improve” instead of "improve”? I feel that pointing at a possibility would be more appropriate here.
P03L20-34: this paragraph offers a well-made description of the three models used in the manuscript. I do not think it is necessary to mention other models here. However, in the discussion part, a brief mention of the possible benefits and drawback of including additional models in the ensemble (or even substituting a current model), withpossible examples, would benefit the reader and the discussion.
P04L11: Is the 4km resolution sufficient for carbon accounting? Also, in the conclusions, carbon credits quantification is mentioned. I do not challenge the value of the presented methodology and results. But I wonder if the scale is sufficient for accurate carbon crediting. Especially given the fact that a lot of attention in this topic is currently shifting towards small scales like field-scales.
P04L14-16: I think this study has the potential to “emphasize the spatial variability” but the main and minor review points that I have listed above, currently hinder this possibility.
P04L31-37: In describing these datasets, their scales and resolutions are not provided but they would be useful to the reader.
Section 2.2: the vertical discretization of the soil layers is not discussed but should be mentioned when relevant in one or more of the three models.
P05L12: missing a space after the comma.
P05L23: which kind of stress conditions? Mentioning the most important owuld be useful.
P06L37: This sentence similar to other sentences across the text that mention the higher performance of the model ensemble compared to single models. Although I wonder how likely it is that this model ensemble would perform worse than the single models it is built on (maybe a short addition on this in the discussion could be beneficial), I agree that this improved performance should be mentioned in the appropriate parts of the text, as already done by the authors. The following discussion section does a good job in illustrating the strengths and weaknesses of each model and how they converge in the higher performance of the ensemble. What is missing is a discussion on the value of this higher performance, both generally and spatially. On the one hand because PP09L33-34 is a bit generic. On the other because the spatial distribution of the improvements of the ensemble is not explicit in the figures.
P08L29-31: I agree that the ensemble suggest this, but I think that the capacity of conventional corn-soybean systems to enhance carbon sequestration should either be supported by references or be presented as a possibility.
P08L43-35: I would suggest improving the readability of this sentence.
P08L46 to P09L1-3: also of this sentence.
P09L15: this is another case in which an explicit description of data distribution would help because, up to now, spatial density of calibration was not discussed.
P09L15-18: also here, please check sentence readability.
P09L20-23: I think Grace&Robetrson indicate such potential when certain regenerative farming practices are adopted. Also, Wu mentions scenarios with shiftings in crop rotation. Does this compare well to the rotations used in the ensemble of this study? Also, the area of the corn belt, as of figure 6, seems to show both increases (up to 2t/ha per year if I am correct) and decreases (down to -2t/ha per year) in SOC. Providing an average value of this increase for the area that is discussed would be beneficial for this discussion. At the same time, it would be interesting to discuss how this increase interacts with the discrepancy between the ensemble results and the RaCA dataset shown in figure 3 (where the ensemble overestimates RaCA median by some 15 %). While I agree that the ensemble offers the best comparison in this figure, I think it is necessary to provide a clear discussion about how significant this increase per year is, in the entire corn belt, with respect to the accuracy of the ensemble itself.
P09L36-46: This paragraphs nicely discuss some limitations of the study. However, I think the spatial resolution should also be addressed. 4km is likely a good trade-off, given the necessity of a supercomputer to run the simulations. But the implication of this resolution depending on applications is not discussed and, in cases like carbon credit quantification (mentioned in the conclusions), the scale becomes important as such activity likely needs higher spatial resolutions.
P09L40: please double check the grammar.
P10L15: I think that, for carbon crediting and regenerative agriculture initiatives, it has the potential to do these things. However, this should be presented more clearly as an outlook of this ensemble that will probably requires further model testing or development. For example, because regenerative practices are not included in the current simulations and, probably, the resolution of 4km is not sufficient for accurate carbon crediting.
P12L28: In reference “FAO. (2025). Agricultural land (% of land area). Retrieved from: https://ourworldindata.org/grapher/agricultural-land-percent-land-area”, the hyperlink does not work.
Figure 2 and Figure 3: please use consistent number of decimals. Also, in the second panel of Figure 2, is the RMSE of Ecosys 1.1? It seems to deviate most from NASS and has smaller RMSE than Caycent. Having something more than RMSE, like model efficiency and RMSE as % of max value of measurements (or something similar), would help in better reading these results.
Figure 6: Most values are concentrated between -02 and 0.2 km C m2 yr-1. Would it be more readable if the colour ramp is stretched in this range, maybe maintaining different colours for negative and positive values? Or it would get too complicated to read?