Ensemble Agroecosystem Modeling Enhances Predictions of Crop Yields and Soil Carbon Across the United States

Gautam, Sagar; Jung, Chang Gyo; Mishra, Umakant; Lal, Rattan; Lorenz, Klaus; Tang, Jinyun; Presley, DeAnn Ricks; Franzluebbers, Alan J.

doi:10.5194/egusphere-2026-1094

Preprints

https://doi.org/10.5194/egusphere-2026-1094

Preprints

09 Mar 2026

| 09 Mar 2026

Ensemble Agroecosystem Modeling Enhances Predictions of Crop Yields and Soil Carbon Across the United States

Sagar Gautam, Chang Gyo Jung, Umakant Mishra, Rattan Lal, Klaus Lorenz, Jinyun Tang, DeAnn Ricks Presley, and Alan J. Franzluebbers

Abstract. Accurately estimating crop yields and soil organic carbon (SOC) dynamics is essential for agricultural planning, carbon accounting, and sustainable land management. However, process-based agroecosystem models often produce divergent estimates due to variations in model structure, parameterization, and underlying assumptions. In this study, we developed a multi-model ensemble framework that integrates three widely used process-based models-Daily Century (DAYCENT), DeNitrification DeComposition (DNDC), and Ecosystem model (ECOSYS)-to simulate crop yields and SOC stock changes (0–30 cm) across cultivated lands of the continental United States (CONUS) at 4 km² spatial resolution. Each model was parameterized using harmonized environmental, soil, and management datasets and evaluated using observed crop yields from the National Agricultural Statistics Service and measured SOC data from the Rapid Carbon Assessment. For the baseline period (2014-2023) under conventional corn-soybean rotation, the ensemble mean showed strong agreement with observations (corn: 7.7 vs. 8.5 Mg ha^-1, RMSE = 3.0; soybean: 2.5 vs. 3.0 Mg ha^-1, RMSE = 1.0), while simulated SOC stocks (5.5 vs. 4.8 kg C m^-2, RMSE = 2.5) closely matched measured data. Spatially, the ensemble model projected SOC gains in the Midwest and Southeastern regions and losses in the Great Plains and Western United States, underscoring the importance of region-specific management practices. Overall, the ensemble approach improved predictive accuracy and reduced uncertainty relative to individual models, providing a scalable pathway for robust, data driven assessments of soil carbon and crop productivity across U.S. agroecosystems.

Received: 25 Feb 2026 – Discussion started: 09 Mar 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Sagar Gautam, Chang Gyo Jung, Umakant Mishra, Rattan Lal, Klaus Lorenz, Jinyun Tang, DeAnn Ricks Presley, and Alan J. Franzluebbers

Status: final response (author comments only)

RC1:
'Comment on egusphere-2026-1094', Anonymous Referee #1, 02 Apr 2026
The study by Gautam et al. presents an ensemble framework based on three established, process-based, agroecosystem models. This is used to simulate yield of corn and soybean and SOC across USA over a relatively long period. The study makes use of large datasets for model calibration and clearly shows the improvement provided by the ensemble compared to the use of each single model, while discussing the results not only in terms of model reliability but also introducing environmental issues and policies implications. Also, the manuscript is rather concise and provides, for the most part, precise information. I think that this work is relevant for SOIL and its readers, and it fits the special issue "Advances in dynamic soil modelling across scales" well.
However, I have some comments, especially about how the results are shown/discussed and how certain datasets are described. I am sure that the authors can address my comments and I think taht this will not modify the main findings in a substantial way. But I believe that this will provide a stronger manuscript that is clearer for the readers of SOIL and will have more impact. Please find below my main concerns, followed by minor points, and by more detail comments.
Main points:
Lines 23-26 Page 6 state: “Each of the three models was validated” and “Model parameterization was conducted to minimize the root mean square error (RMSE)”. If I understand correctly, minimization of RMSE for yield and SOC was performed. I think this is rather calibration and not validation as, if I again understand correctly, all available data were used to minimize RMSE, leaving no independent dataset for model validation. This paragraph is rather short, and I may have misinterpret it. I thus suggest that the authors revise the text and the wording. Also, model validation through 80/20 or split sampling would be a nice addition to the manuscript, but it is true that validation is not mentioned elsewhere in the current text, so it may not be one of the objectives of this work. Alternatively to this, I would include additional metrics like Nash-Sutcliffe Efficiency and R2, and discuss the results and the possibly different outcomes of these metrics. Also, the RMSE and respective measurement units are consistently provided throughout the manuscript, which is well done. But I think that the addition of the RMSE in % of the maximum value (or an alternative strategy that provides the same benefit) would help the reader in better understanding the value of these RMSE values. At least in some key points in which results are discussed.

I am unfortunately not very familiar with the datasets that were used to calibrate the models and calculate RMSE values. I think that the current text does not provide the reader with sufficient information about the yield and SOC datasets mentioned at section 2.3. For example, the approximate number of points, their distribution, and their spatial patterns across the modelled domain are not clear. This has also repercussions on the readability of figures 4, 5, and 6 and on those parts of text (e.g., P08L12-14) where th espatial distribution of “more coherent and robust estimates of the ensemble across diverse agroclimatic zones” are mentioned. It is true that the text discusses some key local spatial patterns, like in the Mid-West corn belt and the Mississippi river basin. But the reader cannot currently see the spatial match between the models and the yield/SOC datasets, and I think a spatial understanding of the model performance is important given the large-scale of the study and the conclusions that are provided. I do not know what the best way is to show the spatial patterns of the measurements and the agreement between models and calibration. It is possible that maps showing measured values with the same 4km scale would not be sufficiently readable or informative. An alternative/addition could be to show maps of the RMSE distribution, maybe only for the ensemble results. I am however sure that the authors can find a suitable way to address this point.

In my view, the colours in figures 4, 5, and 6 are not intuitive. I agree with the use of colour classes instead of a continuous colour ramp, and with the use of one colour scale for yield (figures 4 and 5) and one for SOC (Figure 6). Also, current scales are readable for colour blind readers, which is nice indeed. But I find the current scales difficult to read. For example, 3-5 Mg/ha and 11-13 MG/ha have similar colours in figures 4 and 5, although they represent very different values. Same applies to 0-0.2 and 1-1.2 SOC intervals. I would suggest ramping the scale with more continuous colours.

Data availability: I am not sure that all data are contained within the article, as stated in the data availablility statement. While one can obtain the datasets used for yield and SOC, other data and results like simulation results cannot be obtained from the text or from citations, if I did not overlook something in the text. Copernicus journals require that data that correspond to journal articles are deposited in reliable (public) data repositories. I would argue that simulations results, at least, should be made available in a readable format. However, it is sometimes the case that these datasets involve also codes for results (e.g., to get plots and statistics from simulations) and materials for figures. Please check the Copernicus data policy for this.

Minor points:
The spatial resolution was set to 4 km. While it is not my intent to challenge this choice, I believe the readers would benefit from a brief explanation of the reasons behind it. Being it computation time, spatial distribution of calibration data, or policy-related, I think this is an interesting detail.

I am not good with names, and I had troubles pinpointing states in the US map. While recognising states on the US map is trivial for some, I think that labelling specific states that are mentioned in the text would be useful.

Detailed comments:
The use of page-wise line numbers complicated the referencing to parts of the text. I will use PxxLxx to refer to page and line. I suggest using a continuous line numbering in the revised manuscript.
P03L07: what management of agroecosystem? Sustainable management? Or something more specific was intended here to then transition to SOC management and sequestration?
P03L11: I think that a comma after “crop rotation” and after “Merr.)” would help readability.
P03L13: better to use rotations ““can improve” instead of "improve”? I feel that pointing at a possibility would be more appropriate here.
P03L20-34: this paragraph offers a well-made description of the three models used in the manuscript. I do not think it is necessary to mention other models here. However, in the discussion part, a brief mention of the possible benefits and drawback of including additional models in the ensemble (or even substituting a current model), withpossible examples, would benefit the reader and the discussion.
P04L11: Is the 4km resolution sufficient for carbon accounting? Also, in the conclusions, carbon credits quantification is mentioned. I do not challenge the value of the presented methodology and results. But I wonder if the scale is sufficient for accurate carbon crediting. Especially given the fact that a lot of attention in this topic is currently shifting towards small scales like field-scales.
P04L14-16: I think this study has the potential to “emphasize the spatial variability” but the main and minor review points that I have listed above, currently hinder this possibility.
P04L31-37: In describing these datasets, their scales and resolutions are not provided but they would be useful to the reader.
Section 2.2: the vertical discretization of the soil layers is not discussed but should be mentioned when relevant in one or more of the three models.
P05L12: missing a space after the comma.
P05L23: which kind of stress conditions? Mentioning the most important owuld be useful.
P06L37: This sentence similar to other sentences across the text that mention the higher performance of the model ensemble compared to single models. Although I wonder how likely it is that this model ensemble would perform worse than the single models it is built on (maybe a short addition on this in the discussion could be beneficial), I agree that this improved performance should be mentioned in the appropriate parts of the text, as already done by the authors. The following discussion section does a good job in illustrating the strengths and weaknesses of each model and how they converge in the higher performance of the ensemble. What is missing is a discussion on the value of this higher performance, both generally and spatially. On the one hand because PP09L33-34 is a bit generic. On the other because the spatial distribution of the improvements of the ensemble is not explicit in the figures.
P08L29-31: I agree that the ensemble suggest this, but I think that the capacity of conventional corn-soybean systems to enhance carbon sequestration should either be supported by references or be presented as a possibility.
P08L43-35: I would suggest improving the readability of this sentence.
P08L46 to P09L1-3: also of this sentence.
P09L15: this is another case in which an explicit description of data distribution would help because, up to now, spatial density of calibration was not discussed.
P09L15-18: also here, please check sentence readability.
P09L20-23: I think Grace&Robetrson indicate such potential when certain regenerative farming practices are adopted. Also, Wu mentions scenarios with shiftings in crop rotation. Does this compare well to the rotations used in the ensemble of this study? Also, the area of the corn belt, as of figure 6, seems to show both increases (up to 2t/ha per year if I am correct) and decreases (down to -2t/ha per year) in SOC. Providing an average value of this increase for the area that is discussed would be beneficial for this discussion. At the same time, it would be interesting to discuss how this increase interacts with the discrepancy between the ensemble results and the RaCA dataset shown in figure 3 (where the ensemble overestimates RaCA median by some 15 %). While I agree that the ensemble offers the best comparison in this figure, I think it is necessary to provide a clear discussion about how significant this increase per year is, in the entire corn belt, with respect to the accuracy of the ensemble itself.
P09L36-46: This paragraphs nicely discuss some limitations of the study. However, I think the spatial resolution should also be addressed. 4km is likely a good trade-off, given the necessity of a supercomputer to run the simulations. But the implication of this resolution depending on applications is not discussed and, in cases like carbon credit quantification (mentioned in the conclusions), the scale becomes important as such activity likely needs higher spatial resolutions.
P09L40: please double check the grammar.
P10L15: I think that, for carbon crediting and regenerative agriculture initiatives, it has the potential to do these things. However, this should be presented more clearly as an outlook of this ensemble that will probably requires further model testing or development. For example, because regenerative practices are not included in the current simulations and, probably, the resolution of 4km is not sufficient for accurate carbon crediting.
P12L28: In reference “FAO. (2025). Agricultural land (% of land area). Retrieved from: https://ourworldindata.org/grapher/agricultural-land-percent-land-area”, the hyperlink does not work.
Figure 2 and Figure 3: please use consistent number of decimals. Also, in the second panel of Figure 2, is the RMSE of Ecosys 1.1? It seems to deviate most from NASS and has smaller RMSE than Caycent. Having something more than RMSE, like model efficiency and RMSE as % of max value of measurements (or something similar), would help in better reading these results.
Figure 6: Most values are concentrated between -02 and 0.2 km C m2 yr-1. Would it be more readable if the colour ramp is stretched in this range, maybe maintaining different colours for negative and positive values? Or it would get too complicated to read?
Citation: https://doi.org/10.5194/egusphere-2026-1094-RC1
- AC1: 'Comment on egusphere-2026-1094', Sagar Gautam, 05 May 2026
  
  Dear Editor and Reviewers,
  Thank you for the helpful feedback on our manuscript. We have carefully considered all comments and provided a point-by-point response in the attached document.
  Best Regards,
  Sagar Gautam (On behalf of all co-authors)
  
  Citation: https://doi.org/10.5194/egusphere-2026-1094-AC1
AC1: 'Comment on egusphere-2026-1094', Sagar Gautam, 05 May 2026

Dear Editor and Reviewers,
Thank you for the helpful feedback on our manuscript. We have carefully considered all comments and provided a point-by-point response in the attached document.
Best Regards,
Sagar Gautam (On behalf of all co-authors)

Citation: https://doi.org/10.5194/egusphere-2026-1094-AC1
RC2:
'Comment on egusphere-2026-1094', Anonymous Referee #2, 10 May 2026

This manuscript egusphere-2026-1094 presents a multi-model ensemble framework combining three process-based agroecosystem models (DAYCENT, DNDC, ECOSYS) to simulate corn and soybean yields as well as soil organic carbon (SOC) stock changes across the continental United States (CONUS) from 2014–2023. The study is spatially comprehensive (4 km² resolution) and leverages harmonized environmental, soil, and management datasets. The ensemble median generally outperforms individual models when compared against NASS yield data and RaCA SOC measurements, showing reduced RMSE and improved central tendency. The authors conclude that ensemble modeling reduces structural uncertainty and provides a more robust basis for carbon accounting and sustainable agricultural policy.
While the topic is timely and relevant to SOIL, the manuscript contains several conceptual, methodological, and presentational issues that need substantial revision before publication. Please find my comments below.
Major comments:

1. The manuscript oscillates between claiming the ensemble “represents” uncertainty (e.g., lines 38–39, Page 3) and “reduces” uncertainty (e.g., lines 17–18, Page 2). These are different scientific goals. Please clarify: is the ensemble intended to characterize the range of plausible outcomes from structural variability, or to improve predictive accuracy via error cancellation? The framing affects the interpretation of RMSE improvements and the value of including biased models like ECOSYS.

2. ECOSYS consistently underperforms for both corn and SOC. The manuscript acknowledges this but does not convincingly justify why ECOSYS should be retained in the ensemble beyond “complementary strengths.” If ECOSYS is mechanistically more detailed but poorly calibrated for CONUS, its inclusion may degrade rather than enhance ensemble performance. Please provide a clearer rationale, or consider a sensitivity analysis excluding ECOSYS.

3. Section 2.3 uses only RMSE for model evaluation and parameter fine-tuning. RMSE does not penalize model complexity, nor does it account for systematic bias or pattern similarity. Please add additional metrics (e.g., Nash-Sutcliffe efficiency, percent bias, or Akaike Information Criterion as you suggested) to better characterize model performance. Also clarify whether the same data were used for calibration and validation to avoid overfitting.

4. In the method section (Section 2.3), it appears that the same datasets (NASS crop yield data and RaCA SOC data) have been used for both model parameterization (i.e., fine-tuning to minimize RMSE) and subsequent model validation. This practice risks overfitting and can lead to overly optimistic performance metrics, including the reported RMSE improvements for the ensemble. Please clarify whether any form of data separation (e.g., temporal or spatial holdout, cross-validation) was applied. If not, the reported agreement between simulated and observed values may reflect calibration rather than true predictive skill, and the ensemble’s apparent advantage could be overstated.

5. The manuscript states that “model parameterization was conducted to minimize RMSE between observed and predicted values” but does not describe the spatial/temporal splitting of NASS and RaCA data. If the same years and locations used for calibration are also used for evaluation, the reported RMSE values likely underestimate true prediction uncertainty. Please discuss how this might affect the apparent ensemble improvement.

6. The finding that the Midwest and Southeast show SOC gains while the Great Plains and West show losses under corn-soybean rotation is important but under-discussed. Why would the same rotation cause SOC losses in drier/western regions? Is this due to lower baseline SOC, different decomposition rates, or management differences? Similarly, DNDC projected SOC losses exceeding 0.01 kg C m⁻² yr⁻¹ in high-yielding zones. Please expand the mechanistic interpretation.

7. The caption and title of Figure 6 currently read “ECOSYS model projected SOC change…” but the figure shows (and text describes) multiple models and the ensemble. This must be corrected to “Agroecosystem model projected…” or similar.

8. You state that the ensemble “captures uncertainty”, but with only three models, it is unlikely that their spread represents the full range of uncertainty across all existing agroecosystem models. Please add a discussion of how the three-model spread compares to published multi-model ensemble studies (e.g., AgMIP) and whether the results are robust to inclusion of other models (e.g., APSIM, EPIC).
Minor comments:

1. Line 13–14 (Page 2): Add units to RMSE values

2. Lines 17-18 (Page 4): The claim of “actionable information for policymakers” is unsupported by the current results (baseline only, no management comparisons). Please tone down or reframe.

3. Lines 24 (Page 4): Change “projections” to “estimates” or “simulations” since the study does not forecast future conditions.

4. Line 27 (Page 5): Typo: “where grown” → “were grown.”

5. Section 2.2 (Page 5–6): The model descriptions are dense and lack comparative synthesis. Please add a summary table or paragraph comparing key differences: process formulation (mechanistic vs. semi-empirical), required inputs, treatment of belowground processes, and yield determination logic. Also explicitly state why these three models were selected over others.

6. Figure 2: Use a color-blind friendly palette. Also explain why the ensemble shows many outliers despite lower RMSE.

7. Line 17 (Page 7): Delete the extra “a” before “an RMSE of 2.7.”

8. Line 43–44 (Page 8): You suggest ECOSYS bias may come from coupled root-canopy processes. Could you quantify/support this (e.g., from sensitivity analyses) using your model output?

9. Line numbers are renewed in every page.

Citation: https://doi.org/10.5194/egusphere-2026-1094-RC2
- AC2: 'Reply on RC2', Sagar Gautam, 20 May 2026
  
  Dear Editor and Reviewers,
  Thank you for the helpful feedback on our manuscript. We have carefully considered all comments and provided a point-by-point response in the attached document.
  Best Regards,
  Sagar Gautam (On behalf of all co-authors)
  
  Citation: https://doi.org/10.5194/egusphere-2026-1094-AC2

Sagar Gautam, Chang Gyo Jung, Umakant Mishra, Rattan Lal, Klaus Lorenz, Jinyun Tang, DeAnn Ricks Presley, and Alan J. Franzluebbers

Viewed

Total article views: 1,438 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
876	498	64	1,438	79	92

HTML: 876
PDF: 498
XML: 64
Total: 1,438
BibTeX: 79
EndNote: 92

Views and downloads (calculated since 09 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	636	359	51	1,046
Apr 2026	116	56	3	175
May 2026	78	52	7	137
Jun 2026	22	12	0	34
Jul 2026	24	19	3	46

Cumulative views and downloads (calculated since 09 Mar 2026)

Month	HTML	PDF	XML	Total
Mar 2026	636	359	51	1,046
Apr 2026	116	56	3	175
May 2026	78	52	7	137
Jun 2026	22	12	0	34
Jul 2026	24	19	3	46

Viewed (geographical distribution)

Total article views: 1,414 (including HTML, PDF, and XML) Thereof 1,414 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 01 Aug 2026

Short summary

We developed an ensemble approach that combines three agroecosystem models to predict crop yields and changes in soil carbon across the United States. The ensemble results were more accurate and consistent compared to individual model. Ensemble result matched closely the observed yield data and soil carbon measurements while reducing the uncertainty from the individual models. This work improves our ability to track carbon change and supports carbon farming, climate action, and land management.


Total:	0
HTML:	0
PDF:	0
XML:	0