the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Advancing Crop Modeling and Data Assimilation Using AquaCrop v7.2 in NASA's Land Information System Framework v7.5
Abstract. This paper introduces the open-source AquaCrop v7.2 model as a new process-based crop model within NASA's Land Information System Framework (LISF) v7.5. The LISF enables high-performance crop modeling with efficient geospatial data handling, and paves the way for scalable satellite data assimilation into AquaCrop. Through three exploratory showcases, we demonstrate the current capabilities of AquaCrop in the LISF, along with topics for future development. First, coarse-scale crop growth simulations with various crop parameterizations are performed over Europe. Satellite-based estimates of land surface phenology are used to inform spatially variable crop parameters. These parameters improve canopy cover simulations in growing degree days compared to using uniform crop parameters in calendar days. Second, ensembles of coarse-scale simulations over Europe are created by perturbing meteorological forcings and soil moisture. The resulting uncertainties in root-zone soil moisture and biomass are often greater in water-limited regions than elsewhere. The third showcase aims to improve fine-scale agricultural simulations through satellite data assimilation. Fine-scale canopy cover observations are assimilated with an ensemble Kalman filter to update the crop state over winter wheat fields in the Piedmont region of Italy. The state updating is beneficial for the intermediary biomass estimates, but leads to only small improvements in yield estimates relative to reference data. This is due to strong model (parameter) constraints and limitations in the assimilated satellite observations and reference yield data. The showcases highlight pathways to improve or advance future crop estimates, e.g. through crop parameter updating and multi-sensor and multi-variate data assimilation.
- Preprint
(13962 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 01 Dec 2025)
-
RC1: 'Comment on egusphere-2025-4417', Anonymous Referee #1, 06 Nov 2025
reply
-
AC1: 'Reply on RC1', Gabriëlle De Lannoy, 06 Nov 2025
reply
We thank the reviewer for the timely and constructive comments. We list the comments in bold fonts below and provide answer in normal fonts, with suggestions for updated text in italic (additions are underlined). Line numbers refer to the submitted manuscript.
1. Line 5. Please specify the coarse-scale, for example: coarse-scale (>10km?)
Sure, we will add in the abstract that the resolution is 0.1 degree.
2. For the showcase 2, I don’t understand how exactly to perform the perturbations on shortwave radiation, precipitation, soil moisture. In my understanding, perturbations are used to understand the model sensitivity to the changes of forcing data or other targets. Typically, large ensemble simulations are performed with an increase or decrease of a target (using precipitation as an example) to understand how crop growth responds to variations in precipitation. Some studies perform such perturbations one at a time to understand the impact of a single perturbation on the simulation. In Table 2, I see that SW and P are multiplied by a standard deviation ratio. Is that correct? If so, then SW and P are both decreased and soil moisture is increased. This type of perturbation seems odd to me because it only shows how the crops respond to decreased SW and P. Why do these perturbation experiments matter for the study?
The perturbation setup follows what is done in state-of-the-art ensemble land surface modeling (e.g. Kumar et al., 2008, Heyvaert et al. 2023). The goal is to obtain a full random error (uncertainty) estimate on the model simulations, introduced by a combination of all possible errors (perturbations) in the input. We chose to perturb the input forcing and state variables (assuming that random errors in the parameters are implicitly captured in errors on the state variables) to get an integrated dynamic estimate of the uncertainty in CC, biomass and other output variables. See also response to comment 3 below.
We will re-order sentence describing the purpose of ensembles as follows:
L.275: “...to create an ensemble of crop model trajectories to determine the model sensitivity to these aspects individually. However, ensembles are also used to quantify (i) the total time-varying uncertainty of the simulation output, or forecast error, and (ii) the correlation of the forecast errors between the various simulated variables. These dynamic ensemble uncertainty estimates are particularly important for DA.... most ensemble simulations with AquaCrop have been performed to study the model sensitivity to crop parameters, and not to estimate the total model uncertainty in response to errors in the state or meteorological estimates.”
The SW and P are perturbed through multiplication with a factor 1+/- a random number taken from a distribution with standard deviation (std) 0.3 or 0.4. For example, for a std of 0.3, 68% of the distribution of the multiplication factor is thus randomly sampled between 0.7 and 1.3.
We will add in the caption of Table 2: “The std is shown relative to the mean (1 or 0) multiplicative or additive perturbation value.”
Furthermore, we will update the text with a reference to earlier studies that use similar tables for land surface modelling, and we will repeat details from the Table 2 caption in the text:
L.286: “The perturbation parameters are spatially and temporally constant, as summarized in Table 2. The setup is inspired by state-of-the-art land surface data assimilation studies (Kumar et al., 2008, Heyvaert et al., 2023). The resulting random perturbations are applied (i) hourly to the hourly MERRA-2 shortwave radiation and precipitation, with a 24 hour temporal autocorrelation to obtain meaningful daily perturbed forcings as input to AquaCrop, and (ii) daily to soil moisture estimate without any temporal autocorrelation. The hourly perturbed MERRA-2 data are converted to daily AquaCrop forcing input of ETo and P as in Busschaert et al. (2022).”
3. Line 290. Why is a perturbation bias correction is needed here? Again, in my understanding, perturbation experiment is just to vary the forcing data to an acceptable range and see how crop growth repones to these changes.
The crop model is non-linear and the zero-centered perturbations to the input can lead to biases in the output. The perturbation bias correction ensures that the ensemble open loop remains unbiased relative to the deterministic simulation as mentioned in the paper. The ensembles in this study are meant to estimate the integrated uncertainty only (to serve later in a data assimilation system), not systematic deviations or biases.
We will edit this as follows:
“to keep the soil moisture ensembles centered around the unperturbed deterministic simulation, and avoid that biases in soil moisture propagate into the biomass uncertainty estimates.”
4. Line 292. What are these 24 members? With which perturbation combinations?
The perturbation combinations are random. One member is one entire model trajectory using a combination of slightly perturbed SW, P, and soil moisture in multiple compartments. The perturbations are random, and different at each time step (even if there is some autocorrelation in the perturbations for the meteorological forcings). References will be added as proposed in response to comment 2.
5. Line 294. Why use only three years results?
This is a showcase, and 3 years already makes the point that we are able to construct good ensemble uncertainty estimates and provide scientific insight in them. Also note that ensemble simulations are computationally intensive.
6. For Showcase 3, I think the goal is to compare the DA results and the original results, why was the ensemble model performed? What is the relationship between the ensemble runs in showcase 2 and 3? I suggest deleting the OL results because they distract from the main points of Showcase 3.
The DA simulations are based on an ensemble simulation to obtain forecast uncertainty estimates. The reference without DA is thus an ensemble OL. Because of nonlinearities, the ensemble mean OL and deterministic simulations are never perfectly identical and we want to disentangle this effect from the DA update process. We therefore prefer to keep the OL results in the paper, in line with most DA publications for transparency.
7. Line 354. “The years 2017 through 2023” why the simulations are carried out for these years? Is it because of observation availability?
This is because of the combined availability of crop maps, yield data, and assimilated observations.
8. Table 3. Please explain why DA does not generate a better yield simulation.
This is explained in L. 467-474.
9. Figure 10. Please show a plot for obs vs Det.
We can add the figure in supplement, if needed, but it would distract from the main message of what the DA is doing relative to its reference OL. By adding the deterministic and OL run, we would need to introduce more discussion of why ensembles deteriorate the deterministic run, which is in fact beyond the goal of this showcase. See also our response to comment 6.
10. Line 482. Could you perform process-based deeper analysis to confirm there are parameter constraints.
Some of these parameter constraints are explained in Line 464-466, but might have been lost in the discussion. We will explicitly add the word “parameter” in these lines and add references to the respective sections for clarity:
“CC_i cannot be updated above the CC_pot,sf,I parameter (Section 2.2, Appendix A)....and the yield range is limited by the CC_pot,sd,i and HI_o parameters.”
-
AC1: 'Reply on RC1', Gabriëlle De Lannoy, 06 Nov 2025
reply
Model code and software
AquaCrop v7.2 Gabriëlle J. M. De Lannoy et al. https://github.com/KUL-RSDA/AquaCrop
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 199 | 64 | 12 | 275 | 12 | 11 |
- HTML: 199
- PDF: 64
- XML: 12
- Total: 275
- BibTeX: 12
- EndNote: 11
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
This work presents crop modeling using the AquaCrop model within NASA’s Land Information System Framework (LISF). First, the authors describe how crop growth is simulated in the AquaCrop model. Then, they present three experiments that inform new LISF features in parameter perturbation and data assimilation. They find that updated crop parameters based on satellite estimates of land surface phenology improve canopy cover simulations. Using data from crop states in winter wheat fields in the Piedmont region of Italy, biomass and canopy cover improved, though crop yield did not. The manuscript is well written. However, some comments need to be addressed before it can be considered for publication.
Comments:
Line 5. Please specify the coarse-scale, for example: coarse-scale (>10km?)
For the showcase 2, I don’t understand how exactly to perform the perturbations on shortwave radiation, precipitation, soil moisture. In my understanding, perturbations are used to understand the model sensitivity to the changes of forcing data or other targets. Typically, large ensemble simulations are performed with an increase or decrease of a target (using precipitation as an example) to understand how crop growth responds to variations in precipitation. Some studies perform such perturbations one at a time to understand the impact of a single perturbation on the simulation. In Table 2, I see that SW and P are multiplied by a standard deviation ratio. Is that correct? If so, then SW and P are both decreased and soil moisture is increased. This type of perturbation seems odd to me because it only shows how the crops respond to decreased SW and P. Why do these perturbation experiments matter for the study?
Line 290. Why is a perturbation bias correction is needed here? Again, in my understanding, perturbation experiment is just to vary the forcing data to an acceptable range and see how crop growth repones to these changes.
Line 292. What are these 24 members? With which perturbation combinations?
Line 294. Why use only three years results?
For Showcase 3, I think the goal is to compare the DA results and the original results, why was the ensemble model performed? What is the relationship between the ensemble runs in showcase 2 and 3? I suggest deleting the OL results because they distract from the main points of Showcase 3.
Line 354. “The years 2017 through 2023” why the simulations are carried out for these years? Is it because of observation availability?
Table 3. Please explain why DA does not generate a better yield simulation.
Figure 10. Please show a plot for obs vs Det.
Line 482. Could you perform process-based deeper analysis to confirm there are parameter constraints.