the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A GLUE-based assessment of WaTEM/SEDEM for simulating soil erosion, transport, and deposition in soil conservation optimised agricultural watersheds
Abstract. Soil erosion models are essential tools for soil conservation planning. Although these models are generally well-tested against plot and field data for in-field soil management, challenges arise when scaling up to the landscape level, where sediment trapping along landscape features becomes increasingly critical. At this scale, a separate analysis of model performance in representing erosion, sediment transport, and deposition processes is both challenging and often lacking. In this study, we assessed the capacity of the spatially distributed erosion and sediment transport model WaTEM/SEDEM to simulate sediment yields in six micro-scale watersheds ranging from 0.8 to 7.8 ha, monitored over eight years from 1994 to 2001. The watersheds were comprised of two groups: four field-dominated watersheds characterised by arable land with minimal landscape structures, and two structure-dominated watersheds featuring a combination of arable land and linear landscape structures (mainly grassed waterways along thalwegs) that minimise sediment connectivity. This setup enabled a separate analysis of model performance for both watershed groups. A Generalised Likelihood Uncertainty Estimation (GLUE) framework was employed to account for measurement and model uncertainties across multiple spatiotemporal scales. Our results show that while WaTEM/SEDEM generally captured the magnitude of the very low measured sediment yields in the monitored watersheds, the model did not meet our pre-defined limits of acceptability when operating on annual timesteps. However, the WaTEM/SEDEM's performance improved substantially when model realisations were aggregated across the eight-year monitoring period and over the two watershed groups, with mean absolute errors of 0.11 t ha⁻¹ yr⁻¹ for field-dominated and 0.18 t ha⁻¹ yr⁻¹ for structure-dominated watersheds. Our findings demonstrate that the model can represent the influence of soil conservation measures on reducing soil erosion and sediment delivery but performs better for long-term conservation planning at larger scales than for precise annual predictions in individual micro-scale watersheds with specific conservation practices.
Competing interests: Pedro V. G. Batista and Peter Fiener serve as Topic Editor and Executive Editor of SOIL, respectively.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(1608 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 03 Dec 2025)
- RC1: 'Comment on egusphere-2025-3391', Anonymous Referee #1, 02 Oct 2025 reply
-
RC2: 'Comment on egusphere-2025-3391', Joris Eekhout, 12 Nov 2025
reply
Review of “A GLUE-based assessment of WaTEM/SEDEM for simulating soil erosion, transport, and deposition in soil conservation optimised agricultural watersheds” submitted to SOIL for consideration for publication.
The manuscript describes a model application of the WaTEM/SEDEM model in six agricultural fields in Germany. The authors applied a GLUE approach to assess the parameter space of a number of model parameters that are relevant for in-field and structural conservation measures. The authors show that the model gives relatively good results at the field-scale, especially when considering the long-term trends, but largely overestimates sediment yield when structural conservation measures are considered.
In general, the manuscript is very well written and is accompanied with clear figures and tables. However, there are some aspects of the manuscript that need revisions. These are mostly related to the objectives and the description of the applied methods.
The objectives should be better defined. The first objective focusses on testing of the model’s capabilities. This does not seem to be too ambitious. No matter what is the outcome, this objective will always be achieved. So please refine this objective to make it more ambitious. The second objective seems related to the GLUE approach, I suggest to explicitly include the GLUE approach in this objective. The last objective is similarly not too ambitious (either testing or analysing something will always be achieved). The concept of aggregating the data using different spatiotemporal resolutions has not been mentioned in the Introduction. I was expecting that this would go in some direction of using different spatial and temporal resolutions (different cell sizes and time steps, for instance). However, this is totally not the case. The authors instead use the long-term median model outcome, instead of the annual outcomes (the way USLE-type of models should actually be used). And the spatial aggregation is related to the two different conservation types considered. I’m not sure if this requires to be included in an objective.
Below I have provided specific comments to the text, figures and tables.
Specific comments
Lines 43-44: I was expecting the two strategies already in this sentence. I suggest to add after the colon “(i) in-field control measures and (ii) off-site sediment transport control structures”. It would be even better to make a clearer distinction, such as on-site and off-site measures.
Lines 43-46: In-field measures also frequently have the aim to increase infiltration and reduce runoff generation or to increase surface roughness to reduce flow velocities. This definition of in-field measures can be made a bit broader.
Line 47: Replace the first “and” with a comma.
Lines 50-51: Field demonstration would be highly feasible at the scale the authors are working and likely more convincing for stakeholders. Models are indeed valuable tools, but likely more for scenario evaluation, for instance, for different configurations of on-site and off-site measures.
Lines 72-73: What is the difference between meso-scale watersheds and large-scale catchments? Please clarify in the text.
Line 80: Replace “at the” with “in a”.
Line 81: Replace “in large-catchment sediment yield observations” with “at larger scales”.
Line 87: Replace the first “and” with a comma.
Lines 94-96: These two sentences are a bit difficult to understand. For instance, what do the authors mean with “These behavioural models”, behavioural models were not mentioned in the previous sentence.
Lines 101-102: The second objective refers to a sensitivity analysis? Or is this related to the application of the GLUE method? Please clarify in the text.
Lines 102-103: The third objective seems unrelated to the information provided in the Introduction or is this also related to the GLUE method (seems unlikely)? If not, please provide a short introduction on how differences in spatiotemporal resolutions impact soil erosion model outcomes.
Line 105: Replace “from” with “for”.
Lines 121-125: The main difference between the two different systems seems to be that W05 and W06 include grass strips, while the other study areas don’t. The other study areas also include retention ponds, which I consider to be a structural conservation measure. Please clarify in the text.
Line 127: Looking at Figure 1, it does not seem that the fields are arranged parallel to the contour lines (assuming that the curved lines indicate the contour lines). Please clarify in the text.
Line 135: What is meant with F15-F18? These are different configurations of the crop rotation? Please clarify.
Figure 1: I had to study the figure quite a bit to figure out which study area belonged to which field. I suggest to include another smaller panel where the fields are better indicated, with different colours, for instance. It would also be useful to get some more information about the contour lines, to give the reader an idea about the slopes in the study area. Moreover, the differences in crop rotation are not that clear from this figure. To which fields do the different F-codes belong?
Line 151: With “aliquot” the authors mean “sample”?
Lines 148-156: How were runoff and sediment totals estimated using this system? Was this continuously monitored or estimated after each event? Were the sediment samples further analysed on grain size distribution? Please clarify in the text.
Lines 161-164: The authors included a code availability statement saying that the code is available on reasonable request. I highly suggest to make the code publicly available through an open-source repository such as GitHub or Zenodo.
197-223: I suggest to restructure this subsection, especially the first paragraph introduces several concepts that are later on described in more detail. This might be confusing for many readers. Please add 1-2 sentences where is explained what will follow in this subsection.
Lines 197-202: This seems to be a bit confusing. What do the authors mean by “combining seasonal rainfall erosivity with temporal changes in soil cover”? The rainfall erosivity is used to calculate the crop factor? How is the SLR calculated and how is the SLR related to the crop factor? Please clarify in the text.
Lines 203-205: Change “bi-weekly measurements” to “bi-weekly crop and residue cover measurements” and remove the sentence in line 205.
Lines 205-207: How were the bi-weekly measurements translated to daily cover values? What is meant by standardised crop development? Please clarify.
Lines 238-240: These values were obtained from literature or included in a further analysis, e.g. GLUE. Please clarify.
Lines 260-261: How was this standard deviation applied? Please clarify.
Line 270: So the runoff samples are collected in a barrel. This has not been described under Data.
Lines 317-319: Here the authors mean that the study areas were subdivided into field- and structure-dominated systems? But that was already defined much earlier in this section. Why is there a need to repeat that here? Please clarify.
Lines 319-320: How did the authors aggregate the median values between field- and structure-dominated systems, by taking the average? Please clarify.
Lines 343-344: With “model performance” the authors mean “simulations”?
Lines 344-345: It seems that the median of the median is higher than 0.3 t/ha/yr (based on the boxplot in panel j of Figure 3), but here the authors suggest 0.24 t/ha/yr. Please explain where this value is based on.
Lines 345-347: Similar to the previous comment. The simulated median of the median is higher than 0.5 t/ha/yr (panel d of Figure 4). Please explain where the 0.15 t/ha/yr is based on.
Lines 356-358: This means that the model is performing better in W04 and worse in W05?
Table 2: If the Simulated SY is indeed the median of the median, then these values do not align with what is shown in Figures 3 and 4. See previous comments about this.
Figure 7: It seems that negative values are erosion and positive values deposition, please indicate this in the figure caption.
Lines 476-479: In that sense, it would be logical to also apply the model using long-term average rainfall erosivity values for the different study areas, instead of taking the median of the annual results.
Line 520-521: The TC is controlled by any value of kTC/A, not only high values. Please revise the sentence accordingly.
Lines 520-526: What is exactly the point the authors want to make here? That TC remains high enough to transport all sediment, without causing deposition. The question should be if this coincides with the observations or does the inclusion of retention ponds has a large influence on the modelled processes? (Ok, this is further explained in the subsequent paragraphs. I suggest to add 1-2 sentences explaining the likely reasons for this behaviour in the model, which you subsequently explain in the following paragraphs.)Citation: https://doi.org/10.5194/egusphere-2025-3391-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 805 | 68 | 16 | 889 | 25 | 32 |
- HTML: 805
- PDF: 68
- XML: 16
- Total: 889
- BibTeX: 25
- EndNote: 32
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The Authors investigate the use of a soil erosion and sediment delivery model (WaTEM/SEDEM: WS) on a well-instrumented catchment in Southern Germany. WS is replicated in Python and implemented within a GLUE approach, assuming prior uncertainty on input parameters and predictive target (sediment load) measurements. The study adopts a limits-of-acceptability approach, comparing model realisation against measurements to test its ability to perform adequate simulations in the presence of uncertainty. Model realisations are tested using both annual timesteps (i.e. annually aggregated sediment yield) and the average annual sediment yield (i.e. the yearly average sediment load) in several microcatchments, allowing comparisons at different levels of spatial and temporal aggregation. Although the average annual sediment yield is the intended predictive target for WS, testing WS at different temporal resolutions is nevertheless both scientifically and practically interesting.
Firstly, I commend the authors on their work. The manuscript reads well, and I believe the general research question is of interest to multiple backgrounds. In terms of the scientific advancement which are offered for soil erosion modelling, I find the conclusion to reiterate what is perhaps the most fundamental concept of the USLE - that temporal aggregation is required to generate acceptable predictions according to the central limit theorem - which isn't neccessarily surprising. Reiterating this point is nevertheless useful, however one could argue that “why” is more relevant than “if”. While the study correctly demonstrates that the model fails at the annual timestep, it offers limited insight into the reasons for this failure. A key value of testing a model outside its intended use case is to diagnose its structural weaknesses; this potential is not fully realised here.
Linking to this broader point is the main methodological critique I have of the manuscript. Despite using an implementation design which is intended to understand model drawbacks and accept only a subset of simulations, the authors do not consider uncertainty on the individual USLE parameters but instead opt for the use of an error surface on the gross erosion predictions. Given that a subset of these parameters are propagated into the transport capacity (i.e. L, S, R and K), the setup potentially ignores important parameter interactions which may impact both the uncertainty quantification and the insights into the model’s shortcomings. Considering the simplistic design of the model, parameter lumping seems avoidable. The choice of approach also induces a circularity into the argumentation of the study, where the lack of acceptable model realisations at the annual timestep is attributed to unconsidered uncertainty in the input factors (e.g. the (bio)physical impacts of conservation tillage on overland flow and erosion), which could have otherwise been considered in the modelling approach.
Secondly, the study implements a replication of the WS code but performs neither benchmarking against the standard model nor releases the source code openly. I am in favour of replications of models such as WS in popular programming languages such as Python (which although on average slower, permit easy data integration and parallelization as mentioned by the authors), but without showing at least a benchmarking use-case the current implementation lacks reproducibility and good modelling practice.
I deem both points necessary to address before recommending the study for publication in SOIL. I have included additional points below on a section-by-section basis.
Kind regards,
Introduction:
No introduction on temporal resolution is given, and how it influences the model assumptions, constrains the model parameters, and influences equifinality. WaTEM/SEDEM simulates the central tendency not the temporal variability. Finer timescales can mean more variability through time compared to through space (i.e. in the long-term annual average), which has obvious implications for the (required) parameter sensitivity. So using it for a case in which it is tested on the annual dynamics of sediment load should be clarified, and also a mention of the assumed impact on model equifinality compared to the long-term simulation.
L54-57 - can you add evidence regarding the most used model? I suggest adding a citation.
L66-68: I suggest being more specific and mentioning what conservation measures haven’t been evaluated. It should be stated what differences there are compared to grass and non arable elements which are commonly represented in the model. Or is it that they are not typically evaluated with real data in studies?
L73-75: This lumps measurements at vastly different spatial scales (e.g. erosion plots to large watersheds) into the same context, despite having considerably different scale-related implications when considering sediment delivery.
L69-77: This paragraph would benefit from a consideration of the practical considerations in the modelling process, since the implications depend on the objective of the modeller. Many modelling efforts seek acceptable sediment yield predictions, and use models with this predictive target but producing intermediate spatially distributed estimates. Others do require accurate spatial estimations with an acceptable level of uncertainty at their representative spatial and temporal scale.
Methods:
Currently I don’t see a justification for the selection of the priors given the driving processes of erosion. Why are error distributions considered uniform for all parameter distributions at all considered time scales? Is it not the case that driving events may be driven by low probability rainfall events or high intensity bursts? I suggest including justifications which match the nature of the driving processes, particularly in the case of changing temporal scales. In such a well-measured watershed, is it not possible to constrain the uncertainty components? It is later discussed that short windows of coincidence between bare soil and heavy rainfall can be critical, which would manifest as high uncertainty in the C-factor and R-factor. This is arguably the advantage of generating synthetic data.
As mentioned above, lumping everything into an error parameter on gross erosion poorly represents the individual contributions of sub-parameters, their interactions, and identifiability. At present, I miss a justification for this. What about the contribution (combinations of) sub-factors and their contribution to erosion and sediment transport realisations?
Regarding the general model implementation, a German USLE formulation is used in place of the typical RUSLE formulation. Can the authors show time series data of the annual parameter inputs used? It would be helpful for the reader to know the distributions of the input values. I would also suggest a discussion on what impact this may have on the model, plus the consequences for comparing parameter values (e.g. ktc) with other studies given the parameter compensation effects.
What about stream initiation and transition to channelised flow? In WS, there are various ways to consider the stream channel initiation by digitizing channels or considering a flow accumulation threshold. I would recommend mentioning this.
L197: The word seasonal is ambiguous in this case. I would also suggest using a mathematical formulation for the C-factor, showing how the SLR is generated and combined with rainfall erosivity and at what time scale. It’s also of general interest to the reader to know what these SLR and C-factor values are for both arable and grassland, and how they change through time. The literature reference for the SLR formulation is also grey literature, so more details are justified.
L297-304: How does the calculation of likelihoods vary between temporal aggregations? For individual years is this done by comparing the time series or individual simulations?
L316-324: Can the authors justify the use of the median? This would assume an underestimation of the total sum due to the positively skewed nature of sediment yield, which would have obvious implications for watershed management. Typical applications of WaTEM/SEDEM are applied to the mean.
Results:
The results are in general concisely presented. However, the spatial analysis section is overly brief. Do the multiple model runs which were made to address equifinality not give significantly more spatial information in addition to the median? What is the spatial variability of the behavioural predictions? Only the median is currently given but arguably one of the advantages of multiple realisations in WS is that you can get some idea of the variability in the spatial patterns from acceptable simulations. This is also useful to know the added spatial information which can be achieved for land management.
Discussion:
A typical explanation for the lack of global ktc parameters is the existence of unconsidered processes. Is this the case, or is it more of an inadequacy to capture the system behaviour? Can the field data give insights on this?
L459-467: This is somewhat difficult to follow. Is including this uncertainty in the input parameters not the purpose of using GLUE?
“Conservation landscapes” is combining multiple physical characteristics of the agricultural watersheds together, which have differing roles in soil erosion and sediment delivery through on-site and off-site effects. Is it due to grassed areas or conservation tillage? Indeed, grassed areas are commonly applied in the model through land use elements and grass buffer strips. So one could argue that they are indeed commonly applied in the model, but the impacts of conservation tillage on erosion and overland flow generation less so. It would help to separate conservation landscapes into their specific elements.
Can the authors elaborate on the effect of using the USLE formulation for Germany versus the typical RUSLE formulation used in WaTEM/SEDEM. Indeed I expect the model to be better calibrated for Germany agri-environmental conditions on which it was developed, however there are differences compared to the RUSLE formulation which are worthwhile to mention.
Conclusion:
L584-585: I didn’t see this point addressed in the manuscript. Is it not the case that the gross erosion estimates from the USLE factors overestimate the rates based on the most likely error surface values?