A GLUE-based assessment of WaTEM/SEDEM for simulating soil erosion, transport, and deposition in soil conservation optimised agricultural watersheds
Abstract. Soil erosion models are essential tools for soil conservation planning. Although these models are generally well-tested against plot and field data for in-field soil management, challenges arise when scaling up to the landscape level, where sediment trapping along landscape features becomes increasingly critical. At this scale, a separate analysis of model performance in representing erosion, sediment transport, and deposition processes is both challenging and often lacking. In this study, we assessed the capacity of the spatially distributed erosion and sediment transport model WaTEM/SEDEM to simulate sediment yields in six micro-scale watersheds ranging from 0.8 to 7.8 ha, monitored over eight years from 1994 to 2001. The watersheds were comprised of two groups: four field-dominated watersheds characterised by arable land with minimal landscape structures, and two structure-dominated watersheds featuring a combination of arable land and linear landscape structures (mainly grassed waterways along thalwegs) that minimise sediment connectivity. This setup enabled a separate analysis of model performance for both watershed groups. A Generalised Likelihood Uncertainty Estimation (GLUE) framework was employed to account for measurement and model uncertainties across multiple spatiotemporal scales. Our results show that while WaTEM/SEDEM generally captured the magnitude of the very low measured sediment yields in the monitored watersheds, the model did not meet our pre-defined limits of acceptability when operating on annual timesteps. However, the WaTEM/SEDEM's performance improved substantially when model realisations were aggregated across the eight-year monitoring period and over the two watershed groups, with mean absolute errors of 0.11 t ha⁻¹ yr⁻¹ for field-dominated and 0.18 t ha⁻¹ yr⁻¹ for structure-dominated watersheds. Our findings demonstrate that the model can represent the influence of soil conservation measures on reducing soil erosion and sediment delivery but performs better for long-term conservation planning at larger scales than for precise annual predictions in individual micro-scale watersheds with specific conservation practices.
Competing interests: Pedro V. G. Batista and Peter Fiener serve as Topic Editor and Executive Editor of SOIL, respectively.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
The Authors investigate the use of a soil erosion and sediment delivery model (WaTEM/SEDEM: WS) on a well-instrumented catchment in Southern Germany. WS is replicated in Python and implemented within a GLUE approach, assuming prior uncertainty on input parameters and predictive target (sediment load) measurements. The study adopts a limits-of-acceptability approach, comparing model realisation against measurements to test its ability to perform adequate simulations in the presence of uncertainty. Model realisations are tested using both annual timesteps (i.e. annually aggregated sediment yield) and the average annual sediment yield (i.e. the yearly average sediment load) in several microcatchments, allowing comparisons at different levels of spatial and temporal aggregation. Although the average annual sediment yield is the intended predictive target for WS, testing WS at different temporal resolutions is nevertheless both scientifically and practically interesting.
Firstly, I commend the authors on their work. The manuscript reads well, and I believe the general research question is of interest to multiple backgrounds. In terms of the scientific advancement which are offered for soil erosion modelling, I find the conclusion to reiterate what is perhaps the most fundamental concept of the USLE - that temporal aggregation is required to generate acceptable predictions according to the central limit theorem - which isn't neccessarily surprising. Reiterating this point is nevertheless useful, however one could argue that “why” is more relevant than “if”. While the study correctly demonstrates that the model fails at the annual timestep, it offers limited insight into the reasons for this failure. A key value of testing a model outside its intended use case is to diagnose its structural weaknesses; this potential is not fully realised here.
Linking to this broader point is the main methodological critique I have of the manuscript. Despite using an implementation design which is intended to understand model drawbacks and accept only a subset of simulations, the authors do not consider uncertainty on the individual USLE parameters but instead opt for the use of an error surface on the gross erosion predictions. Given that a subset of these parameters are propagated into the transport capacity (i.e. L, S, R and K), the setup potentially ignores important parameter interactions which may impact both the uncertainty quantification and the insights into the model’s shortcomings. Considering the simplistic design of the model, parameter lumping seems avoidable. The choice of approach also induces a circularity into the argumentation of the study, where the lack of acceptable model realisations at the annual timestep is attributed to unconsidered uncertainty in the input factors (e.g. the (bio)physical impacts of conservation tillage on overland flow and erosion), which could have otherwise been considered in the modelling approach.
Secondly, the study implements a replication of the WS code but performs neither benchmarking against the standard model nor releases the source code openly. I am in favour of replications of models such as WS in popular programming languages such as Python (which although on average slower, permit easy data integration and parallelization as mentioned by the authors), but without showing at least a benchmarking use-case the current implementation lacks reproducibility and good modelling practice.
I deem both points necessary to address before recommending the study for publication in SOIL. I have included additional points below on a section-by-section basis.
Kind regards,
Introduction:
No introduction on temporal resolution is given, and how it influences the model assumptions, constrains the model parameters, and influences equifinality. WaTEM/SEDEM simulates the central tendency not the temporal variability. Finer timescales can mean more variability through time compared to through space (i.e. in the long-term annual average), which has obvious implications for the (required) parameter sensitivity. So using it for a case in which it is tested on the annual dynamics of sediment load should be clarified, and also a mention of the assumed impact on model equifinality compared to the long-term simulation.
L54-57 - can you add evidence regarding the most used model? I suggest adding a citation.
L66-68: I suggest being more specific and mentioning what conservation measures haven’t been evaluated. It should be stated what differences there are compared to grass and non arable elements which are commonly represented in the model. Or is it that they are not typically evaluated with real data in studies?
L73-75: This lumps measurements at vastly different spatial scales (e.g. erosion plots to large watersheds) into the same context, despite having considerably different scale-related implications when considering sediment delivery.
L69-77: This paragraph would benefit from a consideration of the practical considerations in the modelling process, since the implications depend on the objective of the modeller. Many modelling efforts seek acceptable sediment yield predictions, and use models with this predictive target but producing intermediate spatially distributed estimates. Others do require accurate spatial estimations with an acceptable level of uncertainty at their representative spatial and temporal scale.
Methods:
Currently I don’t see a justification for the selection of the priors given the driving processes of erosion. Why are error distributions considered uniform for all parameter distributions at all considered time scales? Is it not the case that driving events may be driven by low probability rainfall events or high intensity bursts? I suggest including justifications which match the nature of the driving processes, particularly in the case of changing temporal scales. In such a well-measured watershed, is it not possible to constrain the uncertainty components? It is later discussed that short windows of coincidence between bare soil and heavy rainfall can be critical, which would manifest as high uncertainty in the C-factor and R-factor. This is arguably the advantage of generating synthetic data.
As mentioned above, lumping everything into an error parameter on gross erosion poorly represents the individual contributions of sub-parameters, their interactions, and identifiability. At present, I miss a justification for this. What about the contribution (combinations of) sub-factors and their contribution to erosion and sediment transport realisations?
Regarding the general model implementation, a German USLE formulation is used in place of the typical RUSLE formulation. Can the authors show time series data of the annual parameter inputs used? It would be helpful for the reader to know the distributions of the input values. I would also suggest a discussion on what impact this may have on the model, plus the consequences for comparing parameter values (e.g. ktc) with other studies given the parameter compensation effects.
What about stream initiation and transition to channelised flow? In WS, there are various ways to consider the stream channel initiation by digitizing channels or considering a flow accumulation threshold. I would recommend mentioning this.
L197: The word seasonal is ambiguous in this case. I would also suggest using a mathematical formulation for the C-factor, showing how the SLR is generated and combined with rainfall erosivity and at what time scale. It’s also of general interest to the reader to know what these SLR and C-factor values are for both arable and grassland, and how they change through time. The literature reference for the SLR formulation is also grey literature, so more details are justified.
L297-304: How does the calculation of likelihoods vary between temporal aggregations? For individual years is this done by comparing the time series or individual simulations?
L316-324: Can the authors justify the use of the median? This would assume an underestimation of the total sum due to the positively skewed nature of sediment yield, which would have obvious implications for watershed management. Typical applications of WaTEM/SEDEM are applied to the mean.
Results:
The results are in general concisely presented. However, the spatial analysis section is overly brief. Do the multiple model runs which were made to address equifinality not give significantly more spatial information in addition to the median? What is the spatial variability of the behavioural predictions? Only the median is currently given but arguably one of the advantages of multiple realisations in WS is that you can get some idea of the variability in the spatial patterns from acceptable simulations. This is also useful to know the added spatial information which can be achieved for land management.
Discussion:
A typical explanation for the lack of global ktc parameters is the existence of unconsidered processes. Is this the case, or is it more of an inadequacy to capture the system behaviour? Can the field data give insights on this?
L459-467: This is somewhat difficult to follow. Is including this uncertainty in the input parameters not the purpose of using GLUE?
“Conservation landscapes” is combining multiple physical characteristics of the agricultural watersheds together, which have differing roles in soil erosion and sediment delivery through on-site and off-site effects. Is it due to grassed areas or conservation tillage? Indeed, grassed areas are commonly applied in the model through land use elements and grass buffer strips. So one could argue that they are indeed commonly applied in the model, but the impacts of conservation tillage on erosion and overland flow generation less so. It would help to separate conservation landscapes into their specific elements.
Can the authors elaborate on the effect of using the USLE formulation for Germany versus the typical RUSLE formulation used in WaTEM/SEDEM. Indeed I expect the model to be better calibrated for Germany agri-environmental conditions on which it was developed, however there are differences compared to the RUSLE formulation which are worthwhile to mention.
Conclusion:
L584-585: I didn’t see this point addressed in the manuscript. Is it not the case that the gross erosion estimates from the USLE factors overestimate the rates based on the most likely error surface values?