On the potential of a low-complexity model to decompose the temporal dynamics of soil erosion and sediment delivery

Matthews, Francis; Panagos, Panos; Fendrich, Arthur; Verstraeten, Gert

doi:https://doi.org/10.5194/egusphere-2023-2693

Preprints

https://doi.org/10.5194/egusphere-2023-2693

Preprints

11 Dec 2023

| 11 Dec 2023

Status: this preprint has been withdrawn by the authors.

On the potential of a low-complexity model to decompose the temporal dynamics of soil erosion and sediment delivery

Francis Matthews, Panos Panagos, Arthur Fendrich, and Gert Verstraeten

Abstract. Testing and improving the capacity of soil erosion and sediment delivery models to simulate the intra-annual dynamics climatic drivers and disturbances (e.g. vegetation clearcutting, tillage events, wildfires) is critical to understand the drivers of the system variability. In seasonally changing agricultural catchments, explicit temporal dynamics are typically neglected within many soil erosion modelling approaches, in favour of a focus on the long-term annual average as the predictive target. Here, we approach the trade-off between the need for model simplicity and temporally-dynamic predictions by testing the ability of a low-complexity, spatially distributed model (WaTEM/SEDEM), to decompose the 15-day dynamics of soil erosion and sediment yield. A standardised parameterisation and implementation routine was applied to four well-studied catchments in North-West Europe with open-access validation data. Through the testing of several alternative model spatial and connectivity structures, including the addition of an empirical runoff coefficient, we show that a temporally-static calibration of transport capacity cannot adequately replicate the relative seasonal decoupling of gross (on-site) soil erosion and sediment delivery. Instead, embedding seasonality into the calibration routine significantly improved the model performance, revealing a negative relationship between gross (pixel-scale soil displacement) and net erosion (stream channel sediment load) throughout the year. By incorporating temporal dynamics, the relative net effect is a reduction in the magnitudes of the spatially-distributed sediment fluxes at aggregated timescales, compared to a temporally-lumped approach. Published catchment observations infer that the efficacy of sediment delivery via overland flow is strongly reduced in the summer by abundant vegetative boundaries and increased in the winter via soil crusting and its promotion of runoff. Models operating at temporally-aggregated timescales should account for the possibility of decoupling in time and space between gross erosion and sediment delivery in arable catchment systems, related to alternations between transport- and detachment-limited sediment transport capacity states. Despite the complexities involved in the temporal downscaling of WaTEM/SEDEM, we show the utility of this approach to: 1) identify key missing information components requiring attention to reduce error in gross erosion predictions (e.g. more consideration of antecedent soil conditions), 2) form a basis for strategically adding physical process-representation, with a focus on maintaining low model complexity while improving predictive skill, and 3) better understand the spatial and temporal interdependencies within soil erosion models when undertaking upscaling exercises.

This preprint has been withdrawn.

Received: 13 Nov 2023 – Discussion started: 11 Dec 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 2782 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (2782 KB)

Supplement (922 KB)

Download & links

This preprint has been withdrawn.

Francis Matthews, Panos Panagos, Arthur Fendrich, and Gert Verstraeten

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2023-2693', Pedro Batista, 04 Jan 2024

General comments
The manuscript investigates the performance of a widely used soil erosion and sediment delivery model (WaTEM/SEDEM) for simulating soil erosion and sediment yield for four catchments in Northwest Europe. The catchment data were taken from a collaborative open-access database and a new open-source model code was implemented in Python. The model, which is usually applied in annual or long-term average annual time steps (as inherited from the RUSLE), is downscaled to a 15-day temporal resolution in order to address the temporal variability of erosion processes in agricultural catchments. Two approaches for model calibration are evaluated: (i) a ‘temporally static’ one, in which a calibrated parameter set is assumed to be constant during the period of the simulation, and (ii) a ‘multitemporal’ one, in which two parameters for different transport capacity equations are calibrated on a monthly basis (i.e., one calibrated parameter set for each month of the year per catchment). In both cases, sediment yield data calculated from catchment outlet measurements are used to calibrate and test the model. No temporal and/or spatial split-off testing is employed, and the same data used for calibration are used for evaluating model performance. Both calibration approaches rely on an optimisation function to define a single best-fit parameter set that minimises the differences between measured and modelled outlet sediment yields. The results show an increase in model accuracy with the use of the monthly calibration routine and the authors conclude that this approach improves temporal representation of soil redistribution processes.
While I appreciate the general motivation of the manuscript, as well as the use of open-access code and catchment data, I have serious concerns regarding the model evaluation approach employed by the authors. The methods for calibrating and testing the model do not consider the uncertainty in the model or in the input data. Moreover, model evaluation is not performed with independent data (i.e., not used during calibration), which can be misleading. This is particularly problematic for spatially distributed erosion models being calibrated against sediment yield data, as models are able of mimicking outlet sediment flows while misrepresenting internal catchment dynamics. Although spatially explicit soil redistribution rates based on field measurements are available for two of the test catchments, this information was not explicitly incorporated into the model calibration and evaluation processes. Hence, I do not think the modelling methodology is sufficiently sound to evaluate the value of dynamic input data and monthly calibration routines to improve process representation in erosion and sediment delivery models. These issues and several others are discussed in detail in the specific comments below.
I also found that the scientific writing is often not precise and that the figures could be largely improved (again, please see the specific comments).
Although I see potential in this work, I believe that the changes necessary to address the issues in the manuscript would lead to essentially new results and a new paper, and therefore I cannot recommend this for publication. I do think there is a lot of value in improving the parameterisation of simple erosion and sediment delivery models with dynamic, high-resolution spatiotemporal data – but more calibration (i.e., parameter optimisation) is not the answer, in my opinion.
I hope my comments are at all useful.
With very best wishes,
Pedro
Specific comments
L40: Increased in comparison to which baseline?
L48-50: Perhaps “began being developed in the 1930s” would be more precise.
Please notice that the references for the USLE and the RUSLE are missing in this paragraph. Moreover, the use of the model names and acronyms is not consistent or not defined. For instance, you could consider rephrasing to “[…] the popularity of the USLE and its revised version, the RUSLE (Renard et al., 1997) [..]”.
L59: Can you give an example of these non-linear internal dynamics?
L65: I understand ‘dynamic timescales’ as timescales that change. Is this what you are trying to convey? Or finer and/or different timescales? Maybe it would be good to define this somewhere.
L65: Which phenomena?
L68: What is meant by soil-erosion dynamics here? Are you talking about processes? Temporal variability?
L70: Why deterministic?
L78-80: I do not see your point here. Models should not be complex because they are tested against outlet sediment yields? Wouldn’t it be better to strive for better testing data?
107: Based on the topics and references you covered above; do you believe it is scientifically sound to validate a spatially distributed erosion model based on outlet sediment yields? Perhaps using terms like “tested” or “compared” would attenuate the issue. See Beven and Young (2013) and Oreskes (1998) regarding modelling semantics.
109: This sounds very Bayesian. What is meant by posterior here? Why not just outputs?
L109-110: I think the research question could be more precisely stated, e.g., “how accurately can WS simulate 15-day sediment yields with a temporally static calibration?”
L118: Who suggests < 10 km² for this? For instance, I am looking at Fig. 4 from de Vente and Poesen (2005) and their conceptual model includes gully and bank erosion, as well as floodplain deposition, at smaller scales than 10 km².
L140-143: I found this exclusion poorly justified. I think one of the main reasons for employing dynamic, continuous-simulation erosion models with finer temporal resolution is precisely to represent extreme, episodic events. This is because a single extreme event can dominate the sediment load signal in small watersheds for several years (Fiener et al., 2019). Hence, the ability of models to simulate such high magnitude low frequency events (even if lumped within a given time period) should be scrutinised.
Moreover, based on your rationale for excluding the extreme events, shouldn’t the event in March (I can’t tell the exact month from Figure 2) 1999 also be removed from the Kinderveld catchment dataset?
By the way, I think Figure 2 could use some work. The legend could be placed outside the upper panel, as it is applicable to all panels and the legend symbols can be confused with the actual data points. More importantly, I strongly suggest getting rid of the pie charts (see this blog entry https://www.ataccama.com/blog/why-pie-charts-are-evil).
Figure 1: What are the grey-scaled rasters in the figure? I reckon these are catchment DEMs, though they seem to be missing from the legend.
L164: Why hybrid?
L186-187: How exactly is the parcel connectivity implemented? Does this mean a percentage of the sediment is dropped at the field borders? Or it only affects the contributing area/flow accumulation and thus the LS factor?
L198-199: I haven’t looked at the code yet, but this sounds great. Thank you for sharing the model code!
L200-210: “Single optimised/calibrated parameter pair” makes me worried… See the Beven (2006) reference you cited above.
Figure 3: Are the RUSLE factors considered parameters or variables?
L243-244: According to Van Rompaey et al. (2001), the ktc parameter “can be interpreted as the theoretical upslope distance that is needed to produce enough sediment to reach the transport capacity at the grid cell, assuming a uniform slope and discharge”. Based on this definition, how do you interpret these differences in magnitude in the calibrated ktc values for the different TC equations? Moreover, what was the parameter space sampled during the optimisation procedure?
L251-252: This is a great improvement on the model! Could you add another sentence briefly explaining how this diffusive deposition is simulated?
L255: I think there is a word missing here.
L256: Is the SLR also an independent variable for the dynamic model application?
L259: Where is this information given in Table 1?
L262: Why is EU coverage relevant here? Do you wish to test the model or the model + the EU-available data?
Table 2: How are the field parcel data incorporated into the model and to the PTEF/Parcel Connectivity parameterisation? How are roads, paths, and field borders represented with a 25 m spatial resolution (assuming the model spatial resolution is being inherited from the DEM)?
Table 2: Is there no information on crop management (crop rotation, tillage type and orientation, etc) per field parcel in the catchments?
L271: Wouldn’t temporally dynamic parameters be considered variables?
L271-275: Where/when do you use the annual (or average annual?) C factor? I imagine that for the 15-day resolution model you use the SLR as input. This needs to be clarified here.
L280: The SLR is not a decomposed C factor, is it? It is a soil loss ratio, as you explain.
Equation 7: I am surprised to find out that this relation is crop-management independent – did I understand this correctly? This would mean that, for instance, the soil loss ratio for a conventionally tilled potato field would be equivalent to mulch-tilled wheat, if both have the same fractional (canopy!) cover, which in your case estimated from again a crop-independent NDVI relationship. Such an approach would introduce a lot of uncertainty to the model parameterisation, which would need to be represented/quantified, particularly during calibration due to equifinality issues – do you agree? For instance, in Germany, the SLR for a soil cover < 10% ranges from 0.08 to 0.94 depending on the crop and tillage type (Schwertmann et al. 1987). What is the land use and what are the typical crop rotations for your catchments?
L298: This demonstrates that the NDVI is a good predictor for vegetation cover, which indeed it should be, right? But how does this relation evidence a good correspondence between predicted and observed crop dynamics?
Figure 4: I found this figure very confusing. The solid lines, which are missing from the figure legend, are hard to visualise. The legend inside the upper panel is again confusing.
L314: Does this mean you performed a temporal split-off test?
L322-332: I had a hard time understanding the calibration procedure. I suggest reformulating so that the methodology is clearly and simply stated for the reader (e.g., what are the parameters actually being calibrated, for which temporal resolution, what data are used to condition the ktc parameters, what kind of split-off testing is employed – or not – and so on).
Importantly, if I understood correctly, what you call “different connectivity scenarios” are actually part of the parameter optimisation procedure, in which you calibrate the trapping efficiency and parcel connectivity parameters, at least according to the supplementary material (S1.5). This critical information needs to be explicitly stated in the manuscript.
Moreover, I can’t say I understand the part about the sediment delivery ratio (SDR) thresholds, which, again, seems like important information that should be clearly explained in the manuscript – not in the supplementary material. In any case, this approach seems to rely on catchment-lumped SDRs calculated from RUSLE-estimated gross erosion rates and measured outlet sediment yields. This seems to assume that the RUSLE-estimated erosion rates are somehow true, so that deviations from expected delivery ratios would be caused by parameterisation errors or the occurrence of other processes than rill and interill erosion (lines 125-135 from the SI). I am not sure I agree, as this assumption apparently neglects (RUSLE) model error, which can be quite large given the discrepancies between the data/purpose the model was developed with/for and the settings where it is being applied. Hence, using this SDR thresholds as part of the conditioning process does not seem prudent to me, at least the way this is currently justified. Maybe I misunderstood something, which in any case is not optimal
L334: What about the trapping efficiency and parcel connectivity parameters? And the SDR thresholds? It seems misleading to state that only two parameters are being calibrated.
L335-336: Sounds like quite a magical parameter! How does this compare to the definition of the ktc parameter stated above?
L336-339: Great to have an interpretation of the results, but do you think this is enough to open the calibration black box? You risk affirming the consequent without additional independent and spatial data to support your interpretation.
L340-350: So, the same data are used for forcing and testing the model? This hardly seems justifiable, considering the temporal and spatial data available for your catchments. These data would allow for different types of split-off tests and for an evaluation of the transferability of the calibration procedure.
Perhaps more importantly, why did you not account for the equifinality issue during calibration? Even with a small number of parameters being calibrated (which I am not sure is the case here), there are several model realisations able to mimic the outlet sediment data if we consider the degrees of freedom afforded to spatially distributed models, and the errors in the input and forcing data – particularly with a monthly calibration. All of these well-known issues, as well as methods for addressing them, are described in some of the references you cite in the manuscript.
Moreover, why didn’t you use the spatially distributed erosion data from the Ganspoel and Kinderveld catchments to calibrate and/or test the model? The data were specifically collected for this purpose, as stated in the title of Van Oost et al. (2005).
L352: Why does avoiding a cross-validation prevents over fitting?
L362-365: If I understood this correctly, the measured sediment loads do not correlate with the simulated erosion rates from the RUSLE for most (3) of the catchments. Hence, it seems like (i) there are other processes not simulated by the model that are affecting the sediment yield or (ii) the model is not fit for purpose. I imagine now that if you include a monthly calibration your results will improve, specially since there was no split-off testing or uncertainty estimation. Do you reckon this means the model improved as a representation of the system or it simply improved its capacity to mimic the forcing data?
Table 3: The SDR information could be better explained, I am not sure what SDR (max) means. I would also like to see all calibrated parameters here, not just this lumped “connectivity index” (which sounds too much with other indices used in connectivity research e.g. Borselli et al., 2008).
I am again surprised by the variability in orders of magnitude of the values for the calibrated ktc parameters. Seems like the parameter has been stripped of its original physical meaning and became an adjustment factor to fit the forcing data.
What was the parameter space you sampled? Apparently, you gave the model a lot of room to fit the forcing data.
L380-385: Yes, as expected we see a “boost in model performance”, as you call it. This would be great if it had been achieved by improving the dynamic model parameterisation with measured data. What we see here seems to be the result of an increase in the freedom the model has been afforded to fit the sediment yield data. I assume that if you do a weekly or daily calibration the results would be even more accurate – despite the fact that USLE predictions are known to deteriorate at finer timescales (Risse et al., 1993). Hence, the erosion predictions get worse; but the sediment yields are more accurately simulated – isn’t this in principle contradictory for small catchments with a predominance of rill and interill erosion and negligible channel processes (i.e., your reasons/assumptions for choosing the test catchments)? That is, if these assumptions are true, shouldn’t the sediment yield be largely explained by the hillslope erosion rates, particularly for 15-day timesteps? In Table 2 we see that the catchments in which predicted erosion rates do not correlate with the measured sediment yields now display the highest NSE values. What does this mean?
I really think it is a great idea to improve the ktc parameterisation in W/S to account for temporal variability in roughness, vegetation cover, etc. But more calibration without uncertainty estimation and using the same (outlet) data for forcing and testing the model is not the answer, in my opinion.
L398: I would not say there is evidence that this calibration corrects any errors – it might simply compensate one error with another (e.g., Pontes et al., 2021).
L398-410: One could also say that during summer there is an overprediction of gross erosion rates which is compensated by calibrating the ktc parameter with very low values that increase hillslope deposition. How do the estimated deposition rates and patterns compare with the measured data in Van Oost et al. (2005)?
L415-420: Aren’t these additional signs of the model compensating for under- and overprediction of erosion by means of calibration?
L438-440: What does moderate capacity mean? Could you give us numbers please? Moreover, what are we supposed to look at in Figure 9c and d? What is modelled and what is measured there? I found Figure 9 to be little informative and hard to interpret.
Importantly, how do the modelled soil redistribution rates compare with the measured redistribution data in Van Oost et al. (2005)? What about the Ganspoel catchment?
L465: But the BRVL catchment showed no correlation between SSL and gross erosion predictions (r = 0.04) and, at the same time, the highest NSE values following the monthly calibration… Did I understand something wrong here?
L467: It can compensate the error; but does it improve the model’s representation of the system? If we would only care about accurately simulating the sediment yield, why would we need a spatially distributed model anyhow?
L476-477: Again, these are not measured gross erosion rates.
L481: “Winter runoff events dominate the runoff and SSL budget in the BRVL and FDTL catchments (Grangeon et al., 2022), representing cases in which the seasonal dynamics of predicted gross erosion and measured SSL were inverted.”
Then doesn’t this potentially indicate that (i) the model is not fit-for-purpose for simulating these catchments in this temporal resolution and that (ii) you are compensating the spurious temporal simulations of internal soil redistribution by means of calibration of the sediment yield?
L491: I agree, but here you have access to the great spatially distributed erosion data from the Kinderveld and Ganspoel catchments. Why haven’t you used them?
This whole discussion made me think about how the concepts of gross erosion and sediment delivery ratio are somewhat inadequate and how perhaps we would be better off thinking in terms of travel distances (Parsons et al., 2004, 2009).
L545-546: Where is this provided in Table 1? Uncertainty estimation was indeed missing here.
L551: I would argue that in the monthly calibration (which seems to be the precise term here – not ‘multitemporal calibration’) you define an optimisation routine to mimic the 15-day sediment load data, but you haven’t provided evidence this is achieved for the right reasons. In fact, the calibration might be compensating for errors in the model and the model parameterisation (see comments above). Hence, I do not think it is sound to state that this calibration serves as a “proxy for missing parameter information or process components”.
L566: I don’t really get how you are using the term “deterministic” throughout the text. What is deterministic model performance?
L572: Can you give the reader a brief explanation of this matter here and refer to Steegen and Govers (2001) for details?
L587-588: “Nevertheless, non-linear temporal differences in ktc back-propagate over the landscape to change the magnitudes of erosion and deposition (Fig. 9).” I am not sure I get this. Can you be more direct? To be honest, I did not get were you wanted to go with this paragraph.
L610-612: Same here: “The spatial characteristics of soil erosion represent the source of the cascading environmental impacts, arguably making them the key prediction target (Vigiak et al., 2006; Jetten et al., 2003; Merritt et al., 2003). While error on the spatial patterns of soil erosion and sediment transport can confound within the spatially lumped sediment yield, the spatial patterns can remain poor (Jetten et al., 2003)”.
L627: Do you mean at the erosion-plot scale? Moreover, how does model testing reduce the uncertainty in the predictions? Does quantifying model error reduce model uncertainty?
In general, I had a hard time understanding where you wanted to go with section 5.3, which seems somewhat speculative and decoupled from your actual results.
L646-647: What is non-linear seasonality? In any case, wouldn’t it be more accurate to state that “reasonable model performance” (what is reasonable anyway?) was only achieved after a monthly calibration of the ktc parameters? And that the best-fit calibrated parameter set was not tested against independent data (i.e. not used during calibration)?
L661-664: I strongly disagree that the monthly parameter optimisation procedure suggested here improves temporal process representation, due to all the above-mentioned reasons.
References
Beven, K. J.: A manifesto for the equifinality thesis, J. Hydrol., 320(1–2), 18–36, doi:10.1016/j.jhydrol.2005.07.007, 2006.
Beven, K. J. and Young, P.: A guide to good practice in modeling semantics for authors and referees, Water Resour. Res., 49(8), 5092–5098, doi:10.1002/wrcr.20393, 2013.
Borselli, L., Cassi, P. and Torri, D.: Prolegomena to sediment and flow connectivity in the landscape: A GIS and field numerical assessment, Catena, 75(3), 268–277, doi:10.1016/j.catena.2008.07.006, 2008.
Fiener, P., Wilken, F. and Auerswald, K.: Filling the gap between plot and landscape scale - Eight years of soil erosion monitoring in 14 adjacent watersheds under soil conservation at Scheyern, Southern Germany, Adv. Geosci., 48, 31–48, doi:10.5194/adgeo-48-31-2019, 2019.
Van Oost, K., Govers, G., Cerdan, O., Thauré, D., Van Rompaey, a., Steegen, a., Nachtergaele, J., Takken, I. and Poesen, J.: Spatially distributed data for erosion model calibration and validation: The Ganspoel and Kinderveld datasets, Catena, 61(2–3), 105–121, doi:10.1016/j.catena.2005.03.001, 2005.
Oreskes, N.: Evaluation (not validation) of quantitative models, Environ. Health Perspect., 106(6), 1453–1460, doi:10.1289/ehp.98106s61453, 1998.
Parsons, A. J., Wainwright, J., Powell, D. M., Kaduk, J. and Brazier, R. E.: A conceptual model for determining soil erosion by water, Earth Surf. Process. Landforms, 29(10), 1293–1302, doi:10.1002/esp.1096, 2004.
Parsons, A. J., Wainwright, J., Brazier, R. E. and Powell, D. M.: Is sediment delivery a fallacy?, Earth Surf. Process. Landforms, 34, 155–161, doi:10.1002/esp, 2009.
Pontes, L. M., Batista, P. V. G., Silva, B. P. C., Viola, M. R., da Rocha, H. R. and Silva, M. L. N.: Assessing sediment yield and streamflow with swat model in a small sub-basin of the cantareira system, Rev. Bras. Cienc. do Solo, 45, doi:10.36783/18069657rbcs20200140, 2021.
Risse, L. M., Nearing, M. a., Laflen, J. M. and Nicks, a. D.: Error Assessment in the Universal Soil Loss Equation, Soil Sci. Soc. Am. J., 57(1987), 825, doi:10.2136/sssaj1993.03615995005700030032x, 1993.
Van Rompaey, A. J. J., Verstraeten, G., Van Oost, K., Govers, G. and Poesen, J.: Modelling mean annual sediment yield using a distributed approach, Earth Surf. Process. Landforms, 26(11), 1221–1236, doi:10.1002/esp.275, 2001.

Citation: https://doi.org/10.5194/egusphere-2023-2693-RC1
RC2:
'Comment on egusphere-2023-2693', Anonymous Referee #2, 22 Jan 2024
General comments
This paper proposed applying the WaTEM/SEDEM model to four well-documented catchments and improving its temporal resolution to a 15-days time step to better represent seasonality effects on modelled sediment fluxes. This improvement is made in response to the perceived lack of temporally explicit modelling approaches in soil erosion modelling. To this end, a temporally varying transport capacity is developed and included using a two-step modelling approach: i) the transport capacity is fixed over the simulation period and is then ii) calibrated on a monthly basis. One of the main result is that using a constant transport capacity parameter throughout the hydrological year yield unsatisfactory results, while the inclusion of two time-varying transport capacity parameters significantly improved model performance.
The study relies on the use of open access data, used for model calibration and evaluation, to improve an open-access model. Indeed, one of the study output is the provision of a Python routine for WaTEM/SEDEM applications. The application of the model is made on three different catchments, and one nested sub catchment. The authors propose to build on the idea of exploring alternative modelling attempts based on increased data availability, supporting more expert-based approaches instead of adding more complexity in existing models (as stated l. 80-83). While this idea is appealing, I have several serious concerns regarding this work:
Several papers already addressed the topic of erosion and sediment transfers in the European loess belt, and none of them was referenced. For instance, early work of Jetten et al. (1999) and Van Dijk & Kwaad (1996); Evrard et al. (2009, 2010) or recent paper such as Landemaine et al. (2023).

One strong argument of the authors is the lack of time dependent soil erosion model in literature (l. 15-16). I would strongly disagree with this hypothesis as numerous time-explicit soil erosion models exist in literature (e.g. WEPP, EROSION3D, EUROSEM, LISEM, KINEROS). The question of time-dependent variables in soil erosion modelling has also been addressed in empirical or process-based models (e.g. CREAM, SWAT, STREAM, PESERA…). How does this study built on existing approaches and why was a new methodology needed?

To enhance the model’s performance, the authors surprisingly choose to include additional complexity in the model through the transport capacity parameters, which seems to contradict the paper’s working hypothesis.

Modelling catchments with area in the order 100 – 1000 ha at a 15-days time step is highly questionable. It is not consistent with the time scale at which soil erosion processes are expected to occur.

It is unclear how the dataset on which the modelling approach based was processed. In particular, in such low-order catchments as the BRVL and FDTL catchments, sediment load can not be estimated from single concentrations values, due to high frequency variations in both discharge and suspended sediment concentration at the flood event scale, including hysteresis effects.

The corresponding resulting model performance is limited (NSE between 39% and 63%, mean NSE=48%), indicating that the main driving factor of the catchments erosion and sediment dynamics were not adequately captured by the proposed modelling approach.

I therefore not recommend publication. Please find additional comments below.
Specific comments
While model application to previously studied catchments with extensive available datasets should be a strength of this modelling study, the authors surprisingly discarded existing data. It is surprising to read that general databases were preferred over data that were specifically derived for the studied catchments. For example, plots delineation and roads network were derived from combined Integrated Administrative and Control System and Open Street Map data according to Figure 3. Why not use the specific data developed for the studied catchments, as illustrated in Matthews et al. (2023 – Figure 3) and Grangeon et al. (2022 – Figure 8) for the Kinderveld (and possibly Ganspoel), BRVL and the nested FDTL catchments (if I understood correctly data availability described in these two papers) respectively? While I understand the intention of developing a unique workflow for future applications in other catchments, it is unclear why the authors chose to discard this unique opportunity to evaluate an important source of uncertainty in input data for models, a foreseen shortcoming for future model applications on other catchments.
Moreover, the authors did not describe how they process the raw data to establish the database used for model evaluation. The inability for the readers to evaluate how the sediment load was calculated for the BRVL and FDTL catchments is concerning, while experimental values are keys in a study intending to evaluate the benefit of a new model parameterization. I understand that ‘Event-variable timestep’ for the Ganspoel and Kinderveld catchments refers to the use of high frequency water height and turbidity measurements transformed into discharge and Suspended Sediment Concentration (SSC) with gauging and sampling operations (if this is correct, it should be explicitly stated). But how can Suspended Sediment Load be calculated at the runoff event scale using ‘a singular aggregated sediment load’?
The authors chose to decompose the 15-day dynamics of soil erosion and sediment transfers, based on the claim that ‘explicit temporal dynamics are typically neglected within many soil erosion modelling approaches in favour of a focus on the long-term annual average as the predictive target’ (l.16). First, this is highly questionable statement as numerous erosion and sediment transfers models exist (see, for example, the models used in the intercomparison proposed by Baartman et al., 2020; none of these models neglect temporal dynamics. One may also consider the widely used SWAT model – Arnold et al., 1998 -, which is another illustration of the inaccuracy of this assertion). Moreover, addressing the erosion dynamics of catchments in the order ~100 -1000 ha using a 15-day time step seem a large temporal window for results aggregation relative to catchments’ response time. In the end, if the model is evaluated against aggregated values, what justify this choice relative to e.g. one or several months?
The model calibration procedure is unclear. As far as I understand, the model evaluation does not involved a training/testing dataset splitting, which may be a concern for adequate model evaluation. The model results are considered satisfactory, while Table 4 indicates that only the total sediment mass is adequately reproduced by the proposed modelling approach. Indeed, with a mean NSE over the four catchments of 48%, the modelled temporal dynamics can not be considered adequately simulated. This seems like an issue in a paper focusing on the improved temporal representation of a model originally developed to reproduce the total sediment mass.
The authors based their study on the hypothesis that the model fails in reproducing the ‘multitemporal sediment yield’ because of inappropriate transport capacity parameterization, and therefore propose some improvements for the WaTEM/SEDEM model. However, in agricultural catchments, seasonality in sediment loads measured at the catchment outlet may results from variation in gross erosion. What justify the focus on improving the transport capacity? From the literature cited in the manuscript, the reader can find existing data on the studied catchment that may have been used to complement the analysis with an in-depth discussion of the relative effects of the seasonal variations of both the transport capacity and erosion rates in these agricultural catchments.
Technical comments
Table 1: Please detail how was calculated sediment load when what using a singular aggregated value per event. Please add an order of magnitude of measurements for the high frequency Ganspoel and Kinderveld catchments.
Figure 2: It is unusual to present discharge and suspended sediment load aggregated by events. It is recommended to use more traditional time series plot including discharge, suspended sediment concentration and rainfall, as it gives a significant amount of additional interesting information on e.g. stream intermittence, flood event occurrence, hysteresis, and characteristics...
l.16: What characteristics are ‘seasonally changing’?
l.29: While soil crusting and vegetative boundaries (do you refer to grass strips and edges?) are recognized as important factors governing runoff dynamics, they were not explicitly studied in this paper. I would suggest removing this part from the abstract.
l.140-143: I found it surprising that the extreme event recorded during the monitoring period was excluded. It is rare to find datasets including significant rainfall-runoff for analysis, as they are usually challenging to measure with an acceptable accuracy. Moreover, it is recognized that most of the catchments sediment fluxes occurred during the largest rainfall-runoff events. Such data are precious and should be analysed in details instead of being discarded.
l.146: susceptible to infiltration-excess.
l.175: Please define SIR.
l.179-180: Should not this be written ‘intensively cultivated catchments’?
l.192-194: It is not clear how the 15-day temporal window was defined. In particular, it seems contradictory to underline the issue of defining ‘consistent thresholds of rainfall-runoff initiation when modelling discrete event episodes’ and then propose a framework based on arbitrarily fixed 15-day threshold for modelling with parameters varying on a monthly basis.
l.306-321: The model calibration procedure is not clear. The first paragraph refers to traditional approach in calibrating WaTEM/SEDEM, an approach discarded here, am I correct? If so, it should be removed. The second paragraph would mean that no splitting between distinct calibration/validation dataset is performed, is this correct? If so, it is a significant issue in the modelling approach.
l.324: How was defined a ‘connectivity scenario’?
l.326-328: Why not considering maximizing the NSE in the calibration procedure, which would account for both the temporality and the sediment mass?
l.362-366: The purpose of the comparison between RUSLE models and measurements is not clear here.
l.416-419: The interpretation of relationships with low correlation coefficient is questionable. I would not recommend using three significant digits on correlation coefficient.
l.504-511: The introduction of pixel-based CN runoff coefficient in this study is questionable, considering that runoff was not directly evaluated in the model.
References
Arnold J.G., Srinivasan R., Muttiah R.S., Williams J.R. (1998). Large area hydrologic modelling and assessment part I: Model development. Journal of the American Water Resources Association, 31(1):73-89.
Baartman J. E. M., Nunes J. P., Masselink R., Darboux F., Bielders C., Degre A., Cantreul V., Cerdan O., Grangeon T., Fiener P., Wilken F., Schindewolf M., Wainwright J. (2020). What do models tell us about water and sediment connectivity? Geomorphology, 367, Article 107300. https://doi.org/10.1016/j.geomorph.2020.107300
Evrard O., Cerdan O., Van Wesemael B., Chauvet M., Le Bissonnais Y., Raclot D., Vandaele K., Andrieux P., Bielders C. (2009). Reliability of an expert-based runoff and erosion model: Applications of STREAM to different environments. Catena, 78(2):129-141.
Evrard O., Nord G., Cerdan O., Souchère V., Le Bissonnais Y., Bonté P. (2010). Modelling the impact of land use change and rainfall seasonality on sediment export from an agricultural catchment of the northwester European loess belt. Agriculture, Ecosystems & Environment, 138(1-2):83-94.
Grangeon T., Vandromme R., Pak L.T., et al. Dynamic parameterization of soil surface characteristics for hydrological models in agricultural catchments. Catena 214, 106257 (2022). https://doi.org/10.1016/j.catena.2022.106257.
Jetten V., de Roo A., Favis-Mortlock D. (1999). Evaluation of field-scale and catchment-scale soil erosion models. Catena, 37:521-541.
Landemaine V., Cerdan O., Grangeon T., Vandromme R., Laignel B., Evrard O., Salvador-Blanes S., Laceby P. (2023). Saturation-excess overland flow in the European loess belt: An underestimated process? International Soil and Water Conservation Research, 11(4):688-699.

https://doi.org/10.1016/j.iswcr.2023.03.004
Matthews F., Verstraeten G., Borrelli P. et al. EUSEDcollab: a network of data from European catchments to monitor net soil erosion by water. Sci Data 10, 515 (2023). https://doi.org/10.1038/s41597-023-02393-8
Van Dijk P.M., Kwaad F.J.P.M. (1996) Runoff generation and soil erosion in small agricultural catchments with loess-derived soils. Hydrological Processes, 10(8):1049-1059.
Citation: https://doi.org/10.5194/egusphere-2023-2693-RC2
AC1: 'Comment on egusphere-2023-2693', Francis Matthews, 04 Mar 2024

We thank both Reviewers for providing comprehensive reviews of this manuscript. In the attached file we provide author responses to all comments raised by reviewers. We combine these into a singular document as numerous points raised, especially the clear positioning and justification of the manuscript, are relevant for both cases. Based on this response, we strongly defend the approach and scientific value of the manuscript (see the general justification of the manuscript), but we will make signficant but specific changes to the manuscipt to meet the responses provided.
Please see all rebuttals and intended manuscript changes in the attached document.
Thank you on behalf of all co-authors,

Citation: https://doi.org/10.5194/egusphere-2023-2693-AC1

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2023-2693', Pedro Batista, 04 Jan 2024

General comments
The manuscript investigates the performance of a widely used soil erosion and sediment delivery model (WaTEM/SEDEM) for simulating soil erosion and sediment yield for four catchments in Northwest Europe. The catchment data were taken from a collaborative open-access database and a new open-source model code was implemented in Python. The model, which is usually applied in annual or long-term average annual time steps (as inherited from the RUSLE), is downscaled to a 15-day temporal resolution in order to address the temporal variability of erosion processes in agricultural catchments. Two approaches for model calibration are evaluated: (i) a ‘temporally static’ one, in which a calibrated parameter set is assumed to be constant during the period of the simulation, and (ii) a ‘multitemporal’ one, in which two parameters for different transport capacity equations are calibrated on a monthly basis (i.e., one calibrated parameter set for each month of the year per catchment). In both cases, sediment yield data calculated from catchment outlet measurements are used to calibrate and test the model. No temporal and/or spatial split-off testing is employed, and the same data used for calibration are used for evaluating model performance. Both calibration approaches rely on an optimisation function to define a single best-fit parameter set that minimises the differences between measured and modelled outlet sediment yields. The results show an increase in model accuracy with the use of the monthly calibration routine and the authors conclude that this approach improves temporal representation of soil redistribution processes.
While I appreciate the general motivation of the manuscript, as well as the use of open-access code and catchment data, I have serious concerns regarding the model evaluation approach employed by the authors. The methods for calibrating and testing the model do not consider the uncertainty in the model or in the input data. Moreover, model evaluation is not performed with independent data (i.e., not used during calibration), which can be misleading. This is particularly problematic for spatially distributed erosion models being calibrated against sediment yield data, as models are able of mimicking outlet sediment flows while misrepresenting internal catchment dynamics. Although spatially explicit soil redistribution rates based on field measurements are available for two of the test catchments, this information was not explicitly incorporated into the model calibration and evaluation processes. Hence, I do not think the modelling methodology is sufficiently sound to evaluate the value of dynamic input data and monthly calibration routines to improve process representation in erosion and sediment delivery models. These issues and several others are discussed in detail in the specific comments below.
I also found that the scientific writing is often not precise and that the figures could be largely improved (again, please see the specific comments).
Although I see potential in this work, I believe that the changes necessary to address the issues in the manuscript would lead to essentially new results and a new paper, and therefore I cannot recommend this for publication. I do think there is a lot of value in improving the parameterisation of simple erosion and sediment delivery models with dynamic, high-resolution spatiotemporal data – but more calibration (i.e., parameter optimisation) is not the answer, in my opinion.
I hope my comments are at all useful.
With very best wishes,
Pedro
Specific comments
L40: Increased in comparison to which baseline?
L48-50: Perhaps “began being developed in the 1930s” would be more precise.
Please notice that the references for the USLE and the RUSLE are missing in this paragraph. Moreover, the use of the model names and acronyms is not consistent or not defined. For instance, you could consider rephrasing to “[…] the popularity of the USLE and its revised version, the RUSLE (Renard et al., 1997) [..]”.
L59: Can you give an example of these non-linear internal dynamics?
L65: I understand ‘dynamic timescales’ as timescales that change. Is this what you are trying to convey? Or finer and/or different timescales? Maybe it would be good to define this somewhere.
L65: Which phenomena?
L68: What is meant by soil-erosion dynamics here? Are you talking about processes? Temporal variability?
L70: Why deterministic?
L78-80: I do not see your point here. Models should not be complex because they are tested against outlet sediment yields? Wouldn’t it be better to strive for better testing data?
107: Based on the topics and references you covered above; do you believe it is scientifically sound to validate a spatially distributed erosion model based on outlet sediment yields? Perhaps using terms like “tested” or “compared” would attenuate the issue. See Beven and Young (2013) and Oreskes (1998) regarding modelling semantics.
109: This sounds very Bayesian. What is meant by posterior here? Why not just outputs?
L109-110: I think the research question could be more precisely stated, e.g., “how accurately can WS simulate 15-day sediment yields with a temporally static calibration?”
L118: Who suggests < 10 km² for this? For instance, I am looking at Fig. 4 from de Vente and Poesen (2005) and their conceptual model includes gully and bank erosion, as well as floodplain deposition, at smaller scales than 10 km².
L140-143: I found this exclusion poorly justified. I think one of the main reasons for employing dynamic, continuous-simulation erosion models with finer temporal resolution is precisely to represent extreme, episodic events. This is because a single extreme event can dominate the sediment load signal in small watersheds for several years (Fiener et al., 2019). Hence, the ability of models to simulate such high magnitude low frequency events (even if lumped within a given time period) should be scrutinised.
Moreover, based on your rationale for excluding the extreme events, shouldn’t the event in March (I can’t tell the exact month from Figure 2) 1999 also be removed from the Kinderveld catchment dataset?
By the way, I think Figure 2 could use some work. The legend could be placed outside the upper panel, as it is applicable to all panels and the legend symbols can be confused with the actual data points. More importantly, I strongly suggest getting rid of the pie charts (see this blog entry https://www.ataccama.com/blog/why-pie-charts-are-evil).
Figure 1: What are the grey-scaled rasters in the figure? I reckon these are catchment DEMs, though they seem to be missing from the legend.
L164: Why hybrid?
L186-187: How exactly is the parcel connectivity implemented? Does this mean a percentage of the sediment is dropped at the field borders? Or it only affects the contributing area/flow accumulation and thus the LS factor?
L198-199: I haven’t looked at the code yet, but this sounds great. Thank you for sharing the model code!
L200-210: “Single optimised/calibrated parameter pair” makes me worried… See the Beven (2006) reference you cited above.
Figure 3: Are the RUSLE factors considered parameters or variables?
L243-244: According to Van Rompaey et al. (2001), the ktc parameter “can be interpreted as the theoretical upslope distance that is needed to produce enough sediment to reach the transport capacity at the grid cell, assuming a uniform slope and discharge”. Based on this definition, how do you interpret these differences in magnitude in the calibrated ktc values for the different TC equations? Moreover, what was the parameter space sampled during the optimisation procedure?
L251-252: This is a great improvement on the model! Could you add another sentence briefly explaining how this diffusive deposition is simulated?
L255: I think there is a word missing here.
L256: Is the SLR also an independent variable for the dynamic model application?
L259: Where is this information given in Table 1?
L262: Why is EU coverage relevant here? Do you wish to test the model or the model + the EU-available data?
Table 2: How are the field parcel data incorporated into the model and to the PTEF/Parcel Connectivity parameterisation? How are roads, paths, and field borders represented with a 25 m spatial resolution (assuming the model spatial resolution is being inherited from the DEM)?
Table 2: Is there no information on crop management (crop rotation, tillage type and orientation, etc) per field parcel in the catchments?
L271: Wouldn’t temporally dynamic parameters be considered variables?
L271-275: Where/when do you use the annual (or average annual?) C factor? I imagine that for the 15-day resolution model you use the SLR as input. This needs to be clarified here.
L280: The SLR is not a decomposed C factor, is it? It is a soil loss ratio, as you explain.
Equation 7: I am surprised to find out that this relation is crop-management independent – did I understand this correctly? This would mean that, for instance, the soil loss ratio for a conventionally tilled potato field would be equivalent to mulch-tilled wheat, if both have the same fractional (canopy!) cover, which in your case estimated from again a crop-independent NDVI relationship. Such an approach would introduce a lot of uncertainty to the model parameterisation, which would need to be represented/quantified, particularly during calibration due to equifinality issues – do you agree? For instance, in Germany, the SLR for a soil cover < 10% ranges from 0.08 to 0.94 depending on the crop and tillage type (Schwertmann et al. 1987). What is the land use and what are the typical crop rotations for your catchments?
L298: This demonstrates that the NDVI is a good predictor for vegetation cover, which indeed it should be, right? But how does this relation evidence a good correspondence between predicted and observed crop dynamics?
Figure 4: I found this figure very confusing. The solid lines, which are missing from the figure legend, are hard to visualise. The legend inside the upper panel is again confusing.
L314: Does this mean you performed a temporal split-off test?
L322-332: I had a hard time understanding the calibration procedure. I suggest reformulating so that the methodology is clearly and simply stated for the reader (e.g., what are the parameters actually being calibrated, for which temporal resolution, what data are used to condition the ktc parameters, what kind of split-off testing is employed – or not – and so on).
Importantly, if I understood correctly, what you call “different connectivity scenarios” are actually part of the parameter optimisation procedure, in which you calibrate the trapping efficiency and parcel connectivity parameters, at least according to the supplementary material (S1.5). This critical information needs to be explicitly stated in the manuscript.
Moreover, I can’t say I understand the part about the sediment delivery ratio (SDR) thresholds, which, again, seems like important information that should be clearly explained in the manuscript – not in the supplementary material. In any case, this approach seems to rely on catchment-lumped SDRs calculated from RUSLE-estimated gross erosion rates and measured outlet sediment yields. This seems to assume that the RUSLE-estimated erosion rates are somehow true, so that deviations from expected delivery ratios would be caused by parameterisation errors or the occurrence of other processes than rill and interill erosion (lines 125-135 from the SI). I am not sure I agree, as this assumption apparently neglects (RUSLE) model error, which can be quite large given the discrepancies between the data/purpose the model was developed with/for and the settings where it is being applied. Hence, using this SDR thresholds as part of the conditioning process does not seem prudent to me, at least the way this is currently justified. Maybe I misunderstood something, which in any case is not optimal
L334: What about the trapping efficiency and parcel connectivity parameters? And the SDR thresholds? It seems misleading to state that only two parameters are being calibrated.
L335-336: Sounds like quite a magical parameter! How does this compare to the definition of the ktc parameter stated above?
L336-339: Great to have an interpretation of the results, but do you think this is enough to open the calibration black box? You risk affirming the consequent without additional independent and spatial data to support your interpretation.
L340-350: So, the same data are used for forcing and testing the model? This hardly seems justifiable, considering the temporal and spatial data available for your catchments. These data would allow for different types of split-off tests and for an evaluation of the transferability of the calibration procedure.
Perhaps more importantly, why did you not account for the equifinality issue during calibration? Even with a small number of parameters being calibrated (which I am not sure is the case here), there are several model realisations able to mimic the outlet sediment data if we consider the degrees of freedom afforded to spatially distributed models, and the errors in the input and forcing data – particularly with a monthly calibration. All of these well-known issues, as well as methods for addressing them, are described in some of the references you cite in the manuscript.
Moreover, why didn’t you use the spatially distributed erosion data from the Ganspoel and Kinderveld catchments to calibrate and/or test the model? The data were specifically collected for this purpose, as stated in the title of Van Oost et al. (2005).
L352: Why does avoiding a cross-validation prevents over fitting?
L362-365: If I understood this correctly, the measured sediment loads do not correlate with the simulated erosion rates from the RUSLE for most (3) of the catchments. Hence, it seems like (i) there are other processes not simulated by the model that are affecting the sediment yield or (ii) the model is not fit for purpose. I imagine now that if you include a monthly calibration your results will improve, specially since there was no split-off testing or uncertainty estimation. Do you reckon this means the model improved as a representation of the system or it simply improved its capacity to mimic the forcing data?
Table 3: The SDR information could be better explained, I am not sure what SDR (max) means. I would also like to see all calibrated parameters here, not just this lumped “connectivity index” (which sounds too much with other indices used in connectivity research e.g. Borselli et al., 2008).
I am again surprised by the variability in orders of magnitude of the values for the calibrated ktc parameters. Seems like the parameter has been stripped of its original physical meaning and became an adjustment factor to fit the forcing data.
What was the parameter space you sampled? Apparently, you gave the model a lot of room to fit the forcing data.
L380-385: Yes, as expected we see a “boost in model performance”, as you call it. This would be great if it had been achieved by improving the dynamic model parameterisation with measured data. What we see here seems to be the result of an increase in the freedom the model has been afforded to fit the sediment yield data. I assume that if you do a weekly or daily calibration the results would be even more accurate – despite the fact that USLE predictions are known to deteriorate at finer timescales (Risse et al., 1993). Hence, the erosion predictions get worse; but the sediment yields are more accurately simulated – isn’t this in principle contradictory for small catchments with a predominance of rill and interill erosion and negligible channel processes (i.e., your reasons/assumptions for choosing the test catchments)? That is, if these assumptions are true, shouldn’t the sediment yield be largely explained by the hillslope erosion rates, particularly for 15-day timesteps? In Table 2 we see that the catchments in which predicted erosion rates do not correlate with the measured sediment yields now display the highest NSE values. What does this mean?
I really think it is a great idea to improve the ktc parameterisation in W/S to account for temporal variability in roughness, vegetation cover, etc. But more calibration without uncertainty estimation and using the same (outlet) data for forcing and testing the model is not the answer, in my opinion.
L398: I would not say there is evidence that this calibration corrects any errors – it might simply compensate one error with another (e.g., Pontes et al., 2021).
L398-410: One could also say that during summer there is an overprediction of gross erosion rates which is compensated by calibrating the ktc parameter with very low values that increase hillslope deposition. How do the estimated deposition rates and patterns compare with the measured data in Van Oost et al. (2005)?
L415-420: Aren’t these additional signs of the model compensating for under- and overprediction of erosion by means of calibration?
L438-440: What does moderate capacity mean? Could you give us numbers please? Moreover, what are we supposed to look at in Figure 9c and d? What is modelled and what is measured there? I found Figure 9 to be little informative and hard to interpret.
Importantly, how do the modelled soil redistribution rates compare with the measured redistribution data in Van Oost et al. (2005)? What about the Ganspoel catchment?
L465: But the BRVL catchment showed no correlation between SSL and gross erosion predictions (r = 0.04) and, at the same time, the highest NSE values following the monthly calibration… Did I understand something wrong here?
L467: It can compensate the error; but does it improve the model’s representation of the system? If we would only care about accurately simulating the sediment yield, why would we need a spatially distributed model anyhow?
L476-477: Again, these are not measured gross erosion rates.
L481: “Winter runoff events dominate the runoff and SSL budget in the BRVL and FDTL catchments (Grangeon et al., 2022), representing cases in which the seasonal dynamics of predicted gross erosion and measured SSL were inverted.”
Then doesn’t this potentially indicate that (i) the model is not fit-for-purpose for simulating these catchments in this temporal resolution and that (ii) you are compensating the spurious temporal simulations of internal soil redistribution by means of calibration of the sediment yield?
L491: I agree, but here you have access to the great spatially distributed erosion data from the Kinderveld and Ganspoel catchments. Why haven’t you used them?
This whole discussion made me think about how the concepts of gross erosion and sediment delivery ratio are somewhat inadequate and how perhaps we would be better off thinking in terms of travel distances (Parsons et al., 2004, 2009).
L545-546: Where is this provided in Table 1? Uncertainty estimation was indeed missing here.
L551: I would argue that in the monthly calibration (which seems to be the precise term here – not ‘multitemporal calibration’) you define an optimisation routine to mimic the 15-day sediment load data, but you haven’t provided evidence this is achieved for the right reasons. In fact, the calibration might be compensating for errors in the model and the model parameterisation (see comments above). Hence, I do not think it is sound to state that this calibration serves as a “proxy for missing parameter information or process components”.
L566: I don’t really get how you are using the term “deterministic” throughout the text. What is deterministic model performance?
L572: Can you give the reader a brief explanation of this matter here and refer to Steegen and Govers (2001) for details?
L587-588: “Nevertheless, non-linear temporal differences in ktc back-propagate over the landscape to change the magnitudes of erosion and deposition (Fig. 9).” I am not sure I get this. Can you be more direct? To be honest, I did not get were you wanted to go with this paragraph.
L610-612: Same here: “The spatial characteristics of soil erosion represent the source of the cascading environmental impacts, arguably making them the key prediction target (Vigiak et al., 2006; Jetten et al., 2003; Merritt et al., 2003). While error on the spatial patterns of soil erosion and sediment transport can confound within the spatially lumped sediment yield, the spatial patterns can remain poor (Jetten et al., 2003)”.
L627: Do you mean at the erosion-plot scale? Moreover, how does model testing reduce the uncertainty in the predictions? Does quantifying model error reduce model uncertainty?
In general, I had a hard time understanding where you wanted to go with section 5.3, which seems somewhat speculative and decoupled from your actual results.
L646-647: What is non-linear seasonality? In any case, wouldn’t it be more accurate to state that “reasonable model performance” (what is reasonable anyway?) was only achieved after a monthly calibration of the ktc parameters? And that the best-fit calibrated parameter set was not tested against independent data (i.e. not used during calibration)?
L661-664: I strongly disagree that the monthly parameter optimisation procedure suggested here improves temporal process representation, due to all the above-mentioned reasons.
References
Beven, K. J.: A manifesto for the equifinality thesis, J. Hydrol., 320(1–2), 18–36, doi:10.1016/j.jhydrol.2005.07.007, 2006.
Beven, K. J. and Young, P.: A guide to good practice in modeling semantics for authors and referees, Water Resour. Res., 49(8), 5092–5098, doi:10.1002/wrcr.20393, 2013.
Borselli, L., Cassi, P. and Torri, D.: Prolegomena to sediment and flow connectivity in the landscape: A GIS and field numerical assessment, Catena, 75(3), 268–277, doi:10.1016/j.catena.2008.07.006, 2008.
Fiener, P., Wilken, F. and Auerswald, K.: Filling the gap between plot and landscape scale - Eight years of soil erosion monitoring in 14 adjacent watersheds under soil conservation at Scheyern, Southern Germany, Adv. Geosci., 48, 31–48, doi:10.5194/adgeo-48-31-2019, 2019.
Van Oost, K., Govers, G., Cerdan, O., Thauré, D., Van Rompaey, a., Steegen, a., Nachtergaele, J., Takken, I. and Poesen, J.: Spatially distributed data for erosion model calibration and validation: The Ganspoel and Kinderveld datasets, Catena, 61(2–3), 105–121, doi:10.1016/j.catena.2005.03.001, 2005.
Oreskes, N.: Evaluation (not validation) of quantitative models, Environ. Health Perspect., 106(6), 1453–1460, doi:10.1289/ehp.98106s61453, 1998.
Parsons, A. J., Wainwright, J., Powell, D. M., Kaduk, J. and Brazier, R. E.: A conceptual model for determining soil erosion by water, Earth Surf. Process. Landforms, 29(10), 1293–1302, doi:10.1002/esp.1096, 2004.
Parsons, A. J., Wainwright, J., Brazier, R. E. and Powell, D. M.: Is sediment delivery a fallacy?, Earth Surf. Process. Landforms, 34, 155–161, doi:10.1002/esp, 2009.
Pontes, L. M., Batista, P. V. G., Silva, B. P. C., Viola, M. R., da Rocha, H. R. and Silva, M. L. N.: Assessing sediment yield and streamflow with swat model in a small sub-basin of the cantareira system, Rev. Bras. Cienc. do Solo, 45, doi:10.36783/18069657rbcs20200140, 2021.
Risse, L. M., Nearing, M. a., Laflen, J. M. and Nicks, a. D.: Error Assessment in the Universal Soil Loss Equation, Soil Sci. Soc. Am. J., 57(1987), 825, doi:10.2136/sssaj1993.03615995005700030032x, 1993.
Van Rompaey, A. J. J., Verstraeten, G., Van Oost, K., Govers, G. and Poesen, J.: Modelling mean annual sediment yield using a distributed approach, Earth Surf. Process. Landforms, 26(11), 1221–1236, doi:10.1002/esp.275, 2001.

Citation: https://doi.org/10.5194/egusphere-2023-2693-RC1
RC2:
'Comment on egusphere-2023-2693', Anonymous Referee #2, 22 Jan 2024
General comments
This paper proposed applying the WaTEM/SEDEM model to four well-documented catchments and improving its temporal resolution to a 15-days time step to better represent seasonality effects on modelled sediment fluxes. This improvement is made in response to the perceived lack of temporally explicit modelling approaches in soil erosion modelling. To this end, a temporally varying transport capacity is developed and included using a two-step modelling approach: i) the transport capacity is fixed over the simulation period and is then ii) calibrated on a monthly basis. One of the main result is that using a constant transport capacity parameter throughout the hydrological year yield unsatisfactory results, while the inclusion of two time-varying transport capacity parameters significantly improved model performance.
The study relies on the use of open access data, used for model calibration and evaluation, to improve an open-access model. Indeed, one of the study output is the provision of a Python routine for WaTEM/SEDEM applications. The application of the model is made on three different catchments, and one nested sub catchment. The authors propose to build on the idea of exploring alternative modelling attempts based on increased data availability, supporting more expert-based approaches instead of adding more complexity in existing models (as stated l. 80-83). While this idea is appealing, I have several serious concerns regarding this work:
Several papers already addressed the topic of erosion and sediment transfers in the European loess belt, and none of them was referenced. For instance, early work of Jetten et al. (1999) and Van Dijk & Kwaad (1996); Evrard et al. (2009, 2010) or recent paper such as Landemaine et al. (2023).

One strong argument of the authors is the lack of time dependent soil erosion model in literature (l. 15-16). I would strongly disagree with this hypothesis as numerous time-explicit soil erosion models exist in literature (e.g. WEPP, EROSION3D, EUROSEM, LISEM, KINEROS). The question of time-dependent variables in soil erosion modelling has also been addressed in empirical or process-based models (e.g. CREAM, SWAT, STREAM, PESERA…). How does this study built on existing approaches and why was a new methodology needed?

To enhance the model’s performance, the authors surprisingly choose to include additional complexity in the model through the transport capacity parameters, which seems to contradict the paper’s working hypothesis.

Modelling catchments with area in the order 100 – 1000 ha at a 15-days time step is highly questionable. It is not consistent with the time scale at which soil erosion processes are expected to occur.

It is unclear how the dataset on which the modelling approach based was processed. In particular, in such low-order catchments as the BRVL and FDTL catchments, sediment load can not be estimated from single concentrations values, due to high frequency variations in both discharge and suspended sediment concentration at the flood event scale, including hysteresis effects.

The corresponding resulting model performance is limited (NSE between 39% and 63%, mean NSE=48%), indicating that the main driving factor of the catchments erosion and sediment dynamics were not adequately captured by the proposed modelling approach.

I therefore not recommend publication. Please find additional comments below.
Specific comments
While model application to previously studied catchments with extensive available datasets should be a strength of this modelling study, the authors surprisingly discarded existing data. It is surprising to read that general databases were preferred over data that were specifically derived for the studied catchments. For example, plots delineation and roads network were derived from combined Integrated Administrative and Control System and Open Street Map data according to Figure 3. Why not use the specific data developed for the studied catchments, as illustrated in Matthews et al. (2023 – Figure 3) and Grangeon et al. (2022 – Figure 8) for the Kinderveld (and possibly Ganspoel), BRVL and the nested FDTL catchments (if I understood correctly data availability described in these two papers) respectively? While I understand the intention of developing a unique workflow for future applications in other catchments, it is unclear why the authors chose to discard this unique opportunity to evaluate an important source of uncertainty in input data for models, a foreseen shortcoming for future model applications on other catchments.
Moreover, the authors did not describe how they process the raw data to establish the database used for model evaluation. The inability for the readers to evaluate how the sediment load was calculated for the BRVL and FDTL catchments is concerning, while experimental values are keys in a study intending to evaluate the benefit of a new model parameterization. I understand that ‘Event-variable timestep’ for the Ganspoel and Kinderveld catchments refers to the use of high frequency water height and turbidity measurements transformed into discharge and Suspended Sediment Concentration (SSC) with gauging and sampling operations (if this is correct, it should be explicitly stated). But how can Suspended Sediment Load be calculated at the runoff event scale using ‘a singular aggregated sediment load’?
The authors chose to decompose the 15-day dynamics of soil erosion and sediment transfers, based on the claim that ‘explicit temporal dynamics are typically neglected within many soil erosion modelling approaches in favour of a focus on the long-term annual average as the predictive target’ (l.16). First, this is highly questionable statement as numerous erosion and sediment transfers models exist (see, for example, the models used in the intercomparison proposed by Baartman et al., 2020; none of these models neglect temporal dynamics. One may also consider the widely used SWAT model – Arnold et al., 1998 -, which is another illustration of the inaccuracy of this assertion). Moreover, addressing the erosion dynamics of catchments in the order ~100 -1000 ha using a 15-day time step seem a large temporal window for results aggregation relative to catchments’ response time. In the end, if the model is evaluated against aggregated values, what justify this choice relative to e.g. one or several months?
The model calibration procedure is unclear. As far as I understand, the model evaluation does not involved a training/testing dataset splitting, which may be a concern for adequate model evaluation. The model results are considered satisfactory, while Table 4 indicates that only the total sediment mass is adequately reproduced by the proposed modelling approach. Indeed, with a mean NSE over the four catchments of 48%, the modelled temporal dynamics can not be considered adequately simulated. This seems like an issue in a paper focusing on the improved temporal representation of a model originally developed to reproduce the total sediment mass.
The authors based their study on the hypothesis that the model fails in reproducing the ‘multitemporal sediment yield’ because of inappropriate transport capacity parameterization, and therefore propose some improvements for the WaTEM/SEDEM model. However, in agricultural catchments, seasonality in sediment loads measured at the catchment outlet may results from variation in gross erosion. What justify the focus on improving the transport capacity? From the literature cited in the manuscript, the reader can find existing data on the studied catchment that may have been used to complement the analysis with an in-depth discussion of the relative effects of the seasonal variations of both the transport capacity and erosion rates in these agricultural catchments.
Technical comments
Table 1: Please detail how was calculated sediment load when what using a singular aggregated value per event. Please add an order of magnitude of measurements for the high frequency Ganspoel and Kinderveld catchments.
Figure 2: It is unusual to present discharge and suspended sediment load aggregated by events. It is recommended to use more traditional time series plot including discharge, suspended sediment concentration and rainfall, as it gives a significant amount of additional interesting information on e.g. stream intermittence, flood event occurrence, hysteresis, and characteristics...
l.16: What characteristics are ‘seasonally changing’?
l.29: While soil crusting and vegetative boundaries (do you refer to grass strips and edges?) are recognized as important factors governing runoff dynamics, they were not explicitly studied in this paper. I would suggest removing this part from the abstract.
l.140-143: I found it surprising that the extreme event recorded during the monitoring period was excluded. It is rare to find datasets including significant rainfall-runoff for analysis, as they are usually challenging to measure with an acceptable accuracy. Moreover, it is recognized that most of the catchments sediment fluxes occurred during the largest rainfall-runoff events. Such data are precious and should be analysed in details instead of being discarded.
l.146: susceptible to infiltration-excess.
l.175: Please define SIR.
l.179-180: Should not this be written ‘intensively cultivated catchments’?
l.192-194: It is not clear how the 15-day temporal window was defined. In particular, it seems contradictory to underline the issue of defining ‘consistent thresholds of rainfall-runoff initiation when modelling discrete event episodes’ and then propose a framework based on arbitrarily fixed 15-day threshold for modelling with parameters varying on a monthly basis.
l.306-321: The model calibration procedure is not clear. The first paragraph refers to traditional approach in calibrating WaTEM/SEDEM, an approach discarded here, am I correct? If so, it should be removed. The second paragraph would mean that no splitting between distinct calibration/validation dataset is performed, is this correct? If so, it is a significant issue in the modelling approach.
l.324: How was defined a ‘connectivity scenario’?
l.326-328: Why not considering maximizing the NSE in the calibration procedure, which would account for both the temporality and the sediment mass?
l.362-366: The purpose of the comparison between RUSLE models and measurements is not clear here.
l.416-419: The interpretation of relationships with low correlation coefficient is questionable. I would not recommend using three significant digits on correlation coefficient.
l.504-511: The introduction of pixel-based CN runoff coefficient in this study is questionable, considering that runoff was not directly evaluated in the model.
References
Arnold J.G., Srinivasan R., Muttiah R.S., Williams J.R. (1998). Large area hydrologic modelling and assessment part I: Model development. Journal of the American Water Resources Association, 31(1):73-89.
Baartman J. E. M., Nunes J. P., Masselink R., Darboux F., Bielders C., Degre A., Cantreul V., Cerdan O., Grangeon T., Fiener P., Wilken F., Schindewolf M., Wainwright J. (2020). What do models tell us about water and sediment connectivity? Geomorphology, 367, Article 107300. https://doi.org/10.1016/j.geomorph.2020.107300
Evrard O., Cerdan O., Van Wesemael B., Chauvet M., Le Bissonnais Y., Raclot D., Vandaele K., Andrieux P., Bielders C. (2009). Reliability of an expert-based runoff and erosion model: Applications of STREAM to different environments. Catena, 78(2):129-141.
Evrard O., Nord G., Cerdan O., Souchère V., Le Bissonnais Y., Bonté P. (2010). Modelling the impact of land use change and rainfall seasonality on sediment export from an agricultural catchment of the northwester European loess belt. Agriculture, Ecosystems & Environment, 138(1-2):83-94.
Grangeon T., Vandromme R., Pak L.T., et al. Dynamic parameterization of soil surface characteristics for hydrological models in agricultural catchments. Catena 214, 106257 (2022). https://doi.org/10.1016/j.catena.2022.106257.
Jetten V., de Roo A., Favis-Mortlock D. (1999). Evaluation of field-scale and catchment-scale soil erosion models. Catena, 37:521-541.
Landemaine V., Cerdan O., Grangeon T., Vandromme R., Laignel B., Evrard O., Salvador-Blanes S., Laceby P. (2023). Saturation-excess overland flow in the European loess belt: An underestimated process? International Soil and Water Conservation Research, 11(4):688-699.

https://doi.org/10.1016/j.iswcr.2023.03.004
Matthews F., Verstraeten G., Borrelli P. et al. EUSEDcollab: a network of data from European catchments to monitor net soil erosion by water. Sci Data 10, 515 (2023). https://doi.org/10.1038/s41597-023-02393-8
Van Dijk P.M., Kwaad F.J.P.M. (1996) Runoff generation and soil erosion in small agricultural catchments with loess-derived soils. Hydrological Processes, 10(8):1049-1059.
Citation: https://doi.org/10.5194/egusphere-2023-2693-RC2
AC1: 'Comment on egusphere-2023-2693', Francis Matthews, 04 Mar 2024

We thank both Reviewers for providing comprehensive reviews of this manuscript. In the attached file we provide author responses to all comments raised by reviewers. We combine these into a singular document as numerous points raised, especially the clear positioning and justification of the manuscript, are relevant for both cases. Based on this response, we strongly defend the approach and scientific value of the manuscript (see the general justification of the manuscript), but we will make signficant but specific changes to the manuscipt to meet the responses provided.
Please see all rebuttals and intended manuscript changes in the attached document.
Thank you on behalf of all co-authors,

Citation: https://doi.org/10.5194/egusphere-2023-2693-AC1

Francis Matthews, Panos Panagos, Arthur Fendrich, and Gert Verstraeten

Supplement

https://doi.org/10.5194/egusphere-2023-2693-supplement

Francis Matthews, Panos Panagos, Arthur Fendrich, and Gert Verstraeten

Viewed

Total article views: 961 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
642	267	52	961	105	53	71

HTML: 642
PDF: 267
XML: 52
Total: 961
Supplement: 105
BibTeX: 53
EndNote: 71

Views and downloads (calculated since 11 Dec 2023)

Month	HTML	PDF	XML	Total
Dec 2023	121	46	10	177
Jan 2024	103	16	4	123
Feb 2024	39	13	5	57
Mar 2024	39	20	3	62
Apr 2024	25	15	8	48
May 2024	14	9	2	25
Jun 2024	43	13	4	60
Jul 2024	26	6	4	36
Aug 2024	17	7	5	29
Sep 2024	11	3	0	14
Oct 2024	9	6	0	15
Nov 2024	3	3	0	6
Dec 2024	8	7	0	15
Jan 2025	5	6	0	11
Feb 2025	11	11	1	23
Mar 2025	14	12	1	27
Apr 2025	9	6	0	15
May 2025	11	6	0	17
Jun 2025	13	20	0	33
Jul 2025	12	8	3	23
Aug 2025	35	8	0	43
Sep 2025	62	14	1	77
Oct 2025	12	12	1	25

Cumulative views and downloads (calculated since 11 Dec 2023)

Month	HTML	PDF	XML	Total
Dec 2023	121	46	10	177
Jan 2024	103	16	4	123
Feb 2024	39	13	5	57
Mar 2024	39	20	3	62
Apr 2024	25	15	8	48
May 2024	14	9	2	25
Jun 2024	43	13	4	60
Jul 2024	26	6	4	36
Aug 2024	17	7	5	29
Sep 2024	11	3	0	14
Oct 2024	9	6	0	15
Nov 2024	3	3	0	6
Dec 2024	8	7	0	15
Jan 2025	5	6	0	11
Feb 2025	11	11	1	23
Mar 2025	14	12	1	27
Apr 2025	9	6	0	15
May 2025	11	6	0	17
Jun 2025	13	20	0	33
Jul 2025	12	8	3	23
Aug 2025	35	8	0	43
Sep 2025	62	14	1	77
Oct 2025	12	12	1	25

Viewed (geographical distribution)

Total article views: 946 (including HTML, PDF, and XML) Thereof 946 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Oct 2025

Download

This preprint has been withdrawn.

Preprint (2782 KB)
Metadata XML

Short summary

We assess if a simplistic model can simulate the timing of soil erosion and sediment transport (delivery) in several small agricultural catchments in North-West Europe. The findings show that the loss of soil in fields and the delivery of sediment to streams are related in complex (non-linear) ways through time which impact our knowledge of soil redistribution. Furthermore, we show how adaptations of simplistic models can be used to reveal the missing processes which require future developments.


Total:	0
HTML:	0
PDF:	0
XML:	0