Predicting rut depth with soil moisture estimates from ERA5-Land and in-situ measurements

Schönauer, Marian; Ågren, Anneli M.; Katzensteiner, Klaus; Hartsch, Florian; Arp, Paul; Drollinger, Simon; Jaeger, Dirk

doi:https://doi.org/10.5281/zenodo.8269412

Preprints

https://doi.org/10.5281/zenodo.8269412

Preprints

07 Sep 2023

| 07 Sep 2023

Predicting rut depth with soil moisture estimates from ERA5-Land and in-situ measurements

Marian Schönauer, Anneli M. Ågren, Klaus Katzensteiner, Florian Hartsch, Paul Arp, Simon Drollinger, and Dirk Jaeger

Abstract. Spatiotemporal modelling is an innovative way of predicting soil moisture and has promising applications in supporting sustainable forest operations. One such application is the prediction of rutting, since rutting can cause severe damage to forest soils and ecological functions.

In this work, we used ERA5-Land soil moisture retrievals and several topographic indices to model the response variable, in-situ soil water content, by means of a random forest model. We then correlated the predicted soil moisture with rut depth from different trials.

Our spatiotemporal modelling approach successfully predicted soil moisture with a Kendall’s rank correlation coefficient of 0.62 (R² of 64 %). The final model included the topographic depth-to-water index, slope, stream power index, topographic wetness index, as well as temporal components such as numeric variables derived from date and ERA5-Land soil moisture retrievals. These retrievals showed to be the most important predictor in the model, indicating a large temporal variation. The prediction of rut depth was also successful, resulting in a Kendall’s correlation coefficient of 0.63.

Our results demonstrate that by using data from several sources, including ERA5-Land retrievals, topographic indices and in-situ soil moisture measurements, we can accurately predict soil moisture and use this information to predict rut depth. This has practical applications in reducing the impact of heavy machinery on forest soils and avoiding wet areas during forest operations.

Received: 22 Aug 2023 – Discussion started: 07 Sep 2023

Download & links

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (0 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

20 Jun 2024

Soil moisture modeling with ERA5-Land retrievals, topographic indices, and in situ measurements and its use for predicting ruts

Marian Schönauer, Anneli M. Ågren, Klaus Katzensteiner, Florian Hartsch, Paul Arp, Simon Drollinger, and Dirk Jaeger

Hydrol. Earth Syst. Sci., 28, 2617–2633, https://doi.org/10.5194/hess-28-2617-2024,https://doi.org/10.5194/hess-28-2617-2024, 2024

Short summary

Marian Schönauer, Anneli M. Ågren, Klaus Katzensteiner, Florian Hartsch, Paul Arp, Simon Drollinger, and Dirk Jaeger

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1908', Anonymous Referee #1, 18 Oct 2023

In this study, the authors estimated soil moisture on a few sites using a combination of multi-sourced spatiotemporal variables. The topic is interesting and definitely within the scope of the journal. However, I found a couple of major issues that significantly affected the quality and completeness of this work.
There are a few comments that the authors may consider.
Major issues:
1. As indicated by the title and throughout the manuscript, this work is aimed at predicting rut depth with relevant predictors. However, no rut depth values were predicted. Instead, the model predicted soil moisture. The performance was then evaluated by investigating the correlation between the rut depth and the predicted moisture using Kendall’s correlation coefficient.
I do not think this is the correct approach. Either you need to predict rut depth directly and compare the predictions with the measured rut depth, or you build a second model to predict rut depth using the predicted soil moisture from the previous model and then compare the predicted rut depth with the observations. Simply showing the correlation between the predicted soil moisture and the measured rut depth is not enough, as this is not an apple-to-apple comparison. Given the above reasoning, the current work is incomplete.
2. ERA5 is a very coarse product with spatial resolution at kilometer level. Given the size of the study areas shown in Figure 7, I don’t think ERA5 can provide enough useful information in terms of spatial variability. You may want to justify your selection.
3. The structure of some parts of the manuscript is very confusing. Most contents in the discussion section should be in the introduction section. You discuss results you got in the discussion section, not the motivation, existing efforts, and current gaps.

Minor issues:
1. Line 217: Why scale the τ values with 0.99 but not 0.98 or other numbers?
2. The bottom part of Figure 2: As I said previously, the comparison between RD and SWC is not fair. Instead of a comparison, what was really investigated in this manuscript is the correlation between RD and predicted SWC.
3. I suppose some of your input data items may have different spatial resolutions. Maybe consider adding some explanation of how you unify them.
4. Similar to the one mentioned in the third major issue mentioned above, some titles in section 3 may not be appropriate and are not informative. For example, 3.3 Rut depth data. If it is more about the data, it shouldn’t be in the result section.
5. Line 221: The usage of “Therefore” is confusing, as I did not see any causation between the previous sentence and the one that follows.
6. Line 346: Wrong usage of “Despite”. You talked about the advantages of the manually obtained dataset at the beginning and you recommended manual datasets at the end. I suppose “Therefore” is the correct word to use here.

Citation: https://doi.org/10.5194/egusphere-2023-1908-RC1
- AC1: 'Reply on RC1', Marian Schönauer, 27 Oct 2023
  
  Dear referee,
  We would like to express our gratitude for your valuable input and for dedicating your time to support us in improving the manuscript/model. In this open discussion, we aim to address a major concern raised, issue #1:
  We acknowledge the concern regarding the validation of the model trained on soil moisture for rut depth, and we agree that this validation approach would not be appropriate. Our intention was to validate the model for soil moisture content (SMC) using a repeated cross-validation procedure. Subsequently, we sought to explore the feasibility of utilizing SMC estimates for predicting rut depth, which yielded some promising results.
  During our internal revisions, we did contemplate the idea of using rut depth data to train a separate model. However, we faced limitations in terms of available data, and it may be worthwhile to explore this further. As a result, the comparison between rut depths and SMC estimates emerged as more of a practical application of spatiotemporal SMC maps. Given the nature of these variables, it is possible that their units are different, much like how a dependent variable (e.g., tree height) often differs in scale from the independent variable(s) (e.g., tree age) in a linear model.
  One solution we have considered is scaling the SMC values to match the unit of centimeters, similar to rut depth. We would appreciate hearing any alternative suggestions or ideas you may have regarding this matter.
  Kind Regards, Marian Schönauer On behalf of the authors
  
  Citation: https://doi.org/10.5194/egusphere-2023-1908-AC1
- AC2:
  'Reply on RC1', Marian Schönauer, 24 Nov 2023
  Author's Response to
  Referee Comments (https://doi.org/10.5194/egusphere-2023-1908-RC1)
  
  Dear Editor Yongping Wei, dear Referee,
  we would like to express our sincere gratitude for your dedicated efforts in reviewing our manuscript and constructive feedback. The recommendations provided have proven to be invaluable and instrumental in enhancing the overall quality of our work.
  Best regards,
  Marian Schönauer
  On behalf of the authors
  
  Please find our responses below, starting with a ‘#’.
  RC: In this study, the authors estimated soil moisture on a few sites using a combination of multi-sourced spatiotemporal variables. The topic is interesting and definitely within the scope of the journal. However, I found a couple of major issues that significantly affected the quality and completeness of this work.
  There are a few comments that the authors may consider.
  Major issues:
  As indicated by the title and throughout the manuscript, this work is aimed at predicting rut depth with relevant predictors. However, no rut depth values were predicted. Instead, the model predicted soil moisture. The performance was then evaluated by investigating the correlation between the rut depth and the predicted moisture using Kendall’s correlation coefficient.
  
  I do not think this is the correct approach. Either you need to predict rut depth directly and compare the predictions with the measured rut depth, or you build a second model to predict rut depth using the predicted soil moisture from the previous model and then compare the predicted rut depth with the observations. Simply showing the correlation between the predicted soil moisture and the measured rut depth is not enough, as this is not an apple-to-apple comparison. Given the above reasoning, the current work is incomplete.
  
  # We have already addressed this comment (https://doi.org/10.5194/egusphere-2023-1908-AC1) in a previous response, but for completeness, we are including it in this document.
  # During our internal revisions, we considered using rut depth data to train a separate model. However, limitations in available data prompted us to contemplate further exploration. Consequently, the comparison between rut depths and Soil Moisture Content (SMC) estimates serves as a practical application of spatiotemporal SMC maps. Acknowledging potential differences in the units of these variables, we assert that comparing variables with different units is valid. We used Kendall’s rank correlations, since it is known to be robust to outliers and does not require the data to be (approximately) normally distributed, making it suitable for analyzing relationships in a wide range of situations.
  # We recognize that the emphasis on predicting rut depth may have been too high. Therefore, we propose changing the title to 'Soil Moisture Modeling with ERA5-Land Retrievals and In-Situ Measurements and Its Application for Rut Prediction.', and will present and discuss the results with more caution. We seek the referee's understanding to retain this correlation, which is fundamental to our work.
  ERA5 is a very coarse product with spatial resolution at kilometer level. Given the size of the study areas shown in Figure 7, I don’t think ERA5 can provide enough useful information in terms of spatial variability. You may want to justify your selection.
  
  # Thanks for this recommendation. We will expand the information about ERA5-Land and its advantages in the Material and Methods section. In a global validation of ERA5-Land and NASA Soil Moisture Active Passive with in-situ Soil Water Content (SWC) measurements, Muñoz-Sabater et al. (2021) stated that the bias for ERA5-Land soil moisture retrievals can be high. However, overall, the Root Mean Square Error (RMSE) was low, indicating a good quality representation of the temporal heterogeneity of SWC. The modeling approach employed in our work does not rely on absolute values of the ERA-derived variables, but rather on changes to adjust the model predictions for different seasons. We are confident that ERA5-Land has been a solid choice for this work. One drawback, of course, is the coarse spatial resolution. However, given that seasonal changes are more crucial for our models, the impact of this drawback is limited.
  # Prior to this work, we validated different datasets, including NASA's SMAP, the drought monitor from the German Weather Forecast (DWD), and soil moisture retrievals from Sentinel-1 (specifically surface soil moisture (SSM) and soil water index (SWI) for different soil depths (2-100 cm)). ERA5-Land resulted in a good representation of temporal variations of SWC at our sites. Surprisingly, the Sentinel-1 data, with a spatial resolution of 1x1 km, did not perform better than the 9x9 km data from ERA5-Land. We speculate that Sentinel-1, using the relatively short-waved C-Band, is less effective in predicting soil water status.
  # ERA5-Land utilizes advanced modeling techniques and assimilation of various observational data sources to generate high-quality, global-scale datasets. We believe that this approach leads to the most robust estimates in our region.
  The structure of some parts of the manuscript is very confusing. Most contents in the discussion section should be in the introduction section. You discuss results you got in the discussion section, not the motivation, existing efforts, and current gaps.
  
  # We appreciate this comment and intend to rework the discussion/introduction in a revised version of the manuscript.
  Minor issues:
  Line 217: Why scale the τ values with 0.99 but not 0.98 or other numbers?
  
  # With this threshold, we aimed at selecting models that demonstrated nearly maximum goodness-of-fit. However, we wanted to penalize models with more features, as they are typically prone to overfitting, are more complicated and require more data (Occam’s razor). The threshold of 0.99 * max τ, or -1%, is to some degree arbitrary, we agree. Nevertheless, it is within a range that aligns with our intended selection of still very good models and is consistent with previous work (e.g. Hauglin et al., 2021).
  The bottom part of Figure 2: As I said previously, the comparison between RD and SWC is not fair. Instead of a comparison, what was really investigated in this manuscript is the correlation between RD and predicted SWC.
  
  # We have made adjustments to the figure, but would like to continue to compare and evaluate the correlations between rut depth (RD) and the predicted values of soil water content (SWC). The rationale for correlating RD with SWC has been discussed above, and we hope to have convincingly addressed any concerns raised by the referee.
  
  I suppose some of your input data items may have different spatial resolutions. Maybe consider adding some explanation of how you unify them.
  
  # Thanks for mentioning this. We will add this information. In principle, we just stacked different maps (with different resolutions), and extracted the raster values at spatial points (i.e. measuring or sensor positions).
  Similar to the one mentioned in the third major issue mentioned above, some titles in section 3 may not be appropriate and are not informative. For example, 3.3 Rut depth data. If it is more about the data, it shouldn’t be in the result section.
  
  # We fully agree and would like to update the headings; for example, change '3.3 Rut depth data' to '3.3 Interrelations between rut depth and SWC.' Additionally, we propose adding sub-headings in the Material and Methods (M&M) section, such as 'Comparisons between model predictions and RD or SWC_CORE,' before the last paragraph of the M&M. The rearrangement of parts of the M&M provides additional clarity.
  Line 221: The usage of “Therefore” is confusing, as I did not see any causation between the previous sentence and the one that follows.
  
  # Thanks for noticing. We have changed the phrasing of the sentences of regard.
  Line 346: Wrong usage of “Despite”. You talked about the advantages of the manually obtained dataset at the beginning and you recommended manual datasets at the end. I suppose “Therefore” is the correct word to use here.
  
  # We changed the text as recommended.
  
  We would like to express our gratitude to the referee for their valuable feedback. We want to assure them that we are fully capable of implementing the recommended changes, as discussed above. The valuable and insightful comments provided by the referee are assumed to substantially improve the manuscript.
  
  References
  Hauglin, M., Rahlf, J., Schumacher, J., Astrup, R., and Breidenbach, J. (2021). Large scale mapping of forest attributes using heterogeneous sets of airborne laser scanning and National Forest Inventory data. Forest Ecosystems 8, 65. doi: 10.1186/s40663-021-00338-4
  Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., et al. (2021). ERA5-Land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383. doi: 10.5194/essd-13-4349-2021
  
  Citation: https://doi.org/10.5194/egusphere-2023-1908-AC2
RC2:
'Comment on egusphere-2023-1908', Anonymous Referee #2, 27 Oct 2023

The authors predicted soil moisture for a study site based on a statistical model, considering different variables (e.g., distance to water (DTW), topographic wetness index, soil moisture satellite estimates). First, they identified which variables are more related with soil moisture, then they used the identified variables for soil moisture prediction. They used two different soil moisture estimates collected in the field to validate the model (IMT and SSN). The content of the manuscript is relevant and the data collected in the field is valuable, but the results lack further interpretation and discussion. For instance, what is the physical meaning of having DTW as an important predicting variable? Is this expected? Why is DTW a better predictor in one case (IMT), but ERA-5 is the best predictor in the other case (SSN)? What are the differences in the observations that could lead to that? Also, predicted soil moisture is related with rutting depth only visually in a map. The actual relation between soil moisture and rutting depth needs to be further discussed, otherwise the title of the paper is inconsistent with what it delivers. There are many things like these that need to be better explained (outlined below). Moreover, another concern is that there are two study sites (A and B). In site A, both different measurements overlap in space, so they are comparable. In site B, the different measurements do not overlap in space, so it’s concerning to what extent they can be compared. This needs to at least be explained/discussed further in the manuscript. I hope my comments can be useful and that the authors can improve the manuscript by adding some clarifications.
Major comments:
L124, L246: What is the explanation behind adding Year as an explanatory variable? I understand using Month and Season, as a given month or season could be consistently wetter than a different month or season. But it’s not clear to me how a different year could be important to predict soil moisture. This could be relevant in a climate change study, when you have more than 50 years of data for analysis, for instance. Here, you only have a few years of data, which are not representative of extremes, so I am not sure I understand what is the reasoning in incorporating it as an explanatory variable. Moreover, it would be interesting to comment on the discussion on the impact that variables Month or Season could have in different climates (e.g., would they be relevant in places where temperature and precipitation are constant throughout the year?)
Figure 1: it looks like in Site1, IMT, SSN measurements and trials overlap spatially. In Site 2, it looks like the IMT measurements were performed in a different location than SSN + trials. This needs to be clearly stated in the methods section. In the results/discussion section, this issue needs to be addressed as well. I would like to see the Kendall’s correlations (e.g., Figure 5 and 6) drawn separately for site 1 and Site 2. Maybe for Site 1, correlations could be better than Site 2, because of the spatial variations in sampling locations. It could be that you decide to focus on the results mainly from Site 1, because the measurements are more consistent there.
In the methods section, it’s not clear to me the timing between different measurements and trials. IMT were collected monthly between Sep 2019 and Oct 2020 (L113). The SSN time coverage is not clear. The trials were conducted on Mar 2021 and Oct 2020 (L189 and L190). But then, in the results section, in Figure 3, it looks like trials were conducted in Mar 2021 and maybe Oct 2022? Please state in the Methods section clearly the time coverage between each measurement and trial. Isn’t it relevant that with SSN you have measurements during/after the trials, whereas for IMT, you only have measurements before the trials? This should be mentioned and the impacts of this different coverage in time should be discussed.
Section 3.1: discuss for instance whether IMT and SSN are recording the same variable or not. According to the methods section, IMT measures soil moisture at 6cm, whereas SSN at 10cm. Is this a big difference or not?
L159: How accurate is the ERA5-Land soil moisture retrieval? The resolution is 9 km x 9 km! So your site is mainly only covered by only one pixel. I think this mismatch in spatial resolution should be mentioned in the discussion when analyzing the results.
L161: Moreover, you stated that the ERA-5 is based on ground-based observations as well. I assume that the data is more accurate for regions where there are ground-based observations, and less accurate for regions where there are not. Was ground-based soil moisture data near your study sites used to “calibrate”/inform the ERA-5 product?
Figure 3: are these time series from Site 1, Site 2, or both? Please make this clear.
Figure 3: based on the Kendall’s coefficients, it looks like correlations in green/layer2/deeper soil are higher than correlations in blue/layer1/top soil. I think it’s important to mention that in the results section. And to discuss it in the discussion session.
3.2. Soil moisture models: for SSN, the most important variable was DTW25 and for IMT, the most important variable was SWC_ERA. Can you explain why these variables turned out to be the most important ones? Was this what you expected? Are there physical reasons for that? You could maybe discuss this more in the Discussion section.
L280-281: I think it’s important to mention in the text that none of the models were significant for Trial 2. In the end, you made the decision of which model to use based on Trial 1. It’s important to discuss more in the Discussion about these differences in Trial 1 and Trial 2.
Figure 7: I am not sure it is fair to use a model to predict soil moisture in trial 1 and trial 2, if you just used the data from trial 1 to train the model. Please discuss this.
The analysis of Figure 7 in general is a bit too shallow. Please discuss more the results of the model. Are predicted soil moisture and rut depth consistent? What is the correlation between these two variables?*** Are the results of the model what you expected? Is the spatialization really relevant here (i.e., are there heterogeneities that the spatial model was able to represent?). And I ask this particularly because I didn’t understand Figure 1: for site B, the measurements of IMT and SSN do not overlap. How did you compare them? If you compared them regardless of the spatial variability, maybe spatial variability is actually not that relevant here.
***One way to check the correlation between two variables that have different units is to use the Spearman rank correlation. It takes into account the ranking between the different variables, not their absolute values.
Can you use the results of the model to actually predict rut depth? If not, please at least discuss this and why not. Otherwise, I think the title of the paper needs to be updated, because in the end rut depth was not predicted as stated in the title.
L315-320: you discuss the relation between rutting and DTW reported by other studies. But, in your study, what were the outcomes in terms of rutting depth and relation with DTW? This is not clear.
There seems to be important differences between Trial 1 and Trial 2, and they have different conditions (wet/dry). Would precipitation data be useful as a proxy for rut depth as well? If yes, comment about this in the discussion. If not, just ignore this comment.
In section 4.1. Importance of predictive systems, you mention the importance of “predictive systems”, and, by that, I understand that one could use the methodology described in the paper to predict rut depth in the future. And one of the highlights of the methodology here is that it incorporates the temporal variability, by using the ERA-5 product. However, ERA-5 data could not be used to predict rutting depth in the future, given that the ERA-5 data is for the present, it is not a forecast for the future. And this brings back the question as to whether precipitation could be an important proxy, because, for precipitation, there are forecasts available.

Minor comments:
L15-16: “(…) to model the response variable, in-situ soil water content” -> “(…) to model the soil water content” – I see no need for the “response variable”, as it is a very vague term.
L20: “(…) as well as temporal components such as numeric variables derived from date and (…)” – “numeric variables” is too vague.
L34: I find it a bit strange to put a reference in a middle of a statement. “Soil compaction as a consequence of harvesting operations (Eliasson, 2005; etc) is detrimental to a (…)”. What is the statement that these citations refer to?
Section 2.1.3. Soil maps: what is the information on these soil maps? Is it soil texture? Please state it.
L60: Agren et al. (2021), used -> without the comma
L91: when you mention the predictor variables here, it would be nice to have an idea of which variables are these. For instance, add in parenthesis (e.g., topographic indexes, soil texture)
I think it would make the paper much clearer if you added a sub-section in the beginning of Material and Methods called “Study design” or “Study overview”. In this section, you could present Figure 2, and you write in a little bit more detail what is written as the caption of Figure 2. I think having an overview of the study design before reading the technicalities of the methods section would be very helpful.
L110-111 and L117-118: both are “known to be temporally wet or sensitive machine traffic” – it’s repetitive. Add this sentence only one time referring to both sites.
L143: what SPI stands for?
L148: how is “Basin” a variable? Is it the basin area?
L154: how were you “able to gather maps” ? What is the source for these maps?
L149-L150: I couldn’t find any justification for re-sampling to 15m x 15m based on Agren et al. (2014). Please explain where this number comes from. Also, in general, please add clarification on how datasets with different spatial resolutions were merged.
I think the methods section is too long. Some steps are described as a “recipe” instead of a scientific paper. For instance:
L130: “ (…) inserted into the attributed of a shapefile”, L169: “(…) merged with in-situ data” – these small technical steps of merging/formatting two datasets don’t need to be detailed.
I would avoid using full sentences to describe the names of variables. For instance, in L138-139, “(…) of the following sizes: 0.25 ha (DTW025), 1.00ha (DTW1), 4.00 ha (DTW4)”. In L 153-155: “(…) scale of 1:5,000 from forest site surveys (Soil05).” In L156-157, “(…) scale of 1:50,000 are available for the entirety of North Rhine-Westphalia (Soil50)”
L159-162: I would rephrase it as: “ERA-5 Land is a global (…) , including soil moisture [m³ m^-3] at the top soil layer (0-7cm) and at a depth of 7-28 cm. The soil moisture at the top soil layer is retrieved by assimilating satellite and ground-based observations”. The names of the variables Volumetric soil water layer 1 and 2 are not relevant. I don’t think you use them further in the manuscript. If you do, then you can add their names in parenthesis, but if not, I see no reason why they should be mentioned.
In Figure 2, in the second row (predictors), there are some gray lines in the back, which seem to be connecting “soil maps” and the ERA graph to “add data”. However, I assume “topogr. indices” and “Month Season” are supposed to be included as well?
Figure 3: a legend on the side with the colors and the names of the variables would be helpful.
Figure 3: no need to say “the figure displays”
Figure 3: the names ‘layer 1’ and ‘layer 2’ are not really relevant here. I think it would be more relevant to provide what these layers refer to: layer 1 (top soil) and layer 2 (7-28cm depth).
L232-233: “Soil water content was measured (…) August 2020” – I consider this fits in Methods, not Results.
2.2 Rut depth data: make it clear here what is different between Trials 1 and 2. Trial 1 is in a wet condition and Trial 2 is in a dry environment, right? That’s why it is expected that SWC1 > SWC2. This makes the interpretation of Figure 6 afterwards easier.
Figure 6: what does the black lines and the black values refer to? Make it clear in the caption.
L346: but in site 2, the locations of IMT and SSN are different.
L272: “SWC_PRED proved to be a better predictor of rut depth”, particularly for Trial 1 (in wetter conditions).
L317: ground water -> groundwater
L316-318: 65% + 93% is more than 100%. I don’t think I understand this sentence. Please reformulate it.
L316-318: proximity to groundwater and DTW are two different things, no?
Discussion: section 4.1. Importance of predictive systems: I think I would move this in the very end of the discussion.

Citation: https://doi.org/10.5194/egusphere-2023-1908-RC2
- AC3: 'Reply on RC2', Marian Schönauer, 24 Nov 2023
  
  Author's Response to
  Referee Comments (https://doi.org/10.5194/egusphere-2023-1908-RC2)
  
  Dear Editor Yongping Wei, dear Referee,
  we would like to express our sincere gratitude for your dedicated efforts in reviewing our manuscript and for your constructive feedback. The recommendations provided have proven to be invaluable and instrumental in enhancing the overall quality of our work.
  Best regards,
  Marian Schönauer
  On behalf of the authors
  
  Please find our responses below the respective comments of the reviewers, starting with a ‘#’. In this regard, we would like to note a change in the naming of both trials, with Trial_WET now referred to as the former Trial 1, and Trial_DRY as Trial 2.
  RC: The authors predicted soil moisture for a study site based on a statistical model, considering different variables (e.g., distance to water (DTW), topographic wetness index, soil moisture satellite estimates). First, they identified which variables are more related with soil moisture, then they used the identified variables for soil moisture prediction. They used two different soil moisture estimates collected in the field to validate the model (IMT and SSN). The content of the manuscript is relevant and the data collected in the field is valuable, but the results lack further interpretation and discussion.
  For instance, what is the physical meaning of having DTW as an important predicting variable? Is this expected?
  # We initially anticipated DTW to be one of the most influential predictors, but our assumption was partially disproven, as illustrated in Figure 4.
  # In the final model (IMT), SWC_ERAL2 has been identified as the most important variable, followed by Month and Season. It is noteworthy that in data with broader spatial coverage (i.e. IMT), in contrast to the SSN data, dynamic variables took precedence over predictor variables. Surprisingly, when modeling SSN data, characterized by high temporal resolution and low spatial resolution, DTW025 remained the most influential variable. One might have anticipated the opposite, expecting a topographic index to play a central role in modeling IMT data, and dynamic SWC_ERA variables dominating the modeling of SSN data.
  # We presume that the low spatial variations of SWC in comparison to temporal variations, inadequately represented by the provided topographic information, may have contributed to this unexpected outcome. Furthermore, the wider spatial coverage in the IMT data likely resulted in more robust averages of SWC, leading to a stronger correlation with the coarse spatial data of ERA5-Land (9x9 km). On the contrary, the SSN data, originating from areas with a size of 100x100 m and known for their temporal wetness, could explain the heightened importance of DTW025. Some sensors might have measured constant water saturation, thereby inflating the explanatory power of topographic information. These assumptions are speculative, and further research in this direction is warranted.
  # In the feature reductions of IMT and SSN data, SWC_ERAL2 (7-28 cm soil depth) dominated over SWC_ERAL1 (0-7 cm). This aligns with in-situ measurements of SWC by the SSN, conducted at a soil depth of approximately 10 cm. Even for the IMT data, where SWC was measured in the top 6 cm of soil, SWC_ERAL2 yielded a better goodness-of-fit compared to SWC_ERAL1.
  # We hypothesize that the prevalence of open lands as the dominant land cover form in the ERA5-Land raster cell contributed to the superior fit of SWC_ERAL2. Grasslands typically exhibit higher temporal heterogeneity of soil moisture compared to forests (James et al., 2003). This temporal heterogeneity tends to decrease with deeper soil layers (Tromp-van Meerveld and McDonnell, 2006). Therefore, the stronger correlation between SWC_ERAL2 and SWC, as well as its higher importance within the random forests, seems reasonable. The disparity between SWC_ERA and in-situ SWC can be attributed to the high transpiration rates in forests, as opposed to grass (Kelliher et al., 1993).
  # We will incorporate this in the revised manuscript.
  Why is DTW a better predictor in one case (IMT), but ERA-5 is the best predictor in the other case (SSN)? What are the differences in the observations that could lead to that?
  # Frankly, this result came as a surprise to us. We would have expected a topographic index to be selected as the most important predictor for the IMT data and a temporal predictor for the SSN data. We would have argued that SSN captures temporal variability more than spatial variability, and therefore, a dynamic variable (SWC_ERA) would be crucial. However, we obtained the opposite result, which are interpreted in the comment above.
  Also, predicted soil moisture is related with rutting depth only visually in a map.
  # To enhance the connection between the figures (formerly Figure 6 and Figure 7), we will integrate the maps illustrating raster predictions with scatterplots depicting the correlations between rutting depth (RD) and soil water content (SWC, Figure 1 in this document). This combined presentation aims to reinforce the visual link between these elements.
  Figure 1. Updated Figure 6 and 7.
  The actual relation between soil moisture and rutting depth needs to be further discussed, otherwise the title of the paper is inconsistent with what it delivers.
  # We fully agree and aim to emphasize the interrelations between soil moisture and soil deformation to address this concern in the initial part of the discussion.
  There are many things like these that need to be better explained (outlined below). Moreover, another concern is that there are two study sites (A and B).
  In site A, both different measurements overlap in space, so they are comparable. In site B, the different measurements do not overlap in space, so it’s concerning to what extent they can be compared. This needs to at least be explained/discussed further in the manuscript.
  # Certainly, we appreciate this observation. The IMT data was collected in close proximity to the rut depth measurements (Site 1), or with a distance of up to 1.3 km (Site 2). However, the spatial distance between the IMT training data and the rut depth data did not seem to be crucial for the accuracy of predicting rut depth, since Kendall’s τ between RD and SWC_PRED was similar for both sites (Figure 3 of this document). During the feature reduction, soil information was excluded in the initial stages, likely due to the relatively homogenous soil properties on the relatively small study sites.
  I hope my comments can be useful and that the authors can improve the manuscript by adding some clarifications.
  # Yes, they definitely are. Thank you very much for your suggestions of improvement. We highly appreciate the efforts, as they have brought attention to some shortcomings, enabling us to improve the manuscript and clarify the reliability of our findings.
  Major comments:
  L124, L246: What is the explanation behind adding Year as an explanatory variable? I understand using Month and Season, as a given month or season could be consistently wetter than a different month or season. But it’s not clear to me how a different year could be important to predict soil moisture. This could be relevant in a climate change study, when you have more than 50 years of data for analysis, for instance. Here, you only have a few years of data, which are not representative of extremes, so I am not sure I understand what is the reasoning in incorporating it as an explanatory variable. Moreover, it would be interesting to comment on the discussion on the impact that variables Month or Season could have in different climates (e.g., would they be relevant in places where temperature and precipitation are constant throughout the year?)
  # We had long-year trends in mind and therefore included Year as a numerical variable. It could also happen that changes in the generation of ERA5-Land retrievals occur, which could be potentially considered by Year. But, as we agree with the referee's concern, we have removed Year from the predictor variables.
  Figure 1: it looks like in Site1, IMT, SSN measurements and trials overlap spatially. In Site 2, it looks like the IMT measurements were performed in a different location than SSN + trials. This needs to be clearly stated in the methods section.
  # We fully agree, and included information about the distance in the Material and Methods. We will also address this issue in the results and discussion section.
  In the results/discussion section, this issue needs to be addressed as well. I would like to see the Kendall’s correlations (e.g., Figure 5 and 6) drawn separately for site 1 and Site 2. Maybe for Site 1, correlations could be better than Site 2, because of the spatial variations in sampling locations. It could be that you decide to focus on the results mainly from Site 1, because the measurements are more consistent there.
  # The significance of the correlation is on the edge, probably due to a low number of observations and the strong heterogeneity of forest soils and machine-soil interaction. It seems questionable to us to wether a separation by site would be helpful, since actual forest operations can be facilitated on a large scale, and no grouping according to sites will be employed. We seek a general predictive system to avoid deep ruts over the entire area of interest. Therefore, we would like to stick to the common presentation of RD vs. SWC in Figure 2 (of this document), but we have added the separated analysis in Figure 3 (this document), which will be added to the Appendix of the manuscript.
  Figure 2. The updated Figure 7. Variables Year and Basin were excluded.
  # Some additional information when the analysis was done separately:
  # Correlations coefficients were higher on Site B (=Site 2, it makes it easier to name them A and B, according to figure with the maps), where the SWC values were gathered with some distance to the rut depth measurements. The model's accuracy did not benefit from the proximity of the in situ SWC data used for modeling. The number of observations shrinks down to 2 in the case of separation.
  Figure 3. Rut depth (RD) was determined after four passes of a forwarder, driving on Site A and Site B, during two seasons (Trial_WET and Trial_DRY, conducted under different moisture conditions). RD was compared to SWC values, determined for undisturbed soil cores (A) and SWC values were predicted by a random forest model. This model was trained on manually obtained IMT measurements (B) and predicted by a model trained with data from a continuously measuring soil sensor network (SSN, C). Correlations were evaluated using Kendall's τ. The correlation of all values is given in black. Significance levels are indicated by *** for p<0.001, ** for 0.001-0.01, * for 0.01-0.05, (*) for 0.05-0.10, and 'ns' for p>0.10.
  
  In the methods section, it’s not clear to me the timing between different measurements and trials. IMT were collected monthly between Sep 2019 and Oct 2020 (L113). The SSN time coverage is not clear. The trials were conducted on Mar 2021 and Oct 2020 (L189 and L190). But then, in the results section, in Figure 3, it looks like trials were conducted in Mar 2021 and maybe Oct 2022?
  Please state in the Methods section clearly the time coverage between each measurement and trial.
  # Thank you for the hint. The SSN was launched in December 2019 and its data was obtained from continuously measuring sensors. Data until 2022-12-31 was included. Date of Trial_DRY was 2022-10-11. We will add this information and correct the typo.
  Isn’t it relevant that with SSN you have measurements during/after the trials, whereas for IMT, you only have measurements before the trials? This should be mentioned and the impacts of this different coverage in time should be discussed.
  # Since the year as a predictor variable has been excluded as suggested, the temporal discrepancy should not be too important anymore. We could add some information regarding this concern in the Discussion and we believe that the temporal discrepancy is obvious from Figure 3 (manuscript). To actually determine the influence of temporal coverage on predictions, further research would be necessary. In this work, it is not possible to distinguish between the impact caused by the time lag between IMT data and the trials and other factors such as spatial coverage, measuring design, etc.
  Section 3.1: discuss for instance whether IMT and SSN are recording the same variable or not. According to the methods section, IMT measures soil moisture at 6cm, whereas SSN at 10cm. Is this a big difference or not?
  # Both techniques are measuring the same parameter, volumetric soil water content – at least in principle. Certainly, the technology is different (refer to sensors in Figure 2, manuscript), but in this work, our focus was not on comparing the measuring results of both devices. Another difference arises from the depth of the measuring range. While soil depth can have a strong effect on soil water content (SWC), the vertical difference here is small, and we assume that the spatial heterogeneity of SWC is much larger than the difference of a few centimeters in soil depth. In addition to these factors, it should be noted that the modeling of SWC was conducted separately for each dataset (IMT and SSN), except for the mixed analysis in Appendix A. Therefore, we believe that potential differences between the in-situ measurements were not deemed too significant for this study and could not be verified. Nevertheless, a general agreement between SSN and IMT data can be observed in Figure 3 (manuscript).
  L159: How accurate is the ERA5-Land soil moisture retrieval? The resolution is 9 km x 9 km! So your site is mainly only covered by only one pixel. I think this mismatch in spatial resolution should be mentioned in the discussion when analyzing the results.
  # ERA5 captures the 'regional' (9 x 9 km) temporal variability quite well, but within that region, there is substantial spatial soil moisture variability on a smaller scale. To address this, the main procedure of this work involved combining digital terrain indices that capture local-scale variability and merging that information with ERA5 data. This downscaling approach was employed to showcase a pathway for utilizing ERA5 retrievals for high-resolution predictions, considering the local-scale soil moisture variability not adequately captured by the original ERA5 resolution.
  L161: Moreover, you stated that the ERA-5 is based on ground-based observations as well. I assume that the data is more accurate for regions where there are ground-based observations, and less accurate for regions where there are not. Was ground-based soil moisture data near your study sites used to “calibrate”/inform the ERA-5 product?
  # We were not able to ascertain whether observation data from nearby stations was directly used by ERA5-Land. However, in response to this comment, the text should be corrected since 'ERA5-Land does not assimilate observations directly. The observations influence the land surface evolution via the atmospheric forcing. Forcing air temperature, humidity, and pressure are corrected using a daily lapse rate derived from ERA5' (Muñoz-Sabater et al., 2021). We assume that even if a measuring site of ERA5-Land was 'nearby,' the spatial heterogeneity of soil water content (SWC) and differences between habitats would make it very unlikely that the distance between our study sites and the nearest ground-truthing site would have a strong influence on the accuracy.
  Figure 3: are these time series from Site 1, Site 2, or both? Please make this clear.
  # The data originates from both sites, encompassing a total of 2 × 9 measuring positions. We have updated the figure caption accordingly.
  Figure 3: based on the Kendall’s coefficients, it looks like correlations in green/layer2/deeper soil are higher than correlations in blue/layer1/top soil. I think it’s important to mention that in the results section. And to discuss it in the discussion session.
  # That is correct. SWC_ERA from layer2 (7-28 cm) ended up in a better representation of in-situ SWC than Layer1. We have added the information to the manuscript and highlighted the differences in the result section, as already described on page 1 ("# In the feature reductions of IMT and SSN data...).
  3.2. Soil moisture models: for SSN, the most important variable was DTW25 and for IMT, the most important variable was SWC_ERA. Can you explain why these variables turned out to be the most important ones? Was this what you expected? Are there physical reasons for that? You could maybe discuss this more in the Discussion section.
  # Frankly, we did not expect this. Please find the replies to these questions above.
  L280-281: I think it’s important to mention in the text that none of the models were significant for Trial 2. In the end, you made the decision of which model to use based on Trial 1.
  # The correlation between the outputs of the SWC models and RD data was not utilized to select the final models. We aimed to improve clarity by re-arranging the method section and providing more details. The best SWC model was selected based on cross-validation. Subsequently, this best model was employed to make SWC predictions for the days of the field trials (with a forwarder) and these predictions were then compared with the resulting RD. Furthermore, Figure 2 was updated to reflect these changes. We have endeavoured to ensure the clarity of the manuscript and assume that the implemented suggestions for improvement have improved the quality of the manuscript.
  
  Figure 4. Updated flow chart.
  It’s important to discuss more in the Discussion about these differences in Trial 1 and Trial 2.
  # Yes, we agree, and added the following information to the discussion: Although the strong association between rut depth (RD) and predicted values of SWC was detected, the influence of differences between the trials is obvious. However, the ranges of RD for each trial were consistent with the SWC predictions. During Trial_WET, a significant correlation between RD and SWC_PRED was shown. We assume, that the wetter conditions, in which soils are destabilized (Hillel, 1998; McNabb et al., 2001) during this trial enhanced the predictive power of topographic indices representing on soil water distributions. For example, DTW025 overlapped with surface water in depressions as seen in the field campaigns for Trial_WET. During Trial_DRY SWC along the measuring sections was most likely below the threshold for soils getting suspectible for deformation. For example, Poltorak et al. (2018) stated that ruts only occurred on soils with an SWC above 50%, and SWC at Trial_DRY were below 30%.
  Figure 7: I am not sure it is fair to use a model to predict soil moisture in trial 1 and trial 2, if you just used the data from trial 1 to train the model. Please discuss this.
  # We did not incorporate data from the RD measurements in the modeling of SWC. In order to enhance clarity, we would re-organized the sections in the methods to ensure a distinct separation between SWC-modeling and the validation of these outputs with rut depth data. At the end of the modeling section, we would conclude with the statement „The final models were then used to make prediction rasters of SWC_PRED, which were visually evaluated. Subsequently, the outputs of the final models (built solely on IMT and SSN data) were compared to rut depths and SWC at the machine operating trails.”
  The analysis of Figure 7 in general is a bit too shallow. Please discuss more the results of the model. Are predicted soil moisture and rut depth consistent? What is the correlation between these two variables (One way to check the correlation between two variables that have different units is to use the Spearman rank correlation. It takes into account the ranking between the different variables, not their absolute values.)
  # In the updated figure Figure 2 (this manuscript), we have integrated the correlation between RD and predicted (as well as observed) values of SWC with the raster predictions. We hope that this new figure is more intuitive compared to the previous one. Emphasizing the importance of raster predictions, we believe they play a crucial role in practical applications of SWC modeling, particularly for generating maps that can be utilized in day-to-day work. Our intention was to underscore the potential for creating daily maps through this modeling approach.
  # The coefficients provided by Spearman and Kendall rank correlation methods exhibit considerable similarity. Spearman's method offers the advantage of computational simplicity, a historical consideration rooted in the era when correlations were manually calculated. However, in modern computational settings, Kendall's coefficient of correlation is often considered more robust and is generally the preferred choice due to its desirable statistical properties.
  Are the results of the model what you expected? Is the spatialization really relevant here (i.e., are there heterogeneities that the spatial model was able to represent?). And I ask this particularly because I didn’t understand Figure 1: for site B, the measurements of IMT and SSN do not overlap. How did you compare them? If you compared them regardless of the spatial variability, maybe spatial variability is actually not that relevant here.
  # While it would have been feasible to conduct cross-validation of the SWC models separately for each site, our primary focus was not on site-specific effects but rather on the general impact of topography on soil moisture dynamics. We asserted that both sites were relatively comparable in terms of soil and stand properties, and our objective was to leverage the varied positions of SSN sensors or IMT measurements to derive topographically driven patterns.
  # It's important to note that the IMT data was not directly compared with the SSN data. These datasets from different sources were treated independently in the modeling process. Consequently, this form of spatial variability does not appear relevant to our analysis. We hope that discussing this issue provides more clarity about the manuscript.
  Can you use the results of the model to actually predict rut depth? If not, please at least discuss this and why not. Otherwise, I think the title of the paper needs to be updated, because in the end rut depth was not predicted as stated in the title.
  # We could update the title to 'Soil moisture modeling with ERA5-Land retrievals and in-situ measurements and its use for predicting ruts' to reduce the emphasis on 'predicting rut depth.'
  # While we assert that predicting rut depth using the modeled values of SWC is possible, it is crucial to approach the results with extreme caution. Prompted by the comments of referee 2, we devoted careful consideration to spatial variation, leading us to construct a semivariogram using rut depth data and SWC data from both sources Figure 5 (below). It appears that the rut depth data might exhibit some degree of spatial covariation. Although this realization is unfortunate for our study, we appreciate that the referee's comments have helped identify this critical issue. We have adjusted the manuscript to present the results in a fair manner and hope that future research can build upon this knowledge, particularly by creating better study designs for surveying predictions of rutting in the proposed manner.
  Figure 5. Semivariogram of rut depth (A) and in-situ soil water content (B).
  L315-320: you discuss the relation between rutting and DTW reported by other studies. But, in your study, what were the outcomes in terms of rutting depth and relation with DTW? This is not clear.
  # Considering the significance of the topographic indices DTW and TWI in the development of the SWC models, we compare RD with both indices, in the revised version of the manuscript. For now, Figure 6 (below) might be of interest.
  
  Figure 6. Comparison between topographic indices and rut depth.
  There seems to be important differences between Trial 1 and Trial 2, and they have different conditions (wet/dry). Would precipitation data be useful as a proxy for rut depth as well? If yes, comment about this in the discussion. If not, just ignore this comment.
  # Precipitation is a significant driver in the soil-plant-atmosphere continuum and undeniably influences soil moisture. However, numerous other drivers contribute to this continuum, encompassing factors such as evapotranspiration, interception, lateral and vertical flow. Modeling these complex interactions can be extensive, and we hold the belief that the retrievals of ERA5-Land provide a robust means to integrate soil moisture dynamics.
  In section 4.1. Importance of predictive systems, you mention the importance of “predictive systems”, and, by that, I understand that one could use the methodology described in the paper to predict rut depth in the future. And one of the highlights of the methodology here is that it incorporates the temporal variability, by using the ERA-5 product. However, ERA-5 data could not be used to predict rutting depth in the future, given that the ERA-5 data is for the present, it is not a forecast for the future. And this brings back the question as to whether precipitation could be an important proxy, because, for precipitation, there are forecasts available.
  # The term 'Prediction' used in this work excludes forecasts into the mid-range future but aims to predict rut depths that can be anticipated in a forest operation conducted today or tomorrow. While we have gained some experiences with forecasts of soil moisture content, we have observed that uncertainties can be quite high. Consequently, we decided to focus on current predictions rather than extrapolations with potentially high biases.
  # However, it's worth noting that there are attempts to extend predictions into medium-range forecasting, as demonstrated by efforts such as those by the Finnish Meteerological Institute. We added additional information on this topic to the discussion.
  Minor comments:
  L15-16: “(…) to model the response variable, in-situ soil water content” -> “(…) to model the soil water content” – I see no need for the “response variable”, as it is a very vague term.
  # Thank you for the hint, we omitted the “responsible variable” as recommended.
  L20: “(…) as well as temporal components such as numeric variables derived from date and (…)” – “numeric variables” is too vague.
  # After removing Year as a predictor, Month and Season could be found in the final models. We have updated the abstract accordingly.
  L34: I find it a bit strange to put a reference in a middle of a statement. “Soil compaction as a consequence of harvesting operations (Eliasson, 2005; etc) is detrimental to a (…)”. What is the statement that these citations refer to?
  # We rephrased the sentence: 'Soil compaction is a consequence of harvesting,' as reported by Eliasson (2005), etc. to explain the negative consequences for ecological functions.
  Section 2.1.3. Soil maps: what is the information on these soil maps? Is it soil texture? Please state it.
  # We used the (categorial) soil type information and added the information to the Materials and Methods section.
  L60: Agren et al. (2021), used -> without the comma
  # Done.
  L91: when you mention the predictor variables here, it would be nice to have an idea of which variables are these. For instance, add in parenthesis (e.g., topographic indexes, soil texture)
  # We have changed the statement as recommended.
  I think it would make the paper much clearer if you added a sub-section in the beginning of Material and Methods called “Study design” or “Study overview”. In this section, you could present Figure 2, and you write in a little bit more detail what is written as the caption of Figure 2. I think having an overview of the study design before reading the technicalities of the methods section would be very helpful.
  # That is a good idea, we implemented a short overview to introduce the method section.
  L110-111 and L117-118: both are “known to be temporally wet or sensitive machine traffic” – it’s repetitive. Add this sentence only one time referring to both sites.
  # Yes, that's true. We have removed the repetitions, thanks for mentioning.
  L143: what SPI stands for?
  # stream power index
  L148: how is “Basin” a variable? Is it the basin area?
  # „Basin“ is a factorial variable which defines the catchment in which the positions are included. We removed Basin from the predictor variables to avoid confusions. Since all SSN positions at a site fell into the same Basin, it was discarded very early in the IMT modelling anyway.
  L154: how were you “able to gather maps”? What is the source for these maps?
  # The maps were provided by the Geological Survey. We adjusted the text accordingly.
  L149-L150: I couldn’t find any justification for re-sampling to 15m x 15m based on Agren et al. (2014). Please explain where this number comes from.
  # We added a clarification of 15x15 m in the Materials and Methods section. This resolution has been shown to exhibit a strong correlation with SWC and can be assumed to be more robust (Ågren et al., 2014), as observed in prior work where resolutions ranging from 1 to 20 m were tested (data not shown). Larson et al. (2022) made a more thorough evaluation for more sites and indices. In Larson’s study TWI was best predicted at 16 m, which is in the order of 15 m.
  Also, in general, please add clarification on how datasets with different spatial resolutions were merged.
  # We added this information. In principle, we just stacked different maps (with different resolutions), and extracted the raster values at spatial points (i.e. measuring or sensor positions).
  I think the methods section is too long. Some steps are described as a “recipe” instead of a scientific paper. For instance: L130: “ (…) inserted into the attributed of a shapefile”, L169: “(…) merged with in-situ data” – these small technical steps of merging/formatting two datasets don’t need to be detailed.
  # To provide sufficient information in the methods section is a delicate balance – between overly lengthy descriptions and insufficient detail. We aim to ensure that interested (possibly also inexperienced) scientists can replicate the methods. Readers less interested in detail can skim through the text. We hope for the referee's approval.
  I would avoid using full sentences to describe the names of variables. For instance, in L138-139, “(…) of the following sizes: 0.25 ha (DTW025), 1.00ha (DTW1), 4.00 ha (DTW4)”. In L 153-155: “(…) scale of 1:5,000 from forest site surveys (Soil05).” In L156-157, “(…) scale of 1:50,000 are available for the entirety of North Rhine-Westphalia (Soil50)”
  # We agree and changed the descriptions as recommended.
  L159-162: I would rephrase it as: “ERA-5 Land is a global (…) , including soil moisture [m³ m^-3] at the top soil layer (0-7cm) and at a depth of 7-28 cm. The soil moisture at the top soil layer is retrieved by assimilating satellite and ground-based observations”. The names of the variables Volumetric soil water layer 1 and 2 are not relevant. I don’t think you use them further in the manuscript. If you do, then you can add their names in parenthesis, but if not, I see no reason why they should be mentioned.
  # We agree and rephrased the sentence, but kept the information on the layers, since it is needed for the feature selection.
  In Figure 2, in the second row (predictors), there are some gray lines in the back, which seem to be connecting “soil maps” and the ERA graph to “add data”. However, I assume “topogr. indices” and “Month Season” are supposed to be included as well?
  # Certainly. We updated the flow chart (Figure 4, this document).
  Figure 3: a legend on the side with the colors and the names of the variables would be helpful.
  # Legend can was added, as suggested.
  Figure 3: no need to say “the figure displays”
  # This sentence part was removed from the caption, as recommended.
  Figure 3: the names ‘layer 1’ and ‘layer 2’ are not really relevant here. I think it would be more relevant to provide what these layers refer to: layer 1 (top soil) and layer 2 (7-28cm depth).
  # Changed as recommended.
  L232-233: “Soil water content was measured (…) August 2020” – I consider this fits in Methods, not Results.
  # The sentence was removed. The abbreviations IMT and SSN are described, as repetition.
  2.2 Rut depth data: make it clear here what is different between Trials 1 and 2. Trial 1 is in a wet condition and Trial 2 is in a dry environment, right? That’s why it is expected that SWC1 > SWC2. This makes the interpretation of Figure 6 afterwards easier.
  # We would change the names of each Trial to Trial_WET (=Trial 1) and Trial_DRY (=Trial 2) to increase readability, and added the required information.
  Figure 6: what does the black lines and the black values refer to? Make it clear in the caption.
  # The caption was updated.
  L346: but in site 2, the locations of IMT and SSN are different.
  # Certainly, this issue is addressed in the revised manuscript.
  L272: “SWC_PRED proved to be a better predictor of rut depth”, particularly for Trial 1 (in wetter conditions).
  # We have changed that as recommended.
  L317: ground water -> groundwater
  L316-318: 65% + 93% is more than 100%. I don’t think I understand this sentence. Please reformulate it.
  L316-318: proximity to groundwater and DTW are two different things, no?
  # It is misleading, we agree. The new formulation could be: „For example, Vega-Nieva et al. (2009) found that 65% of ruts deeper than 25 cm were located in areas with a DTW value of less than 1 m, and 93% of these ruts occurred in areas with DTW values less than 10 m.”.
  Discussion: section 4.1. Importance of predictive systems: I think I would move this in the very end of the discussion.
  We agree that swapping section 4.1 is an alternative but would like to stick with the current storyline. We hope for the reviewer's understanding.
  
  We would like to express our gratitude to the referee for their valuable feedback. We want to assure them that we are fully capable of implementing the recommended changes, as discussed above. We think that the insightful and valuable comments provided by the referee resulted in a substantial improvement of the manuscript.
  
  References
  Ågren, A., Lidberg, W., Strömgren, M., Ogilvie, J., and Arp, P. (2014). Evaluating digital terrain indices for soil wetness mapping – a Swedish case study. Hydrology and Earth System Sciences 18, 3623–3634. doi: 10.5194/hess-18-3623-2014
  Hillel, D. (1998). Environmental soil physics: Fundamentals, applications, and environmental considerations. San Diego, California: Elsevier.
  James, S. E., Pärtel, M., Wilson, S. D., and Peltzer, D. A. (2003). Temporal heterogeneity of soil moisture in grassland and forest. Journal of Ecology, 234–239.
  Kelliher, F. M., Leuning, R., and Schulze, E. D. (1993). Evaporation and canopy characteristics of coniferous forests and grasslands. Oecologia 95, 153–163. doi: 10.1007/BF00323485
  Larson, J., Lidberg, W., Ågren, A. M., and Laudon, H. (2022). Predicting soil moisture conditions across a heterogeneous boreal catchment using terrain indices. Hydrology and Earth System Sciences, 26(19), 4837-4851. doi: 10.5194/hess-26-4837-2022
  McNabb, D. H., Startsev, A. D., and Nguyen, H. (2001). Soil Wetness and Traffic Level Effects on Bulk Density and Air-Filled Porosity of Compacted Boreal Forest Soils. Soil Science Society of America Journal 65, 1238–1247. doi: 10.2136/sssaj2001.6541238x
  Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., et al. (2021). ERA5-Land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383. doi: 10.5194/essd-13-4349-2021
  Poltorak, B. J., Labelle, E. R., and Jaeger, D. (2018). Soil displacement during ground-based mechanized forest operations using mixed-wood brush mats. Soil and Tillage Research 179, 96–104. doi: 10.1016/j.still.2018.02.005
  Tromp-van Meerveld, H. J., and McDonnell, J. J. (2006). On the interrelations between topography, soil depth, soil moisture, transpiration rates and species distribution at the hillslope scale. Advances in Water Resources 29, 293–310. doi: 10.1016/j.advwatres.2005.02.016
  Vega-Nieva, D. J., Murphy, P. N. C., Castonguay, M., Ogilvie, J., and Arp, P. (2009). A modular terrain model for daily variations in machine-specific forest soil trafficability. Canadian Journal of Soil Science 89, 93–109. doi: 10.4141/CJSS06033
  
  Citation: https://doi.org/10.5194/egusphere-2023-1908-AC3

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-1908', Anonymous Referee #1, 18 Oct 2023

In this study, the authors estimated soil moisture on a few sites using a combination of multi-sourced spatiotemporal variables. The topic is interesting and definitely within the scope of the journal. However, I found a couple of major issues that significantly affected the quality and completeness of this work.
There are a few comments that the authors may consider.
Major issues:
1. As indicated by the title and throughout the manuscript, this work is aimed at predicting rut depth with relevant predictors. However, no rut depth values were predicted. Instead, the model predicted soil moisture. The performance was then evaluated by investigating the correlation between the rut depth and the predicted moisture using Kendall’s correlation coefficient.
I do not think this is the correct approach. Either you need to predict rut depth directly and compare the predictions with the measured rut depth, or you build a second model to predict rut depth using the predicted soil moisture from the previous model and then compare the predicted rut depth with the observations. Simply showing the correlation between the predicted soil moisture and the measured rut depth is not enough, as this is not an apple-to-apple comparison. Given the above reasoning, the current work is incomplete.
2. ERA5 is a very coarse product with spatial resolution at kilometer level. Given the size of the study areas shown in Figure 7, I don’t think ERA5 can provide enough useful information in terms of spatial variability. You may want to justify your selection.
3. The structure of some parts of the manuscript is very confusing. Most contents in the discussion section should be in the introduction section. You discuss results you got in the discussion section, not the motivation, existing efforts, and current gaps.

Minor issues:
1. Line 217: Why scale the τ values with 0.99 but not 0.98 or other numbers?
2. The bottom part of Figure 2: As I said previously, the comparison between RD and SWC is not fair. Instead of a comparison, what was really investigated in this manuscript is the correlation between RD and predicted SWC.
3. I suppose some of your input data items may have different spatial resolutions. Maybe consider adding some explanation of how you unify them.
4. Similar to the one mentioned in the third major issue mentioned above, some titles in section 3 may not be appropriate and are not informative. For example, 3.3 Rut depth data. If it is more about the data, it shouldn’t be in the result section.
5. Line 221: The usage of “Therefore” is confusing, as I did not see any causation between the previous sentence and the one that follows.
6. Line 346: Wrong usage of “Despite”. You talked about the advantages of the manually obtained dataset at the beginning and you recommended manual datasets at the end. I suppose “Therefore” is the correct word to use here.

Citation: https://doi.org/10.5194/egusphere-2023-1908-RC1
- AC1: 'Reply on RC1', Marian Schönauer, 27 Oct 2023
  
  Dear referee,
  We would like to express our gratitude for your valuable input and for dedicating your time to support us in improving the manuscript/model. In this open discussion, we aim to address a major concern raised, issue #1:
  We acknowledge the concern regarding the validation of the model trained on soil moisture for rut depth, and we agree that this validation approach would not be appropriate. Our intention was to validate the model for soil moisture content (SMC) using a repeated cross-validation procedure. Subsequently, we sought to explore the feasibility of utilizing SMC estimates for predicting rut depth, which yielded some promising results.
  During our internal revisions, we did contemplate the idea of using rut depth data to train a separate model. However, we faced limitations in terms of available data, and it may be worthwhile to explore this further. As a result, the comparison between rut depths and SMC estimates emerged as more of a practical application of spatiotemporal SMC maps. Given the nature of these variables, it is possible that their units are different, much like how a dependent variable (e.g., tree height) often differs in scale from the independent variable(s) (e.g., tree age) in a linear model.
  One solution we have considered is scaling the SMC values to match the unit of centimeters, similar to rut depth. We would appreciate hearing any alternative suggestions or ideas you may have regarding this matter.
  Kind Regards, Marian Schönauer On behalf of the authors
  
  Citation: https://doi.org/10.5194/egusphere-2023-1908-AC1
- AC2:
  'Reply on RC1', Marian Schönauer, 24 Nov 2023
  Author's Response to
  Referee Comments (https://doi.org/10.5194/egusphere-2023-1908-RC1)
  
  Dear Editor Yongping Wei, dear Referee,
  we would like to express our sincere gratitude for your dedicated efforts in reviewing our manuscript and constructive feedback. The recommendations provided have proven to be invaluable and instrumental in enhancing the overall quality of our work.
  Best regards,
  Marian Schönauer
  On behalf of the authors
  
  Please find our responses below, starting with a ‘#’.
  RC: In this study, the authors estimated soil moisture on a few sites using a combination of multi-sourced spatiotemporal variables. The topic is interesting and definitely within the scope of the journal. However, I found a couple of major issues that significantly affected the quality and completeness of this work.
  There are a few comments that the authors may consider.
  Major issues:
  As indicated by the title and throughout the manuscript, this work is aimed at predicting rut depth with relevant predictors. However, no rut depth values were predicted. Instead, the model predicted soil moisture. The performance was then evaluated by investigating the correlation between the rut depth and the predicted moisture using Kendall’s correlation coefficient.
  
  I do not think this is the correct approach. Either you need to predict rut depth directly and compare the predictions with the measured rut depth, or you build a second model to predict rut depth using the predicted soil moisture from the previous model and then compare the predicted rut depth with the observations. Simply showing the correlation between the predicted soil moisture and the measured rut depth is not enough, as this is not an apple-to-apple comparison. Given the above reasoning, the current work is incomplete.
  
  # We have already addressed this comment (https://doi.org/10.5194/egusphere-2023-1908-AC1) in a previous response, but for completeness, we are including it in this document.
  # During our internal revisions, we considered using rut depth data to train a separate model. However, limitations in available data prompted us to contemplate further exploration. Consequently, the comparison between rut depths and Soil Moisture Content (SMC) estimates serves as a practical application of spatiotemporal SMC maps. Acknowledging potential differences in the units of these variables, we assert that comparing variables with different units is valid. We used Kendall’s rank correlations, since it is known to be robust to outliers and does not require the data to be (approximately) normally distributed, making it suitable for analyzing relationships in a wide range of situations.
  # We recognize that the emphasis on predicting rut depth may have been too high. Therefore, we propose changing the title to 'Soil Moisture Modeling with ERA5-Land Retrievals and In-Situ Measurements and Its Application for Rut Prediction.', and will present and discuss the results with more caution. We seek the referee's understanding to retain this correlation, which is fundamental to our work.
  ERA5 is a very coarse product with spatial resolution at kilometer level. Given the size of the study areas shown in Figure 7, I don’t think ERA5 can provide enough useful information in terms of spatial variability. You may want to justify your selection.
  
  # Thanks for this recommendation. We will expand the information about ERA5-Land and its advantages in the Material and Methods section. In a global validation of ERA5-Land and NASA Soil Moisture Active Passive with in-situ Soil Water Content (SWC) measurements, Muñoz-Sabater et al. (2021) stated that the bias for ERA5-Land soil moisture retrievals can be high. However, overall, the Root Mean Square Error (RMSE) was low, indicating a good quality representation of the temporal heterogeneity of SWC. The modeling approach employed in our work does not rely on absolute values of the ERA-derived variables, but rather on changes to adjust the model predictions for different seasons. We are confident that ERA5-Land has been a solid choice for this work. One drawback, of course, is the coarse spatial resolution. However, given that seasonal changes are more crucial for our models, the impact of this drawback is limited.
  # Prior to this work, we validated different datasets, including NASA's SMAP, the drought monitor from the German Weather Forecast (DWD), and soil moisture retrievals from Sentinel-1 (specifically surface soil moisture (SSM) and soil water index (SWI) for different soil depths (2-100 cm)). ERA5-Land resulted in a good representation of temporal variations of SWC at our sites. Surprisingly, the Sentinel-1 data, with a spatial resolution of 1x1 km, did not perform better than the 9x9 km data from ERA5-Land. We speculate that Sentinel-1, using the relatively short-waved C-Band, is less effective in predicting soil water status.
  # ERA5-Land utilizes advanced modeling techniques and assimilation of various observational data sources to generate high-quality, global-scale datasets. We believe that this approach leads to the most robust estimates in our region.
  The structure of some parts of the manuscript is very confusing. Most contents in the discussion section should be in the introduction section. You discuss results you got in the discussion section, not the motivation, existing efforts, and current gaps.
  
  # We appreciate this comment and intend to rework the discussion/introduction in a revised version of the manuscript.
  Minor issues:
  Line 217: Why scale the τ values with 0.99 but not 0.98 or other numbers?
  
  # With this threshold, we aimed at selecting models that demonstrated nearly maximum goodness-of-fit. However, we wanted to penalize models with more features, as they are typically prone to overfitting, are more complicated and require more data (Occam’s razor). The threshold of 0.99 * max τ, or -1%, is to some degree arbitrary, we agree. Nevertheless, it is within a range that aligns with our intended selection of still very good models and is consistent with previous work (e.g. Hauglin et al., 2021).
  The bottom part of Figure 2: As I said previously, the comparison between RD and SWC is not fair. Instead of a comparison, what was really investigated in this manuscript is the correlation between RD and predicted SWC.
  
  # We have made adjustments to the figure, but would like to continue to compare and evaluate the correlations between rut depth (RD) and the predicted values of soil water content (SWC). The rationale for correlating RD with SWC has been discussed above, and we hope to have convincingly addressed any concerns raised by the referee.
  
  I suppose some of your input data items may have different spatial resolutions. Maybe consider adding some explanation of how you unify them.
  
  # Thanks for mentioning this. We will add this information. In principle, we just stacked different maps (with different resolutions), and extracted the raster values at spatial points (i.e. measuring or sensor positions).
  Similar to the one mentioned in the third major issue mentioned above, some titles in section 3 may not be appropriate and are not informative. For example, 3.3 Rut depth data. If it is more about the data, it shouldn’t be in the result section.
  
  # We fully agree and would like to update the headings; for example, change '3.3 Rut depth data' to '3.3 Interrelations between rut depth and SWC.' Additionally, we propose adding sub-headings in the Material and Methods (M&M) section, such as 'Comparisons between model predictions and RD or SWC_CORE,' before the last paragraph of the M&M. The rearrangement of parts of the M&M provides additional clarity.
  Line 221: The usage of “Therefore” is confusing, as I did not see any causation between the previous sentence and the one that follows.
  
  # Thanks for noticing. We have changed the phrasing of the sentences of regard.
  Line 346: Wrong usage of “Despite”. You talked about the advantages of the manually obtained dataset at the beginning and you recommended manual datasets at the end. I suppose “Therefore” is the correct word to use here.
  
  # We changed the text as recommended.
  
  We would like to express our gratitude to the referee for their valuable feedback. We want to assure them that we are fully capable of implementing the recommended changes, as discussed above. The valuable and insightful comments provided by the referee are assumed to substantially improve the manuscript.
  
  References
  Hauglin, M., Rahlf, J., Schumacher, J., Astrup, R., and Breidenbach, J. (2021). Large scale mapping of forest attributes using heterogeneous sets of airborne laser scanning and National Forest Inventory data. Forest Ecosystems 8, 65. doi: 10.1186/s40663-021-00338-4
  Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., et al. (2021). ERA5-Land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383. doi: 10.5194/essd-13-4349-2021
  
  Citation: https://doi.org/10.5194/egusphere-2023-1908-AC2
RC2:
'Comment on egusphere-2023-1908', Anonymous Referee #2, 27 Oct 2023

The authors predicted soil moisture for a study site based on a statistical model, considering different variables (e.g., distance to water (DTW), topographic wetness index, soil moisture satellite estimates). First, they identified which variables are more related with soil moisture, then they used the identified variables for soil moisture prediction. They used two different soil moisture estimates collected in the field to validate the model (IMT and SSN). The content of the manuscript is relevant and the data collected in the field is valuable, but the results lack further interpretation and discussion. For instance, what is the physical meaning of having DTW as an important predicting variable? Is this expected? Why is DTW a better predictor in one case (IMT), but ERA-5 is the best predictor in the other case (SSN)? What are the differences in the observations that could lead to that? Also, predicted soil moisture is related with rutting depth only visually in a map. The actual relation between soil moisture and rutting depth needs to be further discussed, otherwise the title of the paper is inconsistent with what it delivers. There are many things like these that need to be better explained (outlined below). Moreover, another concern is that there are two study sites (A and B). In site A, both different measurements overlap in space, so they are comparable. In site B, the different measurements do not overlap in space, so it’s concerning to what extent they can be compared. This needs to at least be explained/discussed further in the manuscript. I hope my comments can be useful and that the authors can improve the manuscript by adding some clarifications.
Major comments:
L124, L246: What is the explanation behind adding Year as an explanatory variable? I understand using Month and Season, as a given month or season could be consistently wetter than a different month or season. But it’s not clear to me how a different year could be important to predict soil moisture. This could be relevant in a climate change study, when you have more than 50 years of data for analysis, for instance. Here, you only have a few years of data, which are not representative of extremes, so I am not sure I understand what is the reasoning in incorporating it as an explanatory variable. Moreover, it would be interesting to comment on the discussion on the impact that variables Month or Season could have in different climates (e.g., would they be relevant in places where temperature and precipitation are constant throughout the year?)
Figure 1: it looks like in Site1, IMT, SSN measurements and trials overlap spatially. In Site 2, it looks like the IMT measurements were performed in a different location than SSN + trials. This needs to be clearly stated in the methods section. In the results/discussion section, this issue needs to be addressed as well. I would like to see the Kendall’s correlations (e.g., Figure 5 and 6) drawn separately for site 1 and Site 2. Maybe for Site 1, correlations could be better than Site 2, because of the spatial variations in sampling locations. It could be that you decide to focus on the results mainly from Site 1, because the measurements are more consistent there.
In the methods section, it’s not clear to me the timing between different measurements and trials. IMT were collected monthly between Sep 2019 and Oct 2020 (L113). The SSN time coverage is not clear. The trials were conducted on Mar 2021 and Oct 2020 (L189 and L190). But then, in the results section, in Figure 3, it looks like trials were conducted in Mar 2021 and maybe Oct 2022? Please state in the Methods section clearly the time coverage between each measurement and trial. Isn’t it relevant that with SSN you have measurements during/after the trials, whereas for IMT, you only have measurements before the trials? This should be mentioned and the impacts of this different coverage in time should be discussed.
Section 3.1: discuss for instance whether IMT and SSN are recording the same variable or not. According to the methods section, IMT measures soil moisture at 6cm, whereas SSN at 10cm. Is this a big difference or not?
L159: How accurate is the ERA5-Land soil moisture retrieval? The resolution is 9 km x 9 km! So your site is mainly only covered by only one pixel. I think this mismatch in spatial resolution should be mentioned in the discussion when analyzing the results.
L161: Moreover, you stated that the ERA-5 is based on ground-based observations as well. I assume that the data is more accurate for regions where there are ground-based observations, and less accurate for regions where there are not. Was ground-based soil moisture data near your study sites used to “calibrate”/inform the ERA-5 product?
Figure 3: are these time series from Site 1, Site 2, or both? Please make this clear.
Figure 3: based on the Kendall’s coefficients, it looks like correlations in green/layer2/deeper soil are higher than correlations in blue/layer1/top soil. I think it’s important to mention that in the results section. And to discuss it in the discussion session.
3.2. Soil moisture models: for SSN, the most important variable was DTW25 and for IMT, the most important variable was SWC_ERA. Can you explain why these variables turned out to be the most important ones? Was this what you expected? Are there physical reasons for that? You could maybe discuss this more in the Discussion section.
L280-281: I think it’s important to mention in the text that none of the models were significant for Trial 2. In the end, you made the decision of which model to use based on Trial 1. It’s important to discuss more in the Discussion about these differences in Trial 1 and Trial 2.
Figure 7: I am not sure it is fair to use a model to predict soil moisture in trial 1 and trial 2, if you just used the data from trial 1 to train the model. Please discuss this.
The analysis of Figure 7 in general is a bit too shallow. Please discuss more the results of the model. Are predicted soil moisture and rut depth consistent? What is the correlation between these two variables?*** Are the results of the model what you expected? Is the spatialization really relevant here (i.e., are there heterogeneities that the spatial model was able to represent?). And I ask this particularly because I didn’t understand Figure 1: for site B, the measurements of IMT and SSN do not overlap. How did you compare them? If you compared them regardless of the spatial variability, maybe spatial variability is actually not that relevant here.
***One way to check the correlation between two variables that have different units is to use the Spearman rank correlation. It takes into account the ranking between the different variables, not their absolute values.
Can you use the results of the model to actually predict rut depth? If not, please at least discuss this and why not. Otherwise, I think the title of the paper needs to be updated, because in the end rut depth was not predicted as stated in the title.
L315-320: you discuss the relation between rutting and DTW reported by other studies. But, in your study, what were the outcomes in terms of rutting depth and relation with DTW? This is not clear.
There seems to be important differences between Trial 1 and Trial 2, and they have different conditions (wet/dry). Would precipitation data be useful as a proxy for rut depth as well? If yes, comment about this in the discussion. If not, just ignore this comment.
In section 4.1. Importance of predictive systems, you mention the importance of “predictive systems”, and, by that, I understand that one could use the methodology described in the paper to predict rut depth in the future. And one of the highlights of the methodology here is that it incorporates the temporal variability, by using the ERA-5 product. However, ERA-5 data could not be used to predict rutting depth in the future, given that the ERA-5 data is for the present, it is not a forecast for the future. And this brings back the question as to whether precipitation could be an important proxy, because, for precipitation, there are forecasts available.

Minor comments:
L15-16: “(…) to model the response variable, in-situ soil water content” -> “(…) to model the soil water content” – I see no need for the “response variable”, as it is a very vague term.
L20: “(…) as well as temporal components such as numeric variables derived from date and (…)” – “numeric variables” is too vague.
L34: I find it a bit strange to put a reference in a middle of a statement. “Soil compaction as a consequence of harvesting operations (Eliasson, 2005; etc) is detrimental to a (…)”. What is the statement that these citations refer to?
Section 2.1.3. Soil maps: what is the information on these soil maps? Is it soil texture? Please state it.
L60: Agren et al. (2021), used -> without the comma
L91: when you mention the predictor variables here, it would be nice to have an idea of which variables are these. For instance, add in parenthesis (e.g., topographic indexes, soil texture)
I think it would make the paper much clearer if you added a sub-section in the beginning of Material and Methods called “Study design” or “Study overview”. In this section, you could present Figure 2, and you write in a little bit more detail what is written as the caption of Figure 2. I think having an overview of the study design before reading the technicalities of the methods section would be very helpful.
L110-111 and L117-118: both are “known to be temporally wet or sensitive machine traffic” – it’s repetitive. Add this sentence only one time referring to both sites.
L143: what SPI stands for?
L148: how is “Basin” a variable? Is it the basin area?
L154: how were you “able to gather maps” ? What is the source for these maps?
L149-L150: I couldn’t find any justification for re-sampling to 15m x 15m based on Agren et al. (2014). Please explain where this number comes from. Also, in general, please add clarification on how datasets with different spatial resolutions were merged.
I think the methods section is too long. Some steps are described as a “recipe” instead of a scientific paper. For instance:
L130: “ (…) inserted into the attributed of a shapefile”, L169: “(…) merged with in-situ data” – these small technical steps of merging/formatting two datasets don’t need to be detailed.
I would avoid using full sentences to describe the names of variables. For instance, in L138-139, “(…) of the following sizes: 0.25 ha (DTW025), 1.00ha (DTW1), 4.00 ha (DTW4)”. In L 153-155: “(…) scale of 1:5,000 from forest site surveys (Soil05).” In L156-157, “(…) scale of 1:50,000 are available for the entirety of North Rhine-Westphalia (Soil50)”
L159-162: I would rephrase it as: “ERA-5 Land is a global (…) , including soil moisture [m³ m^-3] at the top soil layer (0-7cm) and at a depth of 7-28 cm. The soil moisture at the top soil layer is retrieved by assimilating satellite and ground-based observations”. The names of the variables Volumetric soil water layer 1 and 2 are not relevant. I don’t think you use them further in the manuscript. If you do, then you can add their names in parenthesis, but if not, I see no reason why they should be mentioned.
In Figure 2, in the second row (predictors), there are some gray lines in the back, which seem to be connecting “soil maps” and the ERA graph to “add data”. However, I assume “topogr. indices” and “Month Season” are supposed to be included as well?
Figure 3: a legend on the side with the colors and the names of the variables would be helpful.
Figure 3: no need to say “the figure displays”
Figure 3: the names ‘layer 1’ and ‘layer 2’ are not really relevant here. I think it would be more relevant to provide what these layers refer to: layer 1 (top soil) and layer 2 (7-28cm depth).
L232-233: “Soil water content was measured (…) August 2020” – I consider this fits in Methods, not Results.
2.2 Rut depth data: make it clear here what is different between Trials 1 and 2. Trial 1 is in a wet condition and Trial 2 is in a dry environment, right? That’s why it is expected that SWC1 > SWC2. This makes the interpretation of Figure 6 afterwards easier.
Figure 6: what does the black lines and the black values refer to? Make it clear in the caption.
L346: but in site 2, the locations of IMT and SSN are different.
L272: “SWC_PRED proved to be a better predictor of rut depth”, particularly for Trial 1 (in wetter conditions).
L317: ground water -> groundwater
L316-318: 65% + 93% is more than 100%. I don’t think I understand this sentence. Please reformulate it.
L316-318: proximity to groundwater and DTW are two different things, no?
Discussion: section 4.1. Importance of predictive systems: I think I would move this in the very end of the discussion.

Citation: https://doi.org/10.5194/egusphere-2023-1908-RC2
- AC3: 'Reply on RC2', Marian Schönauer, 24 Nov 2023
  
  Author's Response to
  Referee Comments (https://doi.org/10.5194/egusphere-2023-1908-RC2)
  
  Dear Editor Yongping Wei, dear Referee,
  we would like to express our sincere gratitude for your dedicated efforts in reviewing our manuscript and for your constructive feedback. The recommendations provided have proven to be invaluable and instrumental in enhancing the overall quality of our work.
  Best regards,
  Marian Schönauer
  On behalf of the authors
  
  Please find our responses below the respective comments of the reviewers, starting with a ‘#’. In this regard, we would like to note a change in the naming of both trials, with Trial_WET now referred to as the former Trial 1, and Trial_DRY as Trial 2.
  RC: The authors predicted soil moisture for a study site based on a statistical model, considering different variables (e.g., distance to water (DTW), topographic wetness index, soil moisture satellite estimates). First, they identified which variables are more related with soil moisture, then they used the identified variables for soil moisture prediction. They used two different soil moisture estimates collected in the field to validate the model (IMT and SSN). The content of the manuscript is relevant and the data collected in the field is valuable, but the results lack further interpretation and discussion.
  For instance, what is the physical meaning of having DTW as an important predicting variable? Is this expected?
  # We initially anticipated DTW to be one of the most influential predictors, but our assumption was partially disproven, as illustrated in Figure 4.
  # In the final model (IMT), SWC_ERAL2 has been identified as the most important variable, followed by Month and Season. It is noteworthy that in data with broader spatial coverage (i.e. IMT), in contrast to the SSN data, dynamic variables took precedence over predictor variables. Surprisingly, when modeling SSN data, characterized by high temporal resolution and low spatial resolution, DTW025 remained the most influential variable. One might have anticipated the opposite, expecting a topographic index to play a central role in modeling IMT data, and dynamic SWC_ERA variables dominating the modeling of SSN data.
  # We presume that the low spatial variations of SWC in comparison to temporal variations, inadequately represented by the provided topographic information, may have contributed to this unexpected outcome. Furthermore, the wider spatial coverage in the IMT data likely resulted in more robust averages of SWC, leading to a stronger correlation with the coarse spatial data of ERA5-Land (9x9 km). On the contrary, the SSN data, originating from areas with a size of 100x100 m and known for their temporal wetness, could explain the heightened importance of DTW025. Some sensors might have measured constant water saturation, thereby inflating the explanatory power of topographic information. These assumptions are speculative, and further research in this direction is warranted.
  # In the feature reductions of IMT and SSN data, SWC_ERAL2 (7-28 cm soil depth) dominated over SWC_ERAL1 (0-7 cm). This aligns with in-situ measurements of SWC by the SSN, conducted at a soil depth of approximately 10 cm. Even for the IMT data, where SWC was measured in the top 6 cm of soil, SWC_ERAL2 yielded a better goodness-of-fit compared to SWC_ERAL1.
  # We hypothesize that the prevalence of open lands as the dominant land cover form in the ERA5-Land raster cell contributed to the superior fit of SWC_ERAL2. Grasslands typically exhibit higher temporal heterogeneity of soil moisture compared to forests (James et al., 2003). This temporal heterogeneity tends to decrease with deeper soil layers (Tromp-van Meerveld and McDonnell, 2006). Therefore, the stronger correlation between SWC_ERAL2 and SWC, as well as its higher importance within the random forests, seems reasonable. The disparity between SWC_ERA and in-situ SWC can be attributed to the high transpiration rates in forests, as opposed to grass (Kelliher et al., 1993).
  # We will incorporate this in the revised manuscript.
  Why is DTW a better predictor in one case (IMT), but ERA-5 is the best predictor in the other case (SSN)? What are the differences in the observations that could lead to that?
  # Frankly, this result came as a surprise to us. We would have expected a topographic index to be selected as the most important predictor for the IMT data and a temporal predictor for the SSN data. We would have argued that SSN captures temporal variability more than spatial variability, and therefore, a dynamic variable (SWC_ERA) would be crucial. However, we obtained the opposite result, which are interpreted in the comment above.
  Also, predicted soil moisture is related with rutting depth only visually in a map.
  # To enhance the connection between the figures (formerly Figure 6 and Figure 7), we will integrate the maps illustrating raster predictions with scatterplots depicting the correlations between rutting depth (RD) and soil water content (SWC, Figure 1 in this document). This combined presentation aims to reinforce the visual link between these elements.
  Figure 1. Updated Figure 6 and 7.
  The actual relation between soil moisture and rutting depth needs to be further discussed, otherwise the title of the paper is inconsistent with what it delivers.
  # We fully agree and aim to emphasize the interrelations between soil moisture and soil deformation to address this concern in the initial part of the discussion.
  There are many things like these that need to be better explained (outlined below). Moreover, another concern is that there are two study sites (A and B).
  In site A, both different measurements overlap in space, so they are comparable. In site B, the different measurements do not overlap in space, so it’s concerning to what extent they can be compared. This needs to at least be explained/discussed further in the manuscript.
  # Certainly, we appreciate this observation. The IMT data was collected in close proximity to the rut depth measurements (Site 1), or with a distance of up to 1.3 km (Site 2). However, the spatial distance between the IMT training data and the rut depth data did not seem to be crucial for the accuracy of predicting rut depth, since Kendall’s τ between RD and SWC_PRED was similar for both sites (Figure 3 of this document). During the feature reduction, soil information was excluded in the initial stages, likely due to the relatively homogenous soil properties on the relatively small study sites.
  I hope my comments can be useful and that the authors can improve the manuscript by adding some clarifications.
  # Yes, they definitely are. Thank you very much for your suggestions of improvement. We highly appreciate the efforts, as they have brought attention to some shortcomings, enabling us to improve the manuscript and clarify the reliability of our findings.
  Major comments:
  L124, L246: What is the explanation behind adding Year as an explanatory variable? I understand using Month and Season, as a given month or season could be consistently wetter than a different month or season. But it’s not clear to me how a different year could be important to predict soil moisture. This could be relevant in a climate change study, when you have more than 50 years of data for analysis, for instance. Here, you only have a few years of data, which are not representative of extremes, so I am not sure I understand what is the reasoning in incorporating it as an explanatory variable. Moreover, it would be interesting to comment on the discussion on the impact that variables Month or Season could have in different climates (e.g., would they be relevant in places where temperature and precipitation are constant throughout the year?)
  # We had long-year trends in mind and therefore included Year as a numerical variable. It could also happen that changes in the generation of ERA5-Land retrievals occur, which could be potentially considered by Year. But, as we agree with the referee's concern, we have removed Year from the predictor variables.
  Figure 1: it looks like in Site1, IMT, SSN measurements and trials overlap spatially. In Site 2, it looks like the IMT measurements were performed in a different location than SSN + trials. This needs to be clearly stated in the methods section.
  # We fully agree, and included information about the distance in the Material and Methods. We will also address this issue in the results and discussion section.
  In the results/discussion section, this issue needs to be addressed as well. I would like to see the Kendall’s correlations (e.g., Figure 5 and 6) drawn separately for site 1 and Site 2. Maybe for Site 1, correlations could be better than Site 2, because of the spatial variations in sampling locations. It could be that you decide to focus on the results mainly from Site 1, because the measurements are more consistent there.
  # The significance of the correlation is on the edge, probably due to a low number of observations and the strong heterogeneity of forest soils and machine-soil interaction. It seems questionable to us to wether a separation by site would be helpful, since actual forest operations can be facilitated on a large scale, and no grouping according to sites will be employed. We seek a general predictive system to avoid deep ruts over the entire area of interest. Therefore, we would like to stick to the common presentation of RD vs. SWC in Figure 2 (of this document), but we have added the separated analysis in Figure 3 (this document), which will be added to the Appendix of the manuscript.
  Figure 2. The updated Figure 7. Variables Year and Basin were excluded.
  # Some additional information when the analysis was done separately:
  # Correlations coefficients were higher on Site B (=Site 2, it makes it easier to name them A and B, according to figure with the maps), where the SWC values were gathered with some distance to the rut depth measurements. The model's accuracy did not benefit from the proximity of the in situ SWC data used for modeling. The number of observations shrinks down to 2 in the case of separation.
  Figure 3. Rut depth (RD) was determined after four passes of a forwarder, driving on Site A and Site B, during two seasons (Trial_WET and Trial_DRY, conducted under different moisture conditions). RD was compared to SWC values, determined for undisturbed soil cores (A) and SWC values were predicted by a random forest model. This model was trained on manually obtained IMT measurements (B) and predicted by a model trained with data from a continuously measuring soil sensor network (SSN, C). Correlations were evaluated using Kendall's τ. The correlation of all values is given in black. Significance levels are indicated by *** for p<0.001, ** for 0.001-0.01, * for 0.01-0.05, (*) for 0.05-0.10, and 'ns' for p>0.10.
  
  In the methods section, it’s not clear to me the timing between different measurements and trials. IMT were collected monthly between Sep 2019 and Oct 2020 (L113). The SSN time coverage is not clear. The trials were conducted on Mar 2021 and Oct 2020 (L189 and L190). But then, in the results section, in Figure 3, it looks like trials were conducted in Mar 2021 and maybe Oct 2022?
  Please state in the Methods section clearly the time coverage between each measurement and trial.
  # Thank you for the hint. The SSN was launched in December 2019 and its data was obtained from continuously measuring sensors. Data until 2022-12-31 was included. Date of Trial_DRY was 2022-10-11. We will add this information and correct the typo.
  Isn’t it relevant that with SSN you have measurements during/after the trials, whereas for IMT, you only have measurements before the trials? This should be mentioned and the impacts of this different coverage in time should be discussed.
  # Since the year as a predictor variable has been excluded as suggested, the temporal discrepancy should not be too important anymore. We could add some information regarding this concern in the Discussion and we believe that the temporal discrepancy is obvious from Figure 3 (manuscript). To actually determine the influence of temporal coverage on predictions, further research would be necessary. In this work, it is not possible to distinguish between the impact caused by the time lag between IMT data and the trials and other factors such as spatial coverage, measuring design, etc.
  Section 3.1: discuss for instance whether IMT and SSN are recording the same variable or not. According to the methods section, IMT measures soil moisture at 6cm, whereas SSN at 10cm. Is this a big difference or not?
  # Both techniques are measuring the same parameter, volumetric soil water content – at least in principle. Certainly, the technology is different (refer to sensors in Figure 2, manuscript), but in this work, our focus was not on comparing the measuring results of both devices. Another difference arises from the depth of the measuring range. While soil depth can have a strong effect on soil water content (SWC), the vertical difference here is small, and we assume that the spatial heterogeneity of SWC is much larger than the difference of a few centimeters in soil depth. In addition to these factors, it should be noted that the modeling of SWC was conducted separately for each dataset (IMT and SSN), except for the mixed analysis in Appendix A. Therefore, we believe that potential differences between the in-situ measurements were not deemed too significant for this study and could not be verified. Nevertheless, a general agreement between SSN and IMT data can be observed in Figure 3 (manuscript).
  L159: How accurate is the ERA5-Land soil moisture retrieval? The resolution is 9 km x 9 km! So your site is mainly only covered by only one pixel. I think this mismatch in spatial resolution should be mentioned in the discussion when analyzing the results.
  # ERA5 captures the 'regional' (9 x 9 km) temporal variability quite well, but within that region, there is substantial spatial soil moisture variability on a smaller scale. To address this, the main procedure of this work involved combining digital terrain indices that capture local-scale variability and merging that information with ERA5 data. This downscaling approach was employed to showcase a pathway for utilizing ERA5 retrievals for high-resolution predictions, considering the local-scale soil moisture variability not adequately captured by the original ERA5 resolution.
  L161: Moreover, you stated that the ERA-5 is based on ground-based observations as well. I assume that the data is more accurate for regions where there are ground-based observations, and less accurate for regions where there are not. Was ground-based soil moisture data near your study sites used to “calibrate”/inform the ERA-5 product?
  # We were not able to ascertain whether observation data from nearby stations was directly used by ERA5-Land. However, in response to this comment, the text should be corrected since 'ERA5-Land does not assimilate observations directly. The observations influence the land surface evolution via the atmospheric forcing. Forcing air temperature, humidity, and pressure are corrected using a daily lapse rate derived from ERA5' (Muñoz-Sabater et al., 2021). We assume that even if a measuring site of ERA5-Land was 'nearby,' the spatial heterogeneity of soil water content (SWC) and differences between habitats would make it very unlikely that the distance between our study sites and the nearest ground-truthing site would have a strong influence on the accuracy.
  Figure 3: are these time series from Site 1, Site 2, or both? Please make this clear.
  # The data originates from both sites, encompassing a total of 2 × 9 measuring positions. We have updated the figure caption accordingly.
  Figure 3: based on the Kendall’s coefficients, it looks like correlations in green/layer2/deeper soil are higher than correlations in blue/layer1/top soil. I think it’s important to mention that in the results section. And to discuss it in the discussion session.
  # That is correct. SWC_ERA from layer2 (7-28 cm) ended up in a better representation of in-situ SWC than Layer1. We have added the information to the manuscript and highlighted the differences in the result section, as already described on page 1 ("# In the feature reductions of IMT and SSN data...).
  3.2. Soil moisture models: for SSN, the most important variable was DTW25 and for IMT, the most important variable was SWC_ERA. Can you explain why these variables turned out to be the most important ones? Was this what you expected? Are there physical reasons for that? You could maybe discuss this more in the Discussion section.
  # Frankly, we did not expect this. Please find the replies to these questions above.
  L280-281: I think it’s important to mention in the text that none of the models were significant for Trial 2. In the end, you made the decision of which model to use based on Trial 1.
  # The correlation between the outputs of the SWC models and RD data was not utilized to select the final models. We aimed to improve clarity by re-arranging the method section and providing more details. The best SWC model was selected based on cross-validation. Subsequently, this best model was employed to make SWC predictions for the days of the field trials (with a forwarder) and these predictions were then compared with the resulting RD. Furthermore, Figure 2 was updated to reflect these changes. We have endeavoured to ensure the clarity of the manuscript and assume that the implemented suggestions for improvement have improved the quality of the manuscript.
  
  Figure 4. Updated flow chart.
  It’s important to discuss more in the Discussion about these differences in Trial 1 and Trial 2.
  # Yes, we agree, and added the following information to the discussion: Although the strong association between rut depth (RD) and predicted values of SWC was detected, the influence of differences between the trials is obvious. However, the ranges of RD for each trial were consistent with the SWC predictions. During Trial_WET, a significant correlation between RD and SWC_PRED was shown. We assume, that the wetter conditions, in which soils are destabilized (Hillel, 1998; McNabb et al., 2001) during this trial enhanced the predictive power of topographic indices representing on soil water distributions. For example, DTW025 overlapped with surface water in depressions as seen in the field campaigns for Trial_WET. During Trial_DRY SWC along the measuring sections was most likely below the threshold for soils getting suspectible for deformation. For example, Poltorak et al. (2018) stated that ruts only occurred on soils with an SWC above 50%, and SWC at Trial_DRY were below 30%.
  Figure 7: I am not sure it is fair to use a model to predict soil moisture in trial 1 and trial 2, if you just used the data from trial 1 to train the model. Please discuss this.
  # We did not incorporate data from the RD measurements in the modeling of SWC. In order to enhance clarity, we would re-organized the sections in the methods to ensure a distinct separation between SWC-modeling and the validation of these outputs with rut depth data. At the end of the modeling section, we would conclude with the statement „The final models were then used to make prediction rasters of SWC_PRED, which were visually evaluated. Subsequently, the outputs of the final models (built solely on IMT and SSN data) were compared to rut depths and SWC at the machine operating trails.”
  The analysis of Figure 7 in general is a bit too shallow. Please discuss more the results of the model. Are predicted soil moisture and rut depth consistent? What is the correlation between these two variables (One way to check the correlation between two variables that have different units is to use the Spearman rank correlation. It takes into account the ranking between the different variables, not their absolute values.)
  # In the updated figure Figure 2 (this manuscript), we have integrated the correlation between RD and predicted (as well as observed) values of SWC with the raster predictions. We hope that this new figure is more intuitive compared to the previous one. Emphasizing the importance of raster predictions, we believe they play a crucial role in practical applications of SWC modeling, particularly for generating maps that can be utilized in day-to-day work. Our intention was to underscore the potential for creating daily maps through this modeling approach.
  # The coefficients provided by Spearman and Kendall rank correlation methods exhibit considerable similarity. Spearman's method offers the advantage of computational simplicity, a historical consideration rooted in the era when correlations were manually calculated. However, in modern computational settings, Kendall's coefficient of correlation is often considered more robust and is generally the preferred choice due to its desirable statistical properties.
  Are the results of the model what you expected? Is the spatialization really relevant here (i.e., are there heterogeneities that the spatial model was able to represent?). And I ask this particularly because I didn’t understand Figure 1: for site B, the measurements of IMT and SSN do not overlap. How did you compare them? If you compared them regardless of the spatial variability, maybe spatial variability is actually not that relevant here.
  # While it would have been feasible to conduct cross-validation of the SWC models separately for each site, our primary focus was not on site-specific effects but rather on the general impact of topography on soil moisture dynamics. We asserted that both sites were relatively comparable in terms of soil and stand properties, and our objective was to leverage the varied positions of SSN sensors or IMT measurements to derive topographically driven patterns.
  # It's important to note that the IMT data was not directly compared with the SSN data. These datasets from different sources were treated independently in the modeling process. Consequently, this form of spatial variability does not appear relevant to our analysis. We hope that discussing this issue provides more clarity about the manuscript.
  Can you use the results of the model to actually predict rut depth? If not, please at least discuss this and why not. Otherwise, I think the title of the paper needs to be updated, because in the end rut depth was not predicted as stated in the title.
  # We could update the title to 'Soil moisture modeling with ERA5-Land retrievals and in-situ measurements and its use for predicting ruts' to reduce the emphasis on 'predicting rut depth.'
  # While we assert that predicting rut depth using the modeled values of SWC is possible, it is crucial to approach the results with extreme caution. Prompted by the comments of referee 2, we devoted careful consideration to spatial variation, leading us to construct a semivariogram using rut depth data and SWC data from both sources Figure 5 (below). It appears that the rut depth data might exhibit some degree of spatial covariation. Although this realization is unfortunate for our study, we appreciate that the referee's comments have helped identify this critical issue. We have adjusted the manuscript to present the results in a fair manner and hope that future research can build upon this knowledge, particularly by creating better study designs for surveying predictions of rutting in the proposed manner.
  Figure 5. Semivariogram of rut depth (A) and in-situ soil water content (B).
  L315-320: you discuss the relation between rutting and DTW reported by other studies. But, in your study, what were the outcomes in terms of rutting depth and relation with DTW? This is not clear.
  # Considering the significance of the topographic indices DTW and TWI in the development of the SWC models, we compare RD with both indices, in the revised version of the manuscript. For now, Figure 6 (below) might be of interest.
  
  Figure 6. Comparison between topographic indices and rut depth.
  There seems to be important differences between Trial 1 and Trial 2, and they have different conditions (wet/dry). Would precipitation data be useful as a proxy for rut depth as well? If yes, comment about this in the discussion. If not, just ignore this comment.
  # Precipitation is a significant driver in the soil-plant-atmosphere continuum and undeniably influences soil moisture. However, numerous other drivers contribute to this continuum, encompassing factors such as evapotranspiration, interception, lateral and vertical flow. Modeling these complex interactions can be extensive, and we hold the belief that the retrievals of ERA5-Land provide a robust means to integrate soil moisture dynamics.
  In section 4.1. Importance of predictive systems, you mention the importance of “predictive systems”, and, by that, I understand that one could use the methodology described in the paper to predict rut depth in the future. And one of the highlights of the methodology here is that it incorporates the temporal variability, by using the ERA-5 product. However, ERA-5 data could not be used to predict rutting depth in the future, given that the ERA-5 data is for the present, it is not a forecast for the future. And this brings back the question as to whether precipitation could be an important proxy, because, for precipitation, there are forecasts available.
  # The term 'Prediction' used in this work excludes forecasts into the mid-range future but aims to predict rut depths that can be anticipated in a forest operation conducted today or tomorrow. While we have gained some experiences with forecasts of soil moisture content, we have observed that uncertainties can be quite high. Consequently, we decided to focus on current predictions rather than extrapolations with potentially high biases.
  # However, it's worth noting that there are attempts to extend predictions into medium-range forecasting, as demonstrated by efforts such as those by the Finnish Meteerological Institute. We added additional information on this topic to the discussion.
  Minor comments:
  L15-16: “(…) to model the response variable, in-situ soil water content” -> “(…) to model the soil water content” – I see no need for the “response variable”, as it is a very vague term.
  # Thank you for the hint, we omitted the “responsible variable” as recommended.
  L20: “(…) as well as temporal components such as numeric variables derived from date and (…)” – “numeric variables” is too vague.
  # After removing Year as a predictor, Month and Season could be found in the final models. We have updated the abstract accordingly.
  L34: I find it a bit strange to put a reference in a middle of a statement. “Soil compaction as a consequence of harvesting operations (Eliasson, 2005; etc) is detrimental to a (…)”. What is the statement that these citations refer to?
  # We rephrased the sentence: 'Soil compaction is a consequence of harvesting,' as reported by Eliasson (2005), etc. to explain the negative consequences for ecological functions.
  Section 2.1.3. Soil maps: what is the information on these soil maps? Is it soil texture? Please state it.
  # We used the (categorial) soil type information and added the information to the Materials and Methods section.
  L60: Agren et al. (2021), used -> without the comma
  # Done.
  L91: when you mention the predictor variables here, it would be nice to have an idea of which variables are these. For instance, add in parenthesis (e.g., topographic indexes, soil texture)
  # We have changed the statement as recommended.
  I think it would make the paper much clearer if you added a sub-section in the beginning of Material and Methods called “Study design” or “Study overview”. In this section, you could present Figure 2, and you write in a little bit more detail what is written as the caption of Figure 2. I think having an overview of the study design before reading the technicalities of the methods section would be very helpful.
  # That is a good idea, we implemented a short overview to introduce the method section.
  L110-111 and L117-118: both are “known to be temporally wet or sensitive machine traffic” – it’s repetitive. Add this sentence only one time referring to both sites.
  # Yes, that's true. We have removed the repetitions, thanks for mentioning.
  L143: what SPI stands for?
  # stream power index
  L148: how is “Basin” a variable? Is it the basin area?
  # „Basin“ is a factorial variable which defines the catchment in which the positions are included. We removed Basin from the predictor variables to avoid confusions. Since all SSN positions at a site fell into the same Basin, it was discarded very early in the IMT modelling anyway.
  L154: how were you “able to gather maps”? What is the source for these maps?
  # The maps were provided by the Geological Survey. We adjusted the text accordingly.
  L149-L150: I couldn’t find any justification for re-sampling to 15m x 15m based on Agren et al. (2014). Please explain where this number comes from.
  # We added a clarification of 15x15 m in the Materials and Methods section. This resolution has been shown to exhibit a strong correlation with SWC and can be assumed to be more robust (Ågren et al., 2014), as observed in prior work where resolutions ranging from 1 to 20 m were tested (data not shown). Larson et al. (2022) made a more thorough evaluation for more sites and indices. In Larson’s study TWI was best predicted at 16 m, which is in the order of 15 m.
  Also, in general, please add clarification on how datasets with different spatial resolutions were merged.
  # We added this information. In principle, we just stacked different maps (with different resolutions), and extracted the raster values at spatial points (i.e. measuring or sensor positions).
  I think the methods section is too long. Some steps are described as a “recipe” instead of a scientific paper. For instance: L130: “ (…) inserted into the attributed of a shapefile”, L169: “(…) merged with in-situ data” – these small technical steps of merging/formatting two datasets don’t need to be detailed.
  # To provide sufficient information in the methods section is a delicate balance – between overly lengthy descriptions and insufficient detail. We aim to ensure that interested (possibly also inexperienced) scientists can replicate the methods. Readers less interested in detail can skim through the text. We hope for the referee's approval.
  I would avoid using full sentences to describe the names of variables. For instance, in L138-139, “(…) of the following sizes: 0.25 ha (DTW025), 1.00ha (DTW1), 4.00 ha (DTW4)”. In L 153-155: “(…) scale of 1:5,000 from forest site surveys (Soil05).” In L156-157, “(…) scale of 1:50,000 are available for the entirety of North Rhine-Westphalia (Soil50)”
  # We agree and changed the descriptions as recommended.
  L159-162: I would rephrase it as: “ERA-5 Land is a global (…) , including soil moisture [m³ m^-3] at the top soil layer (0-7cm) and at a depth of 7-28 cm. The soil moisture at the top soil layer is retrieved by assimilating satellite and ground-based observations”. The names of the variables Volumetric soil water layer 1 and 2 are not relevant. I don’t think you use them further in the manuscript. If you do, then you can add their names in parenthesis, but if not, I see no reason why they should be mentioned.
  # We agree and rephrased the sentence, but kept the information on the layers, since it is needed for the feature selection.
  In Figure 2, in the second row (predictors), there are some gray lines in the back, which seem to be connecting “soil maps” and the ERA graph to “add data”. However, I assume “topogr. indices” and “Month Season” are supposed to be included as well?
  # Certainly. We updated the flow chart (Figure 4, this document).
  Figure 3: a legend on the side with the colors and the names of the variables would be helpful.
  # Legend can was added, as suggested.
  Figure 3: no need to say “the figure displays”
  # This sentence part was removed from the caption, as recommended.
  Figure 3: the names ‘layer 1’ and ‘layer 2’ are not really relevant here. I think it would be more relevant to provide what these layers refer to: layer 1 (top soil) and layer 2 (7-28cm depth).
  # Changed as recommended.
  L232-233: “Soil water content was measured (…) August 2020” – I consider this fits in Methods, not Results.
  # The sentence was removed. The abbreviations IMT and SSN are described, as repetition.
  2.2 Rut depth data: make it clear here what is different between Trials 1 and 2. Trial 1 is in a wet condition and Trial 2 is in a dry environment, right? That’s why it is expected that SWC1 > SWC2. This makes the interpretation of Figure 6 afterwards easier.
  # We would change the names of each Trial to Trial_WET (=Trial 1) and Trial_DRY (=Trial 2) to increase readability, and added the required information.
  Figure 6: what does the black lines and the black values refer to? Make it clear in the caption.
  # The caption was updated.
  L346: but in site 2, the locations of IMT and SSN are different.
  # Certainly, this issue is addressed in the revised manuscript.
  L272: “SWC_PRED proved to be a better predictor of rut depth”, particularly for Trial 1 (in wetter conditions).
  # We have changed that as recommended.
  L317: ground water -> groundwater
  L316-318: 65% + 93% is more than 100%. I don’t think I understand this sentence. Please reformulate it.
  L316-318: proximity to groundwater and DTW are two different things, no?
  # It is misleading, we agree. The new formulation could be: „For example, Vega-Nieva et al. (2009) found that 65% of ruts deeper than 25 cm were located in areas with a DTW value of less than 1 m, and 93% of these ruts occurred in areas with DTW values less than 10 m.”.
  Discussion: section 4.1. Importance of predictive systems: I think I would move this in the very end of the discussion.
  We agree that swapping section 4.1 is an alternative but would like to stick with the current storyline. We hope for the reviewer's understanding.
  
  We would like to express our gratitude to the referee for their valuable feedback. We want to assure them that we are fully capable of implementing the recommended changes, as discussed above. We think that the insightful and valuable comments provided by the referee resulted in a substantial improvement of the manuscript.
  
  References
  Ågren, A., Lidberg, W., Strömgren, M., Ogilvie, J., and Arp, P. (2014). Evaluating digital terrain indices for soil wetness mapping – a Swedish case study. Hydrology and Earth System Sciences 18, 3623–3634. doi: 10.5194/hess-18-3623-2014
  Hillel, D. (1998). Environmental soil physics: Fundamentals, applications, and environmental considerations. San Diego, California: Elsevier.
  James, S. E., Pärtel, M., Wilson, S. D., and Peltzer, D. A. (2003). Temporal heterogeneity of soil moisture in grassland and forest. Journal of Ecology, 234–239.
  Kelliher, F. M., Leuning, R., and Schulze, E. D. (1993). Evaporation and canopy characteristics of coniferous forests and grasslands. Oecologia 95, 153–163. doi: 10.1007/BF00323485
  Larson, J., Lidberg, W., Ågren, A. M., and Laudon, H. (2022). Predicting soil moisture conditions across a heterogeneous boreal catchment using terrain indices. Hydrology and Earth System Sciences, 26(19), 4837-4851. doi: 10.5194/hess-26-4837-2022
  McNabb, D. H., Startsev, A. D., and Nguyen, H. (2001). Soil Wetness and Traffic Level Effects on Bulk Density and Air-Filled Porosity of Compacted Boreal Forest Soils. Soil Science Society of America Journal 65, 1238–1247. doi: 10.2136/sssaj2001.6541238x
  Muñoz-Sabater, J., Dutra, E., Agustí-Panareda, A., Albergel, C., Arduini, G., Balsamo, G., et al. (2021). ERA5-Land: a state-of-the-art global reanalysis dataset for land applications. Earth Syst. Sci. Data 13, 4349–4383. doi: 10.5194/essd-13-4349-2021
  Poltorak, B. J., Labelle, E. R., and Jaeger, D. (2018). Soil displacement during ground-based mechanized forest operations using mixed-wood brush mats. Soil and Tillage Research 179, 96–104. doi: 10.1016/j.still.2018.02.005
  Tromp-van Meerveld, H. J., and McDonnell, J. J. (2006). On the interrelations between topography, soil depth, soil moisture, transpiration rates and species distribution at the hillslope scale. Advances in Water Resources 29, 293–310. doi: 10.1016/j.advwatres.2005.02.016
  Vega-Nieva, D. J., Murphy, P. N. C., Castonguay, M., Ogilvie, J., and Arp, P. (2009). A modular terrain model for daily variations in machine-specific forest soil trafficability. Canadian Journal of Soil Science 89, 93–109. doi: 10.4141/CJSS06033
  
  Citation: https://doi.org/10.5194/egusphere-2023-1908-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Publish subject to revisions (further review by editor and referees) (27 Nov 2023) by Yongping Wei

AR by Marian Schönauer on behalf of the Authors (12 Dec 2023) Author's response Author's tracked changes Manuscript

ED: Publish subject to revisions (further review by editor and referees) (03 Jan 2024) by Yongping Wei

ED: Referee Nomination & Report Request started (11 Jan 2024) by Yongping Wei

RR by Anonymous Referee #1 (21 Jan 2024)

Suggestions for revision or reasons for rejection

The authors’ answers to most of my questions and the corresponding changes in the manuscript are acceptable. However, there are a few items that I felt more explanations or changes are needed.
1. The added content is not enough to justify the use of ERA5 in this study. As pointed out by both reviewers, the resolution of the ERA5 product is too low given the size of the study area. Therefore, the most convincing reasons to use ERA5 could be either one or both of these a) the soil moisture variability within the study area is negligible and b) there are no other data sources applicable. The authors mentioned the first reason in the response to the other reviewer but not in the revised paper. I believe it’s important to add that in the manuscript to clarify any potential concerns of future readers.

2. The paragraph under Section 3.3. I appreciate the modifications the authors made. However, I insist on my previous point that this section should not be there unless a stronger connection to the results is presented. Since the goal of this paper is building models instead of purely doing some field measurements, the discussion about those measurements should be in the Data section. Unless the content discusses the impact of those measurements on the modeling results. Right now, the section is just about those measurements. Another way to work this around is to break this paragraph down and put those sentences near the more relevant discussions about results in the sub-sections that follow.

3. It’s fine if the authors insist on keeping the evaluation involving both the RD and the SWC, but more effort will then be needed to differentiate those two evaluations, as one is an apple-to-apple comparison, whereas the other one is not. I would say keeping the existing evaluation between the predicted SWC and RD is probably fine, as the goal for this is to demonstrate the usefulness of the predictions. However, I will definitely expect more quantitative comparisons between the predicted SWC and SWC measurements, as this one tells us how good your model is and it is the main objective of this work. Therefore, I suggest considering adding a few more other metrics, such as RMSE and NSE, to the latter for a more comprehensive evaluation.

Hide

RR by Anonymous Referee #2 (08 Feb 2024)

Suggestions for revision or reasons for rejection

I have reviewed the manuscript by Schönauer et al. for the second time, and I am honored that so many of my comments were considered useful by the authors and further implemented in the manuscript. I consider the paper much better now. Compared to the previous version, the updated manuscript:
- Is easier to follow (e.g., changing from “Trials 1 and 2” to “Trials Wet and Dry”)
- Addressed and explained many methodological concerns (e.g., locations of measurements in Sites A and B)
- improved the discussion of the results (e.g., why was DTW025 an important predicting variable?)
- is more honest regarding the study limitations (e.g., spatial covariation detection).

However, I still have some minor (mainly technical) concerns:
L34: and has shown to be -> and has been shown to be
L100: during this field trials -> during these field trials
Caption Figure 2: dryer -> drier
L120: dezember -> december
L206: dryer -> drier
L206: "subsequent section 2 of the same machine trail (...) (Figure 2, Site A), or in close proximity of section 1 (Site B)" – isn’t it the other way around (Site A and B) ? Based on Figure 2, it looks like in Site A it was in close proximity (i.e., roughly parallel), and in Site B, it continued on the same trail.
Figure 5: the legend indicates that the y-axis is SWC_CORE, but the figure indicates the y-axis as SWC_SR
Figure 7: why is the asterisk in parenthesis sometimes [0.28(*)], and other times not [0.34*] ?
Captions of several figures: “Significance levels are indicated by *** for p<0.001, ** for
0.001-0.01, * for 0.01-0.05, (*) for 0.05-0.10, and 'ns' for p>0.10” – asterisk in parenthesis for 0.05-0.10, but not for the other ranges.
L294: too fragile ?
In Appendix B, in one of the plots, Kendall’s coefficient is NANA?
L177-178: "the main outputs when both datasets were combined can be seen in Appendix A" -> this is a result, and I think it would be more appropriate to mention this in the results section.
L305: particle-to-particle
“Section 4.4.1 Temporal variation was higher than spatial variation”: can you really claim that? Have you quantified temporal and spatial variation, in order to be able to state that one is “higher” than the other? In the text, it is clearer that you mean that the temporal variation was more important than the spatial variation.
L364: the sentence is started with “this indicates”. What is “this” referring to in this context? Please consider re-phrasing it. For instance, the sentence could be: “The temporal variability in soil moisture between the trials was more important in this study than the spatial variability within the relatively small areas where each trial was conducted”.
L367: Sita A -> Site A
L374: wether?
L371-372: Please re-phrase this sentence in a more formal way: "Therefore, we have to admit, that the study design was not ideal for assessing the ability to predict rutting with a spatiotemporal model of SWC, and the results have to be considered with caution." – remove the part about “we have to admit”, maybe add something on the lines of “limitations” of this study.

Hide

ED: Publish subject to revisions (further review by editor and referees) (15 Feb 2024) by Yongping Wei

AR by Marian Schönauer on behalf of the Authors (19 Mar 2024) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (25 Mar 2024) by Yongping Wei

RR by Anonymous Referee #2 (03 Apr 2024)

RR by Anonymous Referee #1 (17 Apr 2024)

ED: Publish subject to technical corrections (20 Apr 2024) by Yongping Wei

AR by Marian Schönauer on behalf of the Authors (22 Apr 2024) Manuscript

Journal article(s) based on this preprint

20 Jun 2024

Soil moisture modeling with ERA5-Land retrievals, topographic indices, and in situ measurements and its use for predicting ruts

Marian Schönauer, Anneli M. Ågren, Klaus Katzensteiner, Florian Hartsch, Paul Arp, Simon Drollinger, and Dirk Jaeger

Hydrol. Earth Syst. Sci., 28, 2617–2633, https://doi.org/10.5194/hess-28-2617-2024,https://doi.org/10.5194/hess-28-2617-2024, 2024

Short summary

Marian Schönauer, Anneli M. Ågren, Klaus Katzensteiner, Florian Hartsch, Paul Arp, Simon Drollinger, and Dirk Jaeger

Viewed

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 310 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
310	0	0	310	0	0

HTML: 310
PDF: 0
XML: 0
Total: 310
BibTeX: 0
EndNote: 0

Views and downloads (calculated since 07 Sep 2023)

Month	HTML	PDF	XML
Sep 2023	120	0	120
Oct 2023	59	0	59
Nov 2023	24	0	24
Dec 2023	14	0	14
Jan 2024	20	0	20
Feb 2024	15	0	15
Mar 2024	11	0	11
Apr 2024	17	0	17
May 2024	14	0	14
Jun 2024	16	0	16
Jul 2024	0
Aug 2024	0
Sep 2024	0

Cumulative views and downloads (calculated since 07 Sep 2023)

Month	HTML	PDF	XML
Sep 2023	120	0	120
Oct 2023	59	0	59
Nov 2023	24	0	24
Dec 2023	14	0	14
Jan 2024	20	0	20
Feb 2024	15	0	15
Mar 2024	11	0	11
Apr 2024	17	0	17
May 2024	14	0	14
Jun 2024	16	0	16
Jul 2024	0
Aug 2024	0
Sep 2024	0

Viewed (geographical distribution)

Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.

Total article views: 305 (including HTML, PDF, and XML) Thereof 305 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Sep 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint

Final revised paper

Short summary

This work employs innovative spatiotemporal modeling to predict soil moisture, with implications for sustainable forest management. By correlating predicted soil moisture with rut depth, it addresses a critical concern of soil damage and ecological impact – and it’s prevention through adequate planning of forest operations.


Total:	0
HTML:	0
PDF:	0
XML:	0