the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Effects of spatial soil moisture variability in forests plots on simulated groundwater recharge estimates
Abstract. Soil-Vegetation-Atmosphere Transfer (SVAT) models are essential tools for simulating and underploting the dynamic interactions governing water balance components within forest ecosystems. These models are widely employed to predict hydrological responses to environmental change, including the impacts of shifting meteorological conditions on forested landscapes. Despite their usefulness, the reliability of SVAT models is frequently compromised by uncertainties arising from incomplete or imprecise input data. These limitations often result in model assumptions that may lead to over- or underestimation of critical water balance components such as groundwater recharge. In order to improve the accuracy of SVAT models, observed soil moisture data are integrated to enhance parameterization processes by aligning simulated outputs with measured values. However, uncertainties remain regarding the selection of representative soil moisture profiles for calibration and the extent of measurements necessary to robustly characterize a forest plot. To address these challenges, the present study explores the spatial variability of soil moisture across two forested plots with contrasting soil and vegetation conditions by the deployment of an extensive network of soil moisture probes in 11 profiles per plot. The influence of soil moisture variability on the adjustment of model input parameters during the calibration process and its subsequent impact on the computation of groundwater recharge is evaluated. The findings reveal that soil moisture variability at the plot characterized by a heterogeneous soil was greater, both horizontally and in depth, throughout the study period. These patterns of variability are also mirrored in the different parameter sets obtained from the calibration of the LWF Brook90 model, based on the recorded soil moisture time series in each of the 11 profiles per plot. The most significant variation is observed in the infiltration and hydraulic soil parameters, whereby this is more pronounced at the plot with heterogeneous soil structure. Nevertheless, when examining the groundwater recharge rates calculated using the 30 best-performing parameter sets for each of the 11 profiles, both plots exhibited comparable temporal patterns and in particular similar variations in total volumes of groundwater recharge. These results suggest that model-inherent uncertainties, including parameter interactions, equifinality and dimensional simplifications, have a stronger impact on model outputs than uncertainties arising from variability in soil moisture caused by spatial heterogeneity of soil texture and hydraulic properties within the plot. Taking into account both sources of uncertainty, the application of bootstrapping techniques demonstrated that groundwater recharge could be reliably estimated using data from only 6 to 7 soil profiles per plot, providing a representative picture of its spatial variability. In general, the results indicate that using data from only a few soil profiles is not sufficient to capture the full range of groundwater recharge dynamics.
- Preprint
(984 KB) - Metadata XML
-
Supplement
(1070 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-4025', Philippe Ackerer, 01 Nov 2025
-
AC1: 'Reply on RC1', Thomas Fichtner, 12 Dec 2025
We thank the reviewer for the thorough and constructive assessment of our manuscript. The comments have been highly valuable in highlighting aspects of the model setup, process representation and parameter choice that require further clarification or refinement. Several points raised directly help us improve the transparency and robustness of our methodological approach. In the following responses, we address each comment in detail and indicate where revisions or additional adjustments to the model setup or the manuscript will be made.
RC: More information about the SVAT model is necessary to better understand the underlying processes and assumptions for quickflow fraction, multiplier to partially activate drainage, root water uptake, interception, are transpiration and evaporation uptake handled in a different way?, run-off,
AC: We agree that a more detailed description of the underlying processes and assumptions will improve the transparency of our methodology. We will also clarify how surface runoff and other relevant model processes are represented. We will revise the manuscript accordingly and expand the model description to address these points and enhance the overall clarity of the paper.
RC: The 100,000 random combinations are set without considering correlation between parameters, which is not really true for soils of similar texture. Please discuss.
AC: It is correct that the random generation of 100,000 parameter combinations did not explicitly account for known correlations between the parameters, which are typically constrained by soil texture and structure. However, in our study the sampling was not performed on primary texture variables (e.g., sand, silt, clay) but on the hydraulic model parameters (e.g., θs, θr, α, n, Ks). These parameters represent empirical degrees of freedom in the hydraulic functions rather than directly measured soil properties, and the pedogenic correlations between texture variables are therefore not directly applicable. Many of the well-known natural correlations (such as sand–clay or bulk density–porosity relations) are substantially weaker or even absent in the hydraulic parameter space. As a consequence, the lack of imposed correlations is less problematic for the exploration we conducted.
Nevertheless, we agree that correlation-preserving sampling methods (e.g., correlated LHS or copula-based sampling) or parameter correlation structures derived from pedotransfer functions could be used in future studies if the aim is to generate parameter sets that more strictly represent natural soils.
RC: How was the warm-up period of three months estimated? (may depend on climatic conditions).
AC: Due to the relatively short data series of less than two years, we decided to use only the period from the beginning of 2023 until the start of data collection in March 2023 as a warm-up period. This is admitteldy a bit too short; normally, the warm-up period should be at least six months long. For the revision of the paper, we will perform the calibration with additional measurement data from 2025 and, in doing so, we will extend the warm-up period to minimum 6 months.
RC: Figure 5 clearly shows that there are some problems with the soil moisture measurements. These outliers may impact the KGE (everything is squared that gives a higher weight to these singularities). Did you remove these ‘errors’?
AC: We assume that by "problems with the soil moisture measurements" or "outliers" you're referring to the repeated, sudden spikes in soil moisture in some profiles. These are very high soil moisture readings due to waterlogging following heavy rainfall events caused by poorly permeable soil layers. These conditions are consistent with field observations, where we were also able to observe this phenomenon of stagnant water in some soil profiles. Therefore, we believe these are not erroneous measurements.
But we agree that outliers in soil moisture measurements can disproportionately affect the Kling-Gupta Efficiency (KGE), due to its squared error component. Prior to calibration, we performed a visual inspection and physical plausibility checks of the soil moisture time series, so we could identify and removed sudden changes that cannot be explained by known rainfall or limited percolation or values that are not within physically meaningful limits. However, we will discuss the potential impact of waterlogging on observed soil moisture dynamics and the calibration in the revised manuscript. In the revised discussion, we would also point out that the model cannot accurately represent certain soil moisture patterns due to certain limitations, such as the lack of realistic representation of lateral flows.
RC: The CV values of Tharandt plots is higher than the CV values of Kienhorst. I agree with the interpretation. However, because the soils are very different, Richards model and MVG models may be more suited for one site than for the other, leading to more variability of the parameter’s values.
AC: We agree that the observed differences in coefficient of variation (CV) between Tharandt and Kienhorst may not solely reflect site heterogeneity, but could also be influenced by the suitability of the underlying model structure. The SVAT model used in this study relies on the Richards equation and Mualem–van Genuchten (MVG) parameterization, which assumes continuous matrix flow. While this approach is widely used, it may be more appropriate for the relatively homogeneous, fine-textured soils at Kienhorst. In soils with strongly structured pore systems or temporary waterlogging as in Tharandt, this assumption can lead to model errors or unrealistic parameter estimates. Therefore, the higher coefficient of variation (CV) in Tharandt may not only be due to site heterogeneity but also to limitations of the model structure itself. We will discuss this in the revised manuscript.
RC: The discussion about model calibrated parameters could be improved. It could be interesting to test if each parameter is statistically different from one soil profile to the other for the same site, and if some correlation can be highlighted (usually, tetaS and tetaR are correlated, due to MVG model formulation where the ‘driving’ parameter is TetaS-TetaR).
AC: We agree that a deeper analysis of the calibrated parameter sets can provide valuable insights into soil profile heterogeneity and parameter interdependence. As suggested, we can assess whether calibrated parameter values differ significantly between the soil profiles at each site by using statistical tests and visualization through correlation matrices and parameter scatter plots. These additions will strengthen the interpretation of the calibrated parameters and help to better contextualize differences between soil profiles and the structural links between hydraulic parameters.
RC: The discussion about the groundwater recharge is interesting. Of course, doing some statistics with only 11 profiles can be discussed but, considering the huge amount of work, the results can be considered has a first order quantification (6 profiles minimum among 11). It also questioned the definition of mean values for a highly non-linear model as Richards model. In my opinion, the scaling infiltration at the size of a plot is very difficult and may depend on forcing terms (raining period or dry period). This is nicely shown in the manuscript.
AC: We agree that statistical inference based on 11 profiles has limitations, but consider the results as a first-order quantification of recharge variability under site-specific conditions. We also acknowledge the non-linearity of the Richards model, which complicates the interpretation of mean values. In highly heterogeneous soils, averaging parameters or outputs may obscure critical dynamics, especially during episodic forcing events such as heavy rainfall or drought. Our intention was not to derive statistically robust spatial averages, but to provide a first-order quantification of recharge variability across representative soil profiles
It is also right that the concept of a “mean” value is problematic in non-linear systems, where averaging boundary conditions or soil parameters does not yield an equivalent average system response. This is particularly relevant for infiltration and recharge, which depend strongly on transient forcing conditions (wet versus dry periods) and soil-specific non-linearities. We will clarify this more explicitly in the discussion.
Finally, we agree that scaling infiltration and recharge processes to the plot scale is challenging. The choice of the 20 × 20 m plot size follows the rationale described in Berthelin et al. (2020), who used the same spatial scale to ensure that each plot constitutes a relatively homogeneous unit with respect to slope, aspect, and vegetation structure. A 20 × 20 m area is large enough to integrate over sub-tree variability and local heterogeneities in vegetation cover, while still remaining small enough to avoid merging distinct geomorphological or land-use units. This scale therefore provides a reasonable compromise: it minimizes internal heterogeneity while capturing the variability relevant for infiltration and groundwater recharge at the stand level. Extending this logic to infiltration processes, the plot size is sufficiently large to represent canopy-induced variability in throughfall and interception, yet small enough to maintain a coherent hydrological response of the soil system. We will expand the discussion to reflect these points more explicitly.
RC: The unit of ksat in figure S3 is unclear.
AC: We will change this in the revised manuscript
Citation: https://doi.org/10.5194/egusphere-2025-4025-AC1
-
AC1: 'Reply on RC1', Thomas Fichtner, 12 Dec 2025
-
RC2: 'Comment on egusphere-2025-4025', Anonymous Referee #2, 05 Nov 2025
Dear authors, please find my comments attached.
-
AC2: 'Reply on RC2', Thomas Fichtner, 12 Dec 2025
We would like to thank for the reviewer’s detailed, constructive, and scientifically insightful comments. The feedback has been extremely valuable in helping us to identify aspects of the manuscript where clarification, additional analyses, or methodological refinement are warranted. In particular, the reviewer’s remarks concerning model parameterization, preferential flow processes, groundwater recharge dynamics and the representation of site-specific hydrological conditions have provided important guidance for strengthening both the conceptual framing and the technical robustness of the study.
In the following reply, we address each comment systematically and outline the revisions and the additional steps we will undertake to improve the model setup and the interpretation of the results. Where the reviewer has pointed to methodological limitations, we acknowledge these explicitly and describe how they will be addressed either through revised analyses or through a clearer discussion of uncertainties and assumptions.
Model setup and description
RC: The model setup section needs further elaboration. Please provide more information about the LWF-BROOK90 model itself, not only its soil moisture component.
AC: We agree that a more detailed description of the underlying processes and assumptions will improve the transparency of our methodology. We will also clarify how surface runoff and other relevant model processes are represented. We will revise the manuscript accordingly and expand the model description to address these points and enhance the overall clarity of the paper (please also see our response to comment 1 of referee comments 1.
RC: It is unclear which model output was used to represent groundwater recharge (GWRCH). I assume it is the drainage from the lowest soil layer rather than outflow from the conceptual subsoil bucket with delay, but this should be explicitly stated and justified.
AC: This is true. By strict definition, grondwater recharge corresponds to the water volumes added to the groundwater table. In our study, groundwater recharge was defined as the drainage flux from the lowest soil layer, without applying an additional conceptual delay or storage bucket. Hence, it is rather potential or latent recharge. Analysing this potential recharge allows for a direct comparison across soil profiles with varying hydraulic properties and the study focuses on relative recharge dynamics rather than absolute aquifer recharge timing. We will update our definition of groundwater recharge in the model setup section accordingly.
RC: It should be discussed that the model can only be used to estimate GWRCH under several restrictive assumptions: it is a 1D model with no explicit groundwater module, no lateral flow/routing, and no upward capillary flux representation.
AC: We fully acknowledge that LWF-BROOK90 is a one-dimensional SVAT model without an explicit groundwater module. As such, it does not simulate groundwater table dynamics, lateral flow or routing, nor upward capillary flux from deeper layers. Potential groundwater recharge in this context is defined as vertical drainage beyond the root zone, and should be interpreted as a first-order estimate of potential recharge, rather than actual aquifer replenishment. We will update the model setup section and discussion to explicitly state these assumptions and to caution against overinterpretation of the recharge estimates beyond the scope of the model.
RC: Please provide sources for all parameter values and parameter ranges used in the calibration. How the soil hydraulic parameters were derived for each layer (which PTF, etc.)?
AC: We agree that the manuscript should clearly document the sources and justification for all parameter values and calibration ranges. In the revised version, we will add corresponding literature sources or data bases for all fixed and calibrated parameters to table S1 in the supplement. We will also clarify how the initial soil hydraulic parameters were obtained for each soil layer. Specifically, θs, θr, α, n and Ks were derived using the pedotransfer function of Wösten et al. (1999), based on measured texture fractions and bulk density for each layer. These PTF-derived values served as initial estimates for the calibration.
RC: I was surprised not to find the maximum LAI parameter or its inclusion in the sensitivity analysis. Together with glmax, this is one of the most sensitive parameters in BROOK90. Further, the method used to simulate seasonal LAI dynamics (phenological scheme and parameters) is not described, although it strongly affects evapotranspiration and thus water availability for percolation.
AC: We agree that the maximum leaf area index (LAImax) is one of the the most important parameter in the model, as it strongly influence transpiration and thus the partitioning of water between evaporation and percolation. LAImax was given in the input file "Data-NEW_param.csv" - 4.0 for the Kienhorst (pine) location and 4.0 for the Tharandt (spruce) location. The seasonal LAI dynamics are contained in the input file "Data-NEW_meteoVeg.csv", where the daily relative LAI value in percent is specified. For the calculation of the relative LAI values, we used a maximum LAI of , the seasonal LAI variation is small. These LAI time series is not calculated by the model itself, it is entirely defined by phenological model. Therefore, we used the approach according to Weis et al. (2012). We will expand the model setup section to describe the phenological scheme used to simulate seasonal LAI dynamics.
RC: Information on meteorological forcing data (source, temporal resolution, spatial representativeness) is missing. If daily data were used together with standard DURATN values, this may strongly affect infiltration and percolation processes.
AC: In the revised manuscript, we will provide the missing data. We have generated daily data from 30 minutes of data; this data is contained in the input file "Data-New_meteoVeg.csv". The average number of hours per precipitation event for all 12 months is defined in the file „Data-NEW_meteo_storm_durations.csv“, we have used the value 4 here, which is described as a „satisfactory approximation“ according to the Brook90 documentation.
Soil moisture data and calibration
RC: I appreciate the substantial effort required to collect such a large soil moisture dataset. However, its application within the current modelling setup—given the research questions—raises concerns.
You calibrated the model using the entire soil moisture dataset (20 months), leaving no data for validation. While validation may not be the core focus, it is good (and standard) scientific practice and would strengthen the credibility of your modelling results.
AC: You are right that independent validation is important to check the generalizability of the model. In the present study, the entire data set was used for calibration because the amount of data was limited (data series of less than two years, march 2023 to november 2024). Our measurement compaign contibued during the revisions of this paper and we now have an extended observation period available to perform a split-sample validation (e.g. calibrate on 70%, validate on 30%) due to additional available measurement data from 2025. We will also extend the warm-up period to minimum 6 months to make sure the model is not affected by estimated initial conditions.
RC: Given the large number of calibration runs, I am concerned about potential model overfitting, which may reduce model performance outside the calibration period.
AC: In our opinion, any risk of overfitting is primarily related to the number of calibration parameters rather than the number of calibration runs, which are intended to explore the parameter space. The large number of runs can only reveal the problem. To mitigate this risk, we will substantially reduce the number of calibration parameters in the revised version and constrained their plausible ranges, thereby limiting the potential for parameter compensation and overfitting. The remaining uncertainty in model predictions will be assessed using a split-sample test. These approache allow us to quantify the robustness of the model and the influence of the remaining degrees of freedom, ensuring that predictions remain reliable beyond the calibration dataset. We will add a discussion of these steps in the revised manuscript to clarify how overfitting is minimized and how residual uncertainty is addressed.
RC: Groundwater recharge is mostly generated during winter, yet only one winter season (20232024) is included. This raises questions regarding the robustness of GWRCH estimates. If sensors are still operating, I strongly recommend including additional data and performing a split-sample calibration/validation.
AC: It is correct that our dataset includes only one full winter season (2023/2024), which limits the temporal representativeness of the groundwater recharge estimates. As mentioned before, we will include additional available measurement data from 2025 during the revision of the paper. So we are able to determine the groundwater recharge for an additional winter season and we can prove the groundwater recharge estimates for robustness. This will help ensure that the recharge estimates are not biased by single-year anomalies and reflect broader climatic variability.
RC: Please discuss how soil moisture measurement uncertainty (±3%) affects calibration results, especially during dry periods where simulated SM values often fall below 5%—resulting in possible relative errors of up to 100%.
AC: This is indeed an important point for the discussion. In dry periods when soil moisture falls below 5% by volume, an absolute measurement uncertainty of ±3% can lead to very high relative errors. A measurement uncertainty of ±3% means that a measured value of 4% could actually be between 1% and 7%. Relative deviation: At 4% soil moisture, an error of ±3% = ±75% is enormous in relative terms. During dry periods, the uncertainty is so high that accurate calibration is virtually impossible – the model could appear “incorrect” even though it is within the measurement uncertainty.
We agree that this limits the significance of the calibration during these phases. To account for that, special attention should be paid to the wet phases, in which the relative uncertainty is lower, when evaluating the model quality.
Further, our model does not aim to produce highly accurate soil moisture values during extreme dry conditions, which could indeed be problematic for studies focusing on plant water stress. However, the primary objective of this study is to quantify groundwater recharge. In this context, short-term deviations in very low soil moisture values have limited influence on the cumulative recharge estimates, since recharge is largely controlled by wetter periods when the soil is above the wilting point. Consequently, we expect the influence of measurement uncertainty during dry periods on our key results to be minimal.
We will add a brief discussion of this limitation in the revised manuscript to clarify that while soil moisture uncertainty may affect dry-period values, it does not substantially compromise our assessment of groundwater recharge.
Missing reference simulation
RC: I recommend including uncalibrated/reference simulations (using measured vegetation parameters and Mualem–van Genuchten parameters obtained from soil profile data via PTFs) for all 22 profiles. This would help to evaluate the added value of calibration for both SM and GWRCH.
AC: We agree that a comparison between calibrated and uncalibrated models provides valuable insights into the added value of calibration and assess how well the model performs with default or literature-based inputs, identifys which profiles or parameters benefit most from calibration. During paper revision, we will run a reference simulation with measured and standard literature values before we start the calibration. The differences in results could be visualized by using scatterplots.
Transferability and scientific significance
RC: The most critical point of the study is that the experimental setup appears highly site-specific and dependent on data availability and chosen parameters. As currently presented, the results have very limited transferability and limited practical/scientific added value. This should be discussed more clearly with respect to the intended audience (modellers, foresters, soil scientists, hydrologists). Key results should either be presented with appropriate caution or supported more robustly, especially in terms of why such a complex and expensive setup is necessary for plot-scale GWRCH estimation.
AC: We acknowledge that the experimental setup is highly site-specific, relying on detailed soil profile characterization, vegetation data, and continuous soil moisture monitoring what limits the direct transferability of the results on other sites. However, the study was designed to explore the feasibility and sensitivity of plot-scale groundwater recharge estimation under realistic field conditions. The complexity of the setup reflects the challenges of capturing recharge dynamics in structured, heterogeneous forest soils, where simplified approaches may fail to represent key processes.
Further, the plots used correspond to the “SuperSites” of the Institute for Forestry Research (FVAs) in Germany, which were selected precisely to represent typical combinations of dominant tree species (e.g., pine/spruce) and associated soil types within the region. While exact combinations of species and soil may be limited in number, these "SuperSites" provide a representative subset of managed forest stands, capturing the main variability in slope, aspect, and vegetation cover relevant for plot-scale groundwater recharge.
During the revision oft he manuscript, we will more clearly delineate the scope and limitations of the study in the discussion. We will reflect more deeply on the generalizability, audience relevance, and justification of your experimental design and modeling approach.
The results should be interpreted as a proof-of-concept for high-resolution recharge modeling, rather than a universally applicable framework. For modellers, the study highlights parameter sensitivities and model limitations. For foresters and soil scientists, it provides insight into how forest structure and soil layering influence water availability and recharge. For hydrologists, it demonstrates the need for profile-resolved data to constrain recharge estimates in complex terrains.
We agree that future work should include multi-site comparisons, simplified proxy approaches, and scaling strategies to improve transferability and practical relevance.
Figures
RC: The quality (resolution) of all figures is currently insufficient, making them difficult to interpret.
AC: This could be due to the upload process; we will check that and improve the quality where required.
Scope and title
RC: A substantial portion of the manuscript deals with soil moisture simulation and calibration, whereas groundwater recharge is addressed only briefly. You may wish to reconsider the title or rebalance the manuscript content.
AC: The aim was to highlight the differences in groundwater recharge, based on calibration. But maybe we could add some words to the title as follows:
“Effects of spatial soil moisture variability in forest plots on model calibration and simulated groundwater recharge estimates”.
Specific comments
RC: L52: The motivation for the importance of GWRCH needs to be further elaborated. As presented in the first paragraph, it remains vague.
AC: We will revise that part and add some facts to the motivation
RC: L76–77: This represents only one of many possible approaches.
AC: We will revise that part and add some facts to other approaches for reducing parameter uncertainty such as sensitivity analyses, ensemble simulations, Bayesian calibration frameworks.
RC: L79–81: This is quite a bold statement and is not universally true (e.g., cases of over-calibration without validation).
AC: We will revise that part and add some facts to the risks of no independent validation or if the calibration is too heavily tailored to a single data set
RC: L101: Why are plot-scale GWRCH estimations needed rather than multi-site or gridded estimations, which are typically required for practical applications by stakeholders?
AC: Multi-site or gridded estimations are indeed essential for for large-scale water balance assessments and operational decision-making. They rely on upscaled parameterizations that may miss local-scale heterogeneities and nonlinearities in infiltration and percolation. Plot-scale study provides the process understanding and high-resolution data necessary to improve these larger-scale models and to guide the selection of representative parameters for broader applications. The focus on plot-scale groundwater recharge allows for a evaluation of model performance under well-controlled conditions, using high-resolution soil moisture data. This level of detail is crucial for understanding site-specific processes such as root-zone dynamics, canopy interception, and soil hydraulic behavior—factors that are often averaged out or parameterized in large-scale models. We agree that bridging the gap between plot-scale understanding and landscape-scale application is essential, and we see this study as a step toward that goal.
Moreover, the contrasting environmental conditions across the two forest plots provide a basis for assessing model transferability and sensitivity to site characteristics. These insights are intended to inform future upscaling efforts, including the development of parameter regionalization schemes and integration into gridded SVAT frameworks. We will add a discussion in the manuscript emphasizing that plot-scale measurements serve both to quantify first-order recharge variability and to support the development and validation of multi-site or gridded modeling approaches, bridging the gap between detailed process studies and practical large-scale applications.
RC: L115: Please add the Latin names of species, full soil texture classifications, and the averaging period for the meteorological data. Additionally, include soil profile data in the Appendix to assess heterogeneity.
AC: We will add this information to the revised manuscript
RC: L135–136: If stagnant conditions at this site are well known due to shallow bedrock, what GWRCH can realistically be expected, and how is this addressed in the model setup?
AC: Indeed, the presence of redoximorphic features and perching horizons on the plot scale area indicates locally restricted vertical percolation due to low subsoil permeability which reduces percolation to deeper groundwater. But the absence of a permanent water table also suggests that this is for limited times. Since slopes are almost negligible, we can assum that most of the perched water will slowly find its way to recharge. We therefore believe that no substantial bias is caused for our recharge estimates, however we acknowledge that this process is not well represented by the model setup and is probably the cause for the inferior simulation quality obtained at the Tharandt site.
To address this limitation more explicitly, we will revise the model setup in a next step. In particular, we aim to improve the representation of near-surface storage dynamics and vertical percolation by adjusting soil hydraulic parameters, refining the layering scheme and testing variation of bypas flow to better represent flow conditions. We hope that these adjustments will allow us to more realistically capture delayed percolation events and the attenuated recharge response that typically arises under stagnant conditions and shallow bedrock.
RC: L144: Do you mean ±3%?
AC: Yes, we will changed it.
RC: L149–150: Please elaborate on the calibration process in more detail.
AC: The factory calibration is based on laboratory measurements in reference media (typically quartz sand and deionized water) across a range of volumetric water contents and temperatures. The resulting calibration curves are stored internally in the sensor electronics and automatically applied to the raw TDT signal, converting it into volumetric water content and temperature values. In this study, we relied on this manufacturer calibration rather than performing site-specific calibration, because our focus was on relative soil moisture dynamics and temporal patterns rather than absolute water content values. Previous studies have shown that the factory calibration of SMT100 sensors provides sufficiently robust results for monitoring soil moisture dynamics in a variety of soils (Sprenger et al., 2015; Demand et al., 2019).
We acknowledge that site-specific calibration (e.g., by gravimetric sampling or soil-specific calibration functions) could reduce systematic bias in absolute values, especially in soils with atypical texture or bulk density. However, given the study objectives, the factory calibration was considered adequate.
RC: L165: Add the north direction and satellite/aerial background imagery to better assess spatial heterogeneity.
AC: We will revise the figures
RC: L193: It is unclear where the parameter calibration ranges come from. For example, cvpd values below 1–1.5 result in unreliable transpiration and should be justified. The same applies to other sensitive parameters as well.
AC: Thank you for pointing out the recommended range for the parameter cvpd. We acknowledge that values between 1.5 and 2.5 kPa are generally considered physiologically plausible thresholds, as transpiration becomes unreliable below this range due to stomatal regulation. In our initial calibration we tested a slightly broader interval to explore model sensitivity and potential site-specific adaptations.
Following your suggestion, we will restrict the calibration range to 1.5–2.5 kPa in the new simulations in connection with the revised manuscript, which ensures that the simulations remain consistent with established physiological knowledge and avoids unrealistic transpiration dynamics. We will also check the value ranges of the other parameters during the revision of the manuscript.
RC: L195: Please include a sensitivity analysis in the Appendix and specify its boundary conditions and variables of interest.
A: A preliminary sensitivity assessment was conducted earlier in the context of a Master’s thesis, which provided helpful insights for designing the current model setup. However, the assumptions, implementation details, and documentation of that work were not developed to a level suitable for publication, and several aspects would require substantial revision before they could be transparently presented in the Appendix of this manuscript.
However, the sensitivity analysis performed was only one source used in selecting the calibration parameters. The parameter selection was also influenced by literature on BROOK90 applications, site-specific measurements and expert judgments based on the physical characteristics of the study area. These supplementary sources ensured that important processes are adequately represented.
RC: L203: A warm-up period of only 3 months is, in my experience, insufficient for this model setup. A minimum of 6 months (preferably 1 year) is recommended. Please either extend the warm-up period or demonstrate that a longer warm-up does not significantly affect soil moisture at the start of spring 2023.
AC: We agree that the 3 months warm up period is a bit too short. As mentioned before, we will extend the warm-up period to minimum 6 months during the revision of the paper.
RC: L220: This setup violates one of the core assumptions of the KS test—independence of samples—since the bootstrapped subset contains data from the full set. You should choose a different metric or modify the setup.
AC: We agree that the classical two-sample KS test assumes independent samples, which is violated when comparing a bootstrap subset to the full data set containing that subset. In the revised manuscript, we will therefore no longer use the KS statistic as a formal hypothesis test (with p-values), but purely as a distributional distance measure between the empirical CDF of the subsample and that of the full data set. Our analysis is than based on the magnitude of the KS distance across bootstrap resamples for each sample size n and we define a practical tolerance threshold for an acceptable deviation. This removes the need for the independence assumption underlying the two-sample KS test and keeps the original intention of assessing how many profiles are needed to obtain a representative recharge distribution.
RC: L230: How were raw observations post-processed? Were sensor issues, spikes, or errors encountered? (e.g., see Kienhorst L1–10 cm, red straight line after 23 Sept.)
AC: We did only very little post-processing of the raw data; the graphs show more or less the original measurements. We only performed a visual inspection and physical plausibility checks of the soil moisture time series, so we could identify sudden changes that cannot be explained by known rainfall or limited percolation or values that are not within physically meaningful limits.
RC: L245: The spikes observed in Tharandt—do they indicate sensor errors or preferential/bypass flow in some plots?
AC: These spikes or very high soil moisture readings are due to waterlogging following heavy rainfall events caused by poorly permeable soil layers. These conditions are consistent with field observations, where we were also able to observe this phenomenon of stagnant water in some soil profiles. Therefore, we believe these are not erroneous measurements.
RC: L260 (270): What exactly is meant by “simulations” here—an average of the 30 ‘best’ simulations and their uncertainty bandwidth?
AC: Yes, it is an average of the 30 ‘best’ simulations and their uncertainty bandwidth
RC: L263: There appears to be a substantial underestimation of variance. Please provide the full KGE decomposition to better assess this and discuss it, as it is crucial for estimating downward water flows and, consequently, GWRCH.
AC: We agree that variance underestimation is a relevant aspect when evaluating downward water fluxes and groundwater recharge. However, in our opinion, a full numerical decomposition of KGE components would not substantially change the interpretation of model performance, because the dominant source of mismatch is already known from the comparison of simulated and observed soil moisture dynamics: the model tends to dampen short-term fluctuations, which is a common behaviour of Richards-type models and related soil hydraulic parameterizations. This behavior is also well documented in the literature for forest soils with high small-scale heterogeneity and preferential flow. The aim of the study was to evaluate the model’s ability to reproduce soil moisture dynamics and downward water flows in a holistic manner, rather than to analyze individual components of the efficiency measure.
Given the scope of the paper, we therefore prefer not to present the separate decomposition terms, as this would shift the emphasis away from the main objectives and add considerable detail without altering the interpretation of recharge dynamics. Instead, we highlighted the limitations of the variance representation directly in the discussion and clarified how this may affect the estimation of downward water flows and groundwater recharge.
Instead of providing the full KGE decomposition, we suggest to expand the manuscript to more explicitly discuss the variability component of model performance, explaining how reduced short-term variance affects simulated percolation dynamics and how this has been accounted for in our uncertainty assessment. This includes narrowing the parameter ranges in the revised calibration, reducing the number of calibration parameters to avoid over-smoothing effects and evaluating model performance using an independent validation period.
We hope this clarification adequately addresses the reviewer’s concern, while keeping the focus on the hydrologically relevant implications of variance underestimation rather than on the detailed numerical decomposition of KGE.
RC: L289: To my knowledge, ksnvp is one of the least sensitive parameters in the BROOK90 model, as snow evaporation rarely exceeds 1% of annual precipitation. It is therefore unclear why it is included in the calibration list.
AC: It is correct that ksnvp is generally considered as a low-sensitivity parameter, particularly in temperate regions where snow evaporation contributes minimally to the overall water balance. In our initial calibration list, ksnvp was included for completeness, as part of a broader vegetation parameter set. However, we acknowledge that its inclusion may not be justified given the limited influence on simulated outputs under the climatic conditions of our study sites. Therefore, we will exclude ksnvp in the new calibration during the revision oft he manuscript and focus on parameters with greater hydrological relevance.
RC: L300: It is unclear whether the CV values represent summarized CV values for all parameters per site. Please explain this in more detail.
AC: We computed the CV for each parameter (e.g. ksmax, alpha, n, glmax, etc.) across the ensemble of accepted parameter sets, and then summarized these values as a site-level average. This provides an indication of parameter uncertainty and identifiability at each location. We will revise the methods section to clarify this calculation and will add a note in the results to explain the interpretation of CV values in the context of model sensitivity and calibration robustness.
RC: L301: The increased variation reflects heterogeneity in calibrated soil parameters within the model setup, not actual soil properties (unless laboratory verification was performed).
AC: We agree that part of the observed variation may indeed originate from the calibration procedure and the parameter ranges applied, rather than representing fully verified soil properties. Our intention in presenting these CV values was to highlight the relative variability within the model setup, which reflects both the heterogeneity of the soils at Tharandt and the uncertainty inherent in parameter estimation. We acknowledge that without laboratory-based verification, the CV values should not be interpreted as direct measures of actual soil property variability. To clarify this point, we will revise the manuscript to explicitly state that the reported CVs represent variability in calibrated model parameters, which may differ from true soil heterogeneity. Nevertheless, the consistently higher variation at Tharandt compared to Kienhorst is in line with field observations of more complex soil layering and redoximorphic features at the site, supporting our interpretation of greater heterogeneity.
RC: L325–326: Please discuss these findings in the context of preferential flow (bypass flow) and how it is parameterized in the model.
AC: As part of a revision, we plan to adjust the model setup to better represent the flows from precipitation to groundwater recharge. Among other things, this includes reviewing and modifying the previous assumptions regarding bypass flow. Contrary to previous assumptions, we will deactivate this in the model. In doing so, we want to avoid unrealistically large amounts of water bypassing the upper soil horizon and being transferred directly to deeper layers. Bypass flow can still occur despite deactivation if the top soil layer is fully saturated after a precipitation event. We will discuss the new simulation results in the context of preferential flow (bypass flow), particularly with regard to how the simplified representation of preferential flow paths influences model performance and the estimation of groundwater recharge.
RC: L334–340: The cumulative GWRCH values seem much higher than expected for the studied sites. Please include a table or plot in the Appendix showing the monthly sums of all water balance components (as you are not covering full years). Also, relate GWRCH to seasonal/annual precipitation as a percentage.
AC: We acknowledge that the reported groundwater recharge values may appear elevated without contextual information on the full water balance. To address this, we will add a table to the Appendix showing the monthly sums of all major water balance components (precipitation, evapotranspiration, interception, groundwater recharge, …..) for each plot and year. Furthermore, we also will express groundwater recharge as a percentage of seasonal/annual precipitation, which allows for a clearer assessment of recharge efficiency and hydrological plausibility.
RC: L368–386: You do not demonstrate that the calibrated model reliably estimates all water balance components—only soil moisture at certain depths.
AC: We agree that our calibration and validation approach was limited to soil moisture observations at specific depths, and therefore does not allow for a direct verification of all individual water balance components. Our intention was to assess the model’s ability to reproduce soil moisture dynamics as a proxy for the underlying hydrological processes, rather than to provide a full closure of the water balance. We acknowledge this limitation and will clarify it in the revised manuscript by changing the formulation.
RC: L396–398: I disagree with this statement. It must be supported with model evidence. In many profiles, the model fails to reproduce high soil moisture peaks (likely from intense precipitation events). These peaks may represent preferential flow, which could be a major source of GWRCH (e.g. for Tharandt). Failure to simulate them may lead to significant underestimation of GWRCH during such events.
AC: You are correct that high overall KGE scores do not necessarily imply that all aspects of soil moisture dynamics are represented equally well. In particular, we acknowledge that the model shows limitations in reproducing sharp soil moisture peaks following intense precipitation events. These peaks may indeed be linked to preferential flow processes, which are not explicitly represented in the current model structure and could contribute to groundwater recharge, especially at the Tharandt site.
Our statement on the general realism of the simulations was intended to emphasize that, despite these shortcomings, the model was able to capture the broader soil moisture dynamics with acceptable precision. We will revise the manuscript to clarify this point and to explicitly acknowledge that the inability to reproduce preferential flow events may lead to an underestimation of recharge during such conditions.
In addition, we are currently testing modifications in the calibration setup (e.g., varying the bypass flow parameter and restricting surface runoff) to achieve a better fit of the simulated soil moisture dynamics. These adjustments are expected to improve the representation of peak events and thus provide a more robust estimation of recharge processes.
RC: L419–420: Given the model setup, emphasis on soil moisture, and limited vegetation parameters, the results are unsurprising.
AC: ok
RC: L427–429: This could become a valuable result if you analyze and discuss it in terms of the parameter list, parameter ranges, and degrees of freedom allowed during calibration. Would the conclusions change if different parameters or wider/narrower ranges were used?
AC: The finding that model-based uncertainty exceeds plot-based variability provides valuable insights into the sensitivity of the calibration setup. At the same time, we would like to emphasize that the calibration ranges applied in this study were derived from literature values, pedotransfer functions, and site-specific observations, and were deliberately chosen to balance plausibility with flexibility. We acknowledge that different choices of parameters or wider/narrower ranges could potentially alter the relative contributions of model-based versus plot-based variability. However, a systematic exploration of alternative calibration setups was beyond the scope of this study.
We will revise the manuscript to explicitly highlight this limitation and to discuss that the conclusions drawn here apply within the chosen calibration framework. We agree that future work could benefit from a more detailed sensitivity analysis of parameter ranges and degrees of freedom, which would allow a deeper understanding of how calibration decisions influence uncertainty and variability in recharge estimates.
RC: L464–466: The referenced GWRCH values are not reliable. I could not find any GWRCH data for Tharandt in Goldberg and Bernhofer (2007), and the Kienhorst values are taken from ArcEGMO simulations, which are often known to underestimate ET and therefore overestimate water available for GWRCH.
AC: We will check that during the revision of the manuscript
Citation: https://doi.org/10.5194/egusphere-2025-4025-AC2
-
AC2: 'Reply on RC2', Thomas Fichtner, 12 Dec 2025
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 229 | 89 | 32 | 350 | 42 | 21 | 21 |
- HTML: 229
- PDF: 89
- XML: 32
- Total: 350
- Supplement: 42
- BibTeX: 21
- EndNote: 21
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The manuscript is very well written, the topic is perfectly suited for HESS and the methodology and scientific results are very good (very good is missing in the rating). Figures' quality has to be improved.
I suggest therefore revision.
Following additional comments could be provided: