the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Drought dynamics across the hydrological cycle – an extensive validation of the National Hydrological Model of Denmark
Abstract. Droughts are gaining attention in temperate regions, as underscored by the severe European droughts of 2018 and 2022. In Denmark, these events caused widespread agricultural losses, degradation of surface waters and ecosystems, and infrastructure damage from soil subsidence. Although historical drought trends in northern Europe are uncertain, climate projections indicate more frequent and intense droughts. Hydrological drought propagation from precipitation deficit to soil moisture, streamflow and groundwater is shaped by topography, soil, vegetation, hydrogeology, and human activity. While streamflow and soil moisture droughts have been widely studied, groundwater droughts remain underexplored despite their importance for baseflow and water supply. In Denmark, where groundwater and surface water are closely linked, and groundwater resources are heavily relied upon, an integrated approach to drought assessment is essential. In this study, we compile a high-quality observational dataset, including soil moisture, streamflow, and groundwater levels, to systematically evaluate model-simulated drought and its propagation throughout all hydrological compartments by the National Hydrological Model of Denmark (DK-model), an integrated, distributed hydrological model. The DK-model’s nationwide coverage, combined with Denmark’s dense hydrological monitoring network, enables a detailed assessment of the model’s ability to simulate drought events. This includes model skill in reproducing observed anomalies, drought response times, and propagation dynamics. The DK-model was found to reproduce drought indices very well for groundwater levels and streamflow compared to respective observational time series. For soil moisture, model performance was lower. Drought propagation, evaluated by accumulation periods for precipitation with optimal correlation to hydrological drought, is likewise reproduced well for streamflow and groundwater. In contrast, the model struggles with the soil moisture signal. By evaluating the DK-model’s performance in simulating drought propagation, this study contributes to improving large-scale hydrological drought modelling and enhances the understanding of the strengths and weaknesses of this approach, while increasing its potential for drought analysis, monitoring, and forecasting. The findings provide critical insights into drought dynamics in temperate regions and support sustainable water resource management in a changing climate.
- Preprint
(4704 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 27 Jan 2026)
-
RC1: 'Comment on egusphere-2025-5373', Anonymous Referee #1, 18 Dec 2025
reply
-
AC1: 'Reply on RC1', Raphael Schneider, 22 Dec 2025
reply
We thank the reviewer for their effort and the provided thoughtful and constructive review. A revision of the manuscript, based on the reviewer’s suggestions, will strengthen the manuscript.
We outline here how we plan to address the major comments of the reviewer.
Improve manuscript structure and focus 1: We understand the reviewer’s concerns about the selection of the observational dataset. We will move the selection fully into the Methods and Data section. Moreover, parts of the description of the observational dataset selection, in particular our considerations around filtering the groundwater level dataset (now in sections 2.3.1 and 3.2.1), will be moved into the appendix as suggested.
In this context, we have a question to the reviewer: In their comment “Detailed descriptions of observational datasets and preliminary performance results should be moved to the Appendix.” – what is meant by “preliminary performance results”? Figure 1?
Improve manuscript structure and focus 2: We appreciate the suggestion of the reviewer of showing general model performance not across all calibration wells and streamflow stations, but specifically across the selected wells and stations for the drought analysis. As we feel that the point of the DK-model being a “general-purpose” model is an important part of the story, our suggestion would be to add the performance across the 53 selected wells and 153 selected discharge stations as a second ECDF in Figure 1, panels a and b, and c and d, respectively. Then each of these subplots shows two lines: One representing performance across all calibration data, the second one representing performance in the selected wells/stations.
Improve manuscript structure and focus 3: Again, we appreciate the critical suggestions by the reviewer, and acknowledge that the discussion section can be sharpened and needs to be separated more strictly from the Result section. Also, we will add a section on comparing the results of our study with the DK-model to comparable hydrological models used for drought assessment, as suggested.
Thanks also for pointing out inconsistencies in our definition of “drought” – we will make sure to be precise around the used terminology in the revised manuscript.
Additional minor comments are appreciated, and will also be addressed by us in the revisions – we foresee no issues there.
Citation: https://doi.org/10.5194/egusphere-2025-5373-AC1 -
RC2: 'Reply on AC1', Anonymous Referee #1, 05 Jan 2026
reply
Thank you for your reply. By “preliminary performance results” I was referring to the overall model performance evaluated across all calibration wells and streamflow stations. Relocating this information to the Appendix would help to improve the focus of the manuscript on the drought-related analyses and results.
Citation: https://doi.org/10.5194/egusphere-2025-5373-RC2 -
AC2: 'Reply on RC2', Raphael Schneider, 05 Jan 2026
reply
Thanks again for the follow-up and clarification. We definitely appreciate the suggestion, and understand the wish for a better focus on the *relevant* aspects of model performance. Our suggestion would be, as mentioned before, to keep Figure 1, but include a second ECDF showing the performance across the 53 selected wells and 153 selected discharge stations in panels a and b, and c and d, respectively. We feel that the point of the DK-model being a “general-purpose” model is an important part of the story; hence we would suggest to keep Figure 1 and give an impression of general model performance. With the suggested addition, general model performance and performance in selected stations/wells for drought performance can be compared.
Citation: https://doi.org/10.5194/egusphere-2025-5373-AC2
-
AC2: 'Reply on RC2', Raphael Schneider, 05 Jan 2026
reply
-
RC2: 'Reply on AC1', Anonymous Referee #1, 05 Jan 2026
reply
-
AC1: 'Reply on RC1', Raphael Schneider, 22 Dec 2025
reply
-
RC3: 'Comment on egusphere-2025-5373', Anonymous Referee #2, 13 Jan 2026
reply
Summary:
This manuscript addresses the need to consider different aspects of drought phenomena when evaluating hydrological models’ ability to model drought, a topic relevant within the scope of HESS, and of importance to the hydrological modelling community. Specifically, they assess the general-purpose DK-model’s ability to simulate the temporal variability of normalized soil moisture, streamflow and groundwater, and its ability to capture lag-times between normalized precipitation and the abovementioned hydrological components. By using an operational model, they bridge research and operational hydrology, making the results relevant for both spheres.
A major weakness of the paper is the lack of drought-specific quantitative evaluations of the model, despite that being a main aim of the study. The assessments of drought dynamics are limited to general evaluation metric of the dry 50% of the time series. Assessments that address drought in particular, is needed for the paper to meet its own objective and to be accepted for publication. Other weaknesses that need to be addressed include a consistent lack of benchmarking or justification of what to accept at good metric values, the issue of including the calibration period in the evaluation period, and a need of clearer organization of the paper.
The manuscript presents a comprehensive and interesting work in terms of modelling, data selection and lag-time analyses. By properly addressing the abovementioned gaps, as well at the comments below, the manuscript could be suitable for publication.
General comments:
- Despite the study’s aims to evaluate drought dynamics, it lacks a clear definition of drought (e.g. occurrences, deficit volumes, durations based on thresholds) and quantitative evaluations based on those definitions. The time series of the lowest 50% values (presented in Fig 4 and Table 5) cannot be considered drought.
- The interpretation of the results as satisfactory lacks a benchmarking or justification. The manuscript explicitly states that the model performs “good” (L413), “very well” (e.g. L21) and “very good” (L532), for correlations between 0.56 and 0.63 (Table 2). Correlations of 0.56-0.63 are not generally accepted as very good, and references or justifications are needed for such statements. This comment also applies for other evaluation metrics.
- The DK-model is calibrated for 2000-2010 and evaluated for 1990-2023. Hence, the model has seen and been adjusted to about 30% the evaluated data during calibration. In general, models should be evaluated for data not seen during the calibration, to avoid too optimistic results. The authors need to justify their choice of not doing so, and preferably provide evaluation metrics for the unseen part of the evaluation period in appendix to ensure readers that the results are robust.
- The manuscript needs a clear organization, including a clear separation between results and discussions, and a methods section that prepares the reader for the results. The structure would benefit from moving the presentation of the evaluation datasets from results to methods, moving new results from discussion to results and introduce methods in methods section. Comparable studies/literature should be discussed in the Discussion (currently lacking).
Specific comments:
- Include quantitative results backing your conclusions in the abstract
- Nuance or rephrase the statement of “Denmark’s dense hydrological monitoring network” (does not apply for soil moisture), and/or note reader about the sparse soil moisture data when providing results in Abstract.
- Suggest to shorten first 15 lines of the Abstact before the study is presented.
- Other studies /evaluation results using DK-model should be presented in the introduction or methods.
- Methods: add the currently lacking descriptions of the temporal resolution of the model and data.
- L158-160: Briefly describe the meteorological data used (how it is produced)
- L190: Define reference period first time it is mentioned
- L176-192: Clarify why you use of all data for optimization (including human influence) when you use fixed abstraction rates and fixed wastewater outflow. I.e. explain why the choices described in sect 2.2.4 does not affect the choice of data used in the optimization described in 2.2.3?
- L202: Link or reference to Jupyter. Is it open or restricted? Information can be provided in the data availability section.
- L205-206: Please clarify. Requiring minimum 20 yrs of data allowing for 20% gaps can be interpreted as requiring minimum 16 yrs of data. Is only 20% of the series allowed to have bi-monthly data, or do the 20% refer to gaps larger than two months.
- L208 (and elsewhere): Consider changing “climate” to “meteorological”, “atmospheric” or similar.
- L220: Please specify “criteria”, and in particular specify applied criteria important for your drought analyses.
- L230: Why exclude catchments smaller than 15 km2. The model has a 500m spatial resolution.
- L234-235: Why does this argument not apply for groundwater wells? Please give general information about recharge area or similar to justify why point (i.e. well) observations are ok for evaluation of model at 500m resolution in sect 2.3.1.
- Sect 2.3.1-2.3.3 Please consider moving number of stations and Fig. 2 to here.
- L244: You state “hundreds” based on reference that states “more than 100”. Please consider rephrasing as hundreds may be interpreted as much more.
- L275 and L277: Please add reference to categories, and consider removing “mild drought”, as this implies drought half of the time.
- L275-277: Please consider using (some of) these thresholds as a basis for drought definition (ref major comment a).
- L275-277: Thresholds and names does not align with what is used in Fig 9. Please fix.
- Table 1:
- Please justify choice of Makkink evapotranspiration method in main text.
- Please define “q-points” in table legend
- Please be consistent in abbreviation for groundwater (SGI or SGDI)
- Please justify the transformation ln(Q) in main text.
- L280: Using 1-month accumulation period for SPI and SPEI is not “most commonly practiced” (3-, 6- and 12-months are generally more common). Please justify your choice, e.g. by your knowledge of drought development in Denmark, or other studies.
- L285-286: Please clarify advantage of computing on weekly and resample to monthly instead of computing directly on monthly data.
- L290: Fitting precipitation to a normal distribution is rare. Can histograms or similar be visualized to back up the choice?
- L290-294: Please quantitively back up the conclusion from the Kolmogorov-Smirnov normality tests. And add the reasoning behind the choices for SDI, which is currently lacking.
- L301: Briefly define what a “streamflow calculation point” is, either here, or under the description of the model.
- L333-335: The evaluation of representativeness of the observation points of the entirety of Denmark does not take into account that you would want that stations to represent the variability of streamflow and groundwater behavior across Denmark, which you yourself highlight as an important aspect e.g. by separating into two different regions in Fig 8. Please consider to discuss the dataset’s representativeness in terms of representing the different hydrological regimes/behaviors across Denmark in the discussion (or under the presentation of the evaluation datasets).
- Sect 2.4.3: Please align with results (including results in the discussion section) – ref major comment d). When presenting metrics (e.g. correlations and RMSE), it would be beneficial if you stated (and justified) values that indicate good performance and not, as this is unclear in the results section when there are no alternatives to compare the performance with.
- L331: Please state how you aggregated.
- Sect 3.1 is missing – maybe a numbering issue.
- L338: Please include in discussion whether a median KGE of 0.67 is satisfactory, with reference to other modelling studies.
- Figure 1 and L337: State which period is used for the overall performance metric scores.
- L340: Please introduce equation in methods, not results.
- L344-345: Are mean absolute error of 0.65 for amplitudes of 1.06 “reasonably well” reproduced? Please justify, or rephrase.
- Sect 3.2: Suggest to move to relevant methods sections, ref previous comments, and delete repetitions.
- 2: Move to methods, and please consider more contrast for points in both a and c to clearly separate Q from SM and shallow from deep. Please provide threshold between shallow and deep in figure caption
- L389 and Fig 3: Example time series: please state the reasons for your choice of these stations, and whether these are near best-cases, medium cases, or other. Can there locations be shown on a map? Please state why you have included this example figure, as this is currently unclear.
- Fig 3: figure caption colour description opposite of figure, and “vertical” should be replaced with “horizontal”
- Fig 4: If the correlation of the entire period and the dry periods are not comparable, why combine them in the figure?
- Sect 3.3 would benefit from a geographical visualization (i.e. map) or regional summary of the results, to see if the performance values depend on region, to better understand potential reasons for the better and worse results. In Fig 7 (and related text), you underline the different processes and drought signal in different regions in Denmark, however, you currently have no evaluation metrics shown either on a map or for regional averages. Knowing where the model can be more or less trusted is beneficial when using the model operationally, and helpful for further model development.
- L413: move interpretations of results as “good” etc. to discussion.
- L426: define “hydrological drought index”, preferably in methods (this is the first time this term is used)
- Sect 3.4: Is the matching of SPI and streamflow done for the overlapping grid cell or for precipitation over the entire catchment of the streamflow station? If the latter, please state in the methods. If the former, please justify in the methods, as one would expect the entire catchment precipitation (deficit) to affect streamflow. If not using catchment precipitation as the basis, this choice should also be discussed in the discussion section, and how this choice may have affected the results.
- Fig 5:
- Please clarify in caption that also deep wells have gray colour for non-signifcant correlations in (g) and (h).
- Not clear if deep or shallow wells when there are blue outlines on points representing more than one well in (i).
- Please comment on the outliers (>20 months for SDI and >50months for SGDI) in the main text. What can explain them?
- L443: please consider replacing “variability” with “accumulation period” or similar.
- L465-466 vs L493-494: How does similar results for SPI and SPEI align with the statement that “SPI and SPEI values are largely uncorrelated in time”?
- L493-494: provide the correlation value underlying this statement (“largely uncorrelated”)
- Fig 9 and related text: Please state why you chose to show a case study and why you chose this event (and not e.g. a more severe event from the events seen in fig7-8). The choice of SPI2 is also unclear.
- L532: “very good results” – I advice to be more modest in the interpretation of the performance results unless you have reference numbers or benchmarking to compare with. This applies throughout the manuscript.
- L532-533: Please state/or refer to place where number underlying this statement can be found.
- L553: Please add reference or justify your statement that the groundwater sensitivity to winter drought is often overlooked
- L562-563: Underline your statement that there is significant spatial variability in acc.periods for streamflow with a p-value or similar. According to L446-447, the acc.periods are not well defined and hence the spatial variability is not necessarily significant. Please also discuss here that the spatial variability acc.periods for observed streamflow is much lower (according to fig 5(f)).
- L606: Ref major comment a), “extreme conditions” are not evaluated in this study. Please rephrase to be in line with what is actually evaluated or can be implied based on that.
- L648: Does conventional hydrological model refer to a specific model (e.g. the DK-model)? Please clarify.
- L615: please specify the plans for CRN sensors, e.g. approx. number, locations etc.
- L672: “drought occurrence” not evaluated.
- Data availability section should include meteorological data, as well as the soil moisture, streamflow and groundwater data used for calibration and evaluation of the model.
Citation: https://doi.org/10.5194/egusphere-2025-5373-RC3
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 318 | 154 | 28 | 500 | 54 | 40 |
- HTML: 318
- PDF: 154
- XML: 28
- Total: 500
- BibTeX: 54
- EndNote: 40
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Manuscript: egusphere-2025-5373
Journal: Hydrology and Earth System Sciences (HESS)
Title: Drought dynamics across the hydrological cycle – an extensive validation of the National Hydrological Model of Denmark
Authors: Raphael Schneider, Simon Stisen, Mark F. T. Hansen, Mie Andreasen, Bertel Nilsson, Klaus Hinsby, Hans Jørgen Henriksen, and Ida Karlsson Seidenfaden
Summary:
This manuscript presents a comprehensive and innovative evaluation of the DK-model for drought monitoring across multiple hydrological compartments. The authors compile an extensive observational dataset and analyse drought propagation from meteorological to soil moisture, streamflow, and groundwater droughts. This represents a substantial effort and addresses a highly relevant topic for hydrological drought research. The manuscript fits well within the scope of _Hydrology and Earth System Sciences_ and will be of interest to a broad readership.
The paper tackles an important problem and is based on a unique national-scale modelling and observational dataset. The main strengths are the comprehensive evaluation across hydrological compartments and the explicit focus on drought propagation. The main weaknesses concern (i) manuscript structure, (ii) conceptual clarity regarding drought definitions and thresholds, and (iii) limited quantitative discussion of model limitations and comparison to existing studies.
The study is generally well presented, with clear figures and a comprehensive modelling framework. However, the manuscript is currently quite lengthy and would benefit from a clearer separation of results and discussion, as well as a more in-depth discussion of the findings in comparison to other hydrological modelling systems used for drought assessment. After addressing the comments below, the manuscript would be suitable for publication.
Major Comments:
1 Manuscript structure and focus on drought evaluation
The manuscript would benefit from a clearer structure and stronger focus on the core drought-evaluation results. In particular:
- The selection and quality assurance of observational data should be moved fully into the _Methods_ section as data preprocessing.
- Detailed descriptions of observational datasets and preliminary performance results should be moved to the Appendix.
- This restructuring would shorten the manuscript and sharpen the focus on drought-related findings.
- Model performance results (e.g. Figure 1) are shown for all available stations, whereas drought evaluation is conducted only for a selected subset. For consistency and relevance: Performance results should be shown only for the stations used in the drought analysis. Results for all stations can be provided in the Appendix.
3. Definition of drought and drought thresholds
Drought is repeatedly defined as index values below 0, which corresponds to “dry anomalies” rather than drought. Drought classes are introduced in the Introduction, but only values <0 are analysed later.
- Please clarify the conceptual definition of drought used in the study.
- Why are thresholds for moderate or severe drought (e.g. SPI < −1) not analysed?
- Please ensure consistency between definitions, analysis, and interpretation.
4. Separation of results and discussion
Results are repeatedly introduced and interpreted in the Discussion section.
- All new results should be presented in the Results section.
- The Discussion should focus on interpretation, comparison with previous studies, and implications.
- Several subsections currently labelled as Discussion (e.g. Sections 4.2 and parts of 4.3) read as Results.
5. Discussion depth and comparison to other models
The discussion is relatively short compared to the breadth of the analysis.
- Please extend the discussion by comparing the DK-model performance to other hydrological models used for drought assessment (e.g. national-scale or continental-scale models).
- Strengths and limitations of the DK-model relative to these systems should be discussed more explicitly.
Specific Comments:
- “Climate projections indicate more frequent and intense droughts” (p.1, LL10):
For Northern Europe, projected drought changes are mixed in the literature. Please nuance this statement (e.g. more summer droughts, fewer winter droughts).
- Abstract (LL21–24):
Statements on model performance are very general. Please include quantitative results.
- p.3, LL71–74:
You state that fewer studies address groundwater and the entire hydrological cycle, yet cite more studies than for other compartments. Please clarify.
- p.4, LL100:
“Weichsel and Saale” – please clarify that these refer to glaciations.
- p.5, LL145:
Is there a more recent reference describing developments over the last three decades?
- p.5, LL154–155:
How thick is the unsaturated/root zone in the model? This is essential for interpreting soil moisture results.
- p.6, LL189–190:
Please quantify the impact of constant abstraction rates and provide supporting material (Appendix). A map showing trends in water consumption (Appendix) would help identify regions where this assumption is most critical.
- p.7, LL215:
Which lag times were tested?
- p.7, LL219–220:
Please clarify the criteria used for expert judgement in data evaluation.
- p.9, LL256ff:
Only introduce indices actually used in the study and explain why they were chosen over alternatives.
- p.9, LL275:
SPI values between 0 and −1 are not drought. Please correct and add references for drought class definitions.
- p.10, LL285:
Why are SMDI and SDI resampled from weekly to monthly?
- p.10, LL289–290:
Are results of the normality test shown?
- p.11, LL305:
Please specify the length of soil moisture time series.
- p.12, LL342:
Interpretation (“little bias”) should be moved to the Discussion and compared with literature.
- p.12, LL344–345:
A mean absolute error of 0.65 m relative to an average amplitude of 1.06 m appears large. Please discuss.
- Figure 1 (p.13):
Show performance only for stations used in drought analysis.
- p.13, LL356:
Consider renaming to “Quality-assured observational dataset for drought evaluation”.
- p.13, LL358:
Figure numbering and narrative would be clearer if the study area and data were introduced before calibration results.
- p.17, LL400–401:
Statements regarding reduced correlation during drought periods should be quantitatively tested and better explained.
- p.17, LL406:
Drought index <0 does not equal drought.
- p.19, LL435:
The colour coding of wells is already explained in the text; repetition is unnecessary.
- p.20, LL440:
Testing accumulation periods up to 60 months for soil moisture with only ~10-year records is not meaningful. Consider limiting this to ~36 months.
- p.25, LL507:
Why was May 2020 chosen? Please provide context.
- p.26, LL518–520:
The discussion starts very generally; consider linking more directly to your results.
- p.26, LL523:
The key research question should already be clearly stated in the Introduction.
- p.26, LL530:
Statements about future soil moisture observations are vague—please provide references or concrete examples.
- p.29, LL611–612:
Please specify the thickness of the root zone. Given that the modelled soil moisture represents the entire unsaturated zone (i.e. a larger volume than observations), one might expect the opposite behaviour.
- Please explain this discrepancy more clearly.
- p.31, LL656–658:
The model does not appear to capture soil moisture lag times well. Please adjust this statement accordingly.