the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Human-Driven Runoff Decline and Hydrological Drought Intensification in Semi-arid Regions in the Last 40 Years Revealed by a Hybrid Physics-Deep Learning Framework
Abstract. Amid accelerating global warming, the intensification of hydrological drought in semi-arid regions has become a critical threat to water security. In this study, we present an integrated framework that couples a physics-based WRF-Hydro model with a deep-learning module (LSTM-Attention) for error correction and attribution analysis. Focusing on the Xilin River Basin, a representative semi-arid catchment, this approach quantifies the relative contributions of climate change and human activities to runoff variations across interannual and intra-annual scales over the 1980–2020 period. We further systematically assess the impact of human activities on hydrological drought. Results reveal a significant runoff declining trend of -23.79×10⁴ m³/a, with an abrupt shift in 2001. During the post-shift period (2001–2020), hydrological drought frequency surged from 7.54 % in the baseline period to 54.58 %. Rapid warming (0.5 °C/10a) caused sustained increase of potential evapotranspiration, while snow water equivalent decreased significantly at 1.27 mm/a; these dual effects drove the overall runoff decline. April exhibited the most pronounced runoff reduction, accounting for 58.87 % of the annual decrease. In contrast, March runoff increased, primarily due to earlier snowmelt triggered by climate warming. Attribution analysis indicates that human activities were the dominant driver of runoff decline, contributing 61.04 %. These activities exerted dual effects on hydrological drought: alleviating it in 29.58 % of months but intensifying or triggering events in 38.34 % of months. The proposed integrated framework offers a robust tool for hydrological attribution analysis and underscores the critical role of human activities in sustainable water resource management.
- Preprint
(1514 KB) - Metadata XML
-
Supplement
(587 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-6227', Anonymous Referee #1, 01 Mar 2026
-
AC1: 'Reply on RC1', Dongwei Liu, 11 Mar 2026
We sincerely thank the reviewer for the careful reading of our manuscript and for the detailed and constructive comments. We have carefully considered all comments and prepared a point-by-point response. Please see the attached PDF supplement for the detailed replies. These comments will also be addressed in the revised manuscript.
-
AC1: 'Reply on RC1', Dongwei Liu, 11 Mar 2026
-
RC2: 'Comment on egusphere-2025-6227', Anonymous Referee #2, 09 Apr 2026
This manuscript presents a hybrid modelling framework coupling WRF-Hydro with an LSTM-Attention residual corrector to attribute runoff decline and hydrological drought intensification in the Xilin River Basin, Inner Mongolia, over 1980 to 2020. The topic is timely; the intra-annual seasonal attribution is a genuine contribution over most comparable studies and using deep learning strictly as a residual corrector rather than a process replacement is a thoughtful design choice. However, several major methodological concerns must be resolved before the findings can be accepted. I detail these below.
- LSTM hyperparameters are not reported. Number of layers, hidden units, dropout rate, learning rate, optimizer, batch size, sequence length, and early stopping criteria are nowhere in the paper. For an AI-integrated study, this is an unacceptable omission. It is impossible to evaluate whether the network is appropriately sized, whether it is overfitting the training period, or whether the residual learning is physically meaningful.
- The baseline period "no human influence" assumption is contradicted by the authors' own supplemental data. Fig. S4 clearly shows water withdrawal, sheep population, and grazing intensity all increasing monotonically from 1980 onward. The authors acknowledge this as a limitation but never estimate the resulting bias on the 61%/39% attribution split. Presenting this split as a point estimate without uncertainty bounds is scientifically overconfident. Even a simple sensitivity test varying the change-point year by ±2 years, or assuming a linearly increasing baseline human influence, would substantially strengthen the quantitative credibility of the central finding.
- WRF dynamical downscaling configuration is entirely absent. The paper states ERA5 was downscaled to 12.5 km via WRF, but no physics schemes, domain configuration, boundary condition settings, or validation of the downscaled meteorology against station observations are provided anywhere. Since this downscaled forcing drives the entire WRF-Hydro simulation, errors introduced here propagate into all subsequent results. This is a critical gap in methodological transparency
- Physical consistency of the DL-corrected runoff is never verified. The LSTM corrector adds a learned residual to WRF-Hydro output. The paper never checks whether the corrected runoff series satisfies approximate water balance closure, whether corrections are physically bounded (e.g., no negative runoff), or whether the magnitude of corrections is physically interpretable. A neural network trained to minimize residuals can produce corrections that are statistically optimal but physically spurious, particularly in the simulation period where conditions may differ from training.
- Noah-MP parameterization options are undisclosed. Noah-MP has over 40 configurable physics options governing snow, ET, soil hydrology, and runoff generation. The specific combination used is never stated. Given that the paper's main climatic drivers are PET and snowmelt , both directly controlled by Noah-MP options, this is not a minor omission. Different Noah-MP configurations can yield substantially different attribution results.
- The model abbreviation switches between "WH-LA" (Fig. 5 caption) and "WH-DL" (throughout the text) with no explanation. Pick one and apply it consistently.
- Figure 7's caption is excessively long and explains the three drought classification types that should have been defined in the Methods section. Move the classification definitions to Section 3.4.
- The Discussion repeats several findings verbatim from the Results section (e.g., the 58.87% Snow-M contribution, the 61.04% human attribution). Discussion sections should interpret and contextualize findings, not restate them.
- The paper does not state the number of WRF-Hydro parameters that were calibrated, the calibration algorithm used, or the objective function. This is standard information for any distributed hydrological modelling paper.
Recommendation
- Major revision required
Citation: https://doi.org/10.5194/egusphere-2025-6227-RC2 -
RC3: 'Comment on egusphere-2025-6227', Anonymous Referee #3, 14 Apr 2026
This manuscript presents a hybrid modelling framework that couples WRF-Hydro with an LSTM-Attention residual correction module to attribute runoff decline and hydrological drought intensification in the Xilin River Basin from 1980 to 2020. The topic is timely and relevant, particularly given increasing interest in combining physically based models with machine learning. I also appreciate the author’s effort to use deep learning strictly as a residual corrector rather than a full replacement of physical processes, as well as the attempt to examine intra-annual variability.
That said, I have several concerns regarding the conceptual basis and robustness of the attribution results. These issues do not necessarily invalidate the overall modelling framework, but they do significantly weaken the confidence in the central conclusions. I outline these below.
Major comments
1. Conceptual separation between climate change and human activities
The manuscript treats climate change and human activities as two independent drivers, but in practice these are tightly coupled. For example, irrigation and land use changes can modify evapotranspiration and even regional climate, while anthropogenic emissions are themselves a key driver of climate change. Given this, I find it difficult to interpret the reported attribution (e.g., the dominance of human contribution) as a clean physical separation. At present, it reads more like a methodological partition than a true process-based attribution.
2. Attribution framework likely absorbs model error into the “human” component
The approach defines human impact as the residual between observed runoff and model-simulated runoff under climate forcing. This implicitly assumes that the model fully captures climate-driven processes, which is a strong assumption. Any model bias, missing processes, or structural limitations will be included in this residual. Since the manuscript notes that WRF-Hydro has limitations in low-flow conditions, this is particularly concerning for drought analysis. It raises the possibility that part of the reported human contribution is compensating for model deficiencies.
3. Baseline assumption of negligible human influence
The attribution relies heavily on the assumption that human impacts are negligible during the baseline period (1980–2000). However, it is unlikely that grazing, land use change, and water use were absent during this time. The authors do acknowledge this limitation, but the implications are not explored. Given how central this assumption is, I would strongly encourage the authors to at least test how sensitive their results are to this choice (e.g., shifting the baseline period or allowing for gradual human influence).
4. Interpretation of the deep learning correction
I understand the intention to preserve physical interpretability by using LSTM as a residual corrector. However, the residual itself is not purely physical. It likely contains a mix of model error, unresolved processes, and possibly human-induced signals. As a result, the corrected runoff series is not strictly a physically consistent output, which complicates its use for attribution. I think this point needs to be discussed more carefully.
5. Lack of explicit representation of human processes
Human impacts are inferred indirectly rather than explicitly represented in the model. Processes such as irrigation return flow, groundwater abstraction, or soil moisture–atmosphere feedback are not included. Without explicitly modelling these, it is difficult to interpret what the human contribution in this study physically represents.
6. Under-constrained attribution problem
More broadly, the study attempts to partition runoff changes into two components (climate vs human) based on a single observed variable. Without additional constraints, this is an underdetermined problem, and some degree of trade-off between components is unavoidable. This makes the quantitative attribution (e.g., 61% vs 39%) less robust than it may appear.
Minor comments
1. Terminology clarity
It would help to more clearly distinguish between anthropogenic climate change and local human interventions, as the current terminology can be confusing.
2. Uncertainty quantification
The attribution results are presented as point estimates without uncertainty bounds. Given the assumptions involved, some form of uncertainty or sensitivity analysis would greatly strengthen the manuscript.
3. Sensitivity to methodological choices
The manuscript would benefit from exploring the sensitivity of the results to choices such as the baseline period, change-point detection, or model configuration.
Citation: https://doi.org/10.5194/egusphere-2025-6227-RC3
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 297 | 168 | 29 | 494 | 63 | 20 | 27 |
- HTML: 297
- PDF: 168
- XML: 29
- Total: 494
- Supplement: 63
- BibTeX: 20
- EndNote: 27
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Title: Human-Driven Runoff Decline and Hydrological Drought Intensification in Semi-arid Regions in the Last 40 Years Revealed by a Hybrid Physics-Deep Learning Framework
The manuscript investigates runoff decline in semi-arid regions using a hybrid WRF-Hydro and LSTM-Attention framework. While the integration of physical modeling with error correction is interesting, several critical issues regarding data validation and methodology transparency must be addressed.
1.Scalability concerns
In distributed hydrological modeling, it is a standard and reasonable paradigm to conduct in-depth physical mechanism analysis within small or medium sized catchments. However, the authors are encouraged to supplement the discussion by clarifying to what extent the core findings can be generalized to other typical basins globally.
2.Validation of Downscaled Data
Line 102 mentions that dynamical downscaling techniques were applied to generate high-resolution meteorological forcing data. While downscaling increases spatial resolution, it does not inherently guarantee improved accuracy, and regional climate models are prone to systematic biases. The manuscript currently lacks a comparative validation between the 12.5 km WRF outputs and actual ground meteorological observations within the basin. Although an LSTM-Attention module is used for error correction, a rigorous physics based framework requires a quality assessment of the initial meteorological input sources.
3.Transparency of WRF-Hydro Parameter Calibration
Line 140 states that "parameters sensitive to hydrological processes were selected for calibration," but the specific parameters are not identified. It is recommended to include a table (in Supplementary) listing the names, physical meanings, initial ranges, and final calibrated values of the key parameters to enhance the transparency and reproducibility.
4.Sample Size and Overfitting Prevention
According to Line 159, the training period covers 1980–1996. If monthly resolution data were used, the sample length would be 204? Could the authors clarify the exact sample size? Given such a limited dataset, what specific measures were implemented to prevent the deep learning model from overfitting? Additionally, please provide the key hyperparameters to facilitate reproducibility and further academic exchange.
5.Link Human Activity Data with Attribution Results
The authors list data regarding human activities in lines 355-365. Please clarify how these multi-source datasets were statistically linked to the "effect of human activities" derived from attribution analysis. While "Correlation analysis" is explicitly mentioned in the technical flowchart (Figure 2), there appears to be a lack of corresponding methodological description or results in the main text. Please confirm if this analysis was performed and provide detailed evidence.