the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Human-Driven Runoff Decline and Hydrological Drought Intensification in Semi-arid Regions in the Last 40 Years Revealed by a Hybrid Physics-Deep Learning Framework
Abstract. Amid accelerating global warming, the intensification of hydrological drought in semi-arid regions has become a critical threat to water security. In this study, we present an integrated framework that couples a physics-based WRF-Hydro model with a deep-learning module (LSTM-Attention) for error correction and attribution analysis. Focusing on the Xilin River Basin, a representative semi-arid catchment, this approach quantifies the relative contributions of climate change and human activities to runoff variations across interannual and intra-annual scales over the 1980–2020 period. We further systematically assess the impact of human activities on hydrological drought. Results reveal a significant runoff declining trend of -23.79×10⁴ m³/a, with an abrupt shift in 2001. During the post-shift period (2001–2020), hydrological drought frequency surged from 7.54 % in the baseline period to 54.58 %. Rapid warming (0.5 °C/10a) caused sustained increase of potential evapotranspiration, while snow water equivalent decreased significantly at 1.27 mm/a; these dual effects drove the overall runoff decline. April exhibited the most pronounced runoff reduction, accounting for 58.87 % of the annual decrease. In contrast, March runoff increased, primarily due to earlier snowmelt triggered by climate warming. Attribution analysis indicates that human activities were the dominant driver of runoff decline, contributing 61.04 %. These activities exerted dual effects on hydrological drought: alleviating it in 29.58 % of months but intensifying or triggering events in 38.34 % of months. The proposed integrated framework offers a robust tool for hydrological attribution analysis and underscores the critical role of human activities in sustainable water resource management.
- Preprint
(1514 KB) - Metadata XML
-
Supplement
(587 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-6227', Anonymous Referee #1, 01 Mar 2026
-
AC1: 'Reply on RC1', Dongwei Liu, 11 Mar 2026
We sincerely thank the reviewer for the careful reading of our manuscript and for the detailed and constructive comments. We have carefully considered all comments and prepared a point-by-point response. Please see the attached PDF supplement for the detailed replies. These comments will also be addressed in the revised manuscript.
-
AC1: 'Reply on RC1', Dongwei Liu, 11 Mar 2026
-
RC2: 'Comment on egusphere-2025-6227', Anonymous Referee #2, 09 Apr 2026
This manuscript presents a hybrid modelling framework coupling WRF-Hydro with an LSTM-Attention residual corrector to attribute runoff decline and hydrological drought intensification in the Xilin River Basin, Inner Mongolia, over 1980 to 2020. The topic is timely; the intra-annual seasonal attribution is a genuine contribution over most comparable studies and using deep learning strictly as a residual corrector rather than a process replacement is a thoughtful design choice. However, several major methodological concerns must be resolved before the findings can be accepted. I detail these below.
- LSTM hyperparameters are not reported. Number of layers, hidden units, dropout rate, learning rate, optimizer, batch size, sequence length, and early stopping criteria are nowhere in the paper. For an AI-integrated study, this is an unacceptable omission. It is impossible to evaluate whether the network is appropriately sized, whether it is overfitting the training period, or whether the residual learning is physically meaningful.
- The baseline period "no human influence" assumption is contradicted by the authors' own supplemental data. Fig. S4 clearly shows water withdrawal, sheep population, and grazing intensity all increasing monotonically from 1980 onward. The authors acknowledge this as a limitation but never estimate the resulting bias on the 61%/39% attribution split. Presenting this split as a point estimate without uncertainty bounds is scientifically overconfident. Even a simple sensitivity test varying the change-point year by ±2 years, or assuming a linearly increasing baseline human influence, would substantially strengthen the quantitative credibility of the central finding.
- WRF dynamical downscaling configuration is entirely absent. The paper states ERA5 was downscaled to 12.5 km via WRF, but no physics schemes, domain configuration, boundary condition settings, or validation of the downscaled meteorology against station observations are provided anywhere. Since this downscaled forcing drives the entire WRF-Hydro simulation, errors introduced here propagate into all subsequent results. This is a critical gap in methodological transparency
- Physical consistency of the DL-corrected runoff is never verified. The LSTM corrector adds a learned residual to WRF-Hydro output. The paper never checks whether the corrected runoff series satisfies approximate water balance closure, whether corrections are physically bounded (e.g., no negative runoff), or whether the magnitude of corrections is physically interpretable. A neural network trained to minimize residuals can produce corrections that are statistically optimal but physically spurious, particularly in the simulation period where conditions may differ from training.
- Noah-MP parameterization options are undisclosed. Noah-MP has over 40 configurable physics options governing snow, ET, soil hydrology, and runoff generation. The specific combination used is never stated. Given that the paper's main climatic drivers are PET and snowmelt , both directly controlled by Noah-MP options, this is not a minor omission. Different Noah-MP configurations can yield substantially different attribution results.
- The model abbreviation switches between "WH-LA" (Fig. 5 caption) and "WH-DL" (throughout the text) with no explanation. Pick one and apply it consistently.
- Figure 7's caption is excessively long and explains the three drought classification types that should have been defined in the Methods section. Move the classification definitions to Section 3.4.
- The Discussion repeats several findings verbatim from the Results section (e.g., the 58.87% Snow-M contribution, the 61.04% human attribution). Discussion sections should interpret and contextualize findings, not restate them.
- The paper does not state the number of WRF-Hydro parameters that were calibrated, the calibration algorithm used, or the objective function. This is standard information for any distributed hydrological modelling paper.
Recommendation
- Major revision required
Citation: https://doi.org/10.5194/egusphere-2025-6227-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 213 | 113 | 22 | 348 | 39 | 13 | 15 |
- HTML: 213
- PDF: 113
- XML: 22
- Total: 348
- Supplement: 39
- BibTeX: 13
- EndNote: 15
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Title: Human-Driven Runoff Decline and Hydrological Drought Intensification in Semi-arid Regions in the Last 40 Years Revealed by a Hybrid Physics-Deep Learning Framework
The manuscript investigates runoff decline in semi-arid regions using a hybrid WRF-Hydro and LSTM-Attention framework. While the integration of physical modeling with error correction is interesting, several critical issues regarding data validation and methodology transparency must be addressed.
1.Scalability concerns
In distributed hydrological modeling, it is a standard and reasonable paradigm to conduct in-depth physical mechanism analysis within small or medium sized catchments. However, the authors are encouraged to supplement the discussion by clarifying to what extent the core findings can be generalized to other typical basins globally.
2.Validation of Downscaled Data
Line 102 mentions that dynamical downscaling techniques were applied to generate high-resolution meteorological forcing data. While downscaling increases spatial resolution, it does not inherently guarantee improved accuracy, and regional climate models are prone to systematic biases. The manuscript currently lacks a comparative validation between the 12.5 km WRF outputs and actual ground meteorological observations within the basin. Although an LSTM-Attention module is used for error correction, a rigorous physics based framework requires a quality assessment of the initial meteorological input sources.
3.Transparency of WRF-Hydro Parameter Calibration
Line 140 states that "parameters sensitive to hydrological processes were selected for calibration," but the specific parameters are not identified. It is recommended to include a table (in Supplementary) listing the names, physical meanings, initial ranges, and final calibrated values of the key parameters to enhance the transparency and reproducibility.
4.Sample Size and Overfitting Prevention
According to Line 159, the training period covers 1980–1996. If monthly resolution data were used, the sample length would be 204? Could the authors clarify the exact sample size? Given such a limited dataset, what specific measures were implemented to prevent the deep learning model from overfitting? Additionally, please provide the key hyperparameters to facilitate reproducibility and further academic exchange.
5.Link Human Activity Data with Attribution Results
The authors list data regarding human activities in lines 355-365. Please clarify how these multi-source datasets were statistically linked to the "effect of human activities" derived from attribution analysis. While "Correlation analysis" is explicitly mentioned in the technical flowchart (Figure 2), there appears to be a lack of corresponding methodological description or results in the main text. Please confirm if this analysis was performed and provide detailed evidence.