Self-limiting precipitation recycling during event-scale wet episodes in north-western China’s semi-arid transition zone
Abstract. Recent decades have seen a marked climatic wetting across north-western China’s semi-arid transition zone, yet the extent to which this intensification arises from local land–atmosphere feedback or from external moisture inflow remains uncertain. Using hourly station observations and ERA5 reanalysis for 2020–2024, we develop a process-based framework that links event-scale rainfall variability to the dynamic behaviour of precipitation recycling. Hourly recycling rates are derived through a two-reservoir moisture-tracking scheme, and multi-day wet episodes are identified to isolate transient feedback processes. Machine-learning surrogate models (CatBoost, XGBoost, ExtraTrees, Gradient Boosting) emulate the recycling rate as a function of meteorological conditions under a leave-one-event-out design, enabling counterfactual perturbation experiments in which precipitation intensity and key moisture variables are systematically scaled. A dimensionless percentage elasticity (η%) is introduced to quantify the relative response of recycled precipitation to rainfall enhancement. Results from nine regional events reveal a robustly negative η%, indicating that additional rainfall often suppresses rather than reinforces local recycling efficiency—a self-limiting wetting feedback. Cluster analysis distinguishes three physical regimes: (1) cool–moist episodes with initially strong but rapidly saturating coupling, (2) advection-dominated events with nearly linear, externally controlled responses, and (3) warm–dry episodes with weak coupling and near-zero elasticity. Collectively, these findings depict the atmosphere above north-western China as a self-stabilizing hydrological system in which increased precipitation does not necessarily strengthen, and may even weaken, local moisture recycling. The proposed event-scale elasticity framework provides a transferable diagnostic for short-term land–atmosphere coupling and for assessing hydrological resilience in arid and semi-arid regions.
General comments
The manuscript “Self-limiting precipitation recycling during event-scale wet episodes in north-western China’s semi-arid transition zone” by Li et al., addresses a timely and relevant topic: precipitation recycling and land–atmosphere feedbacks in a dryland environment. The question of precipitation efficiency during wet episodes in semi-arid climates is scientifically important, and the precipitation efficiency topic in general is at the focus of an ongoing debate. This means that studies addressing it, specifically in dryland regions, are needed.
However, the study, in its current form, appears methodologically over-engineered relative to the available data, and the overall design feels contrived rather than data-driven. A very elaborate modelling and machine-learning framework is applied to a minimal observational basis (five years of data from five stations, yielding only nine events). The resulting analyses rely heavily on this extremely small sample. As a result, the conclusions one can draw are determined by methodological choices rather than robustly supported by large data sets.
Because the dataset is too limited to support the complexity of the framework and the generality of the interpretations, I do not believe the manuscript, in its current form, meets the standards of HESS.
Major comments:
1) The record length for the rainfall stations is very short. While this fact is presented in the introduction, it may still be too short to represent climatology of the region. This is a major problem in (a) the analyses presented in Fig. 1 (they are called climatological, but actually represent 5 years). Additionally, elasticity results and their interpretation are implicitly discussed in terms of general wetting-feedback behaviour in north-western China. However, because both the event sample and the baseline statistics come from a short, recent period, it is unclear how representative these feedback patterns are of longer-term climate conditions, particularly under drier or more variable regimes not captured in the record. The manuscript would benefit from a more explicit discussion of how the limited record length constrains the representativeness of the thresholds, regimes, and inferred feedback behaviours.
2) The methodological framework in the manuscript is very elaborate (moisture tracking + multiple ML surrogates + counterfactual elasticity experiments), yet the observational basis is limited to nine events in a small domain. The manuscript would benefit from clearer justification of why such a complex modelling structure is warranted for this dataset, and from discussion of overfitting risks and extrapolation limits in the surrogate-based perturbation experiments.
3) Related to the above, the description of the study area is very brief and does not adequately explain what makes this region scientifically distinctive. Why was this particular area selected? Beyond noting the presence of topography and low annual precipitation, the manuscript should better characterise the climatological and hydrometeorological context to clarify the broader relevance of the case study. Since, if I’m not mistaken, the study area is not even called by its name, it’s very hard for international readers to grasp what the characteristics of the region are. Could you help the readers?
4) The assumption (L154) that at t0 all atmospheric water is advected clearly disregard the role of previous moisture from the analysis. Could you justify it? Can you quantify how much of the moisture at t0 of the event was actually advected? For example, if you start the same exercise a few days before every event, what is the ratio between advected and local moisture at t0 of the event?
5) Sect. 2.3.3. Precipitation removal is considered by using ERA5 data. However, ERA5 skill in representing precipitation in drylands is inferior compared to other places. What would have happened if you had used measured precipitation to calculate the precipitation sink?
6) Sect. 2.7. What is the meaning of clustering of 9 events, and with 3 clusters? Is it even needed to represent the 9 events in different clusters? And to make an “objective” classification with more parameters going into the analysis compared to the number of events? Also, how did you get the interpretation of the clusters “cool–moist” etc.? The identified “regimes” may reflect sampling of a few specific synoptic situations rather than robust, recurrent hydrometeorological modes of the region. Please describe better why this process is needed and how you interpret its results. Please address issues like the validity of the elasticity parameter evaluated separately for each cluster (with n=2 in one of the clusters, and a leave-one-out design).
Specific comments
L41: “representative” in what way? Similarly, in L80, “key… region” – why?
L44: It is not clear to me what you mean with “coupling”. Is it the coupling between the two days? Also, I don’t understand what this has to do with the confounding effects of complex topography. Please explain.
L80: “In summary” – In my view it’s better to start the last paragraph with a summary of the “why this study is needed” (e.g., in light of the XXX there’s a need to better understand…) without saying “in summary”. It’s a bit strange to read “in summary” when you read the introduction.
L77–79: It seems to me the description of where you can find the explanation of how you derive elasticity is not needed here. The next paragraph describes it again. If needed, you can stick with “(Section 2)” rather than the full explanation.
Figure 1: (a) The basemap in this panel is not clear. Please replace with something more informative. I would suggest overlying mean annual precipitation as contours over topography (colours), because these are the two factors that are mentioned in the text. Then the stations can be given with a label showing the P95 precipitation if needed, but make sure this label does not conflict with the P95 number.
L93: “1100 m asl” is not a topography gradient, but rather elevation. If the stress is on the gradient please describe it (e.g., X m drop over a distance of Y km from west to east, or something in that spirit).
L115: “Temporal gaps” – are there any in the data? If so, especially if they are significant, please describe data availability throughout the study period.
L117–118: It seems to me like the RMSE of ERA5’s precipitation is generally larger than P95. Is it still reasonable to say it “reproduces… magnitude of rainfall events”?
L204: “P95” – is this P95 of all days or of rain days?
L207–204: It’s not clear to me whether events need to be >=2 d in duration, or they can also be 1-d events. If it’s only >=2 d, can you explain why? If it’s not, make sure this is what readers understand from the text.
L227: What is the definition of “measurable precipitation”?
Sect. 2.6.4: Can you please describe a bit the models? Why these specific models and what are the differences between the different models?
L244–245: Did all these parameters go into the clustering algorithm? If so, what are the parameters in “etc.”? If not, which parameters went inside?
Section 2.8 states that counterfactual experiments scale groups of variables including P, E, TCWV, and, when relevant, T2M and sp. However, to my understanding, the provided code appears to perturb only precipitation (tp). Please clarify whether this reflects the implementation used for the reported results, and if not, describe how multi-variable, cluster-specific perturbations were actually applied.
L313–320: This section repeats what is already written in the data and methods. Please consider removing it. The same holds for L332–334.
Code availability: In the codes provided, the function that calculates advection is missing “omitted for brevity”, however this is a core part of the study and would be beneficial for the readers. I suggest you upload it.
Similarly, the line “ W_adv = compute_column_water_vapor(ds_era5['q'].isel(time=0), ...) # Simplified” is missing some parts.
AI use: Please make sure to comply with the AI use statement required by Copernicus: “Should you have used AI tools to generate (parts of) your manuscript, please describe the usage either in the Methods section or the Acknowledgements.”
Technical corrections
L129: Consider replacing the colon with a reference to Table 3.
L214: Missing period at the end of the sentence.
L276: Where is “P’” here?