Self-limiting precipitation recycling during event-scale wet episodes in north-western China&rsquo;s semi-arid transition zone

Li, Ruolin; Cui, Yang; Feng, Qi

doi:10.5194/egusphere-2025-5314

Preprints

https://doi.org/10.5194/egusphere-2025-5314

Preprints

24 Nov 2025

| 24 Nov 2025

Self-limiting precipitation recycling during event-scale wet episodes in north-western China’s semi-arid transition zone

Ruolin Li, Yang Cui, and Qi Feng

Abstract. Recent decades have seen a marked climatic wetting across north-western China’s semi-arid transition zone, yet the extent to which this intensification arises from local land–atmosphere feedback or from external moisture inflow remains uncertain. Using hourly station observations and ERA5 reanalysis for 2020–2024, we develop a process-based framework that links event-scale rainfall variability to the dynamic behaviour of precipitation recycling. Hourly recycling rates are derived through a two-reservoir moisture-tracking scheme, and multi-day wet episodes are identified to isolate transient feedback processes. Machine-learning surrogate models (CatBoost, XGBoost, ExtraTrees, Gradient Boosting) emulate the recycling rate as a function of meteorological conditions under a leave-one-event-out design, enabling counterfactual perturbation experiments in which precipitation intensity and key moisture variables are systematically scaled. A dimensionless percentage elasticity (η%) is introduced to quantify the relative response of recycled precipitation to rainfall enhancement. Results from nine regional events reveal a robustly negative η%, indicating that additional rainfall often suppresses rather than reinforces local recycling efficiency—a self-limiting wetting feedback. Cluster analysis distinguishes three physical regimes: (1) cool–moist episodes with initially strong but rapidly saturating coupling, (2) advection-dominated events with nearly linear, externally controlled responses, and (3) warm–dry episodes with weak coupling and near-zero elasticity. Collectively, these findings depict the atmosphere above north-western China as a self-stabilizing hydrological system in which increased precipitation does not necessarily strengthen, and may even weaken, local moisture recycling. The proposed event-scale elasticity framework provides a transferable diagnostic for short-term land–atmosphere coupling and for assessing hydrological resilience in arid and semi-arid regions.

Received: 31 Oct 2025 – Discussion started: 24 Nov 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Ruolin Li, Yang Cui, and Qi Feng

Status: closed

RC1:
'Comment on egusphere-2025-5314', Anonymous Referee #1, 03 Feb 2026

General comments

The manuscript “Self-limiting precipitation recycling during event-scale wet episodes in north-western China’s semi-arid transition zone” by Li et al., addresses a timely and relevant topic: precipitation recycling and land–atmosphere feedbacks in a dryland environment. The question of precipitation efficiency during wet episodes in semi-arid climates is scientifically important, and the precipitation efficiency topic in general is at the focus of an ongoing debate. This means that studies addressing it, specifically in dryland regions, are needed.

However, the study, in its current form, appears methodologically over-engineered relative to the available data, and the overall design feels contrived rather than data-driven. A very elaborate modelling and machine-learning framework is applied to a minimal observational basis (five years of data from five stations, yielding only nine events). The resulting analyses rely heavily on this extremely small sample. As a result, the conclusions one can draw are determined by methodological choices rather than robustly supported by large data sets.

Because the dataset is too limited to support the complexity of the framework and the generality of the interpretations, I do not believe the manuscript, in its current form, meets the standards of HESS.

Major comments:

1) The record length for the rainfall stations is very short. While this fact is presented in the introduction, it may still be too short to represent climatology of the region. This is a major problem in (a) the analyses presented in Fig. 1 (they are called climatological, but actually represent 5 years). Additionally, elasticity results and their interpretation are implicitly discussed in terms of general wetting-feedback behaviour in north-western China. However, because both the event sample and the baseline statistics come from a short, recent period, it is unclear how representative these feedback patterns are of longer-term climate conditions, particularly under drier or more variable regimes not captured in the record. The manuscript would benefit from a more explicit discussion of how the limited record length constrains the representativeness of the thresholds, regimes, and inferred feedback behaviours.

2) The methodological framework in the manuscript is very elaborate (moisture tracking + multiple ML surrogates + counterfactual elasticity experiments), yet the observational basis is limited to nine events in a small domain. The manuscript would benefit from clearer justification of why such a complex modelling structure is warranted for this dataset, and from discussion of overfitting risks and extrapolation limits in the surrogate-based perturbation experiments.

3) Related to the above, the description of the study area is very brief and does not adequately explain what makes this region scientifically distinctive. Why was this particular area selected? Beyond noting the presence of topography and low annual precipitation, the manuscript should better characterise the climatological and hydrometeorological context to clarify the broader relevance of the case study. Since, if I’m not mistaken, the study area is not even called by its name, it’s very hard for international readers to grasp what the characteristics of the region are. Could you help the readers?

4) The assumption (L154) that at t0 all atmospheric water is advected clearly disregard the role of previous moisture from the analysis. Could you justify it? Can you quantify how much of the moisture at t0 of the event was actually advected? For example, if you start the same exercise a few days before every event, what is the ratio between advected and local moisture at t0 of the event?

5) Sect. 2.3.3. Precipitation removal is considered by using ERA5 data. However, ERA5 skill in representing precipitation in drylands is inferior compared to other places. What would have happened if you had used measured precipitation to calculate the precipitation sink?

6) Sect. 2.7. What is the meaning of clustering of 9 events, and with 3 clusters? Is it even needed to represent the 9 events in different clusters? And to make an “objective” classification with more parameters going into the analysis compared to the number of events? Also, how did you get the interpretation of the clusters “cool–moist” etc.? The identified “regimes” may reflect sampling of a few specific synoptic situations rather than robust, recurrent hydrometeorological modes of the region. Please describe better why this process is needed and how you interpret its results. Please address issues like the validity of the elasticity parameter evaluated separately for each cluster (with n=2 in one of the clusters, and a leave-one-out design).

Specific comments

L41: “representative” in what way? Similarly, in L80, “key… region” – why?

L44: It is not clear to me what you mean with “coupling”. Is it the coupling between the two days? Also, I don’t understand what this has to do with the confounding effects of complex topography. Please explain.

L80: “In summary” – In my view it’s better to start the last paragraph with a summary of the “why this study is needed” (e.g., in light of the XXX there’s a need to better understand…) without saying “in summary”. It’s a bit strange to read “in summary” when you read the introduction.

L77–79: It seems to me the description of where you can find the explanation of how you derive elasticity is not needed here. The next paragraph describes it again. If needed, you can stick with “(Section 2)” rather than the full explanation.

Figure 1: (a) The basemap in this panel is not clear. Please replace with something more informative. I would suggest overlying mean annual precipitation as contours over topography (colours), because these are the two factors that are mentioned in the text. Then the stations can be given with a label showing the P95 precipitation if needed, but make sure this label does not conflict with the P95 number.

L93: “1100 m asl” is not a topography gradient, but rather elevation. If the stress is on the gradient please describe it (e.g., X m drop over a distance of Y km from west to east, or something in that spirit).

L115: “Temporal gaps” – are there any in the data? If so, especially if they are significant, please describe data availability throughout the study period.

L117–118: It seems to me like the RMSE of ERA5’s precipitation is generally larger than P95. Is it still reasonable to say it “reproduces… magnitude of rainfall events”?

L204: “P95” – is this P95 of all days or of rain days?

L207–204: It’s not clear to me whether events need to be >=2 d in duration, or they can also be 1-d events. If it’s only >=2 d, can you explain why? If it’s not, make sure this is what readers understand from the text.

L227: What is the definition of “measurable precipitation”?

Sect. 2.6.4: Can you please describe a bit the models? Why these specific models and what are the differences between the different models?

L244–245: Did all these parameters go into the clustering algorithm? If so, what are the parameters in “etc.”? If not, which parameters went inside?

Section 2.8 states that counterfactual experiments scale groups of variables including P, E, TCWV, and, when relevant, T2M and sp. However, to my understanding, the provided code appears to perturb only precipitation (tp). Please clarify whether this reflects the implementation used for the reported results, and if not, describe how multi-variable, cluster-specific perturbations were actually applied.

L313–320: This section repeats what is already written in the data and methods. Please consider removing it. The same holds for L332–334.
Code availability: In the codes provided, the function that calculates advection is missing “omitted for brevity”, however this is a core part of the study and would be beneficial for the readers. I suggest you upload it.

Similarly, the line “ W_adv = compute_column_water_vapor(ds_era5['q'].isel(time=0), ...) # Simplified” is missing some parts.
AI use: Please make sure to comply with the AI use statement required by Copernicus: “Should you have used AI tools to generate (parts of) your manuscript, please describe the usage either in the Methods section or the Acknowledgements.”

Technical corrections

L129: Consider replacing the colon with a reference to Table 3.

L214: Missing period at the end of the sentence.

L276: Where is “P’” here?

Citation: https://doi.org/10.5194/egusphere-2025-5314-RC1
- AC1: 'Reply on RC1', Ruolin Li, 06 Feb 2026
  
  On behalf of all co-authors, we sincerely thank Referee #1 for the careful, thorough, and constructive evaluation of our manuscript. We particularly appreciate the clear articulation of the central concerns regarding the balance between methodological complexity and the observational basis of our study.
  Regarding the scope and representativeness of the analysis, we fully acknowledge that the 5-year record and nine identified events constrain the representativeness and generality of the inferred feedback behaviour. Our intention was not to derive climatological regimes or broadly applicable thresholds for north-western China. Rather, the study was designed as an exploratory, event-focused analysis—using a small set of well-defined wet episodes as diagnostic case studies to examine the internal consistency and potential self-limiting characteristics of precipitation recycling under specific atmospheric and land-surface conditions.
  We recognize that this intent is not sufficiently clear in the current version. In a revised manuscript, we will explicitly narrow the scope of interpretation, clarify the exploratory and conditional nature of the analytical framework, and more transparently discuss the limitations associated with the short record, small sample size, and extrapolation constraints.
  On the methodological framework and analytical choices: We agree that stronger justification and clearer positioning are required given the limited sample size. The manuscript will be revised to emphasize methodological transparency, uncertainty quantification, and the conditional nature of the counterfactual experiments. We will also reconsider the role and presentation of certain analytical components, with particular attention to their interpretability and statistical robustness under the available data constraints.
  On the study domain and key assumptions: We will substantially strengthen the description and scientific motivation of the study domain and will more explicitly discuss the implications of key methodological choices—including data selection and initial-condition assumptions—for the interpretation of the recycling estimates.
  We believe that addressing these concerns through structural revision and clearer framing—rather than expanding the empirical scope—will better align the manuscript with the standards and expectations of HESS. A comprehensive point-by-point response with a revised manuscript will be submitted following the discussion period.
  We thank the referee again for helping us sharpen the focus and interpretation of this study.
  
  Citation: https://doi.org/10.5194/egusphere-2025-5314-AC1
RC2:
'Comment on egusphere-2025-5314', Anonymous Referee #2, 15 Apr 2026
In this manuscript, the authors use ERA5 and precipitation data to study the impact of precipitation recycling during high-rainfall episodes in north-western China. Although the topic of changes in moisture recycling is interesting and societally relevant, the authors make several methodological choices that are, in my opinion, not justified. In addition, the small observation data set (9 events over 5 years) limits drawing robust conclusions. I therefore believe that the current manuscript is not suitable for publication in HESS. Please see below my specific concerns:
Aim of the study: In the introduction, the authors mention ‘it is still unclear whether such intensification originates from stronger local land-atmosphere feedbacks or from enhanced advection’ (L35) and ‘… to investigate how precipitation recycling responds during individual rainfall episodes under an overall wetter climatic background’ (L45). However, in this study, the authors only study rainfall events of the last 5 years, which makes it difficult to determine how the impact of moisture recycling and advection changed compared to a previously ‘dryer’ climate. Could the authors explain how the used methodology provides insights into the questions/problems mentioned in the introduction? In addition, the ERA5 data that is used in the study provides a nice opportunity to show the wetting trend in this specific study region, rather than relying on existing literature.

Choice of study area: Why did the authors specifically choose this (rather small) study area? Is rainfall data only available for this study region? How representative is it for other regions within north-western China?

Hourly calculation of precipitation recycling rate (section 2.3): the authors make several assumptions in the moisture tracking framework that are, in my opinion, not justified. Could the authors justify the following assumptions (preferably with existing literature):
The recycling rate (RR) is calculated at hourly scale and per ERA5 grid cell (25x25km). In general, the recycling rate depends on the size of the study area, because larger regions are more likely to contain precipitation from locally evaporated moisture. At a global scale, the moisture recycling rate is 1, while at a point source it is 0 (Eltahir & Bras, 1996). At a scale of 25 km, evaporated moisture is very likely to precipitate outside of the region, which makes the RR uncertain. In addition, evaporation does not turn instantly into precipitation. Usually, RR is calculated for longer time scales, so an hourly RR should be better justified.

It is not explicitly mentioned in the methodology, but if all ERA5 137 vertical levels are used, the vertical column reaches up to about 80 km. At the same time, equation 2 and 3 (L160) implicitly assume a well-mixed vertical layer, which is usually only holds within the atmospheric boundary layer (up to about 1 km). The assumption that precipitation is removed proportionally to the layer-wise mass should therefore be better justified.

At the same time, evaporation is only added to the lowest pressure layer of Wloc. If advection is only horizontal, which appears to be the case based on equation 5, the moisture from evaporation is not distributed vertically. As a result, precipitation removes moisture from all vertical layers, but evaporation only add to the model layer closest to the surface.

The authors assume that Wloc=0 at t=0 (L154). This suggests that at the start of the event, the atmosphere does not contain locally evaporated moisture. It seems unlikely to me that there was no land-atmosphere coupling before the onset of the event. Could the authors justify why they make this assumption, and why a no spin-up period was used?

Machine learning surrogate modelling and cluster analysis: The methodology is rather complex for the amount of available data. Training a machine learning model with many variables on just nine events creates a high risk over overfitting and little statistical significance. A similar reasoning can be provided for the cluster analysis. Each cluster has only two or three events. In addition, it would be useful to provide some literature on why there would be different regimes of land-atmosphere coupling under high precipitation events. The methodology mentions ‘the optimal cluster number K=3 was chosen based on silhouette analysis and physical interpretability’ (L247). How was this physical interpretability established. Also, 3 clusters with 9 points are very unstable. What if we add or remove one point?

Alternative methodologies: Could the authors motivate the choice for the methodology? Maybe other methods such as atmospheric models with tracers (e.g. WRF-WVT (Insua-Costa & Miguez-Macho, 2018)) or Lagrangian transport modelling (e.g. FLEXPART (Bakels et al., 2024)) would be more suitable for this case. Otherwise, the authors could extend the nine cases by covering the full ERA5 data record. This would require retrieving also the precipitation from ERA5, but the authors mention that ERA5 is suitable in this case study (L117). This would make the machine learning a bit more robust. In addition, it would also be informative to simply plot time series of precipitation, evaporation, moisture recycling rate and advection. This way, the physical processes may be simpler to grasp.
Citation: https://doi.org/10.5194/egusphere-2025-5314-RC2
- AC2: 'Reply on RC2', Ruolin Li, 20 Apr 2026
  
  Response to Referee #2
  We thank Referee #2 for the critical and constructive review. The comments identify real tensions in the manuscript — most centrally, the gap between the evidential scope of nine event-scale cases and the breadth of claims in the Abstract, Introduction, Discussion, and Conclusions. We accept these concerns and present below a concrete revision plan. Our overarching approach is to reframe the manuscript as an exploratory, event-scale diagnostic study of local moisture recycling during individual high-precipitation episodes, and to systematically reduce overreaching language throughout, including claims relating to regional wetting mechanisms, transferable generalisations ("transferable diagnostic," "self-stabilizing hydrological system"), and future precipitation trend attribution.
  Comment 1: Aim of the study
  Response: We agree that a five-year window of events cannot support comparative attribution across different climatic states. The current Introduction (L35, L45) implies the paper addresses how moisture recycling and advection have changed under a wetter climatic background — a question our data cannot formally test.
  In the revision, we will make a clear separation between background motivation and research object. Specifically, we will delete all introductory sentences that position the paper as explaining regional wetting mechanisms or imply a comparison with a historically drier baseline. The central research question will be recalibrated to: within a set of high-precipitation episodes over a semi-arid transition corridor, how does the local moisture recycling contribution evolve during the event, and does it exhibit a self-limiting character? To provide the climatic context that motivated this event-period selection, we will add a brief regional hydroclimatic context figure using ERA5 long-term data, labelled explicitly as motivation rather than an object of inference. Corresponding revisions to the Abstract, Discussion, and Conclusions will remove language that attributes regional wetting trends or projects future behaviour based on the nine-event sample.
  Comment 2: Choice of study area
  Response: The current wording implying regional representativeness is overstated and we will revise it. The study area will be justified on four criteria of diagnostic suitability: availability of high-temporal-resolution station observations for validation; spatial coherence of the identified precipitation events; the corridor's suitability for examining the co-occurrence of local evaporation signals and external moisture transport under event conditions; and comparatively limited orographic complexity relative to adjacent sub-regions. The revised text will explicitly restrict the scope of inference to this corridor and acknowledge that generalisation to broader north-western China lies beyond the present study.
  Comment 3a: Spatial and temporal scale of hourly RR
  Response: We accept that recycling rate is scale-dependent and that absolute RR values at the 0.25° grid scale carry substantial uncertainty (Eltahir & Bras, 1996).
  We will clarify the intended interpretation in the revision. The hourly grid-scale RR is not designed as a scale-invariant absolute recycling fraction in the sense of Eltahir & Bras (1996). It is used as an event-scale diagnostic index of local-source contribution to precipitation, derived from a simplified column partitioning framework, and its interpretive value lies in relative variability across events and time steps rather than in absolute magnitude. We note that RR is defined as the fraction of total precipitation flux attributable to locally evaporated moisture — consistent with the formulation in Section 2.3 — and this will be stated unambiguously in the revision.
  Following the referee's suggestion, we will add a sensitivity analysis in which hourly RR values are aggregated to 3-hourly and daily means, and will report whether the event-level ordering and sign of η% remain stable under this aggregation.
  Comment 3b: Vertical layers and well-mixed assumption
  Response: We wish to clarify the vertical data structure. The model state variables W_local and W_advected are defined on the ERA5 pressure-level grid (dimensions: time × pressure level × latitude × longitude), using specific humidity q from pressure-level products. The framework therefore does resolve the atmospheric column by pressure level and does not extend to the ~80 km altitude of ERA5's full native model levels. We will state the data product and the vertical extent used explicitly in the revised Section 2.3.
  That said, the referee's deeper concern stands: the framework does not include explicit vertical mixing between pressure levels. Horizontal advection is computed independently at each level using the corresponding layer wind fields, but there is no vertical transport term. The proportional removal of W_local by precipitation is distributed across levels in proportion to each layer's moisture content, but evaporation enters only the lowest pressure level (see also Comment 3c). We will acknowledge this clearly in the revision — the framework is a layer-resolved horizontal transport scheme with a simplified vertical source structure, not a full tracer model with explicit mixing. A diagnostic figure showing the vertical specific humidity profile from ERA5 during selected events will be added to illustrate the vertical distribution of moisture mass and contextualise the simplification.
  Comment 3c: Evaporation added only to the lowest layer; precipitation removed from all layers
  Response: This is a precise and valid observation. In the current implementation, surface evaporation flux is added exclusively to the lowest pressure level of W_local, while precipitation removes moisture from all levels in proportion to each layer's contribution to column-integrated W_local. This source–sink asymmetry is real and we did not address it adequately in the manuscript.
  We acknowledge this as a simplification that does not represent upward turbulent redistribution of evaporated moisture. To evaluate its effect on the main findings, we will conduct a sensitivity test in which evaporation is redistributed uniformly across levels below 850 hPa, and will report whether the sign and event-level ranking of η% remain robust under this modified source distribution. We will also describe and acknowledge the asymmetry explicitly in the revised Section 2.3, without attributing a specific direction of bias to it, as the net effect on η% depends on the interaction among the source term, horizontal advective fluxes, and proportional precipitation removal.
  Comment 3d: Wloc = 0 at t = 0 — no spin-up
  Response: We confirm that in the current implementation, W_local is initialised to zero at the first time step of each event, with no pre-event integration period. We agree with the referee that this is physically implausible as a representation of total local-origin moisture at event onset. Our intent was to track incremental local-source moisture accumulation during the event itself, but the current text presents this as the full initialisation condition without adequate justification.
  In the revision, we will: (i) rewrite Section 2.3 to clarify that W_local diagnoses event-increment local-source accumulation rather than the total antecedent local-moisture inventory; (ii) introduce a pre-event integration period of several days prior to event onset — the exact duration will be chosen with reference to characteristic atmospheric moisture residence times in the region and will be reported in the revised manuscript; and (iii) present a sensitivity comparison between the original zero-initialisation and the spin-up run, demonstrating whether the sign and relative magnitude of η% are affected. This analysis requires no new data acquisition and will be implemented within the existing computational framework.
  Comment 4: Machine learning surrogate model and cluster analysis
  Response: We fully accept the concern. With nine independent events, overfitting in the surrogate model and statistical instability in the K=3 clustering are genuine limitations that the current manuscript does not adequately acknowledge.
  For the ML surrogate model, we will explicitly reframe it as a response-surface exploration tool. Its role is to characterise how η% varies across the antecedent state space — not to build a generalisable predictive model. We will retain basic model diagnostics as descriptive characterisation of model behaviour, but remove language implying robust validation. Similarly, the counterfactual perturbation analysis (Section 2.8) will be reframed as a heuristic, surrogate-based sensitivity exploration rather than a physically coupled perturbation experiment. Cluster-specific correlation coefficients and p-values computed on sub-samples of two to three events will be removed.
  For the cluster analysis, we will retain the three-group structure as it underlies the existing figures and results narrative, but remove the K-means justification and the associated language ("optimal K=3 based on silhouette analysis," "three physical regimes," "genuine physical regimes rather than statistical artifacts"). In its place, the grouping will be based on pre-defined meteorological threshold criteria — specifically, event-mean large-scale moisture flux magnitude and antecedent near-surface moisture state, both defined independently of the outcome variable η% — and will be described throughout as three descriptive event archetypes rather than statistically derived regimes. This restructuring will require revisions to the relevant Results and Discussion passages, which we are committed to completing.
  Comment 5: Alternative methodologies and additional diagnostics
  Response: We accept that Lagrangian approaches such as FLEXPART (Bakels et al., 2024) and tracer-enabled regional models such as WRF-WVT (Insua-Costa & Miguez-Macho, 2018) offer higher mechanistic fidelity for moisture-source attribution than our Eulerian framework. The current approach was chosen for computational tractability at sub-daily time scales within the existing validation framework. We will make this trade-off explicit in Section 2.3 and identify Lagrangian validation as a priority for future work.
  Regarding expansion to the full ERA5 record: this would unquestionably strengthen statistical inference. However, the present study is built around validation against high-temporal-resolution station observations available for 2020–2024; relaxing this constraint would sacrifice the validation standard on which the analysis rests. The manuscript will be explicitly scoped as nine high-fidelity diagnostic case studies rather than a statistically generalisable regional sample, and multi-decadal extension — ideally combined with Lagrangian moisture tracking — is identified as a natural continuation.
  Following the referee's practical suggestion, we will add event-scale time series figures for precipitation, evaporation, estimated RR, and a moisture transport proxy diagnostic for all nine events. These will improve the physical transparency of the analysis considerably and allow readers to assess event-scale behaviour directly.
  
  Citation: https://doi.org/10.5194/egusphere-2025-5314-AC2

Status: closed

RC1:
'Comment on egusphere-2025-5314', Anonymous Referee #1, 03 Feb 2026

General comments

The manuscript “Self-limiting precipitation recycling during event-scale wet episodes in north-western China’s semi-arid transition zone” by Li et al., addresses a timely and relevant topic: precipitation recycling and land–atmosphere feedbacks in a dryland environment. The question of precipitation efficiency during wet episodes in semi-arid climates is scientifically important, and the precipitation efficiency topic in general is at the focus of an ongoing debate. This means that studies addressing it, specifically in dryland regions, are needed.

However, the study, in its current form, appears methodologically over-engineered relative to the available data, and the overall design feels contrived rather than data-driven. A very elaborate modelling and machine-learning framework is applied to a minimal observational basis (five years of data from five stations, yielding only nine events). The resulting analyses rely heavily on this extremely small sample. As a result, the conclusions one can draw are determined by methodological choices rather than robustly supported by large data sets.

Because the dataset is too limited to support the complexity of the framework and the generality of the interpretations, I do not believe the manuscript, in its current form, meets the standards of HESS.

Major comments:

1) The record length for the rainfall stations is very short. While this fact is presented in the introduction, it may still be too short to represent climatology of the region. This is a major problem in (a) the analyses presented in Fig. 1 (they are called climatological, but actually represent 5 years). Additionally, elasticity results and their interpretation are implicitly discussed in terms of general wetting-feedback behaviour in north-western China. However, because both the event sample and the baseline statistics come from a short, recent period, it is unclear how representative these feedback patterns are of longer-term climate conditions, particularly under drier or more variable regimes not captured in the record. The manuscript would benefit from a more explicit discussion of how the limited record length constrains the representativeness of the thresholds, regimes, and inferred feedback behaviours.

2) The methodological framework in the manuscript is very elaborate (moisture tracking + multiple ML surrogates + counterfactual elasticity experiments), yet the observational basis is limited to nine events in a small domain. The manuscript would benefit from clearer justification of why such a complex modelling structure is warranted for this dataset, and from discussion of overfitting risks and extrapolation limits in the surrogate-based perturbation experiments.

3) Related to the above, the description of the study area is very brief and does not adequately explain what makes this region scientifically distinctive. Why was this particular area selected? Beyond noting the presence of topography and low annual precipitation, the manuscript should better characterise the climatological and hydrometeorological context to clarify the broader relevance of the case study. Since, if I’m not mistaken, the study area is not even called by its name, it’s very hard for international readers to grasp what the characteristics of the region are. Could you help the readers?

4) The assumption (L154) that at t0 all atmospheric water is advected clearly disregard the role of previous moisture from the analysis. Could you justify it? Can you quantify how much of the moisture at t0 of the event was actually advected? For example, if you start the same exercise a few days before every event, what is the ratio between advected and local moisture at t0 of the event?

5) Sect. 2.3.3. Precipitation removal is considered by using ERA5 data. However, ERA5 skill in representing precipitation in drylands is inferior compared to other places. What would have happened if you had used measured precipitation to calculate the precipitation sink?

6) Sect. 2.7. What is the meaning of clustering of 9 events, and with 3 clusters? Is it even needed to represent the 9 events in different clusters? And to make an “objective” classification with more parameters going into the analysis compared to the number of events? Also, how did you get the interpretation of the clusters “cool–moist” etc.? The identified “regimes” may reflect sampling of a few specific synoptic situations rather than robust, recurrent hydrometeorological modes of the region. Please describe better why this process is needed and how you interpret its results. Please address issues like the validity of the elasticity parameter evaluated separately for each cluster (with n=2 in one of the clusters, and a leave-one-out design).

Specific comments

L41: “representative” in what way? Similarly, in L80, “key… region” – why?

L44: It is not clear to me what you mean with “coupling”. Is it the coupling between the two days? Also, I don’t understand what this has to do with the confounding effects of complex topography. Please explain.

L80: “In summary” – In my view it’s better to start the last paragraph with a summary of the “why this study is needed” (e.g., in light of the XXX there’s a need to better understand…) without saying “in summary”. It’s a bit strange to read “in summary” when you read the introduction.

L77–79: It seems to me the description of where you can find the explanation of how you derive elasticity is not needed here. The next paragraph describes it again. If needed, you can stick with “(Section 2)” rather than the full explanation.

Figure 1: (a) The basemap in this panel is not clear. Please replace with something more informative. I would suggest overlying mean annual precipitation as contours over topography (colours), because these are the two factors that are mentioned in the text. Then the stations can be given with a label showing the P95 precipitation if needed, but make sure this label does not conflict with the P95 number.

L93: “1100 m asl” is not a topography gradient, but rather elevation. If the stress is on the gradient please describe it (e.g., X m drop over a distance of Y km from west to east, or something in that spirit).

L115: “Temporal gaps” – are there any in the data? If so, especially if they are significant, please describe data availability throughout the study period.

L117–118: It seems to me like the RMSE of ERA5’s precipitation is generally larger than P95. Is it still reasonable to say it “reproduces… magnitude of rainfall events”?

L204: “P95” – is this P95 of all days or of rain days?

L207–204: It’s not clear to me whether events need to be >=2 d in duration, or they can also be 1-d events. If it’s only >=2 d, can you explain why? If it’s not, make sure this is what readers understand from the text.

L227: What is the definition of “measurable precipitation”?

Sect. 2.6.4: Can you please describe a bit the models? Why these specific models and what are the differences between the different models?

L244–245: Did all these parameters go into the clustering algorithm? If so, what are the parameters in “etc.”? If not, which parameters went inside?

Section 2.8 states that counterfactual experiments scale groups of variables including P, E, TCWV, and, when relevant, T2M and sp. However, to my understanding, the provided code appears to perturb only precipitation (tp). Please clarify whether this reflects the implementation used for the reported results, and if not, describe how multi-variable, cluster-specific perturbations were actually applied.

L313–320: This section repeats what is already written in the data and methods. Please consider removing it. The same holds for L332–334.
Code availability: In the codes provided, the function that calculates advection is missing “omitted for brevity”, however this is a core part of the study and would be beneficial for the readers. I suggest you upload it.

Similarly, the line “ W_adv = compute_column_water_vapor(ds_era5['q'].isel(time=0), ...) # Simplified” is missing some parts.
AI use: Please make sure to comply with the AI use statement required by Copernicus: “Should you have used AI tools to generate (parts of) your manuscript, please describe the usage either in the Methods section or the Acknowledgements.”

Technical corrections

L129: Consider replacing the colon with a reference to Table 3.

L214: Missing period at the end of the sentence.

L276: Where is “P’” here?

Citation: https://doi.org/10.5194/egusphere-2025-5314-RC1
- AC1: 'Reply on RC1', Ruolin Li, 06 Feb 2026
  
  On behalf of all co-authors, we sincerely thank Referee #1 for the careful, thorough, and constructive evaluation of our manuscript. We particularly appreciate the clear articulation of the central concerns regarding the balance between methodological complexity and the observational basis of our study.
  Regarding the scope and representativeness of the analysis, we fully acknowledge that the 5-year record and nine identified events constrain the representativeness and generality of the inferred feedback behaviour. Our intention was not to derive climatological regimes or broadly applicable thresholds for north-western China. Rather, the study was designed as an exploratory, event-focused analysis—using a small set of well-defined wet episodes as diagnostic case studies to examine the internal consistency and potential self-limiting characteristics of precipitation recycling under specific atmospheric and land-surface conditions.
  We recognize that this intent is not sufficiently clear in the current version. In a revised manuscript, we will explicitly narrow the scope of interpretation, clarify the exploratory and conditional nature of the analytical framework, and more transparently discuss the limitations associated with the short record, small sample size, and extrapolation constraints.
  On the methodological framework and analytical choices: We agree that stronger justification and clearer positioning are required given the limited sample size. The manuscript will be revised to emphasize methodological transparency, uncertainty quantification, and the conditional nature of the counterfactual experiments. We will also reconsider the role and presentation of certain analytical components, with particular attention to their interpretability and statistical robustness under the available data constraints.
  On the study domain and key assumptions: We will substantially strengthen the description and scientific motivation of the study domain and will more explicitly discuss the implications of key methodological choices—including data selection and initial-condition assumptions—for the interpretation of the recycling estimates.
  We believe that addressing these concerns through structural revision and clearer framing—rather than expanding the empirical scope—will better align the manuscript with the standards and expectations of HESS. A comprehensive point-by-point response with a revised manuscript will be submitted following the discussion period.
  We thank the referee again for helping us sharpen the focus and interpretation of this study.
  
  Citation: https://doi.org/10.5194/egusphere-2025-5314-AC1
RC2:
'Comment on egusphere-2025-5314', Anonymous Referee #2, 15 Apr 2026
In this manuscript, the authors use ERA5 and precipitation data to study the impact of precipitation recycling during high-rainfall episodes in north-western China. Although the topic of changes in moisture recycling is interesting and societally relevant, the authors make several methodological choices that are, in my opinion, not justified. In addition, the small observation data set (9 events over 5 years) limits drawing robust conclusions. I therefore believe that the current manuscript is not suitable for publication in HESS. Please see below my specific concerns:
Aim of the study: In the introduction, the authors mention ‘it is still unclear whether such intensification originates from stronger local land-atmosphere feedbacks or from enhanced advection’ (L35) and ‘… to investigate how precipitation recycling responds during individual rainfall episodes under an overall wetter climatic background’ (L45). However, in this study, the authors only study rainfall events of the last 5 years, which makes it difficult to determine how the impact of moisture recycling and advection changed compared to a previously ‘dryer’ climate. Could the authors explain how the used methodology provides insights into the questions/problems mentioned in the introduction? In addition, the ERA5 data that is used in the study provides a nice opportunity to show the wetting trend in this specific study region, rather than relying on existing literature.

Choice of study area: Why did the authors specifically choose this (rather small) study area? Is rainfall data only available for this study region? How representative is it for other regions within north-western China?

Hourly calculation of precipitation recycling rate (section 2.3): the authors make several assumptions in the moisture tracking framework that are, in my opinion, not justified. Could the authors justify the following assumptions (preferably with existing literature):
The recycling rate (RR) is calculated at hourly scale and per ERA5 grid cell (25x25km). In general, the recycling rate depends on the size of the study area, because larger regions are more likely to contain precipitation from locally evaporated moisture. At a global scale, the moisture recycling rate is 1, while at a point source it is 0 (Eltahir & Bras, 1996). At a scale of 25 km, evaporated moisture is very likely to precipitate outside of the region, which makes the RR uncertain. In addition, evaporation does not turn instantly into precipitation. Usually, RR is calculated for longer time scales, so an hourly RR should be better justified.

It is not explicitly mentioned in the methodology, but if all ERA5 137 vertical levels are used, the vertical column reaches up to about 80 km. At the same time, equation 2 and 3 (L160) implicitly assume a well-mixed vertical layer, which is usually only holds within the atmospheric boundary layer (up to about 1 km). The assumption that precipitation is removed proportionally to the layer-wise mass should therefore be better justified.

At the same time, evaporation is only added to the lowest pressure layer of Wloc. If advection is only horizontal, which appears to be the case based on equation 5, the moisture from evaporation is not distributed vertically. As a result, precipitation removes moisture from all vertical layers, but evaporation only add to the model layer closest to the surface.

The authors assume that Wloc=0 at t=0 (L154). This suggests that at the start of the event, the atmosphere does not contain locally evaporated moisture. It seems unlikely to me that there was no land-atmosphere coupling before the onset of the event. Could the authors justify why they make this assumption, and why a no spin-up period was used?

Machine learning surrogate modelling and cluster analysis: The methodology is rather complex for the amount of available data. Training a machine learning model with many variables on just nine events creates a high risk over overfitting and little statistical significance. A similar reasoning can be provided for the cluster analysis. Each cluster has only two or three events. In addition, it would be useful to provide some literature on why there would be different regimes of land-atmosphere coupling under high precipitation events. The methodology mentions ‘the optimal cluster number K=3 was chosen based on silhouette analysis and physical interpretability’ (L247). How was this physical interpretability established. Also, 3 clusters with 9 points are very unstable. What if we add or remove one point?

Alternative methodologies: Could the authors motivate the choice for the methodology? Maybe other methods such as atmospheric models with tracers (e.g. WRF-WVT (Insua-Costa & Miguez-Macho, 2018)) or Lagrangian transport modelling (e.g. FLEXPART (Bakels et al., 2024)) would be more suitable for this case. Otherwise, the authors could extend the nine cases by covering the full ERA5 data record. This would require retrieving also the precipitation from ERA5, but the authors mention that ERA5 is suitable in this case study (L117). This would make the machine learning a bit more robust. In addition, it would also be informative to simply plot time series of precipitation, evaporation, moisture recycling rate and advection. This way, the physical processes may be simpler to grasp.
Citation: https://doi.org/10.5194/egusphere-2025-5314-RC2
- AC2: 'Reply on RC2', Ruolin Li, 20 Apr 2026
  
  Response to Referee #2
  We thank Referee #2 for the critical and constructive review. The comments identify real tensions in the manuscript — most centrally, the gap between the evidential scope of nine event-scale cases and the breadth of claims in the Abstract, Introduction, Discussion, and Conclusions. We accept these concerns and present below a concrete revision plan. Our overarching approach is to reframe the manuscript as an exploratory, event-scale diagnostic study of local moisture recycling during individual high-precipitation episodes, and to systematically reduce overreaching language throughout, including claims relating to regional wetting mechanisms, transferable generalisations ("transferable diagnostic," "self-stabilizing hydrological system"), and future precipitation trend attribution.
  Comment 1: Aim of the study
  Response: We agree that a five-year window of events cannot support comparative attribution across different climatic states. The current Introduction (L35, L45) implies the paper addresses how moisture recycling and advection have changed under a wetter climatic background — a question our data cannot formally test.
  In the revision, we will make a clear separation between background motivation and research object. Specifically, we will delete all introductory sentences that position the paper as explaining regional wetting mechanisms or imply a comparison with a historically drier baseline. The central research question will be recalibrated to: within a set of high-precipitation episodes over a semi-arid transition corridor, how does the local moisture recycling contribution evolve during the event, and does it exhibit a self-limiting character? To provide the climatic context that motivated this event-period selection, we will add a brief regional hydroclimatic context figure using ERA5 long-term data, labelled explicitly as motivation rather than an object of inference. Corresponding revisions to the Abstract, Discussion, and Conclusions will remove language that attributes regional wetting trends or projects future behaviour based on the nine-event sample.
  Comment 2: Choice of study area
  Response: The current wording implying regional representativeness is overstated and we will revise it. The study area will be justified on four criteria of diagnostic suitability: availability of high-temporal-resolution station observations for validation; spatial coherence of the identified precipitation events; the corridor's suitability for examining the co-occurrence of local evaporation signals and external moisture transport under event conditions; and comparatively limited orographic complexity relative to adjacent sub-regions. The revised text will explicitly restrict the scope of inference to this corridor and acknowledge that generalisation to broader north-western China lies beyond the present study.
  Comment 3a: Spatial and temporal scale of hourly RR
  Response: We accept that recycling rate is scale-dependent and that absolute RR values at the 0.25° grid scale carry substantial uncertainty (Eltahir & Bras, 1996).
  We will clarify the intended interpretation in the revision. The hourly grid-scale RR is not designed as a scale-invariant absolute recycling fraction in the sense of Eltahir & Bras (1996). It is used as an event-scale diagnostic index of local-source contribution to precipitation, derived from a simplified column partitioning framework, and its interpretive value lies in relative variability across events and time steps rather than in absolute magnitude. We note that RR is defined as the fraction of total precipitation flux attributable to locally evaporated moisture — consistent with the formulation in Section 2.3 — and this will be stated unambiguously in the revision.
  Following the referee's suggestion, we will add a sensitivity analysis in which hourly RR values are aggregated to 3-hourly and daily means, and will report whether the event-level ordering and sign of η% remain stable under this aggregation.
  Comment 3b: Vertical layers and well-mixed assumption
  Response: We wish to clarify the vertical data structure. The model state variables W_local and W_advected are defined on the ERA5 pressure-level grid (dimensions: time × pressure level × latitude × longitude), using specific humidity q from pressure-level products. The framework therefore does resolve the atmospheric column by pressure level and does not extend to the ~80 km altitude of ERA5's full native model levels. We will state the data product and the vertical extent used explicitly in the revised Section 2.3.
  That said, the referee's deeper concern stands: the framework does not include explicit vertical mixing between pressure levels. Horizontal advection is computed independently at each level using the corresponding layer wind fields, but there is no vertical transport term. The proportional removal of W_local by precipitation is distributed across levels in proportion to each layer's moisture content, but evaporation enters only the lowest pressure level (see also Comment 3c). We will acknowledge this clearly in the revision — the framework is a layer-resolved horizontal transport scheme with a simplified vertical source structure, not a full tracer model with explicit mixing. A diagnostic figure showing the vertical specific humidity profile from ERA5 during selected events will be added to illustrate the vertical distribution of moisture mass and contextualise the simplification.
  Comment 3c: Evaporation added only to the lowest layer; precipitation removed from all layers
  Response: This is a precise and valid observation. In the current implementation, surface evaporation flux is added exclusively to the lowest pressure level of W_local, while precipitation removes moisture from all levels in proportion to each layer's contribution to column-integrated W_local. This source–sink asymmetry is real and we did not address it adequately in the manuscript.
  We acknowledge this as a simplification that does not represent upward turbulent redistribution of evaporated moisture. To evaluate its effect on the main findings, we will conduct a sensitivity test in which evaporation is redistributed uniformly across levels below 850 hPa, and will report whether the sign and event-level ranking of η% remain robust under this modified source distribution. We will also describe and acknowledge the asymmetry explicitly in the revised Section 2.3, without attributing a specific direction of bias to it, as the net effect on η% depends on the interaction among the source term, horizontal advective fluxes, and proportional precipitation removal.
  Comment 3d: Wloc = 0 at t = 0 — no spin-up
  Response: We confirm that in the current implementation, W_local is initialised to zero at the first time step of each event, with no pre-event integration period. We agree with the referee that this is physically implausible as a representation of total local-origin moisture at event onset. Our intent was to track incremental local-source moisture accumulation during the event itself, but the current text presents this as the full initialisation condition without adequate justification.
  In the revision, we will: (i) rewrite Section 2.3 to clarify that W_local diagnoses event-increment local-source accumulation rather than the total antecedent local-moisture inventory; (ii) introduce a pre-event integration period of several days prior to event onset — the exact duration will be chosen with reference to characteristic atmospheric moisture residence times in the region and will be reported in the revised manuscript; and (iii) present a sensitivity comparison between the original zero-initialisation and the spin-up run, demonstrating whether the sign and relative magnitude of η% are affected. This analysis requires no new data acquisition and will be implemented within the existing computational framework.
  Comment 4: Machine learning surrogate model and cluster analysis
  Response: We fully accept the concern. With nine independent events, overfitting in the surrogate model and statistical instability in the K=3 clustering are genuine limitations that the current manuscript does not adequately acknowledge.
  For the ML surrogate model, we will explicitly reframe it as a response-surface exploration tool. Its role is to characterise how η% varies across the antecedent state space — not to build a generalisable predictive model. We will retain basic model diagnostics as descriptive characterisation of model behaviour, but remove language implying robust validation. Similarly, the counterfactual perturbation analysis (Section 2.8) will be reframed as a heuristic, surrogate-based sensitivity exploration rather than a physically coupled perturbation experiment. Cluster-specific correlation coefficients and p-values computed on sub-samples of two to three events will be removed.
  For the cluster analysis, we will retain the three-group structure as it underlies the existing figures and results narrative, but remove the K-means justification and the associated language ("optimal K=3 based on silhouette analysis," "three physical regimes," "genuine physical regimes rather than statistical artifacts"). In its place, the grouping will be based on pre-defined meteorological threshold criteria — specifically, event-mean large-scale moisture flux magnitude and antecedent near-surface moisture state, both defined independently of the outcome variable η% — and will be described throughout as three descriptive event archetypes rather than statistically derived regimes. This restructuring will require revisions to the relevant Results and Discussion passages, which we are committed to completing.
  Comment 5: Alternative methodologies and additional diagnostics
  Response: We accept that Lagrangian approaches such as FLEXPART (Bakels et al., 2024) and tracer-enabled regional models such as WRF-WVT (Insua-Costa & Miguez-Macho, 2018) offer higher mechanistic fidelity for moisture-source attribution than our Eulerian framework. The current approach was chosen for computational tractability at sub-daily time scales within the existing validation framework. We will make this trade-off explicit in Section 2.3 and identify Lagrangian validation as a priority for future work.
  Regarding expansion to the full ERA5 record: this would unquestionably strengthen statistical inference. However, the present study is built around validation against high-temporal-resolution station observations available for 2020–2024; relaxing this constraint would sacrifice the validation standard on which the analysis rests. The manuscript will be explicitly scoped as nine high-fidelity diagnostic case studies rather than a statistically generalisable regional sample, and multi-decadal extension — ideally combined with Lagrangian moisture tracking — is identified as a natural continuation.
  Following the referee's practical suggestion, we will add event-scale time series figures for precipitation, evaporation, estimated RR, and a moisture transport proxy diagnostic for all nine events. These will improve the physical transparency of the analysis considerably and allow readers to assess event-scale behaviour directly.
  
  Citation: https://doi.org/10.5194/egusphere-2025-5314-AC2

Ruolin Li, Yang Cui, and Qi Feng

Viewed

Total article views: 2,117 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
1,301	658	158	2,117	1,277	721

HTML: 1,301
PDF: 658
XML: 158
Total: 2,117
BibTeX: 1,277
EndNote: 721

Views and downloads (calculated since 24 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	365	55	30	450
Dec 2025	240	99	30	369
Jan 2026	170	145	30	345
Feb 2026	273	127	39	439
Mar 2026	177	177	22	376
Apr 2026	71	47	6	124
May 2026	5	8	1	14

Cumulative views and downloads (calculated since 24 Nov 2025)

Month	HTML	PDF	XML	Total
Nov 2025	365	55	30	450
Dec 2025	240	99	30	369
Jan 2026	170	145	30	345
Feb 2026	273	127	39	439
Mar 2026	177	177	22	376
Apr 2026	71	47	6	124
May 2026	5	8	1	14

Viewed (geographical distribution)

Total article views: 2,091 (including HTML, PDF, and XML) Thereof 2,091 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 10 May 2026

Short summary

Rainfall can recycle within the atmosphere, meaning that some of the water that falls as rain evaporates and falls again nearby. This study explores how such recycling behaves during short wet periods in north-western China’s semi-arid region. Using weather data and machine learning, we found that stronger rain does not endlessly increase local recycling. This self-limiting feedback helps keep the regional water cycle balanced under a wetter climate.


Total:	0
HTML:	0
PDF:	0
XML:	0