Synchronization frequency analysis and stochastic simulation of multisite flood flows based on the complicated vine-copula structure

Yu, Xinting; Guo, Yuxue; Chen, Siwei; Gu, Haiting; Xu, Yue-Ping

doi:10.5194/egusphere-2024-2266

Preprints

https://doi.org/10.5194/egusphere-2024-2266

Preprints

12 Aug 2024

| 12 Aug 2024

Synchronization frequency analysis and stochastic simulation of multisite flood flows based on the complicated vine-copula structure

Xinting Yu, Yuxue Guo, Siwei Chen, Haiting Gu, and Yue-Ping Xu

Abstract. Accurately modeling and predicting flood flows across multiple sites within a watershed presents significant challenges due to potential issues of insufficient accuracy and excessive computational demands in existing methodologies. In response to these challenges, this study introduces a novel approach centered around the use of vine copula models, termed RDV-Copula (Reduced-dimension vine copula construction approach). The core of this methodology lies in its ability to integrate and extract complex data information before constructing the copula function, thus preserving the intricate spatial-temporal connections among multiple sites while substantially reducing the vine copula's complexity. This study performs a synchronization frequency analysis using the devised copula models, offering valuable insights into flood encounter probabilities. Additionally, the innovative approach undergoes validation by comparison with three benchmark models, which vary in dimensions and nature of variable interactions. Furthermore, the study conducts stochastic simulations, exploring both unconditional and conditional scenarios across different vine copula models. Applied in the Shifeng Creek watershed, China, the findings reveal that vine copula models are superior in capturing complex variable relationships, demonstrating significant spatial interconnectivity crucial for flood risk prediction in heavy rainfall events. Interestingly, the study observes that expanding the model's dimensions does not inherently enhance simulation precision. The RDV-Copula method not only captures comprehensive information effectively but also simplifies the vine copula model by reducing its dimensionality and complexity. This study contributes to the field of hydrology by offering a refined method for analyzing and simulating multisite flood flows.

Received: 19 Jul 2024 – Discussion started: 12 Aug 2024

Competing interests: At least one of the (co-)authors is a member of the editorial board of Hydrology and Earth System Science

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 5507 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (5507 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

15 Jan 2025

Synchronization frequency analysis and stochastic simulation of multi-site flood flows based on the complicated vine copula structure

Xinting Yu, Yue-Ping Xu, Yuxue Guo, Siwei Chen, and Haiting Gu

Hydrol. Earth Syst. Sci., 29, 179–214, https://doi.org/10.5194/hess-29-179-2025,https://doi.org/10.5194/hess-29-179-2025, 2025

Short summary

Xinting Yu, Yuxue Guo, Siwei Chen, Haiting Gu, and Yue-Ping Xu

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2266', Anonymous Referee #1, 30 Aug 2024
This study introduces a novel RDV-Copula (Reduced-dimension vine copula) approach to improve the modeling and prediction of flood flows across multiple sites within a watershed. The method integrates complex spatial-temporal data while reducing the computational complexity of vine copula models. By identifying key variables and constructing RDV-Copula functions, the approach effectively captures spatial-temporal relationships between sites, ensuring accurate flood risk assessment. Applied to the Shifeng Creek watershed, the method revealed strong spatial connectivity, highlighting the increased risk of downstream flooding during heavy rainfall events. Validation against benchmark models showed that increasing model dimensions does not always enhance simulation accuracy, and in some cases, can complicate the model. The RDV-Copula method strikes an optimal balance between information accuracy and simplicity. This approach proves particularly useful for flood risk analysis and management, providing a refined methodology for multisite runoff simulations, and supporting decision-making for flood control and event scheduling.
Overall, the methods seem more likely to be reliable and the originality of the research is undoubted. The analyses in this study are well organized and the results are reasonable. In addition, the presentation of this article is generally clear. It is a valuable study and within the scope of this journal. Therefore, I recommend minor revisions prior to final acceptance.
General comments for the authors’ reference:
Subsection 2.3.1: In this subsection, the description of the text and the presentation of Figure 3 focus on illustrating the process of how to choose the key variables. But after selecting the key variables, how to construct the RDV-Copula model needs further elaboration. Please supplement this section and modify the picture if necessary.

Subsection 3.2.2.2: Why are these three different sets of structures chosen as benchmark models? What is the significance of the comparison of each set of benchmarks? Please explain how is it possible to validate the effectiveness of the proposed RDV-Copula method by comparing it with the three sets of benchmarks?

Line 418-421: What does the symbol of the “*” in Figure 7(a) indicate? There is no explanation in the text or in the picture.

Line 198-201: The methodology provides a brief introduction to the differences and characteristics of C-vine and D-vine copula. However, there is still a possibility that it may confuse readers who do not have the knowledge of this area. I recommend that some schematic diagrams may be added to the introduction to assist comprehension.

In the process of constructing the joint distribution function, why is only the relationship between yesterday's runoff and today's runoff considered when identifying the relationship in the time dimension? Why not consider the effect of the runoff from two days ago on today's runoff?

Line 378-379: “This chosen structure is then further compared with other copula functions to validate its efficacy.” Based on my understanding, the phrase “This chosen structure” refers to the structure selected after comparing the RDV-Cvine and RDV-Dvine. However, the sentence is somewhat ambiguous, so I am uncertain if my interpretation is correct. Could you please clarify and rewrite the sentence?

I think there is a repetition of the information presented in the table and the picture in Supplement D. Please select one of them (table or picture) for the information presentation.

Line 98-100: “This complexity can complicate the copula's structure determination, inflate computational demands during parameter fitting, and potentially diminish the accuracy of stochastic simulations.” In this sentence, the phrase “copula's structure determination” should be revised to “copula structure's determination”.

Line 365-367: “The subsequent step involves identifying the site with the most significant correlation to its preceding day's inflow, which is then used as a as a variable to represent the temporal relationship on that day.” There is an error in this sentence. “as a” is repeated, please delete the redundant one.

Line 442-443: “The obvious dark colored blocks in the graph indicate the high probabilities of being the high-water or the low-water concurrently.” This sentence seems a bit confused. Please rewrite it to avoid ambiguity.

Line 445-447: “While the LSM site's synchronization probabilities with the other sites are comparatively lower, they still exceed 50%, recorded at 58.29% with the LX site, 61.25% with the QS site, and 57.15% with the SD site.” The sentence is not clear enough, please revise it and replace the word “recorded”.

Line 653-655: “Depending on the number of hydrometric stations, Wang and Shen (2023b) established the 7-dimensional regular vine (R-vine) copula models to depict the complex and diverse dependence.” Please delete the “the” before “7-dimensional regular vine”. Please replace the “dependence” by “dependencies”.

Line 700: “The conditional simulation is a double-edged sword.” Please remove the “the” before the conditional simulation.
Citation: https://doi.org/10.5194/egusphere-2024-2266-RC1
- AC1:
  'Reply on RC1', Xinting Yu, 19 Sep 2024
  (1) Subsection 2.3.1: In this subsection, the description of the text and the presentation of Figure 3 focus on illustrating the process of how to choose the key variables. But after selecting the key variables, how to construct the RDV-Copula model needs further elaboration. Please supplement this section and modify the picture if necessary.
  Thank you for your suggestion. Figure 3 has been modified. In addition, the manuscript has been supplemented with the entirety of the RDV-Copula model construction. The revised text and figure are shown below:
  Figure 3. Schematic diagram of the RDV-Copula method
  After identifying the "N+1" key variables, the marginal distribution function for each variable is determined, selecting the most appropriate distribution (e.g., Normal, Gamma) based on the statistical characteristics of each variable. Using these marginal distributions, a suitable copula structure is then selected, such as C-Vine or D-Vine, depending on the nature of dependencies among the key variables. Next, for each pair of variables in the chosen vine structure, the most appropriate bivariate copula family (e.g., Gaussian, Clayton, Gumbel) is selected to accurately capture their dependencies. Subsequently, parameters for each selected pair-copula are estimated sequentially using methods like Maximum Likelihood Estimation (MLE). Finally, the constructed copula model is validated using statistical criteria such as the Akaike Information Criterion (AIC).
  (2) Subsection 3.2.2.2: Why are these three different sets of structures chosen as benchmark models? What is the significance of the comparison of each set of benchmarks? Please explain how is it possible to validate the effectiveness of the proposed RDV-Copula method by comparing it with the three sets of benchmarks?
  The three different benchmark models were chosen to evaluate the RDV-Copula method across distinct dimensions of complexity and correlation (spatial and temporal) to assess its effectiveness in capturing dependencies. Here's why each set of structures was selected and their significance in validating the proposed RDV-Copula method:
  Benchmark 1: Four-dimensional spatial vine copula (Unconditional)
  Reason for selection: This benchmark focuses solely on the effect of spatial correlations, providing a simpler case where only inflows from the four sites (LSM-LX-QS-SD) on the same day are considered.
  
  Significance: By limiting the model to spatial correlations, Benchmark 1 provides a baseline to compare how well the RDV-Copula captures spatial dependencies. If the RDV-Copula outperforms this benchmark, it demonstrates that including temporal correlations (as RDV-Copula does) improves performance.
  
  Benchmark 2: Eight-dimensional spatial-temporal vine copula (Unconditional)
  Reason for selection: This benchmark extends the analysis by incorporating both spatial and temporal correlations. The inclusion of inflows from both the current and previous day (LSM-LX-QS-SD-LSM1-LX1-QS1-SD1) reflects a more complex dependence structure.
  
  Significance: This model demonstrates the performance when handling both spatial and temporal correlations in an unconditional framework. Comparing it with RDV-Copula shows whether the latter's reduced dimensionality (five-dimensional) or conditional simulation better captures the hydrological dynamics.
  
  Benchmark 3: Eight-dimensional spatial-temporal vine copula (Conditional)
  Reason for selection: This benchmark uses a similar eight-dimensional structure as Benchmark 2 but incorporates conditional simulation, assuming that the previous day’s runoff is known.
  
  Significance: By comparing it with RDV-con (which also uses conditional simulation but with a simplified five-dimensional structure), the comparison highlights whether the RDV-Copula’s dimensional reduction sacrifices accuracy or remains effective.
  
  How to validate the effectiveness of the proposed RDV-Copula method:
  Comparing RDV-un with Benchmark 1: If RDV-un (which includes both spatial and temporal correlations) outperforms Benchmark 1 (spatial-only), it validates that considering temporal information adds value.
  
  Comparing RDV-un with Benchmark 2: A comparison with Benchmark 2 (eight-dimensional) demonstrates whether RDV-un's reduced dimensionality preserves or enhances predictive performance, thereby evaluating the RDV-Copula's ability to balance model complexity and accuracy.
  
  Comparing RDV-con with Benchmark 3: RDV-con’s conditional simulation can be compared with Benchmark 3’s approach to assess whether the reduced dimensionality of the RDV-Copula still captures the essential conditional temporal dynamics.
  
  Overall, the comparison with these benchmarks allows for an assessment of whether the RDV-Copula method is both effective in reducing model complexity and capturing critical hydrological relationships, validating its practicality for applications.
  (3) Line 418-421: What does the symbol of the “*” in Figure 7(a) indicate? There is no explanation in the text or in the picture.
  The manuscript has been revised by adding the following sentence to explain the symbol “*”: “The "*" on the ellipse means that the correlation passes the significance test of . This sentence for explanation will be added to the manuscript.”
  (4) Line 198-201: The methodology provides a brief introduction to the differences and characteristics of C-vine and D-vine copula. However, there is still a possibility that it may confuse readers who do not have the knowledge of this area. I recommend that some schematic diagrams may be added to the introduction to assist comprehension.
  C-vine and D-vine have different structures. The schematic diagrams of these two are shown below. This set of diagrams will be added to the manuscript later.
  In the C-vine copula structure, each tree features a central node that is connected to all other edges, as illustrated in Figure (a). C-vine is suitable for structures with a key variable that has a significant correlation with the remaining other variables. In contrast, in the D-vine copula structure, each node is connected to no more than two edges, as depicted in Figure (b).
  Figure 5. The vine structures for the given order of 3 variables in (a) the C-vine copula and (b) the D-vine copula.
  (5) In the process of constructing the joint distribution function, why is only the relationship between yesterday's runoff and today's runoff considered when identifying the relationship in the time dimension? Why not consider the effect of the runoff from two days ago on today's runoff?
  The reason is that the magnitude of the degree of association needs to be considered when identifying key variables in the temporal dimension used to construct the joint distribution function.
  Take the runoff of LSM site as an example. LSM represents the runoff data of the current day, LSM1 represents the runoff data of the previous day, LSM2 represents the runoff data of two days ago, LSM3 represents the runoff data of three days ago, and LSM4 represents the runoff data of four days ago. The correlations between LSM and LSM1, LSM2, LSM3, and LSM4 are represented by (a)-(d) in the figure below. Pearson's correlation coefficients show a gradual decrease in correlation with time. The correlation is highest for two adjacent days.
  Here are the Pearson correlation coefficients:
  LSM and LSM1: 0.566
  
  LSM and LSM2: 0.311
  
  LSM and LSM3: 0.204
  
  LSM and LSM4: 0.180
  
  It can be concluded that the correlation between LSM and the previous day's runoff is the highest with 0.566. While the data from two days ago no longer has much influence on the current day's runoff data, so it can be excluded from the critical variable selection. Considering only the previous day's contribution in the time dimension can effectively represent the time correlation while avoiding unnecessary dimension increase.
  (6) Line 378-379: “This chosen structure is then further compared with other copula functions to validate its efficacy.” Based on my understanding, the phrase “This chosen structure” refers to the structure selected after comparing the RDV-Cvine and RDV-Dvine. However, the sentence is somewhat ambiguous, so I am uncertain if my interpretation is correct. Could you please clarify and rewrite the sentence?
  Sorry for causing such confusion. The vine copula structure (RDV-Cvine or RDV-Dvine) with better index values will be preferred. “This chosen structure” refers to the structure selected after comparing the RDV-Cvine and RDV-Dvine. The sentence is modified to: “The RDV-Copula structure with better index values is then further compared with other copula functions to validate its efficacy.”
  (7) I think there is a repetition of the information presented in the table and the picture in Supplement D. Please select one of them (table or picture) for the information presentation.
  Thanks for your suggestion. In Supplement D, only the pictures of the marginal distribution function are preserved for display. The table content has been removed.
  (8) Line 98-100: “This complexity can complicate the copula's structure determination, inflate computational demands during parameter fitting, and potentially diminish the accuracy of stochastic simulations.” In this sentence, the phrase “copula's structure determination” should be revised to “copula structure's determination”.
  Thanks for your advice. This is modified in the manuscript.
  (9) Line 365-367: “The subsequent step involves identifying the site with the most significant correlation to its preceding day's inflow, which is then used as a as a variable to represent the temporal relationship on that day.” There is an error in this sentence. “as a” is repeated, please delete the redundant one.
  This is modified in the manuscript.
  (10) Line 442-443: “The obvious dark colored blocks in the graph indicate the high probabilities of being the high-water or the low-water concurrently.” This sentence seems a bit confused. Please rewrite it to avoid ambiguity.
  Sorry for causing such confusion. This sentence has been revised to “The obvious dark-colored blocks in the graph indicate the high probabilities of being in high-water or low-water states concurrently.”
  (11) Line 445-447: “While the LSM site's synchronization probabilities with the other sites are comparatively lower, they still exceed 50%, recorded at 58.29% with the LX site, 61.25% with the QS site, and 57.15% with the SD site.” The sentence is not clear enough, please revise it and replace the word “recorded”.
  Thanks for your advice. This sentence has been revised to “While the LSM site's synchronization probabilities with the other sites are comparatively lower, they still exceed 50%, with values of 58.29% for the LX site, 61.25% for the QS site, and 57.15% for the SD site.”
  (12) Line 653-655: “Depending on the number of hydrometric stations, Wang and Shen (2023b) established the 7-dimensional regular vine (R-vine) copula models to depict the complex and diverse dependence.” Please delete the “the” before “7-dimensional regular vine”. Please replace the “dependence” by “dependencies”.
  This is modified in the manuscript.
  (13) Line 700: “The conditional simulation is a double-edged sword.” Please remove the “the” before the conditional simulation.
  This is modified in the manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2266-AC1
RC2:
'Comment on egusphere-2024-2266', Anonymous Referee #2, 16 Sep 2024

The manuscript is concerned with synchronization frequency analysis and stochastic simulation of multisite flood flows based on the complicated vine-copula structure, which is interesting. It is relevant and within the scope of the journal.
However, the manuscript, in its present form, needs some further improvements. Once the adequate revisions to the following points are implemented, the paper may be acceptable for publication. There are some specific comments that might help the authors further enhance the manuscript's quality.
* Introduction:
It is generally well-written, with good structure and clarity, but there are a few areas where improvements can be made for better precision and readability.
* Line 31-33: “As is reported by Centre for Research on the Epidemiology of Disasters (CRED)”. The use of "is" here is unnecessary and makes the sentence sound awkward. Remove the word “is” from the sentence.
* Line 37: “Large floods often result from the amalgamation of floods from multiple sub-watersheds” . Please replace "amalgamation" with "merging" for more concise language.
* Line 64-65: “Copula function is widely applied in hydrological fields...” . "Copula function" should be plural here.
* Method:
* Many equations are presented in the paper, and most look OK. However, please check carefully whether all equations are necessary and whether the quantities involved are properly explained. Line 232-237: For equations (6), (7), and (8), the variables are not clearly explained. Please elaborate on the meaning of each variable, such as u_ph^1,u_ph^2,u_ph^3,u_ph^4 and u_pl^1,u_pl^2,u_pl^3,u_pl^4 . This ensures the reader understands the variables used in the equations.
* Figure2. The colors used to represent elements such as reservoirs and cross sections are a bit confusing. Use the same color for the same elements throughout the figure. For instance, if reservoirs are represented by a specific color, maintain that color consistently. Please revise it.
* Figure4. How is it possible to distinguish between conditional simulation and unconditional simulation through Figure 4? There is no clear explanation or visual distinction between conditional and unconditional simulation in the figure. Please provide a clear legend or add explanations for (a) and (b) in Figure 4, highlighting the difference between conditional and unconditional simulations.
* Case study:
* Line 316-317: "This study focuses on four major sites within the Shifeng Creek catchment" The reason for selecting these sites is unclear. Clarify that these sites are chosen due to their strategic importance. Can you explain briefly why these particular sites were selected for the study?
* Line 344-346: The combinations of [X-H, Y-H, Z-H, W-H], [X-M, Y-M, Z-M, W-M], and [X-L, Y-L, Z-L, W-L] mean what synchronization respectively? The abbreviations are used without prior explanation. Before utilizing the abbreviations please correspond the abbreviated letters to the original meaning. If not, it will be harder for readers to understand.
* Line 320-321: “To achieve this, daily runoff data of August, covering a span from 2000 to 2020, have been compiled.” This sentence would be more properly expressed in the past tense. Change "have been compiled" to "were compiled" for better tense consistency.
* Results

* This section is well written. The only area of weakness is the display of pictures. The font size of figure 10 is a little small. It could be considered to keep only a lower number of sub-figures and the rest could be placed in supplementary part of the article.
* Discussion & Conclusion
* Line 666-668: “Although the eight-dimensional vine copula model takes more variables into account, including both temporal and spatial correlation, the model is too complicated due to many variables, which makes the simulation less efficient on the contrary.” This sentence is too long and complex. Pease simplify it.
* Minor comments
The grammar in the article is generally correct. However, there are some words that are not used appropriately. A few examples are given below:

* Line 254-255: “This strategy aims to distill essential spatial-temporal information, thereby reducing the vine copula function's dimensionality to simplify the model structure.” Here, the meaning of the word “distill” is not suitable for this application. It would be more appropriate to replace it with “extract”.
* Line 271-273: “The core distinction between these two simulation methods hinges on whether certain data points are pre-determined at the time of simulation.” There is something wrong with the logic of this statement.
* Line 347: “The calculation equations can be referenced in Appendix B.” I think “provided” may be better than “referenced”.
* The supplemental content section is a bit too redundant. Think about keeping the important parts.

Citation: https://doi.org/10.5194/egusphere-2024-2266-RC2
- AC2: 'Reply on RC2', Xinting Yu, 20 Sep 2024
  
  * Introduction:
  It is generally well-written, with good structure and clarity, but there are a few areas where improvements can be made for better precision and readability.
  * Line 31-33: “As is reported by Centre for Research on the Epidemiology of Disasters (CRED)” The use of "is" here is unnecessary and makes the sentence sound awkward. Remove the word “is” from the sentence.
  Thanks for your suggestion. The word “is” has been removed from this sentence. This is modified in the manuscript. This sentence has been revised to “As reported by Centre for Research on the Epidemiology of Disasters (CRED)”.
  * Line 37: “Large floods often result from the amalgamation of floods from multiple sub-watersheds” Please replace "amalgamation" with "merging" for more concise language.
  Thanks for your suggestion. This is modified in the manuscript. This sentence has been revised to “Large floods often result from the merging of floods from multiple sub-watersheds.”.
  * Line 64-65: “Copula function is widely applied in hydrological fields...” "Copula function" should be plural here.
  Thanks for your suggestion. This is revised in the manuscript. This sentence has been modified to “Copula functions are widely applied in hydrological fields, including the joint frequency analysis (Liu et al., 2018; Zhang et al., 2021), water resources management (Gao et al., 2018; Nazeri Tahroudi et al., 2022), wetness-dryness encountering (Wang et al., 2022; Zhang et al., 2023), flood risk assessment (Li et al., 2022; Tosunoglu et al., 2020; Zhong et al., 2021) , water quality analysis (Yu et al., 2020; Yu and Zhang, 2021)，precipitation model (Gao et al., 2020; Nazeri Tahroudi et al., 2023; Tahroudi et al., 2022) and so on.”.
  
  * Method:
  * Many equations are presented in the paper, and most look OK. However, please check carefully whether all equations are necessary and whether the quantities involved are properly explained. Line 232-237: For equations (6), (7), and (8), the variables are not clearly explained. Please elaborate on the meaning of each variable, such as and . This ensures the reader understands the variables used in the equations.
  Sorry for causing such confusion. Since the inflows of the different sites are represented by X¹, X², X³,…… ,X^N+M . The variables X_ph¹, X_ph², X_ph³,…… ,X_ph^N+M represent the inflow amounts corresponding to the high-water at these sites, while X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M denote the inflow amounts corresponding to the low-water at the respective sites. The marginal distribution functions for these inflows are represented as u¹, u², u³,…… ,u^N+M . Specifically, u_ph¹, u_ph², u_ph³,…… ,u_ph^N+M denote the marginal distribution functions corresponding to the high-water inflow amounts X_ph¹, X_ph², X_ph³,…… ,X_ph^N+M , capturing the probabilistic behavior of the inflows during high-water conditions at each site. Similarly, u_pl¹, u_pl², u_pl³,…… ,u_pl^N+Mrepresent the marginal distribution functions for the low-water inflow amounts X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M , describing the inflow behavior during low-water conditions at these sites.
  The explanation of this part is supplemented in the manuscripts as follows:
  “Let the inflows of the different sites be represented by X¹, X², X³,…… ,X^N+M. X_ph¹, X_ph², X_ph³,…… ,X_ph^N+Mrepresent the amounts of inflow corresponding to the high-water of these different sites respectively. Meanwhile, X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M represent the amounts of inflow corresponding to the low-water of these different sites respectively. The marginal distribution functions are u¹, u², u³,…… ,u^N+M, respectively. Specifically, u_ph¹, u_ph², u_ph³,…… ,u_ph^N+M denote the marginal distribution functions corresponding to the high-water inflow amounts X_ph¹, X_ph², X_ph³,…… ,X_ph^N+M , capturing the probabilistic behavior of the inflows during high-water conditions at each site. Similarly, u_pl¹, u_pl², u_pl³,…… ,u_pl^N+Mrepresent the marginal distribution functions for the low-water inflow amounts X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M , describing the inflow behavior during low-water conditions at these sites.”
  * Figure2. The colors used to represent elements such as reservoirs and cross sections are a bit confusing. Use the same color for the same elements throughout the figure. For instance, if reservoirs are represented by a specific color, maintain that color consistently. Please revise it.
  The figure has been modified, as shown below. The colors of various elements in the figure are unified. The reservoirs are shown in green and the cross-sections are shown in blue.
  Figure 2. Schematic diagram of the generalized system in the catchment
  * Figure4. How is it possible to distinguish between conditional simulation and unconditional simulation through Figure 4? There is no clear explanation or visual distinction between conditional and unconditional simulation in the figure. Please provide a clear legend or add explanations for (a) and (b) in Figure 4, highlighting the difference between conditional and unconditional simulations.
  Thank you for your suggestion. Figure 4 has been modified. The descriptions for both conditional and unconditional simulations have been further refined and supplemented to enhance clarity. This is revised in the manuscript.
  Unconditional simulation (Figure 4(a)): This approach generates random samples based solely on the marginal probability distribution, without incorporating any existing data constraints. The probability distribution is shown in the upper-left plot, and random samples are generated simultaneously, resulting in the scatter plot below. The generated samples, represented by blue points, illustrate the joint variability according to their predefined marginal distributions. Since no prior information is used, each data point is in an unknown state before the simulation.
  Conditional simulation (Figure 4(b)): In this scenario, the simulation takes into account pre-existing data conditions. The marginal probability distribution is displayed in the top-center plot, while the known conditional data is shown in the upper-right scatter plot (in pink). These known data points act as a constraint for generating new random samples. The resulting scatter plot below (blue and pink points) demonstrates how the conditional samples are influenced by both the marginal distribution and the specified conditions of the known data. This method allows for a tailored simulation that incorporates pre-existing data insights.
  Figure 4. Schematic diagram for generating random simulation samples (a) unconditional simulation (b) conditional simulation
  
  * Case study:
  * Line 316-317: "This study focuses on four major sites within the Shifeng Creek catchment" The reason for selecting these sites is unclear. Clarify that these sites are chosen due to their strategic importance. Can you explain briefly why these particular sites were selected for the study?
  These four sites were selected due to their strategic importance within the Shifeng Creek catchment, representing key locations in the upper, middle, and lower reaches of the river system. The Lishimen Reservoir (LSM) and Longxi Reservoir (LX) sites are both situated in the upper reaches of the catchment and play a crucial role in flood control by regulating inflows and managing water levels. Understanding the flow patterns at these reservoir sites is essential for optimizing reservoir operations and mitigating downstream flood risks, especially during peak flood periods. The Qianshan (QS) cross-section, located in the middle reaches, and the Shaduan (SD) cross-section, positioned in the lower reaches, are critical control points for flood management. Analyzing flow processes at these sections allows for better coordination of reservoir operations and helps prevent the convergence of flood peaks, thus enhancing flood mitigation throughout the catchment.
  The reasons for why these four critical sites were chosen are supplemented in the manuscript as follows.
  “These four sites were selected for their strategic importance within the Shifeng Creek catchment, covering the upper, middle, and lower reaches. The Lishimen (LSM) and Longxi (LX) reservoirs, both in the upper reaches, are vital for flood control, regulating inflows to reduce downstream flood risks. The Qianshan (QS) cross-section, in the middle reaches, and the Shaduan (SD) cross-section, in the lower reaches, serve as key flood control points. Analyzing flows at these sites enables better coordination of reservoir operations and prevents flood peak convergence, enhancing overall flood management.”
  * Line 344-346: The combinations of [X-H, Y-H, Z-H, W-H], [X-M, Y-M, Z-M, W-M], and [X-L, Y-L, Z-L, W-L] mean what synchronization respectively? The abbreviations are used without prior explanation. Before utilizing the abbreviations please correspond the abbreviated letters to the original meaning. If not, it will be harder for readers to understand.
  These three combinations represent different types of synchronization. Specifically, [X-H, Y-H, Z-H, W-H] indicates that all four sites (X, Y, Z, and W) are in a high-water state, [X-M, Y-M, Z-M, W-M] signifies that all sites are in a medium-water state, and [X-L, Y-L, Z-L, W-L] means that all sites are in a low-water state simultaneously. To improve clarity, the manuscript will be revised as follows.
  “Considering the three potential states (High/Medium/Low) at each site, a total of 81 possible inflow-state combinations are identified. For ease of presentation, H, M, and L are then used as abbreviations for High, Medium, and Low. Among the 81 combinations, the combinations [X-H, Y-H, Z-H, W-H], [X-M, Y-M, Z-M, W-M], and [X-L, Y-L, Z-L, W-L] are classified as synchronous high-water, synchronous medium-water, synchronous low-water, respectively, while the remainder are deemed asynchronous. The calculation equations can be referenced in Appendix B.”
  * Line 320-321: “To achieve this, daily runoff data of August, covering a span from 2000 to 2020, have been compiled.” This sentence would be more properly expressed in the past tense. Change "have been compiled" to "were compiled" for better tense consistency.
  This is modified in the manuscript.
  
  * Results
  
  * This section is well written. The only area of weakness is the display of pictures. The font size of figure 10 is a little small. It could be considered to keep only a lower number of sub-figures and the rest could be placed in supplementary part of the article.
  The font size in the figure has been enlarged. Here are the modified figures. Moreover, to ensure the clarity of each figure, only 9 figures are put here, and the rest are put in the supplemental content.
  Figure 10. Cumulative probability distribution of the preferred marginal distribution function for runoff on each day throughout 1^st-9^th in August
  
  * Discussion & Conclusion
  * Line 666-668: “Although the eight-dimensional vine copula model takes more variables into account, including both temporal and spatial correlation, the model is too complicated due to many variables, which makes the simulation less efficient on the contrary.” This sentence is too long and complex. Pease simplify it.
  This sentence has been modified in the manuscript.
  “Although the eight-dimensional vine copula model considers both temporal and spatial correlations, its complexity reduces simulation efficiency due to the large number of variables.”
  
  * Minor comments
  The grammar in the article is generally correct. However, there are some words that are not used appropriately. A few examples are given below:
  
  * Line 254-255: “This strategy aims to distill essential spatial-temporal information, thereby reducing the vine copula function's dimensionality to simplify the model structure.” Here, the meaning of the word “distill” is not suitable for this application. It would be more appropriate to replace it with “extract”.
  Thanks for your suggestion. The word “distill” has been replaced. And this sentence has been modified in the manuscript.
  “This strategy aims to extract essential spatial-temporal information, thereby reducing the vine copula function's dimensionality to simplify the model structure.”
  * Line 271-273: “The core distinction between these two simulation methods hinges on whether certain data points are pre-determined at the time of simulation.” There is something wrong with the logic of this statement.
  This sentence has been modified in the manuscript.
  “The key difference between these two simulation methods lies in whether specific data points are known in advance before generating the simulation.”
  * Line 347: “The calculation equations can be referenced in Appendix B.” I think “provided” may be better than “referenced”.
  Thanks for your suggestion. The word “referenced” has been replaced by “provided”. And this sentence has been modified as follows.
  “The calculation equations can be provided in Appendix B.”
  * The supplemental content section is a bit too redundant. Think about keeping the important parts.
  Appendix A, B, and C are all essential. The table in Appendix D is removed and only the figures of the marginal distribution results are retained for presentation.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2266-AC2
RC3:
'Comment on egusphere-2024-2266', Anonymous Referee #3, 26 Sep 2024

The methodology and reasoning behind this study are sound, and the manuscript is well-written. The results and discussion sections offer valuable contributions to the scope of this journal. I have no further comments for this review.

Citation: https://doi.org/10.5194/egusphere-2024-2266-RC3
- AC3: 'Reply on RC3', Xinting Yu, 26 Sep 2024
  
  Thank you very much for your positive feedback and for recognizing the effort and contributions of my work. I truly appreciate your kind words regarding the methodology, reasoning, and the clarity of the manuscript. Your acknowledgment of the results and discussion sections as valuable contributions to the journal is incredibly encouraging. I am grateful for your time and thorough review, which have been instrumental in enhancing the quality of this paper. Thank you once again for your support.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2266-AC3

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2024-2266', Anonymous Referee #1, 30 Aug 2024
This study introduces a novel RDV-Copula (Reduced-dimension vine copula) approach to improve the modeling and prediction of flood flows across multiple sites within a watershed. The method integrates complex spatial-temporal data while reducing the computational complexity of vine copula models. By identifying key variables and constructing RDV-Copula functions, the approach effectively captures spatial-temporal relationships between sites, ensuring accurate flood risk assessment. Applied to the Shifeng Creek watershed, the method revealed strong spatial connectivity, highlighting the increased risk of downstream flooding during heavy rainfall events. Validation against benchmark models showed that increasing model dimensions does not always enhance simulation accuracy, and in some cases, can complicate the model. The RDV-Copula method strikes an optimal balance between information accuracy and simplicity. This approach proves particularly useful for flood risk analysis and management, providing a refined methodology for multisite runoff simulations, and supporting decision-making for flood control and event scheduling.
Overall, the methods seem more likely to be reliable and the originality of the research is undoubted. The analyses in this study are well organized and the results are reasonable. In addition, the presentation of this article is generally clear. It is a valuable study and within the scope of this journal. Therefore, I recommend minor revisions prior to final acceptance.
General comments for the authors’ reference:
Subsection 2.3.1: In this subsection, the description of the text and the presentation of Figure 3 focus on illustrating the process of how to choose the key variables. But after selecting the key variables, how to construct the RDV-Copula model needs further elaboration. Please supplement this section and modify the picture if necessary.

Subsection 3.2.2.2: Why are these three different sets of structures chosen as benchmark models? What is the significance of the comparison of each set of benchmarks? Please explain how is it possible to validate the effectiveness of the proposed RDV-Copula method by comparing it with the three sets of benchmarks?

Line 418-421: What does the symbol of the “*” in Figure 7(a) indicate? There is no explanation in the text or in the picture.

Line 198-201: The methodology provides a brief introduction to the differences and characteristics of C-vine and D-vine copula. However, there is still a possibility that it may confuse readers who do not have the knowledge of this area. I recommend that some schematic diagrams may be added to the introduction to assist comprehension.

In the process of constructing the joint distribution function, why is only the relationship between yesterday's runoff and today's runoff considered when identifying the relationship in the time dimension? Why not consider the effect of the runoff from two days ago on today's runoff?

Line 378-379: “This chosen structure is then further compared with other copula functions to validate its efficacy.” Based on my understanding, the phrase “This chosen structure” refers to the structure selected after comparing the RDV-Cvine and RDV-Dvine. However, the sentence is somewhat ambiguous, so I am uncertain if my interpretation is correct. Could you please clarify and rewrite the sentence?

I think there is a repetition of the information presented in the table and the picture in Supplement D. Please select one of them (table or picture) for the information presentation.

Line 98-100: “This complexity can complicate the copula's structure determination, inflate computational demands during parameter fitting, and potentially diminish the accuracy of stochastic simulations.” In this sentence, the phrase “copula's structure determination” should be revised to “copula structure's determination”.

Line 365-367: “The subsequent step involves identifying the site with the most significant correlation to its preceding day's inflow, which is then used as a as a variable to represent the temporal relationship on that day.” There is an error in this sentence. “as a” is repeated, please delete the redundant one.

Line 442-443: “The obvious dark colored blocks in the graph indicate the high probabilities of being the high-water or the low-water concurrently.” This sentence seems a bit confused. Please rewrite it to avoid ambiguity.

Line 445-447: “While the LSM site's synchronization probabilities with the other sites are comparatively lower, they still exceed 50%, recorded at 58.29% with the LX site, 61.25% with the QS site, and 57.15% with the SD site.” The sentence is not clear enough, please revise it and replace the word “recorded”.

Line 653-655: “Depending on the number of hydrometric stations, Wang and Shen (2023b) established the 7-dimensional regular vine (R-vine) copula models to depict the complex and diverse dependence.” Please delete the “the” before “7-dimensional regular vine”. Please replace the “dependence” by “dependencies”.

Line 700: “The conditional simulation is a double-edged sword.” Please remove the “the” before the conditional simulation.
Citation: https://doi.org/10.5194/egusphere-2024-2266-RC1
- AC1:
  'Reply on RC1', Xinting Yu, 19 Sep 2024
  (1) Subsection 2.3.1: In this subsection, the description of the text and the presentation of Figure 3 focus on illustrating the process of how to choose the key variables. But after selecting the key variables, how to construct the RDV-Copula model needs further elaboration. Please supplement this section and modify the picture if necessary.
  Thank you for your suggestion. Figure 3 has been modified. In addition, the manuscript has been supplemented with the entirety of the RDV-Copula model construction. The revised text and figure are shown below:
  Figure 3. Schematic diagram of the RDV-Copula method
  After identifying the "N+1" key variables, the marginal distribution function for each variable is determined, selecting the most appropriate distribution (e.g., Normal, Gamma) based on the statistical characteristics of each variable. Using these marginal distributions, a suitable copula structure is then selected, such as C-Vine or D-Vine, depending on the nature of dependencies among the key variables. Next, for each pair of variables in the chosen vine structure, the most appropriate bivariate copula family (e.g., Gaussian, Clayton, Gumbel) is selected to accurately capture their dependencies. Subsequently, parameters for each selected pair-copula are estimated sequentially using methods like Maximum Likelihood Estimation (MLE). Finally, the constructed copula model is validated using statistical criteria such as the Akaike Information Criterion (AIC).
  (2) Subsection 3.2.2.2: Why are these three different sets of structures chosen as benchmark models? What is the significance of the comparison of each set of benchmarks? Please explain how is it possible to validate the effectiveness of the proposed RDV-Copula method by comparing it with the three sets of benchmarks?
  The three different benchmark models were chosen to evaluate the RDV-Copula method across distinct dimensions of complexity and correlation (spatial and temporal) to assess its effectiveness in capturing dependencies. Here's why each set of structures was selected and their significance in validating the proposed RDV-Copula method:
  Benchmark 1: Four-dimensional spatial vine copula (Unconditional)
  Reason for selection: This benchmark focuses solely on the effect of spatial correlations, providing a simpler case where only inflows from the four sites (LSM-LX-QS-SD) on the same day are considered.
  
  Significance: By limiting the model to spatial correlations, Benchmark 1 provides a baseline to compare how well the RDV-Copula captures spatial dependencies. If the RDV-Copula outperforms this benchmark, it demonstrates that including temporal correlations (as RDV-Copula does) improves performance.
  
  Benchmark 2: Eight-dimensional spatial-temporal vine copula (Unconditional)
  Reason for selection: This benchmark extends the analysis by incorporating both spatial and temporal correlations. The inclusion of inflows from both the current and previous day (LSM-LX-QS-SD-LSM1-LX1-QS1-SD1) reflects a more complex dependence structure.
  
  Significance: This model demonstrates the performance when handling both spatial and temporal correlations in an unconditional framework. Comparing it with RDV-Copula shows whether the latter's reduced dimensionality (five-dimensional) or conditional simulation better captures the hydrological dynamics.
  
  Benchmark 3: Eight-dimensional spatial-temporal vine copula (Conditional)
  Reason for selection: This benchmark uses a similar eight-dimensional structure as Benchmark 2 but incorporates conditional simulation, assuming that the previous day’s runoff is known.
  
  Significance: By comparing it with RDV-con (which also uses conditional simulation but with a simplified five-dimensional structure), the comparison highlights whether the RDV-Copula’s dimensional reduction sacrifices accuracy or remains effective.
  
  How to validate the effectiveness of the proposed RDV-Copula method:
  Comparing RDV-un with Benchmark 1: If RDV-un (which includes both spatial and temporal correlations) outperforms Benchmark 1 (spatial-only), it validates that considering temporal information adds value.
  
  Comparing RDV-un with Benchmark 2: A comparison with Benchmark 2 (eight-dimensional) demonstrates whether RDV-un's reduced dimensionality preserves or enhances predictive performance, thereby evaluating the RDV-Copula's ability to balance model complexity and accuracy.
  
  Comparing RDV-con with Benchmark 3: RDV-con’s conditional simulation can be compared with Benchmark 3’s approach to assess whether the reduced dimensionality of the RDV-Copula still captures the essential conditional temporal dynamics.
  
  Overall, the comparison with these benchmarks allows for an assessment of whether the RDV-Copula method is both effective in reducing model complexity and capturing critical hydrological relationships, validating its practicality for applications.
  (3) Line 418-421: What does the symbol of the “*” in Figure 7(a) indicate? There is no explanation in the text or in the picture.
  The manuscript has been revised by adding the following sentence to explain the symbol “*”: “The "*" on the ellipse means that the correlation passes the significance test of . This sentence for explanation will be added to the manuscript.”
  (4) Line 198-201: The methodology provides a brief introduction to the differences and characteristics of C-vine and D-vine copula. However, there is still a possibility that it may confuse readers who do not have the knowledge of this area. I recommend that some schematic diagrams may be added to the introduction to assist comprehension.
  C-vine and D-vine have different structures. The schematic diagrams of these two are shown below. This set of diagrams will be added to the manuscript later.
  In the C-vine copula structure, each tree features a central node that is connected to all other edges, as illustrated in Figure (a). C-vine is suitable for structures with a key variable that has a significant correlation with the remaining other variables. In contrast, in the D-vine copula structure, each node is connected to no more than two edges, as depicted in Figure (b).
  Figure 5. The vine structures for the given order of 3 variables in (a) the C-vine copula and (b) the D-vine copula.
  (5) In the process of constructing the joint distribution function, why is only the relationship between yesterday's runoff and today's runoff considered when identifying the relationship in the time dimension? Why not consider the effect of the runoff from two days ago on today's runoff?
  The reason is that the magnitude of the degree of association needs to be considered when identifying key variables in the temporal dimension used to construct the joint distribution function.
  Take the runoff of LSM site as an example. LSM represents the runoff data of the current day, LSM1 represents the runoff data of the previous day, LSM2 represents the runoff data of two days ago, LSM3 represents the runoff data of three days ago, and LSM4 represents the runoff data of four days ago. The correlations between LSM and LSM1, LSM2, LSM3, and LSM4 are represented by (a)-(d) in the figure below. Pearson's correlation coefficients show a gradual decrease in correlation with time. The correlation is highest for two adjacent days.
  Here are the Pearson correlation coefficients:
  LSM and LSM1: 0.566
  
  LSM and LSM2: 0.311
  
  LSM and LSM3: 0.204
  
  LSM and LSM4: 0.180
  
  It can be concluded that the correlation between LSM and the previous day's runoff is the highest with 0.566. While the data from two days ago no longer has much influence on the current day's runoff data, so it can be excluded from the critical variable selection. Considering only the previous day's contribution in the time dimension can effectively represent the time correlation while avoiding unnecessary dimension increase.
  (6) Line 378-379: “This chosen structure is then further compared with other copula functions to validate its efficacy.” Based on my understanding, the phrase “This chosen structure” refers to the structure selected after comparing the RDV-Cvine and RDV-Dvine. However, the sentence is somewhat ambiguous, so I am uncertain if my interpretation is correct. Could you please clarify and rewrite the sentence?
  Sorry for causing such confusion. The vine copula structure (RDV-Cvine or RDV-Dvine) with better index values will be preferred. “This chosen structure” refers to the structure selected after comparing the RDV-Cvine and RDV-Dvine. The sentence is modified to: “The RDV-Copula structure with better index values is then further compared with other copula functions to validate its efficacy.”
  (7) I think there is a repetition of the information presented in the table and the picture in Supplement D. Please select one of them (table or picture) for the information presentation.
  Thanks for your suggestion. In Supplement D, only the pictures of the marginal distribution function are preserved for display. The table content has been removed.
  (8) Line 98-100: “This complexity can complicate the copula's structure determination, inflate computational demands during parameter fitting, and potentially diminish the accuracy of stochastic simulations.” In this sentence, the phrase “copula's structure determination” should be revised to “copula structure's determination”.
  Thanks for your advice. This is modified in the manuscript.
  (9) Line 365-367: “The subsequent step involves identifying the site with the most significant correlation to its preceding day's inflow, which is then used as a as a variable to represent the temporal relationship on that day.” There is an error in this sentence. “as a” is repeated, please delete the redundant one.
  This is modified in the manuscript.
  (10) Line 442-443: “The obvious dark colored blocks in the graph indicate the high probabilities of being the high-water or the low-water concurrently.” This sentence seems a bit confused. Please rewrite it to avoid ambiguity.
  Sorry for causing such confusion. This sentence has been revised to “The obvious dark-colored blocks in the graph indicate the high probabilities of being in high-water or low-water states concurrently.”
  (11) Line 445-447: “While the LSM site's synchronization probabilities with the other sites are comparatively lower, they still exceed 50%, recorded at 58.29% with the LX site, 61.25% with the QS site, and 57.15% with the SD site.” The sentence is not clear enough, please revise it and replace the word “recorded”.
  Thanks for your advice. This sentence has been revised to “While the LSM site's synchronization probabilities with the other sites are comparatively lower, they still exceed 50%, with values of 58.29% for the LX site, 61.25% for the QS site, and 57.15% for the SD site.”
  (12) Line 653-655: “Depending on the number of hydrometric stations, Wang and Shen (2023b) established the 7-dimensional regular vine (R-vine) copula models to depict the complex and diverse dependence.” Please delete the “the” before “7-dimensional regular vine”. Please replace the “dependence” by “dependencies”.
  This is modified in the manuscript.
  (13) Line 700: “The conditional simulation is a double-edged sword.” Please remove the “the” before the conditional simulation.
  This is modified in the manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2266-AC1
RC2:
'Comment on egusphere-2024-2266', Anonymous Referee #2, 16 Sep 2024

The manuscript is concerned with synchronization frequency analysis and stochastic simulation of multisite flood flows based on the complicated vine-copula structure, which is interesting. It is relevant and within the scope of the journal.
However, the manuscript, in its present form, needs some further improvements. Once the adequate revisions to the following points are implemented, the paper may be acceptable for publication. There are some specific comments that might help the authors further enhance the manuscript's quality.
* Introduction:
It is generally well-written, with good structure and clarity, but there are a few areas where improvements can be made for better precision and readability.
* Line 31-33: “As is reported by Centre for Research on the Epidemiology of Disasters (CRED)”. The use of "is" here is unnecessary and makes the sentence sound awkward. Remove the word “is” from the sentence.
* Line 37: “Large floods often result from the amalgamation of floods from multiple sub-watersheds” . Please replace "amalgamation" with "merging" for more concise language.
* Line 64-65: “Copula function is widely applied in hydrological fields...” . "Copula function" should be plural here.
* Method:
* Many equations are presented in the paper, and most look OK. However, please check carefully whether all equations are necessary and whether the quantities involved are properly explained. Line 232-237: For equations (6), (7), and (8), the variables are not clearly explained. Please elaborate on the meaning of each variable, such as u_ph^1,u_ph^2,u_ph^3,u_ph^4 and u_pl^1,u_pl^2,u_pl^3,u_pl^4 . This ensures the reader understands the variables used in the equations.
* Figure2. The colors used to represent elements such as reservoirs and cross sections are a bit confusing. Use the same color for the same elements throughout the figure. For instance, if reservoirs are represented by a specific color, maintain that color consistently. Please revise it.
* Figure4. How is it possible to distinguish between conditional simulation and unconditional simulation through Figure 4? There is no clear explanation or visual distinction between conditional and unconditional simulation in the figure. Please provide a clear legend or add explanations for (a) and (b) in Figure 4, highlighting the difference between conditional and unconditional simulations.
* Case study:
* Line 316-317: "This study focuses on four major sites within the Shifeng Creek catchment" The reason for selecting these sites is unclear. Clarify that these sites are chosen due to their strategic importance. Can you explain briefly why these particular sites were selected for the study?
* Line 344-346: The combinations of [X-H, Y-H, Z-H, W-H], [X-M, Y-M, Z-M, W-M], and [X-L, Y-L, Z-L, W-L] mean what synchronization respectively? The abbreviations are used without prior explanation. Before utilizing the abbreviations please correspond the abbreviated letters to the original meaning. If not, it will be harder for readers to understand.
* Line 320-321: “To achieve this, daily runoff data of August, covering a span from 2000 to 2020, have been compiled.” This sentence would be more properly expressed in the past tense. Change "have been compiled" to "were compiled" for better tense consistency.
* Results

* This section is well written. The only area of weakness is the display of pictures. The font size of figure 10 is a little small. It could be considered to keep only a lower number of sub-figures and the rest could be placed in supplementary part of the article.
* Discussion & Conclusion
* Line 666-668: “Although the eight-dimensional vine copula model takes more variables into account, including both temporal and spatial correlation, the model is too complicated due to many variables, which makes the simulation less efficient on the contrary.” This sentence is too long and complex. Pease simplify it.
* Minor comments
The grammar in the article is generally correct. However, there are some words that are not used appropriately. A few examples are given below:

* Line 254-255: “This strategy aims to distill essential spatial-temporal information, thereby reducing the vine copula function's dimensionality to simplify the model structure.” Here, the meaning of the word “distill” is not suitable for this application. It would be more appropriate to replace it with “extract”.
* Line 271-273: “The core distinction between these two simulation methods hinges on whether certain data points are pre-determined at the time of simulation.” There is something wrong with the logic of this statement.
* Line 347: “The calculation equations can be referenced in Appendix B.” I think “provided” may be better than “referenced”.
* The supplemental content section is a bit too redundant. Think about keeping the important parts.

Citation: https://doi.org/10.5194/egusphere-2024-2266-RC2
- AC2: 'Reply on RC2', Xinting Yu, 20 Sep 2024
  
  * Introduction:
  It is generally well-written, with good structure and clarity, but there are a few areas where improvements can be made for better precision and readability.
  * Line 31-33: “As is reported by Centre for Research on the Epidemiology of Disasters (CRED)” The use of "is" here is unnecessary and makes the sentence sound awkward. Remove the word “is” from the sentence.
  Thanks for your suggestion. The word “is” has been removed from this sentence. This is modified in the manuscript. This sentence has been revised to “As reported by Centre for Research on the Epidemiology of Disasters (CRED)”.
  * Line 37: “Large floods often result from the amalgamation of floods from multiple sub-watersheds” Please replace "amalgamation" with "merging" for more concise language.
  Thanks for your suggestion. This is modified in the manuscript. This sentence has been revised to “Large floods often result from the merging of floods from multiple sub-watersheds.”.
  * Line 64-65: “Copula function is widely applied in hydrological fields...” "Copula function" should be plural here.
  Thanks for your suggestion. This is revised in the manuscript. This sentence has been modified to “Copula functions are widely applied in hydrological fields, including the joint frequency analysis (Liu et al., 2018; Zhang et al., 2021), water resources management (Gao et al., 2018; Nazeri Tahroudi et al., 2022), wetness-dryness encountering (Wang et al., 2022; Zhang et al., 2023), flood risk assessment (Li et al., 2022; Tosunoglu et al., 2020; Zhong et al., 2021) , water quality analysis (Yu et al., 2020; Yu and Zhang, 2021)，precipitation model (Gao et al., 2020; Nazeri Tahroudi et al., 2023; Tahroudi et al., 2022) and so on.”.
  
  * Method:
  * Many equations are presented in the paper, and most look OK. However, please check carefully whether all equations are necessary and whether the quantities involved are properly explained. Line 232-237: For equations (6), (7), and (8), the variables are not clearly explained. Please elaborate on the meaning of each variable, such as and . This ensures the reader understands the variables used in the equations.
  Sorry for causing such confusion. Since the inflows of the different sites are represented by X¹, X², X³,…… ,X^N+M . The variables X_ph¹, X_ph², X_ph³,…… ,X_ph^N+M represent the inflow amounts corresponding to the high-water at these sites, while X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M denote the inflow amounts corresponding to the low-water at the respective sites. The marginal distribution functions for these inflows are represented as u¹, u², u³,…… ,u^N+M . Specifically, u_ph¹, u_ph², u_ph³,…… ,u_ph^N+M denote the marginal distribution functions corresponding to the high-water inflow amounts X_ph¹, X_ph², X_ph³,…… ,X_ph^N+M , capturing the probabilistic behavior of the inflows during high-water conditions at each site. Similarly, u_pl¹, u_pl², u_pl³,…… ,u_pl^N+Mrepresent the marginal distribution functions for the low-water inflow amounts X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M , describing the inflow behavior during low-water conditions at these sites.
  The explanation of this part is supplemented in the manuscripts as follows:
  “Let the inflows of the different sites be represented by X¹, X², X³,…… ,X^N+M. X_ph¹, X_ph², X_ph³,…… ,X_ph^N+Mrepresent the amounts of inflow corresponding to the high-water of these different sites respectively. Meanwhile, X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M represent the amounts of inflow corresponding to the low-water of these different sites respectively. The marginal distribution functions are u¹, u², u³,…… ,u^N+M, respectively. Specifically, u_ph¹, u_ph², u_ph³,…… ,u_ph^N+M denote the marginal distribution functions corresponding to the high-water inflow amounts X_ph¹, X_ph², X_ph³,…… ,X_ph^N+M , capturing the probabilistic behavior of the inflows during high-water conditions at each site. Similarly, u_pl¹, u_pl², u_pl³,…… ,u_pl^N+Mrepresent the marginal distribution functions for the low-water inflow amounts X_pl¹, X_pl², X_pl³,…… ,X_pl^N+M , describing the inflow behavior during low-water conditions at these sites.”
  * Figure2. The colors used to represent elements such as reservoirs and cross sections are a bit confusing. Use the same color for the same elements throughout the figure. For instance, if reservoirs are represented by a specific color, maintain that color consistently. Please revise it.
  The figure has been modified, as shown below. The colors of various elements in the figure are unified. The reservoirs are shown in green and the cross-sections are shown in blue.
  Figure 2. Schematic diagram of the generalized system in the catchment
  * Figure4. How is it possible to distinguish between conditional simulation and unconditional simulation through Figure 4? There is no clear explanation or visual distinction between conditional and unconditional simulation in the figure. Please provide a clear legend or add explanations for (a) and (b) in Figure 4, highlighting the difference between conditional and unconditional simulations.
  Thank you for your suggestion. Figure 4 has been modified. The descriptions for both conditional and unconditional simulations have been further refined and supplemented to enhance clarity. This is revised in the manuscript.
  Unconditional simulation (Figure 4(a)): This approach generates random samples based solely on the marginal probability distribution, without incorporating any existing data constraints. The probability distribution is shown in the upper-left plot, and random samples are generated simultaneously, resulting in the scatter plot below. The generated samples, represented by blue points, illustrate the joint variability according to their predefined marginal distributions. Since no prior information is used, each data point is in an unknown state before the simulation.
  Conditional simulation (Figure 4(b)): In this scenario, the simulation takes into account pre-existing data conditions. The marginal probability distribution is displayed in the top-center plot, while the known conditional data is shown in the upper-right scatter plot (in pink). These known data points act as a constraint for generating new random samples. The resulting scatter plot below (blue and pink points) demonstrates how the conditional samples are influenced by both the marginal distribution and the specified conditions of the known data. This method allows for a tailored simulation that incorporates pre-existing data insights.
  Figure 4. Schematic diagram for generating random simulation samples (a) unconditional simulation (b) conditional simulation
  
  * Case study:
  * Line 316-317: "This study focuses on four major sites within the Shifeng Creek catchment" The reason for selecting these sites is unclear. Clarify that these sites are chosen due to their strategic importance. Can you explain briefly why these particular sites were selected for the study?
  These four sites were selected due to their strategic importance within the Shifeng Creek catchment, representing key locations in the upper, middle, and lower reaches of the river system. The Lishimen Reservoir (LSM) and Longxi Reservoir (LX) sites are both situated in the upper reaches of the catchment and play a crucial role in flood control by regulating inflows and managing water levels. Understanding the flow patterns at these reservoir sites is essential for optimizing reservoir operations and mitigating downstream flood risks, especially during peak flood periods. The Qianshan (QS) cross-section, located in the middle reaches, and the Shaduan (SD) cross-section, positioned in the lower reaches, are critical control points for flood management. Analyzing flow processes at these sections allows for better coordination of reservoir operations and helps prevent the convergence of flood peaks, thus enhancing flood mitigation throughout the catchment.
  The reasons for why these four critical sites were chosen are supplemented in the manuscript as follows.
  “These four sites were selected for their strategic importance within the Shifeng Creek catchment, covering the upper, middle, and lower reaches. The Lishimen (LSM) and Longxi (LX) reservoirs, both in the upper reaches, are vital for flood control, regulating inflows to reduce downstream flood risks. The Qianshan (QS) cross-section, in the middle reaches, and the Shaduan (SD) cross-section, in the lower reaches, serve as key flood control points. Analyzing flows at these sites enables better coordination of reservoir operations and prevents flood peak convergence, enhancing overall flood management.”
  * Line 344-346: The combinations of [X-H, Y-H, Z-H, W-H], [X-M, Y-M, Z-M, W-M], and [X-L, Y-L, Z-L, W-L] mean what synchronization respectively? The abbreviations are used without prior explanation. Before utilizing the abbreviations please correspond the abbreviated letters to the original meaning. If not, it will be harder for readers to understand.
  These three combinations represent different types of synchronization. Specifically, [X-H, Y-H, Z-H, W-H] indicates that all four sites (X, Y, Z, and W) are in a high-water state, [X-M, Y-M, Z-M, W-M] signifies that all sites are in a medium-water state, and [X-L, Y-L, Z-L, W-L] means that all sites are in a low-water state simultaneously. To improve clarity, the manuscript will be revised as follows.
  “Considering the three potential states (High/Medium/Low) at each site, a total of 81 possible inflow-state combinations are identified. For ease of presentation, H, M, and L are then used as abbreviations for High, Medium, and Low. Among the 81 combinations, the combinations [X-H, Y-H, Z-H, W-H], [X-M, Y-M, Z-M, W-M], and [X-L, Y-L, Z-L, W-L] are classified as synchronous high-water, synchronous medium-water, synchronous low-water, respectively, while the remainder are deemed asynchronous. The calculation equations can be referenced in Appendix B.”
  * Line 320-321: “To achieve this, daily runoff data of August, covering a span from 2000 to 2020, have been compiled.” This sentence would be more properly expressed in the past tense. Change "have been compiled" to "were compiled" for better tense consistency.
  This is modified in the manuscript.
  
  * Results
  
  * This section is well written. The only area of weakness is the display of pictures. The font size of figure 10 is a little small. It could be considered to keep only a lower number of sub-figures and the rest could be placed in supplementary part of the article.
  The font size in the figure has been enlarged. Here are the modified figures. Moreover, to ensure the clarity of each figure, only 9 figures are put here, and the rest are put in the supplemental content.
  Figure 10. Cumulative probability distribution of the preferred marginal distribution function for runoff on each day throughout 1^st-9^th in August
  
  * Discussion & Conclusion
  * Line 666-668: “Although the eight-dimensional vine copula model takes more variables into account, including both temporal and spatial correlation, the model is too complicated due to many variables, which makes the simulation less efficient on the contrary.” This sentence is too long and complex. Pease simplify it.
  This sentence has been modified in the manuscript.
  “Although the eight-dimensional vine copula model considers both temporal and spatial correlations, its complexity reduces simulation efficiency due to the large number of variables.”
  
  * Minor comments
  The grammar in the article is generally correct. However, there are some words that are not used appropriately. A few examples are given below:
  
  * Line 254-255: “This strategy aims to distill essential spatial-temporal information, thereby reducing the vine copula function's dimensionality to simplify the model structure.” Here, the meaning of the word “distill” is not suitable for this application. It would be more appropriate to replace it with “extract”.
  Thanks for your suggestion. The word “distill” has been replaced. And this sentence has been modified in the manuscript.
  “This strategy aims to extract essential spatial-temporal information, thereby reducing the vine copula function's dimensionality to simplify the model structure.”
  * Line 271-273: “The core distinction between these two simulation methods hinges on whether certain data points are pre-determined at the time of simulation.” There is something wrong with the logic of this statement.
  This sentence has been modified in the manuscript.
  “The key difference between these two simulation methods lies in whether specific data points are known in advance before generating the simulation.”
  * Line 347: “The calculation equations can be referenced in Appendix B.” I think “provided” may be better than “referenced”.
  Thanks for your suggestion. The word “referenced” has been replaced by “provided”. And this sentence has been modified as follows.
  “The calculation equations can be provided in Appendix B.”
  * The supplemental content section is a bit too redundant. Think about keeping the important parts.
  Appendix A, B, and C are all essential. The table in Appendix D is removed and only the figures of the marginal distribution results are retained for presentation.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2266-AC2
RC3:
'Comment on egusphere-2024-2266', Anonymous Referee #3, 26 Sep 2024

The methodology and reasoning behind this study are sound, and the manuscript is well-written. The results and discussion sections offer valuable contributions to the scope of this journal. I have no further comments for this review.

Citation: https://doi.org/10.5194/egusphere-2024-2266-RC3
- AC3: 'Reply on RC3', Xinting Yu, 26 Sep 2024
  
  Thank you very much for your positive feedback and for recognizing the effort and contributions of my work. I truly appreciate your kind words regarding the methodology, reasoning, and the clarity of the manuscript. Your acknowledgment of the results and discussion sections as valuable contributions to the journal is incredibly encouraging. I am grateful for your time and thorough review, which have been instrumental in enhancing the quality of this paper. Thank you once again for your support.
  
  Citation: https://doi.org/10.5194/egusphere-2024-2266-AC3

Peer review completion

AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload

ED: Publish subject to minor revisions (further review by editor) (17 Oct 2024) by Nadia Ursino

AR by Xinting Yu on behalf of the Authors (25 Oct 2024) Author's response Author's tracked changes Manuscript

ED: Publish as is (28 Oct 2024) by Nadia Ursino

AR by Xinting Yu on behalf of the Authors (05 Nov 2024) Manuscript

Journal article(s) based on this preprint

15 Jan 2025

Synchronization frequency analysis and stochastic simulation of multi-site flood flows based on the complicated vine copula structure

Xinting Yu, Yue-Ping Xu, Yuxue Guo, Siwei Chen, and Haiting Gu

Hydrol. Earth Syst. Sci., 29, 179–214, https://doi.org/10.5194/hess-29-179-2025,https://doi.org/10.5194/hess-29-179-2025, 2025

Short summary

Xinting Yu, Yuxue Guo, Siwei Chen, Haiting Gu, and Yue-Ping Xu

Viewed

Total article views: 999 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
222	72	705	999	9	7

HTML: 222
PDF: 72
XML: 705
Total: 999
BibTeX: 9
EndNote: 7

Views and downloads (calculated since 12 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	104	40	7	151
Sep 2024	59	12	52	123
Oct 2024	13	4	226	243
Nov 2024	22	2	177	201
Dec 2024	15	12	182	209
Jan 2025	7	2	61	70
Feb 2025	0
Mar 2025	0
Apr 2025	0
May 2025	0
Jun 2025	0
Jul 2025	0
Aug 2025	0
Sep 2025	0
Oct 2025	0
Nov 2025	0
Dec 2025	1	0	1
Jan 2026	0
Feb 2026	0
Mar 2026	1	0	1
Apr 2026	0

Cumulative views and downloads (calculated since 12 Aug 2024)

Month	HTML	PDF	XML	Total
Aug 2024	104	40	7	151
Sep 2024	59	12	52	123
Oct 2024	13	4	226	243
Nov 2024	22	2	177	201
Dec 2024	15	12	182	209
Jan 2025	7	2	61	70
Feb 2025	0
Mar 2025	0
Apr 2025	0
May 2025	0
Jun 2025	0
Jul 2025	0
Aug 2025	0
Sep 2025	0
Oct 2025	0
Nov 2025	0
Dec 2025	1	0	1
Jan 2026	0
Feb 2026	0
Mar 2026	1	0	1
Apr 2026	0

Viewed (geographical distribution)

Total article views: 974 (including HTML, PDF, and XML) Thereof 974 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 11 Apr 2026

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (5507 KB)
Metadata XML

Short summary

This study introduces RDV-Copula, a new method to simplify complex vine copula structures by reducing dimensionality while retaining essential data. Applied to Shifeng Creek in China, RDV-Copula captured critical spatial-temporal relationships, demonstrating high synchronization probabilities and significant flood risks. Notably, it was found that increasing structure complexity does not always improve accuracy. This method offers an efficient tool for analyzing and simulating multisite flows.


Total:	0
HTML:	0
PDF:	0
XML:	0