DAR-type model based on "long memory-threshold" structure: a competitor for daily streamflow prediction under changing environment

Wang, Huimin; Song, Songbai; Peng, Zhuoyue; Zhang, Gengxi

doi:10.5194/egusphere-2025-1305

Preprints

https://doi.org/10.5194/egusphere-2025-1305

Preprints

28 May 2025

| 28 May 2025

DAR-type model based on "long memory-threshold" structure: a competitor for daily streamflow prediction under changing environment

Huimin Wang, Songbai Song, Zhuoyue Peng, and Gengxi Zhang

Abstract. The non-stationarity, non-linearity, and time-varying fluctuations of streamflow have increased with changes in the environment, challenging accurate streamflow prediction. Furthermore, the overlook of long-term memory features could lead to biases in model parameter estimation and testing of time series properties. The classical linear Autoregressive-Generalized Autoregressive Conditional Heteroskedasticity (AR-GARCH) model has a narrow parameter range, and the moment conditional requirements for parameter estimation are relatively strict, limiting its applicability and prediction accuracy in modelling and predicting daily streamflow. Under the premise of long-term memory, a dual-threshold double autoregressive (DTDAR) model is proposed to capture the non-linear patterns in streamflow series. Using 15 hydrological stations in the Yellow River basin in China as an example, DAR models are compared with AR-GARCH models to assess their applicability and predictive ability. The results indicate that the DAR-type models have a stronger predictive ability for daily streamflow than the AR-GARCH-type models. The threshold models (DTDAR, TAR-GARCH) convert non-linear transformations into several linear problems, improving the prediction accuracy of single linear structural models (DAR and FDAR, AR-GARCH and FAR-HARCH), among which the R² value is improved by 29.15 % and 15.06 %, 25.53 % and 15.53 %, and the NSE value is increased by 0.29 and 0.16, 0.24 and 0.15. Compared to the normal distribution, the student's t distribution for residuals is a better choice for predicting daily streamflow time series in the study area. This study enriches the stochastic hydrological models and improves the accuracy of streamflow prediction.

Received: 19 Mar 2025 – Discussion started: 28 May 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Huimin Wang, Songbai Song, Zhuoyue Peng, and Gengxi Zhang

Status: final response (author comments only)

RC1:
'Comment on egusphere-2025-1305', Anonymous Referee #1, 09 Jun 2025
I have read the paper, “DAR-type model based on "long memory-threshold" structure: a competitor for daily streamflow prediction under changing environment”. Overall, the paper aims to develop and test a stochastic model for simulating daily streamflow, taking care of the nonlinearity, nonstationarity, and most importantly, the long-term memory of the streamflow. This is one of the few papers in the field of stochastic hydrology that has devoted greater attention to reproducing the long-term memory component of streamflow, which is really appreciated.
My major observation is that the paper is not sufficiently motivated, and the flow of the arguments in the paper is not smooth. For example, there are many times in the paper when an arbitrary number of statistical tests are being performed without any prior reasoning. The structure of section 2.3 does not clearly give enough reason why the current modeling paradigm is failing to reproduce the nonlinear, non-stationary models that fail to reproduce the long-term memory properties of the streamflow. Further, this section does not provide enough evidence to go with the FDTDAR model. There are many figures in the paper which is more suitable in the supplementary file rather than the main manuscript.
The following comments need to be addressed to improve the structure of the paper and the overall motivation behind this work.
Major comment:
Line 120-125: How is the Hurst exponent estimated? Based on the information provided in Table 1, the length of the time series is short enough to estimate a stable value of H. Additionally, there is no uncertainty measure of the H estimates provided. This is a very serious concern. If H is not statistically significant, then it is not a long-term persistence process. In order to confirm the existence of long-term persistence, the nature of decay of the autocorrelation function must follow a type of power-law, as long-term persistence is a scale-free entity.

Section 2.3: This section, in the current form, is the most confusing part in section 2. It portrays different tests on streamflow time series, identifies non-stationarity, confirms the general properties of a white noise process, and examines the contribution of long-term memory to improve the simulation capability of daily streamflow. There are many tests performed here, without enough motivation. It is recommended to reconstruct this section. First, try to state what the overall aim or motivation is for the modeling exercise. Second, with the help of a flowchart, show what hypothesis needs to be tested before going to the modeling exercise. In order to do such hypothesis testing, state the relevant tests with appropriate references. Finally, crisply conclude what is learnt through this modeling exercise and state the next steps to achieve the objective of the paper.

Before going to section 4, it is recommended to give an illustration of the model selection, parameter estimation, and testing the residuals of the FDTDAR model for a simulation from a standard model. Give a flowchart for this entire model-building process and diagrams related to the key results.

Section 3.5: Why the FTAR-GARCH model? The previous sections were devoted to a finer understanding of the FDTDAR model. Suddenly, in this section, a new modeling framework is added without any prior motivation/reasons. Please clarify this point.

Other comments:
There are many figures and tables in the main manuscript that can be moved to the supplementary section, as they support the model development process. For example, table 2, figure 4, and figure 2, table 3.

Section 2.3: Daily streamflow time series characteristics and their linkage relationships – the title is not conveying any specific element/property. What is meant by linkage relationships? How is this link estimated? What are the variables considered for the link?

Line 130-135: There is no motivation for doing all sorts of statistical tests. In the previous section, the discussion was focused on autocorrelation, but suddenly it shifted to nonstationarity without giving much motivation for why such an analysis is needed.

Line 120-125: How the deseasonalization is performed here. Please provide the mathematical details of the process.

Figure 2: The discharge is shown in m3/s with a different y-axis. Please show the plots in mm/day, so that the flow magnitudes and other patterns can be compared visually across all the catchments.

Line 96: The Length of the basin is 5464 km, is it the length of the main channel/river? Please provide the catchment area.

Section 3.1: “mu_m and sigma_m are seasonal mean and variance” – I think this is not correct. It is mentioned in equation 1 that m denotes day of the year where n denotes the year. Therefore, the variable x_nm denotes the value of streamflow on the mth day of the nth year. So if the average is taken across all the years (as it is mentioned n=1,2,..,N), mu_m should be the average annual streamflow, not the seasonal streamflow as it is now mentioned. Please clarify this. The same is with the variance, sigma_m.

Section 2.3: Strong motivation for why the DAR type model is needed in streamflow simulation can be discussed here with some numerical cases.

Section 4.1 is not there.
Citation: https://doi.org/10.5194/egusphere-2025-1305-RC1
- AC1: 'Reply on RC1', Gengxi Zhang, 15 Jul 2025
  
  Dear reviewer,
  We express our great appreciation for your constructive comments on improving the manuscript. We have fully addressed all of the comments in the revised manuscript. Please see the response details in the attached file.
  Sincerely, Gengxi Zhang
  
  Citation: https://doi.org/10.5194/egusphere-2025-1305-AC1
RC2:
'Comment on egusphere-2025-1305', Anonymous Referee #2, 26 Jun 2025

Summary:
This manuscript proposes an innovative modelling approach - the dual-threshold double autoregressive (DTDAR) model - designed to improve the prediction of daily streamflow under non-stationarity, long memory, and non-linearity in the field in the field of hydrology. By integrating fractional differencing with a threshold-based structure in both the first- and second-order moments, the authors develop a long memory-threshold framework (FDTDAR) that is shown to outperform conventional models (AR-GARCH, TAR-GARCH) at multiple stations across the Yellow River Basin.
General remarks:
1. The manuscript is well structured, comprehensive in analysis, and the methodology is methodically laid out. The proposed approach makes a meaningful contribution to the field of stochastic hydrological modelling and represents a promising alternative to existing linear and GARCH-based models. However, several aspects require clarification or revision before the manuscript can be recommended for publication.
2. While the DTDAR/FDTDAR model is a novel approach, it is structurally complex, with numerous parameters and thresholds. The paper would benefit from a clearer explanation of how parameter identifiability, estimation convergence, and computational burden are handled. Practitioners will benefit from a discussion on model tractability and software implementation.
3. The manuscript focuses primarily on comparisons with AR-GARCH and TAR-GARCH models. While these are relevant, the absence of modern nonlinear or machine learning models (e.g., LSTM, hybrid deep learning models) in the comparison set limits the extent to which the results can be considered broadly applicable. Even if not implemented, a discussion acknowledging this limitation and the rationale for focusing on DAR-type models would be appropriate.
4. While the use of average interval width (AIW) and containing ratio (CR) are appropriate to assess the prediction uncertainty, the manuscript lacks detail on how prediction intervals were constructed. Clarification on whether these are based on analytical variance, bootstrapping, or Monte Carlo simulations is necessary.
5. The analysis clearly shows that the Student’s t-distribution improves predictive performance over the Gaussian assumption. The authors are encouraged to provide more discussion on how degrees of freedom were selected, and whether any skewed or generalized t-distributions were considered or could be more appropriate for heavy-tailed hydrological data.
Apart from these general comments, the authors should also take into consideration a few minor points.
Minor remarks:
1. Terms such as FDTDAR-n and FDTDAR-t should be introduced earlier and used consistently.
2. Several grammatical and syntactic issues are present throughout the manuscript. A round of professional language editing is recommended.
3. Recent advances in time series forecasting using deep learning could be briefly referenced to contextualize the DTDAR approach.

Citation: https://doi.org/10.5194/egusphere-2025-1305-RC2
- AC2: 'Reply on RC2', Gengxi Zhang, 16 Jul 2025
  
  Dear reviewer,
  We express our great appreciation for your constructive comments on improving the manuscript. We have fully addressed all of the comments in the revised manuscript. Please see the response details in the attached file.
  Sincerely, Gengxi Zhang
  
  Citation: https://doi.org/10.5194/egusphere-2025-1305-AC2

Huimin Wang, Songbai Song, Zhuoyue Peng, and Gengxi Zhang

Viewed

Total article views: 886 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
766	94	26	886	11	24

HTML: 766
PDF: 94
XML: 26
Total: 886
BibTeX: 11
EndNote: 24

Views and downloads (calculated since 28 May 2025)

Month	HTML	PDF	XML	Total
May 2025	67	8	2	77
Jun 2025	100	22	9	131
Jul 2025	75	17	5	97
Aug 2025	107	13	1	121
Sep 2025	372	11	1	384
Oct 2025	36	7	6	49
Nov 2025	9	16	2	27

Cumulative views and downloads (calculated since 28 May 2025)

Month	HTML	PDF	XML	Total
May 2025	67	8	2	77
Jun 2025	100	22	9	131
Jul 2025	75	17	5	97
Aug 2025	107	13	1	121
Sep 2025	372	11	1	384
Oct 2025	36	7	6	49
Nov 2025	9	16	2	27

Viewed (geographical distribution)

Total article views: 905 (including HTML, PDF, and XML) Thereof 905 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 22 Nov 2025

Short summary

This study introduces a novel dual-threshold double autoregressive (DTDAR) model for daily streamflow prediction. The DTDAR model outperforms other commonly used models, especially when using a Student's t distribution for residuals, showing improved accuracy in capturing non-linearity and long-term memory in streamflow data.


Total:	0
HTML:	0
PDF:	0
XML:	0