the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Informativeness of teleconnections in local and regional frequency analysis of rainfall extremes
Abstract. We propose an effective and reproducible framework to assess the informative content of teleconnections for representing and modeling the frequency regime of rainfall extremes at regional scale. Our dataset consists of 680 annual maximum series of rainfall depth, with 1 and 24 hours durations, located in northern Italy. We compute at-site time series of L-moments through sliding time windows; then we discretize the study region into tiles, where L-moments time series are averaged. We observe that the Western Mediterranean Oscillation index (WeMOI) shows strong spatial correlation patterns with gridded L-moments. Finally, in a preliminary application of climate-informed regional frequency analysis, the L-moments are modelled as functions of WeMOI. We observe high variability of WeMOI-driven rainfall percentiles predictions, and an increase in overall goodness-of-fit of the regional model relative to the stationary framework. Overall, our research suggests promising pathways for climate-informed local and regional frequency analysis of rainfall extremes, and describes general methods, that can be adapted to different environmental variables.
- Preprint
(8920 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 12 Jan 2025)
-
CC1: 'Comment on egusphere-2024-3261', Francesco Serinaldi, 29 Oct 2024
reply
I would like to share some thoughts about this paper with the Authors, hoping that they can contribute to the discussion.
A fundamental assumption that is common to almost all methods for regional frequency analysis is spatio-temporal independence. However, the proposed procedure seems to neglect it and introduces a spurious dependence as well, I think. In fact, the sliding window procedure used to compute the sequences of L-moments acts as a moving average (… it is the same procedure used to compute e.g. drought indices such as SPI or SPEI). The resulting time series have a spurious autocorrelation with linear decay of 1/w per time lag. It is known that the autocorrelation affects the estimation of cross-correlation (and vice versa). It can yield spurious cross-correlation, and variance inflation. This means that the sequences of L-moment values computed over sliding (overlapping) time windows and the “rolling mean of the considered teleconnection” might show a spurious cross-correlation. Also note that WeMOI is a low frequency climatic mode characterized by its own “natural” autocorrelation. Therefore, the autocorrelation of the rolling means of WeMOI are characterized by the superposition of two autocorrelation structures.
In this context, any statistical test used to check the statistical significance of cross-correlation values should account for the variance inflation affecting the distribution of the test statistics (here, the Spearman correlation; see e.g. works by Khaled H. Hamed in this respect). I may have missed something, but the text seems not to specify whether the Authors accounted for these issues. If not, I think they should be considered, as they often completely change the conclusions of these types of analysis, revealing that dependence might be a much more general and simple way to model the observed behaviour (… and “entia non sunt multiplicanda praeter necessitatem”).
The Authors denote the GEV models with parameters depending on WeMOI as nonstationary. Nowadays, the term “nonstationary” is used quite arbitrarily and loosely in almost every paper; however, a model is nonstationary if and only if its form (parameters) depends on a parametric support such as time or space. WeMOI is not a parametric support; it is a process with stochastic behaviour (and periodic oscillations at 12 months, and about 20 and 50 years). Models with parameters depending on other “random” processes are not nonstationary but doubly stochastic because the parameters are themselves randomly fluctuating.
Please note that this is not just a semantic issue. Double stochastic models can be stationary, thus meaning that we can apply all standard results of mathematical statistics. Conversely, nonstationary models might be problematic and lead to paradoxes and misleading conclusions as they are not consistent with the ergodicity assumption, which is fundamental to establish a correspondence between sample properties and population properties, thus making inference technically possible.
An example of the consequences of neglecting the importance of underlying assumptions is the interpretation of the performance metrics used to compare stationary and nonstationary models. The RLM index in eq. 6 is related to the test statistic of the likelihood ratio test
Chi^2 = -2*ln(LH_0/LH_1) = -2*ln(LH_st/LH_nst) = 2*ln(LH_nst/LH_st) = 2*RLM,
which is asymptotically distributed as a Chi^2 variable with degrees of freedom equal to the difference of the number of parameters of the two models. Since the Authors use second-order polynomials for mu and/or CV, the degrees of freedom are 2 (=5-3) for GEV_1/2 vs GEV_0 or 4(=7-3) for GEV_3 vs GEV_0. Therefore, the 95th quantiles of the two Chi^2 distributions are about 6 and 9.5, corresponding to critical levels of RLM equal to 3 and 4.25. If we compare these values with those reported in Fig. 6a, we see that most of the RML values are lower than those upper limits. Leaving aside the validity of the asymptotic Chi^2 for finite samples, the message is that relative measures/metrics should not be interpreted according to their absolute values but should be assessed accounting for their own sampling variability. Indeed, this interpretation reconciles results of Fig. 6a with those in Fig. 6c. In this respect, please note that TN.SW is the only theoretically consistent goodness-of-fit metrics used here, whereas AD is not.
Let me explain. When we deal with (supposed) nonstationary models we can only apply diagnostics to standardized versions of the data, conditional on the fitted parameters (see e.g., Coles 2001; An Introduction to Statistical Modeling of Extreme Values). The expression of AD statistic holds true if and only if the data are identically distributed (i.e. the model is stationary) because such a goodness-of-fit test is based on the distance between the parametric model and the empirical cumulative distribution function (ECDF). Now, the values of the ECDF, F_n(x), corresponding to each observation are representative of theoretical probabilities if and only if these observations come from the same distribution F(x), thus meaning that larger observations (order statistics) correspond to larger non-exceedance probabilities.
However, suppose that our model is nonstationary; for example, GEV location parameter increases with time because the observations seem to assume greater values through the years. In this case, each observation comes from a different distribution (which is only valid at a specific time… or WeMOI value), and largest values are likely associated to GEV distributions with large location parameters (as the time-varying GEV model attempts to follow the fluctuations of the observations). Therefore, given a sample of size n, the largest observation, for instance, has non-exceedance probability 1/(n+1) according to the ECDF (… leaving aside plotting position issues), whereas it might have the same probability of a smaller observation under its own local GEV. In other words, the correspondence between empirical and theoretical probabilities has been lost. Therefore, Delta_AD in Fig. 6a is positive for two reasons:
- AD_nst is likely greater than AD_st because the ECDF appearing in the AD formulas refer to (overlooked) stationary assumption.
- The AD statistic is a positive distance that indicates better performance when it is small (but not too small, is it would mean that the F(x_i) sample is too regular to be a random uniform sample); if Delta_AD = AD_nst – AD_st is positive, it means that AD_nst is larger than AD_st, and therefore GEV_0 is better. In fact, this is the rule used by Ashkar and Ba (2017): “The decision rule for the sample is to choose GP if a_GP– a_KAP < 0”.
That said, as for RLM, the actual question is whether values of Delta_AD equal to 0.004-0.006 are just within the expected fluctuations for nested models, once we account for factors like dependence and the variance inflation of AD statistics related to the fact that the model parameters are estimated on the same data used to compute the AD statistic itself. Since estimated parameters make AD statistic no longer distribution-free, comparing AD values of different models is also questionable (because identical AD values for two models can correspond to different probabilities in the sampling distributions of AD under these alternative models).
TN.SW values confirm the message of the other two metrics: the data are not enough to discriminate among GEV_0, GEV_1, etc. However, while RLM and AD and their Delta's are expected to be positive but possibly small, SW is centered around zero by construction (if a model is good enough). Therefore, Delta_TN.SW is expected to be centered around zero when discriminating between models is not possible. In this case, the comparison is technically sound because TN.SW provides the kind of conditional standardization mentioned above. Thus, results in Fig. 6c are consistent with those in Figs. 6a and 6b, and the interpretation of Fig. 6 provided in section 5.2 should be revised accordingly, I think.
A note about the use of return period: return period is the expected value of return intervals, which implies integration over time (by definition). Under nonstationarity, the integral does not yield 1/(1-F(x)) because a unique F(x) does not exist! And replacing it with a set of T_i = 1/(1-F_i(x)) formulas makes no sense. Roughly speaking, under nonstationarity, integrating over time implies averaging over a set of F_i(x) distributions, and the resulting expectation is a formula reflecting (and function of) all F_i’s. While I understand the (fallacious but widespread) rationale of drawing a set of return level curves (as those in Fig. 5c-e), the formula 1/(1-F_i(x)) does not correspond to any expected value (over time). Under nonstationarity, the return level curve is as unique as in the case of stationarity because the expected value of the (inter)arrival times of exceedances over a specified threshold is always a single value resulting from integration. However, what changes is the expression linking T with the set of F_i’s. Even though the diagrams of T_i = 1/(1-F_i(x)) vs x are reported in almost every paper dealing with nonstationary distributions, this does not make them and the relationships T_i = 1/(1-F_i(x)) meaningful. So, please consider avoiding further spreading such a misconception.
Finally, the Authors state that “the spatial aggregation into tiles allows to obtain more reliable values of the rainfall statistics”. This is strictly true if the time series within a tile are independent; otherwise, spatial dependence implies information redundancy, meaning that the apparent smoothness comes with uncertainty much larger than one can think, and such averaged/aggregated statistics might be not so reliable. Please note that I do not refer to the spatial dependence of AMAX: these can look approximately uncorrelated (in space and time) even when the underlying processes are strongly dependent. In these cases, clustering in space and time might be an indicator of the underlying (concealed) spatio-temporal dependence.
These remarks apply to any type of analysis, including for instance record-breaking observations. In fact, “significant deviations in the number of record-breaking events in an observed time series relative to what is expected under the iid hypothesis indicate non-stationary time series” (Castellarin et al. 2024; https://doi.org/10.3390/atmos15070865)… or dependence, I would say! If “I.I.D.” still means independent and identically distributed, discrepancies can come from lack of fulfilment of one of these two assumptions or both, and there is no reason to exclude the former. Based on my experience, people often tend to neglect dependence because adding a few covariates to GLM-like models with an arbitrary polynomial/spline link is a bit easier than deriving the theoretical relationships accounting for dependence.
Overall, in my opinion, any method that implies spatio-temporal aggregation, smoothing, and averaging of hydro-climatic data should carefully account for spatio-temporal dependence, as this assumption allows one to keep models simple and parsimonious, it is generally sufficient to describe the behaviour of most of the observed processes, and often reveals that we are overconfident about the reliability of results and the amount of information (effective sample size) actually available.
However, if we decide to use nonstationary models, we must bear in mind (i) what nonstationarity technically means and implies, (ii) that most of the methods available under stationarity are no longer valid, and (iii) the inference procedure itself along with the interpretation of results are problematic because of lack of ergodicity. The foregoing discussion just mentions some concepts that cannot be transferred when we move from stationarity to nonstationarity. Deeper discussion of these and other issues can easily be found in the literature… some of such a literature (concerning the impact of spatio-temporal dependence on frequency analysis) is from one of the Authors.
Sincerely,
Francesco Serinaldi
Citation: https://doi.org/10.5194/egusphere-2024-3261-CC1 -
RC1: 'Comment on egusphere-2024-3261', Anonymous Referee #1, 08 Dec 2024
reply
The article “Informativeness of teleconnections in local and regional frequency analysis of rainfall extremes” proposes a framework to assess the informative content of teleconnections for representing and modeling the frequency regime of rainfall extremes at a regional scale, using northern Italy as a case study.
The article deals with a topic still poorly investigated, i.e. the possible use of information related to teleconnection in regional frequency analyses. Overall, the article is well written, the methodology is clearly described, the parameter selection has been properly described and the results are supported with evaluation metrics and a complete and detailed discussion.
Considering its novelty, I believe the article can be published after minor revision. It would be interesting to read what I suppose will be the next article, i.e. the application of the regional frequency analysis and a comparison with other methodologies.
A list of minor comments follows.
- In the description of Figure 2, I suggest replacing “In black: Italian administrative regions” with something similar to “Black lines: Italian administrative regions” to avoid confusion with the color scale of the elevation.
- Row 157: I suggest replacing “4000 m a.s.l.” with “about 4000 m a.s.l.”.
- Row 163: I suppose an “of” is missing after “time series”.
- Row 223: Please check the reference, I suppose it should be “Gabriele and Arnell (1991)”.
- In Figure 3, first column, some stations are colored with gray, while others are in a purple-green-yellow color scale. Could you please explain why? I suppose that you reported with this color scale only the 387 stations with a significant correlation with WeMOI, but maybe I am missing something. Does the same apply to the white tiles in columns 2-4 of Figure 3?
- In row 165, the authors said that the data covers the period 1928-2020, while in the conclusion (row 362) the period mentioned is 1921-2022. Please check and correct.
Citation: https://doi.org/10.5194/egusphere-2024-3261-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
139 | 31 | 7 | 177 | 1 | 1 |
- HTML: 139
- PDF: 31
- XML: 7
- Total: 177
- BibTeX: 1
- EndNote: 1
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1