Characterizing Climate-Driven Shifts in Chilean Rainfall Regimes with a Hybrid Hidden Markov–Copula Framework
Abstract. Chile's hydroclimate exhibits pronounced meridional gradients and strong interannual variability, posing persistent challenges for regime-aware, probabilistic rainfall prediction. We introduce a hierarchical framework that explicitly separates large-scale regime dynamics from local spatial dependence. The approach integrates: (i) a covariate-driven non-homogeneous Hidden Markov Model (nHMM) to learn synoptic precipitation regimes and their transitions; (ii) Dynamic Time Warping (DTW) clustering to delineate precipitation-coherent climatic zones; and (iii) state-conditional Regular Vine copulas with Generalized Pareto (GPD) tails to model residual spatial dependence and extremes. The analysis employs the 0.05° daily CR2MET precipitation product over continental Chile (462 grid points, May–August 1980–2021) together with large-scale atmospheric covariates including the Southern Oscillation Index (SOI), the Oceanic Niño Index (ONI), and Global Mean Sea-Surface Temperature (GMSST).
Five physically consistent rainfall regimes emerge, spanning from an anticyclonic dry state to a cyclonic wet state, confirmed by composites of mean sea-level pressure, 850-hPa winds, and 500-hPa geopotential height. Mixed-effects inference on the transition matrix reveals a statistically significant decline in wet-state persistence of ~0.34 % yr-1 (≈14.5 % over 1980–2022), coincident with rising GMSST. Out-of-sample ensembles for 2022 (100 daily members conditioned on Viterbi states) are well calibrated: central 90 % prediction intervals achieve near-nominal coverage, low asymmetry, and widths increasing southward with climatological variance.
By disentangling regime timing and drivers from residual spatial co-variability and extremes, the proposed nHMM – DTW – vine – GPD framework yields meteorologically coherent states, spatially consistent probabilistic simulations, and quantitatively validated forecasts. The method is computationally tractable and transferable, offering a principled pathway for regime-conditioned, uncertainty-aware precipitation prediction to support hydroclimate risk management in Chile and other topographically complex regions.
General remarks
This is a puzzling manuscript: on one side it delivers evidence that the chosen approach to study the precipitation space time structure in a highly complex environment (here Chile with its very long extension in latitude together with a complicated and structured orography) is a scientifically fruitful approach. Also the choice to build a hierarchical model with the hidden Markov states to identify sysnoptic regimes, the clustering to identify regional homogeneous precipitation zones und finally to model the spatial variability within each zone with copulas for the bulk and generalized Pareto distributions for the extremes beyond a fixed quantile is well explained in the introduction. But the presentation of the results is the "black" side of the medal:
(1) in large parts section 2 and section 3 contain repeated texts,
(2) a very large number of actual results is shifted into the supplement text S1 - S10 (the total supplement covers 20 pages!) with the additional caveat that S8 is not readable,
(3) a very large number of evaluation/test variables is introduced often only by name and at no stage it mentioned how this multitude of tests is used for inferences about the results and how the actual decision of choosing the appropriate model is influenced by those test variables,
(4) at no stage of the manuscript the actual number of estimated parameters vs. the actual number of sampling points in space and time used to perform the inference/estimation is mentioned,
(5) very often uncertainty measures such as confidence intervals or p-values in case of test are not given or even mentioned, one example is table S8 in the supplements, here the estimated shape parameter of the Pareto fit of the mediod station is reported (but never mentioned in the main text) for each synoptic regime with one close to zero value and two positive and two negative values which would indicate either clear differences of the tail behavior or very weak properties of the underlying data set to estimate the notorious difficult shape value, this is a clear case to be studied in detail eg be bootstrapping across the stations around the mediod and finally
(6) except for the introduction the remaining manuscript and the supplement reads like a splitted project report about the analysis of Chilean precipitation space-time variability applicable for potential hydrological research, eg physical consistency of mentioned but the selection of atmospheric variables lacks any moisture related variables like specific humidity or vertically integrated moisture transports which is more than the 850 (u,v) data and which can be easily extracted from C3S / ERA5 data sets.
Together with the extensive list of comments (my detailed remarls) in the attached and annotated manuscript I can suggest publication in HESS only after major (major) revisions of the current version of the manuscript.