the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Snowmelt Influence on Northern Hemisphere River Discharge – The Potential of Causal Inference for Assessing Long-Term Trends
Abstract. Snowmelt is a vital contributor to river discharge across the Northern Hemisphere, supplying freshwater to over 1.5 billion people and supporting key economic sectors such as agriculture and hydropower. However, climate change has led to a decline in snow water equivalent (SWE) on almost the entire Northern Hemisphere, reducing snow-based water availability. Despite the importance of snowmelt and the pressure imposed by climate change, no large-scale studies have examined the connection of snowmelt dynamics and river discharge beyond statistical measures, which fail to capture the complexity of hydrological regimes. To address this gap, we perform causal discovery using PCMCI, a method that adapts the PC proposed by Peter Spirtes and Clark Glymour to the time series setting, to obtain qualitative causal structure across 119 basins from 1980 to 2022 and then quantify using causal effect estimation with a 20-year moving window and a random forest estimator. Our results show that the role of snowmelt in streamflow generation is changing. In various basins where the method allows for trend detection, the ratio of the causal effect of snowmelt on river discharge to the mean of the river discharge is increasing despite declining SWE. This suggests that as precipitation patterns shift and intra-annual variability increases, snowmelt may become more important for streamflow generation in certain basins despite a generally declining SWE. While regional differences emerge, causal effects do not consistently correlate with geographical factors such as latitude or basin characteristics. Analyses in six basins that serve as illustrative examples, indicate that changes in seasonal hydrology, particularly the timing and distribution of precipitation, influence the relative role snowmelt plays for river discharge. These findings highlight the power of causal inference over conventional statistical measures in enhancing the analysis of large-scale snow hydrological regimes by adding depth to existing approaches.
Competing interests: At least one of the (co-)authors is a member of the editorial board of The Cryosphere.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(13104 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 08 Apr 2026)
- RC1: 'Comment on egusphere-2025-6280', Alexander Gottlieb, 24 Mar 2026 reply
-
RC2: 'Comment on egusphere-2025-6280', Haejo Kim & Samuel Tuttle (co-review team), 03 Apr 2026
reply
Summary
This manuscript investigates the relationship between snowmelt and river discharge across the Northern Hemisphere using a causal discovery algorithm. Specifically, the authors used the PC Momentary Conditional Independence (PCMCI) framework for causal discovery on 119 basins to discover causal structures between several hydrologic variables between 1980 and 2022. The authors then quantified the trends of causal effects using a 20-year moving window and a random forest estimator of snowfall/rainfall on basin discharge.
Understanding the relationship between the snowmelt and streamflow is an ongoing and complex research topic in snow hydrology, especially with rapidly warming winters due to anthropogenic climate change. As a result, this manuscript falls within the scope of this journal.
The main drawback to this manuscript is the lack of clarity regarding many crucial details to a scientific study such as data selection and methodology, which are highlighted as major comments below. Figure quality is good, however, in many of the figures the text size can be a little small and some are almost illegible. Additionally, the manuscript also contains many grammatical and punctuation errors (i.e., missing parenthesis around citations, lack of commas, etc.).
Major Comments (Clarity on data and methods):
- The authors list all their data sources starting in section 2.1. However, the authors do not address the different spatial resolutions of each dataset. The authors do partially address this matter in section 2.3.3 with the MERRA-2 data. Highlighting these differences might help make the point in Section 2.3 more clear why the proportional approach is clearly needed. Additionally, having a last access date will be helpful.
- Table 1: For Snow, it’s stated that SnowCCI v3.1 was used alongside ERA 5 Land. In the remarks column, it says ERA 5 was used to fill data gaps. Then on line 125 it says “ECWMF reanalysis land (ERA 5 Land/ERA 5). Are the authors using ERA5 and ERA 5 land interchangeably? Aren’t they two different products? Also, since it states “PMW and in-situ”, I’m guessing the SnowCCI dataset is a mixture of remotely sensed and in situ measurements of SWE. This is not clear in the manuscript. Also, how is the gap-filling conducted?
- Line 136: What is the justification to choose variables from ERA5 and ERA5 Land? Why didn’t the authors stick with just ERA5 or ERA5 land for each variable?
- Section 2.3.2, the authors state that snow is subtracted from the ERA5 total precipitation to generate a liquid precipitation data. However, it is unclear what is being subtracted. Is it snowfall or the SWE data?
- Section 2.4: The authors computed a range of statistical measures (mean, median, variance, and standard deviation) and applied a Theil-Sen estimator and Mann-Kendall test. However, it is unclear what they are taking a mean or median of. Is it a yearly average for each year in the 43 year period between 1980 and 2022? Do the authors then do the Theil-Sen and Mann-Kendall test? In addition, in Section 3.1 the authors state normalized Theil-Sen trends were computed using both the mean and the maximum absolute value, but do not state the latter in section 2.4.
- Line 189: The authors switch from using PCMCI to using PCMCI+ without giving a description as to what PCMCI+ is and how it is different from PCMCI. I’m guessing that PCMCI+ is a specific type of algorithm. Also, if the authors are using a software package to run their PCMCI+, it would be nice to know which package they used and see the software package cited in the manuscript.
- Section 2.5: On line 196-197, the authors mention the use of sanity checks through modifications. What are these modifications? If these modifications are highlighted in a different section, this should be clearly highlighted in the text here. Also, have the authors considered doing the analysis on surrogate data that randomizes the time series but retains seasonal trends?
- Section 2.5.1. In line 156, the authors claimed their approach to avoid seasonality and shifts of seasonal patterns were described in more detail. I am still somewhat confused as to how this was considered outside of the calendar week being an index variable. Did this actually remove the non-stationarity problem? Aren’t there statistical tests to prove a time series is stationary?
- Section 2.5.2: I am also confused at how the random forest estimator was applied. Figure 4 goes into slightly more detail saying the RF estimator is fit using input data using the optimal adjustment set identified from the causal graphs, but it is still unclear what is being predicted or estimated.
- Section 3: I wonder if section 3 can be improved by presenting the results in a slightly different order. Since this is a paper about causal statistics, I think presenting the causal results (Sect. 3.2) first would highlight its importance to the analysis. This can be followed by the results of Sect. 3.1 as explainers for the causal results.
- In addition, my understanding is that casual graphs are an important part of PCMCI, especially for possible feature selection. Why didn’t the authors show the casual graphs or include them in supplementary materials for discussion for some of the exemplary basins in section 3.4? I think showing these causal graphs and linking them back to the literature can help answer whether these causal graphs can produce reasonable results and help differentiate sections 3.4 and 4.1.
- The authors seem to have included maps with temporal trends of the snowmelt-to-discharge and rain-to-discharge causal effects, so we can see how they are changing over time, but they don't show maps with the actual magnitude of the causal effects (e.g., how does the strength of the causal effect in the Arctic compare to the western U.S., for instance)? They do this for select basins, normalized to river discharge (e.g. Figure 10), and present the results in tables in Appendix B, but it would be helpful to see it in a map.
- For Figure 10, couldn't the trend in river discharge result in increasing or decreasing trends, even if the strength of the causal effect remains constant over time? If so, then why normalize by discharge, which could make the plots harder to interpret? Or do the authors normalize by a single mean discharge value calculated from the entire times series?
- Figure A1: Why did they specify that evapotranspiration has no causal link with river discharge? It seems to me that an increase in ET could definitely decrease streamflow.
Comments
- Line 8 (abstract): The authors presented trends in causal effects, but did not present the magnitude of the causal effects (as they note in the discussion), so instead of "...and then quantify the causal effects..." this should be "...and then quantify the trends in causal effects..."
- Section 1: Parts of section 1 seem to be disconnected and can be edited to flow better.
- Line 26: The authors end the paragraph about how snowmelt generated runoff is used downstream, but the next paragraph is about the effects of snow-dominated rivers due to anthropogenic climate change and then talks about the downstream uses. This line can be deleted.
- Paragraph staring on Line 52 (“A brief overview of the role…”) and Figure 2 feels very out of place where it is placed. I don’t think it hurts the introduction if the paragraph and figure are removed from the introduction.
- Figure 2 caption: The authors state “snowmelt”, but line 53 states the ratio was calculated using SWE to total precipitation.
- Lines 124 to 131: These sentences can use refinement and reordering to make it more clear and justify why the time period was chosen and how the 199 basins were selected.
- Line 136: The “(e.g. evapotranspiration datasets)” does not clarify the authors previous sentence on line 135.
- Line 136: Manuscripts says, “For rain and temperature we use the ERA 5 and ERA 5 Land data respectively”. However, on Fig. 3, Temperature is listed as ERA5. Also, why didn’t the authors choose precipitation for ERA 5 Land?
- Line 223: The sentence “This was crucial, given the presence of nonlinear relationships in our system” feels oddly written. I wonder if this information can be included with the previous sentence? i.e., “we used the non-parametric Conditional Mutual Information as the conditional independence test due to the presence of nonlinear relationships within our system”
- Lines 252, 253, and 256: Please include Figure before 4B and 4C.
- Figure 10: What does the black dashed line represent in the graphs? I don’t think it was addressed what these plotted in the manuscript or the caption. Additionally, x-axis labelled have a typo: “Riverdischarge”.
- Line 466: The authors should be consistent with spelling of “snow pack”. The authors have used “snowpack” in their manuscript up to this point.
- Line 474: What is the author referring to with “(5)”?
- lines 544-546: This sentence is confusingly written (lots of commas) and has a typo.
- Line 569: This paragraph starts on line 558 and continues to line 580. I think the paragraph can be split here as the authors are discussing the use of another PCMCI algorithm called LPCMCI. Also defining that LPCMCI means might be helpful as well.
- Line 595: This sentence reads awkwardly and can use a rewrite.
- Line 598: missing parenthesis after “(see section 3.5”
- Figures B1 to B4: All figures state, “Not Trend Identified” Also, is the continent column necessary? The removal of this column can also potentially help improve the sizing of the figure, so the numbers and text are more legible. Also, having the separator between the non-causal and causal be white might help differentiate the two sides. It is hard to tell when the text is very small.
Citation: https://doi.org/10.5194/egusphere-2025-6280-RC2
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 164 | 151 | 21 | 336 | 18 | 23 |
- HTML: 164
- PDF: 151
- XML: 21
- Total: 336
- BibTeX: 18
- EndNote: 23
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
In this manuscript, the authors attempt to provide novel insights on the relationship between snowmelt and streamflow. They apply a causal inference algorithm, the Peter Clark Momentary Conditional Independence (PCMCI) framework coupled with a random forest-based method of causal effect estimation, to several reanalysis and remote sensing datasets of snowpack, precipitation, streamflow, and other surface variables aggregated to the major river basin scale to identify causal relationships among them. They apply this framework to rolling 20-year windows over a 43 year period to determine whether and how the nature of the causal relationships between variables changes over time. They illustrate the changing nature of these relationships with 6 case study basins from around the world to highlight differential responses.
While I found the study conceptually very interesting and well within the scientific scope of The Cryosphere, as the relationship between snowpack dynamics and streamflow in a nonstationary climate is an area of active research and debate, I have several concerns about the scientific quality of the manuscript and the degree to which it represents meaningful advancement of scientific understanding and/or methods. My primary concern is that the analytical framework used is quite opaque and not subject to any kind of formal validation, and it is unclear whether it is an appropriate tool for the analysis of the snowmelt-streamflow system or what advantages it offers over existing methods. Additionally, there are a few concerning statistical errors throughout the manuscript that could affect the results and their interpretation, and places where the methods and data choices are unclear or poorly justified.
Given the seriousness of these concerns, the rest of this Review focuses on the major areas that I feel need to be addressed before this manuscript could be considered for publication in The Cryosphere or another high-quality scientific journal.
Major Comment 1: Methodological Framework
As mentioned, my primary concern with this manuscript is that there is no validation or interrogation of the main method employed. As such, it is fundamentally unclear whether the inference framework used is even appropriate for analyzing the snowmelt-streamflow system, or what novel insights it positions vis-a-vis other methods.
1.1: Lack of validation of algorithm fitness for purpose
First, it is unclear why the authors chose the PCMCI algorithm in particular. As the authors discuss, there exist numerous techniques for analyzing causal relationships between X and Y in multivariate time-evolving contexts, but there is little justification for why this particular algorithm is best-positioned to address questions about the snowmelt-streamflow relationship vis-a-vis other simpler, more transparent methods. As part of algorithm selection, I would expect the testing of multiple methods and extensive validation that demonstrate their applicability to the system in question (see, for example, Docquier et al., 2024). The authors write:
“To confirm the basic functionality of the method, we carry out a set of sanity checks. We introduce modifications to selected variables (snow or rain) and examine whether the detected effects (on river discharge) changed in the expected qualitative manner. These checks serve as sanity tests rather than a sensitivity analysis.” (ll. 196-199)
I would argue that these are not simple sanity checks, but absolutely necessary validation exercises that should be formalized and a fundamental part of the manuscript. Not all methods are applicable to all systems, and when applying a method to a novel system, as is done here, one needs to rigorously demonstrate that it is appropriate and identify the contexts in which it does and does not perform well.
There are a few reason to believe that the chosen analytical framework may not, in fact, be particularly appropriate. First and foremost is the stationarity requirement for the PCMCI algorithm. As the authors note (ll. 150-155) and reference extensively from the literature—and indeed what motivates the study in large part—snowpack, precipitation, and streamflow all exhibit a high degree of nonstationarity in many regions of the world. Despite acknowledging this, there is no detrending of the variables prior to analysis—the rolling mean applied does not address the non-stationarity issue—which means that there is a high likelihood of identifying spurious causal relationships in strongly-trending variables. Additionally, the authors note, when discussing the identification of the effects of rain versus snowmelt, that the former is more identifiable because “...rain is nearly detectable all year round. In contrast, snowmelt signal for only occurs in spring and early summer, and is thus detectable in less data points” (ll. 594-595). If the goal of the study is to analyze the snowmelt-streamflow relationship, it seems troubling that the strong seasonality of snow accumulation and melt, which occurs during only a few months of the year, is a complicating factor for causal inference. One would desire a causal inference method that leverages that information, rather than being complicated by it.
1.2: Algorithm opaqueness
Even assuming that the PCMCI is a technically appropriate causal inference algorithm, the results it produces are quite opaque and difficult to interpret in terms of what relationships it is identifying and over what time-scales. For instance, all else equal, enhanced wintertime snowmelt will increase cold-season streamflow, but reduce warm-season streamflow since a smaller snowpack will persist into those months. As such, there is a short-run positive relationship and a longer-run negative relationship. It is not at all clear how these competing effects are combined into the single “causal effect” the authors report, nor why that is a particularly meaningful measure. Hydrologists and water resource managers typically care about streamflow volume and temporal distribution. An analysis that placed the results in those terms would be much more readily interpretable and useful.
Figure 10, for example, is quite inscrutable. How should we interpret the ratio? Based on my reading of the text and figure caption, it would seem that values >1 indicate that the causal effect or a perturbation in snowmelt/rain is greater than annual mean discharge? Given that each trigger only accounts for a subset of streamflow, this seems to defy reasonable expectation. It is once again also unclear how the changing nature of the relationships over the course of the seasonal cycle is collapsed into a single annual value. Additionally, throughout the analysis, it seems that the values presented correspond to calendar years, rather than the water years (October-September or less commonly September-August) snow hydrologists typically use, as they better correspond with the seasonal hydrologic cycle in the Northern Hemisphere mid- and high-latitudes.
1.3: Use of random forest for causal effect estimation
My last concern about the overarching methodological framework is about the use of the random forest model for “causal effect” estimation. Random forests are fundamentally predictive models that learn predictive associations between variables, not causal models. Perturbing a trigger variable X reveals how the model’s prediction of Y changes when X changes, not what happens to Y when X is intervened upon in the actual system. Particularly in the case of highly correlated and interdependent predictors, as in the system the authors are examining, perturbing X does not correspond to the relevant causal quantity E[Y|do(X)]. As a result, the perturbation experiment appears to quantify model sensitivity rather than a causal effect. The manuscript should either clarify the assumptions under which the causal interpretation is valid or revise the interpretation accordingly.
Additionally, in the perturbation experiments, the motivation behind the different intervention experiments is not apparent. Seasonal perturbations make sense as a means of teasing out the effects of variability and trends in the driving variables, but the “zero” interventions are both unphysical and likely well beyond the support of the data and as such, do not strike me as a meaningful sensitivity test. The authors also note that they intervene the calendar week index variable (ll. 244-245). This intervention makes little sense; it is a necessary statistical control to account for seasonality.
On the whole, the claim that the findings “highlight the power of causal inference over conventional statistical measures in enhancing the analysis of large-scale snow hydrological regimes by adding depth to existing approaches” (ll. 15-17) does not pass muster, as there is no comparison to conventional statistical measures and the added value of the casual inference technique chosen is not made readily apparent.
Major Comment 2: Statistical Errors
My second major concern is about some apparent statistical errors in the application of the rolling window and calculation of trends. The authors apply a 20-year rolling window—it is unclear whether it is a centered window, or left- or right-justified, but I am assuming centered—to 43 years of data, then average the effects by year to produce a 43-year time series. This means that the edge years are estimated with a much smaller sample size than the central years, which has the potential to distort the results. The common practice when applying moving averages is to only retain years fully covered by the window to avoid such inhomogeneities. Additionally, testing the sensitivity of results to windows of different widths, which will have different tradeoffs between signal identification and the length of time series able to be analyzed, would be a valuable robustness check.
Additionally, there appears to be a major statistical error involved in calculating trends in the causal effect. The rolling window method introduces very strong temporal autocorrelation, as adjacent years now share 19/20 of their data. This serves to dramatically reduce the variability of the time series, artificially inflating the Mann-Kendall test statistic and overstating statistical significance. All trends should be calculated on the unsmoothed data or the significance test appropriately adjusted for the much smaller number of effective observations that results when smoothing is applied.
Major Comment 3: Unacknowledged data uncertainties
The authors rely on a single observational or reanalysis dataset for each variable in their model. There are, however, large numbers of data products for each, and particularly for hydroclimatic variables such as SWE and rainfall, the observational uncertainty is quite large and different datasets may not agree with one another on magnitude, variability, or even the direction of long-term trends of the variable of interest (Gottlieb and Mankin, 2024; Mortimer et al., 2020; Mudryk et al., 2025). The authors claim in multiple places that no other independent validation data exist, but ensuring that the findings are consistent regardless of the particular data product chosen is itself an important validation. Additionally, there are multiple independent streamflow datasets used in large-sample hydrology that are based on actual streamflow measurements, rather than a reanalysis like GloFAS, e.g., the data compiled by Han et al., (2024) or the GRDC-Caravan dataset (Färber et al., 2025).
Relatedly, given that one of the key variables in the study is snowmelt, it is unclear from the Data and Methods sections what SWE data are actually used. Figure 3 suggests ERA5, Section 2.1 mentions SnowCCI and both ERA5 and ERA5-Land (which are very different data products when it comes to snowpack), and Table 1 mentions SnowCCI gap-filled with ERA5, and also ERA5-Land. Consistency and clarity here, including details on the gap-filling procedure are necessary.
Minor Comments (in order in which they appear in the manuscript)
l. 34-35: I do not believe either of these studies assesses changes in the interannual variability of SWE over time.
Figure 1 and the associated text are a nice overview, but not particularly germane to the study, which is focused more narrowly on streamflow.
l. 53: I am assuming this is referring to the ratio of total annual snowmelt to total precipitation. Throughout the manuscript, SWE and snowmelt are interchanged quite a bit. Most of the analysis, it seems, focuses on snowmelt (as measured by negative changes in SWE), so that term should be used to avoid confusion.
l. 64: I would argue Han et al. (2024) is a study that examines the relationship between snow and river discharge on a hemispheric scale.
ll. 260-261: The sentence “The normalized Theil-Sen trends of all variables in the analyzed basins were computed using the basin mean and the maximum absolute value for each variable” is quite confusing. How are both the mean and maximum used in normalization?
Figure 5: The time scale on which these trends are being calculated is unclear. Presumably annual values over the 1980-2022 period?
Section 3.4: Some of the results for each basin regarding the mechanisms driving different changes in the causal influences of snowmelt and rain on discharge feel overinterpreted and unsupported by quantitative results.
Figure 10: Significant at what confidence level? Also, see Major Comment 2 about data smoothing and significance testing.
Section 3.5: I am having a difficult time understanding what is being tested here and why, and how it represents a validation of the method.
Section 4.1: Much of the discussion around specific basins feels redundant with section 3.4 and could be tightened considerably.
Review References
Docquier, D., Di Capua, G., Donner, R. V., Pires, C. A. L., Simon, A., and Vannitsem, S.: A comparison of two causal methods in the context of climate analyses, Nonlinear Process. Geophys., 31, 115–136, https://doi.org/10.5194/npg-31-115-2024, 2024.
Färber, C., Plessow, H., Mischel, S. A., Kratzert, F., Addor, N., Shalev, G., and Looser, U.: GRDC-Caravan: extending Caravan with data from the Global Runoff Data Centre, Earth Syst. Sci. Data, 17, 4613–4625, https://doi.org/10.5194/essd-17-4613-2025, 2025.
Gottlieb, A. R. and Mankin, J. S.: Evidence of human influence on Northern Hemisphere snow loss, Nature, 625, 293–300, https://doi.org/10.1038/s41586-023-06794-y, 2024.
Han, J., Liu, Z., Woods, R., McVicar, T. R., Yang, D., Wang, T., Hou, Y., Guo, Y., Li, C., and Yang, Y.: Streamflow seasonality in a snow-dwindling world, Nature, 629, 1075–1081, https://doi.org/10.1038/s41586-024-07299-y, 2024.
Mortimer, C., Mudryk, L., Derksen, C., Luojus, K., Brown, R., Kelly, R., and Tedesco, M.: Evaluation of long-term Northern Hemisphere snow water equivalent products, The Cryosphere, 14, 1579–1594, https://doi.org/10.5194/tc-14-1579-2020, 2020.
Mudryk, L., Mortimer, C., Derksen, C., Elias Chereque, A., and Kushner, P.: Benchmarking of snow water equivalent (SWE) products based on outcomes of the SnowPEx+ Intercomparison Project, The Cryosphere, 19, 201–218, https://doi.org/10.5194/tc-19-201-2025, 2025.