the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
SPASS – new gridded climatological snow datasets for Switzerland: Potential and limitations
Abstract. Gridded information on the past, present and future state of the surface snow cover is an indispensable climate service for any snow-dominated region like the Alps. Here, we present and evaluate the first long-term gridded datasets of modeled daily snow water equivalent and snow depth, which are available for the last 60+ years (since 1962) at 1 km spatial resolution over Switzerland. The comparison against a higher quality, but shorter model dataset shows on the one hand a good performance regarding bias and correlation and on the other hand acceptable absolute and relative errors except for ephemeral snow and for shorter time aggregations like weeks. The comparison against in-situ station data for yearly, monthly and weekly aggregated values at different elevation bands demonstrates only slightly better performance scores for the higher quality dataset, which demonstrates the good performance of the quantile-mapping method which was used to produce the long-term climatological from the higher quality dataset. A trend analysis of yearly mean snow depth from this gridded climatological- and from station-based data revealed a very good agreement on direction and significance at all elevations. However, at the lowest elevations the strength of the decreasing trend in snow depth is clearly overestimated by the gridded datasets. Moreover, a comparison of the trends between individual stations and the corresponding grid points revealed a few cases of larger disagreements in direction and strength of the trend. All these results imply that the performance of the new snow datasets is generally encouraging but can vary at low elevations, at single grid points or for short time windows. Therefore, despite some limitations, the new gridded snow products show promise as they provide high-quality and spatially high-resolution information of snow water equivalent and snow depth, which is of great value for typical climatological products like anomaly maps or elevation dependent long-term trend analysis.
- Preprint
(1878 KB) - Metadata XML
-
Supplement
(1392 KB) - BibTeX
- EndNote
Status: open (until 01 Apr 2025)
-
RC1: 'Comment on egusphere-2025-413', Michael Matiu, 12 Mar 2025
reply
Marty et al. present an evaluation of spatially gridded datasets of snow cover over Switzerland, with high spatial resolution (1km) and long duration (60+ years). They evaluate different datasets, with and without assimilation using ground observations using different metrics across elevation, and also compare long-term trends. The manuscript falls well within the topic of TC. It is well written and results are discussed critically, for which I compliment the authors (especially sec 3.4 on limitations is great). However, there are a few issues that remain unclear.
Major points:
-
The novel contribution of this study with respect to previous studies is not completely clear – from the literature review in the intro it seems that a comparison of the “new” dataset has already been performed (Scherrer et al 2024) and the dataset itself has been created and validated in Michel et al. 2024. Please highlight the differences between the past studies and the current one more clearly, as well as the novelty of this particular study.
-
I assume one novelty is the elevational analysis and temporal aggregation unit analysis. While the first one is evident and well explained, the second seems very minor to me after reading the paper. First, all of the figures are in the supplement, and often the results are presented as similar across temporal units. Moreover, the motivation behind the different temporal units is not evident. And also why daily was not a temporal unit.
-
On the other hand, another important factor is seasonality. Did you consider how error metrics vary across the snow season? E.g, if they are constant or increase/decrease towards the end of the season?
-
While the authors have a great choice of evaluation metrics, including MAAPE, which seems very interesting, there needs to be some consideration of whether the metrics (and the associated figures and statistics) refer to spatial, temporal, or their combined spatiotemporal variability. For example, L176-180 is unclear (and also not really relevant to know computational details like how your array looks like). It would be great if you could identify what the metrics and variability refer to, i.e., where you average over space, time (years or other), or where you show variability across space or time or both. Also the order of calculation matters, so if you first do metrics, then average (e.g., over weeks, or over gridcells), or first average and then do metrics. Less for bias, but significantly for all other metrics. The ordering of calculations is not completely clear from the manuscript.
Minor points:
-
Abstract is sometimes confusing. It mentions two datasets, named old and new, which remains unclear. My suggestion is to try to make the abstract as self-contained as possible. Also, some numbers would make it less vague.
-
L40 “many applications”, please provide some examples.
-
Besides the general intro to snow, the introduction focuses exclusively on the history of gridded snow datasets in Switzerland. Since the topic of spatially gridded snow cover datasets is not trivial and can be tackled (in theory) in multiple ways (stations, remote sensing, modelling), maybe a broader introduction into gridded climate datasets and gridded snow in particular might be useful for readers.
-
Introduction and Methods are somewhat mixed, since the used models/datasets are presented in the introduction, but then in the method the models/datasets are not described further. I guess there are other studies presenting this in detail, but for completeness, I suggest including a brief summary of the key model characteristics and meteorological input for the different datasets.
-
L114 some reference would be useful
-
L135 unclear if for the climatological analysis the reference period was 1991-2020 or 1999-2023.
-
L151 so relative trend is based on the Theil-Sen slope, but relative to what? Theil-Sen intercept, mean over the whole period, something else?
-
Sec 2.4 did you compare the difference in trends between CLQM and Comb?
-
Related, has the meteo input (the temp and precip grids) been tested for homogeneity? Otherwise, I guess the snow trends could reflect input dishomogeneities as well…(ok this comes around L405...)
-
Sec. 2.5 Since you use relative errors, I guess relative bias would also be interesting? While absolute bias increases with elevation, relative one should decrease, no?
-
L190 “because HS has been derived from SWE” but this is true for all elevations.
-
L210 “boxplots consisting of the 25 yearly values” but there more points than this in the boxplots?
-
Fig 4: very unusual choice for the whiskers to go from 5th to 95th percentile. Why not the standard boxplot variant with 1.5*IQR from the box edges (up to the largest value, if within range)? Also because your choice highlights a lot of “outliers”, which are not really outliers, but continuous variability, in my opinion. One could also do the other standard whiskers that go to min and max.
-
Fig4 and 5: for bias a line at y=0 would be useful. Or some light background grid lines in all panels.
-
Fig5: if the focus is on the comparison between CLQM and EKF, it would be useful to show the boxes side-by-side (e.g., with different fill or line colours) and not in separate panels. If the focus is on comparing by elevation, it’s fine like this.
-
Fig. 6: a polynomial of first degree should be a straight line…
-
Besides Fig6, is it possible to produce the same figure as Fig5 also for the non-assimilated stations, or put them side-by-side to compare the performance metrics between assimilated and non-assimilated stations?
-
L278-284 this paragraph feels a bit off in the current section. Or what does it refer to? To all stations assimilated and not? It also contains something on climatology and trends…
-
Fig7 could you please increase resolution or use vector graphics? It’s not possible to zoom in easily.
-
Fig9, by chance, do you also have a relative trend figure of this?
-
Similarly, I guess a relative trend map (Fig10) could also be useful?
-
L427 unclear, might be resolved with a more detailed method section description
-
L429 Why not use tmin and tmax instead of tmean? It’s also much more stable over time, considering the 60year period.
-
L461 please repeat the reasoning from Michel et al. 2024 here, shortly.
-
Conclusion could be a bit more general. E.g., what are the implications of the results? Can the fairly simple degree-day model be trusted or is the station data maybe more accurate? What are use cases of such a dataset in climatology, hydrology, ...?
-
Not necessarily for this study but have you considered evaluating the grids with remote sensing? Like with optical-derived snow presence?
-
PS: Thanks for the review invitation, I had SPASS reading the paper, or Spaß, as we spell it here :)
Citation: https://doi.org/10.5194/egusphere-2025-413-RC1 -
-
RC2: 'Comment on egusphere-2025-413', Anonymous Referee #2, 17 Mar 2025
reply
General summary
The manuscript presents gridded snow datasets for Switzerland. The gridded SWE datasets analysed are described in Micheal 2024 but in this study the authors apply a conversion from SWE to HS (SWE2HS) to obtain HS values which are then evaluated instead of SWE. The authors perform comparisons with in-situ data to understand the biases in the gridded data with respect to elevation and various aggregation methods. Trend analysis is also presented.
The manuscript focuses on four versions of gridded snow datasets produced and used operationally by SLF’s snow hydrological service (OSHD). One without snow data assimilation spanning 1962-present and one with snow data assimilation spanning 1999-present (described in Mott et al. 2023). A derived dataset that applies a quantile mapping procedure (described in Michel et al. 2024) to extend the characteristics of the shorter time series dataset with data assimilation across the full period (1960-present). A combined dataset comprised of the original output with snow data assimilation for 1999-present and the quantile mapped data for 1962-1998.
Although the topic is consistent with the journal and the analysis seems reasonable the article needs to be revised to clearly differentiate this manuscript from existing work. I recommend the authors restructure the introduction and methods and provide a clear storyline that is woven throughout the paper.Major revisions
Novelty - I had to refer to Scherrer et al. 2024 and Michel et al. 2024 to deduce the knowledge gap this paper aims to fill and to understand how it differs from previous work. My impression from those works is that this paper fills the gap specified in Michel et al. 2024 pp.8970 “The snow climatology produced for Switzerland is used here as an example, and will be studied and validated in more detail in a future work.” The text should be revised to bring out this point. My reading is you are doing a more thorough evaluation of Michel et al. 2024 and you are also producing a snow depth dataset by applying SWE2HS and evaluating HS and not SWE.
I suggest a revised introduction where the authors succinctly present the companion works and clearly outline the gap that is being filled by this analysis. Focus your introduction on what you are doing and why. Many of the details of the models currently in the introduction could be moved to an expanded methods/data section which would also include additional details about the models (e.g. forcing data) not currently provided.
I’m not fully convinced by the conclusion to use the combined dataset for trends. The time series and trends for entire elevation bands for CLQM versus the combined dataset do not appear to be sufficiently different to support for the use of the combined time series for trend analysis which has a known inhomogeneity. Indeed, the EKF dataset seems to provide different information at a local scale (e.g. Fig. 7) and the comparison with in-situ suggests it is of ‘higher quality’ but this does not necessarily make it better for trends.
There is insufficient information presented in the methods to support the interpretation of results, especially as it pertains to Section 3.4. See specific minor comments.
Text needs to be revised to improve flow of logic and overall readability. Multiple concepts are often lumped together and not properly explained, or are conclusions drawn without corresponding support in the text or figures.
Improved and consistent references to literature throughout.
Minor comments
L13: Do you mean climatology?
L47-50: Suggest referencing Figure 1 here.
L50-54: suggest revising sentence to make it more clear that the QM mapping method is presented in Michel et al. 2024. E.g., “This method, presented in Michel et al. (2024), allows...”
L77: Do you mean Section instead of chapter?
Section 2.1
- what is the temperature and precipitation forcing? Uncertainties related to forcing data are discussed in later parts of the paper (Sect. 4.3 in particular) but you did not inform the reader of the source of the forcing data in the methods.
-general framework is outlined in the Figure 1 but this also needs to be detailed in text form. The schematic is not a sufficient replacement.
-briefly outline the OSHD model. i.e. a temperature index model run at 1km spatial resolution, plus any other relevant details. It is described in other papers so a full description shouldn’t be needed but you still need to provide enough information for the reader to interpret the results and to aid them through the discussion.
L97: Do you mean uncertainty or error rather than bias. Also, please expand. Are there certain conditions under which you expect the conversion to be more or less accurate? Maybe reference Aschauer et al. (2023) here.
L94-95: In the introduction you present that Switzerland has a strong SWE network (L35-36) but then only evaluate HS. In the introduction you state that you evaluate HS because it is more accessible to the non-scientific public and is needed for ‘other applications’ (L65). Here (L94-95) you state that the comparison is limited to HS because you are using daily in situ data. The statements are not fully consistent. Both can be true, but the flow of logic needs to be clear.
L103-104: Are there any references to support the statement about in situ sites being in flat sheltered fields? Maybe Grünewald and Lehning 2015?
L103-104: Here you are combining two issues into one. The problem you are describing is twofold - one is point to grid representativity; the other is that the points are biased to a certain landcover type and condition (the flat field statement).
L106: Suggest simply ‘relative to station data are expected’ (remove the intentional part).
L114: Unclear if the data are quality controlled by the data provider or by you. By you, please detail these ‘separate steps’. If by the data provider, provide a reference if possible and information on gap-filling as appropriate.
L115: Remove ‘Technically’. Start sentence with ‘Each station’
L115-117: Did you also look at the results if you used the intersecting grid cell and compare with the smallest elevation difference method?
L115-117: Again, you need to work on more clearly building your argument. Here you need to tell the reader why you compared it with the eight nearest grid cells and took the one with the smallest elevation difference. i.e. grid-point mismatch and changes in HS with elevation.
L123: Can you provide a reference here, please?
L127-129: What do you mean by ‘hardly any’? It might be helpful to have a figure showing the hypsometry and number of stations in the various elevations bands. I found myself looking for such a figure many times while reading the paper.
L154: What do you mean by ‘quite well’. Please be more specific. There are a number of these vague statements throughout. A figure showing the number of sites by elevation band versus the land area elevation distribution would support this claim.
Figure 2: what about a hypsometry vs sites per elevation band or something as a sub-figure? This would support your 'quite-well' statement as well as the lines where you say there 'are hardly any' sites. Is there a way to show the sites that were not included?
L176: remove extra ‘.’ after ‘band’ and before ‘elevation’.
L188-190: Is this supported in the literature somewhere or is it conjecture? If SWE bias is close to 0 and HS bias not close to 0 then it is likely due to the conversion but this does not come across clearly in the text as written. Also, be careful to go one step at a time and clearly lay out each piece for the reader – i.e. why was this expected from the QM procedure (L187).
L190: revise to ‘and therefore has not been’
L194: ‘clear improvement in score performance when going from low to high elevations’ text is a bit strange in the context of the text that follows. Not sure the phrasing here captures what is presented next. Maybe in the subsequent text it would help if you compared low to high for the same temporal period(s) instead of mixing everything together.
L195: Does the lower MAAPE at annual and monthly at the 500m elevation band (compared to weekly) have to do with zeroes and all that to which the MAAPE method isn't well suited?
L200: Any suggestions as to why the performance increases with elevation?
L214-215: Even with the MAAPE metrics which is supposed to reduce impact of small values do you still expect relative errors to be larger for small values (i.e. low elevations) compared to larger values (i.e. higher elevations)?
L225: The same analysis as what? Same as Figure 4? Please specify.
L230: Do these poor scores correspond with certain locations or certain months or time periods or is it random?
L236-238: Also, is the fact that you need another model to go from SWE to HS not another confounding factor?
L264: ‘In a separate step‘
L266: What do you mean by ‘very similar’?
L2658-269: Is it completely and only due to the flat field observations or is this only one of multiple confounding factors? What about elevation differences at point vs grid cell, or are these fully mitigated by the 8-nearest grid cell approach?
Figure 6 – Do you mean second order polynomial? Did you consider making a similar plot but with MAAPE instead of bias?
L278: Do you mean all grid points with coincident in situ stations or all grid points in Switzerland? Unclear.
L279: ‘slightly higher’ instead of ‘slightly increasing’
L283: Is this analyzed in your paper or elsewhere? If elsewhere, please add appropriate reference.
L283-284: Lack of language precision: your comparison showed that HS is lower with the gridded data but did not demonstrate why it is lower.
L285: revise Section 3.3 subheading to ‘Evaluation of trends’
L289-290: Is this also in Michel et al. 2024 paper or is this a new finding? Not clear what is new to this paper and what is in a previous study.
L291: Recommend highlighting and reiterating the fact that it would only change or improve things from 1999 onward.
L292-293: ‘where the benefit of using OSHD-Comb can be nicely demonstrated’ again, you are jumping ahead here.
L293: How does an anomaly map (although interesting) supports your discussion of trends? Also, the anomalies for this year look fairly different to me. There are some interesting differences that could be expanded upon.
L307: was this clearly demonstrated in your analysis? or was this in the other paper? Unclear. If in your analysis maybe add a sentence to summarize those results in the relevant results section.
L208: albeit only during the last 25 years
Figure 8 and associated text –
- By extending both back to 1960 you are muting the differences. There are 40years where the time series is the same and only 25 years when they might differ. I know it’s only 25 years, which is considered short for climatology, but I’d like to also see the trends calculated over that period to be able to better compare the two models.
- Second, if the trends are similar then why concatenate two different time series which might introduce an artificial break in the time series? Why not simply recommend using CLQM for trend analysis?
L340: what do you mean by ‘it’s tempting to fill in missing snow information with that from a grid’? Did you fill in missing data with gridded data?
L358: Given that you identified sites with erroneous observations or inhomogeneities why not exclude these sites from the analysis entirely?
L360-361: This is the first mention of meteorological input data. You haven’t told the reader what the input forcing is, so they have no frame of reference for the sites listed. Need to add this type of information to the methods/data.
L383-384: also important for energy balance
L388: please reference the corresponding figure.
L392: perhaps ‘see’ rather than ‘detect’
L392-293: better agreement with station fluctuations compared to what?
L393: revise to ‘demonstrate a significant decreasing’
L395-398: Perhaps mention why does the fact that it coincides with the snow line mean it is expected to have the largest change in number of snow days? You need to explicitly lay out to the reader of these connections that are implied.
Section 3.4
- This section reads more like discussion than results. Consider renaming it.
- There are some interesting pieces in this section but could be improved by synthesizing and presented the information in a more coherent manner. I suggest reorganizing the section to put all the arguments about forcing data together and all the arguments about OSHD-Comb together. Also, moving some of the details about forcing data and precipitation partitioning to the methods may also help the reader anticipate and understand what is being presented. You can remind the reader by repeating key pieces but this should not be the first place you bring for example, where the precipitation data came from. Finally, the rationale for not analyzing daily values (L429-431) should be in the methods.
L403: Is it limited to inhomogeneities or are there also systematic biases in one or both forcings? Are they regional? by elevation?
L405-406: Here, are you talking about the impact of combining two datasets into 1 time series or are you talking about the forcing data? Unclear.
L425: Suggest ‘input data are a reason’ rather than input data are ‘the’ reason.
L432: Other reasons for what? Other reasons for not analyzing at the daily time scale? Please clarify.
L469-470: I don't feel that you really presented enough to really support this claim/argument. I believe you have probably done the analysis but how it is presented here and what you have chosen to present doesn't seem to really support this piece (at least not sufficiently)Citation: https://doi.org/10.5194/egusphere-2025-413-RC2
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
93 | 31 | 7 | 131 | 19 | 6 | 4 |
- HTML: 93
- PDF: 31
- XML: 7
- Total: 131
- Supplement: 19
- BibTeX: 6
- EndNote: 4
Viewed (geographical distribution)
Country | # | Views | % |
---|---|---|---|
United States of America | 1 | 36 | 25 |
Switzerland | 2 | 34 | 24 |
Canada | 3 | 9 | 6 |
France | 4 | 9 | 6 |
China | 5 | 7 | 5 |
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
- 36