the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Clustering simulated snow profiles to form avalanche forecast regions
Abstract. This study presents a statistical clustering method that allows avalanche forecasters to explore patterns in simulated snow profiles. The method uses fuzzy analysis clustering to group small regions into larger forecast regions by considering snow profile characteristics, spatial arrangements, and temporal trends. We developed the method, tuned parameters, and present clustering results using operational snowpack model data and human hazard assessments from the Columbia Mountains of western Canada during the 2022–23 winter season. The clustering results from simulated snow profiles closely matched actual forecast regions, effectively partitioning areas based on major patterns in avalanche hazard, such as varying danger ratings or avalanche problem types. By leveraging the uncertain predictions of fuzzy analysis clustering, this method can provide avalanche forecasters with a straightforward approach to interpreting complex snowpack model output and identifying regions of uncertainty. We provide practical and technical considerations to help integrate these methods into operational forecasting practices.
- Preprint
(4915 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 02 Sep 2024)
-
CC1: 'Comment on egusphere-2024-1609', Frank Techel, 17 Jul 2024
reply
Dear Simon, Florian, and Pascal
I read with interest your preprint on "Clustering simulated snow profiles to form avalanche forecast regions".
I have three questions:
- Section 2.1: What is the size of the Columbia Mountains study area?
- L58, L352-356: Did I understand correctly that within this (rather large) study area (operationally?) only 168 grid points are used to simulate the snowpack? Or were these just the points used for the analysis? This seems like a rather drastic reduction compared to other studies recently submitted by you, as Herla et al. (2024) ( https://egusphere.copernicus.org/preprints/2024/egusphere-2024-871/egusphere-2024-871.pdf )
- And potentially relevant when interpreting the findings (as shown in Figure 10): Did forecasters have access to clustered snow-cover simulations during the investigated season? If they did, how were these used to cluster subregions into regions?
Thank you for clarifying these points.
Kind regards,
Frank Techel
Citation: https://doi.org/10.5194/egusphere-2024-1609-CC1 -
AC1: 'Reply on CC1', Simon Horton, 26 Jul 2024
reply
Thank you for your interest and comments.
We can add the size of our study area (111,801 km2). You are correct that the operational model used only 168 grid points for this area, a significant reduction from Herla et al. (2024). That study used all grid points within the treeline elevation range in Glacier National Park (1348 km²). In contrast, the operational model splits each forecast polygon into smaller “microregion” polygons. Depending on the forecast polygon size, they were divided into 1 to 8 microregions, each typically covering 300 to 600 km². It then samples a representative grid point from each NWP model within the microregion. This sparse spatial sampling has been used in the operational model for several years to balance spatial resolution and computational costs, allowing the model to run quickly each morning (the operational domain covers 745,829 km² which is over 500 times larger than Glacier). Tests have found this sampling density captures most regional-scale patterns resolved by the NWP models (which typically have an effective resolution 5 to 7 times larger than the grid spacing). However, we are implementing finer resolution this year thanks to improved computational efficiencies.
Yes, forecasters had access to a dashboard with a prototype product that presented snowpack model clusters. While it's difficult to quantify its impact on human assessments, we should discuss this in our paper as there likely was some influence. Forecaster usage of the prototype varied between individuals and hazard situations, likely having more impact in data-sparse areas compared to data-rich areas like the Columbia Mountains. As forecasters increasingly use model-driven decision aids, it will become more challenging to use their assessments to validate models.
The prototype product analyzed the same simulated profiles, but used a different clustering method. The method calculated snow profile distances from basic summary statistics (e.g., HS, HN, presence of a weak layer, percentage of wet grains) instead of dynamic time warping. A hierarchical clustering algorithm was used, and the number of clusters selected using the within-between ratio. The domain was larger, causing subregions in the Columbia Mountains to be grouped with those in the neighboring Rocky Mountains. This means the clusters viewed by forecasters likely differed from those in this study.
Our first clustering prototype in 2020-21 used only the temporal sequence of HN24 to determine distances between subregions, which worked surprisingly well. We then used the summary statistics method described above for the 2021-22 and 2022-23 seasons. Forecasters expressed interest in these early prototypes, motivating us to refine the methods and conduct a more rigorous study. In 2023-24, we implemented a method similar to the one described in this study, using dynamic time warping distances and fuzzy clustering.
Citation: https://doi.org/10.5194/egusphere-2024-1609-AC1
-
RC1: 'Comment on egusphere-2024-1609', Bert Kruyt, 26 Jul 2024
reply
Very relevant work, that aims to formalize processes that have thus far relied on inherently non-transparent expert assessment, and as such form a valuable complement to, and improvement of said assessment. We are living in an interesting time where avalanche forecasting is developing rapidly due to the application of modeling techniques long shunned by (skeptical) practitioners. These authors have played a crucial role in removing some of that skepticism with previous work, and this current work is no exception.Generally, this paper is well written, clear and shows novel methods.
One generic reservation I have with models like these, that rely on tuning of parameters to match a regional dataset, is the applicability to other regions. This becomes even more of a concern when these are model chains, where several of these 'tuned' models are strung together. In this case , Mayer's model for dry slab stability (Pmax) is tuned (or trained as it is a ML model) on snow profiles around Davos, CH. That model is then used in the author's clustering model in Canada.
One can question if the parameter choice would have been different if the Random Forest model of Mayer 2022 had been trained on Canadian profiles (with more new snow, less wind?). Or if someone takes this clustering model and applies it to a region with a very different snow climate (e.g. Norway). Would the parameter choice still be optimal?
This is not meant as critique to the paper, but more as a general concern to the community. Answering these questions implies a lot of work of a type that is not valued as much in the scientific community as much as doing 'new' things is. However for the rigidity of avalanche forecasts , it is just as important.Perhaps some discussion on the applicability of this method to other areas, and the dependency on training/tuning data would improve the paper?
Specific comments:
- Figure 4 is not clear in black and white. Whereas being readable in black&white is too much to ask for figures 7-10, for this figure a simple linestyle change (dotted/dashed or markers) may make it readable in black and white ( for example on an eReader.)
- Sect 4.4.
- Is there a reason why Beta values > 0.1 were not investigated? If so please mention/explain.
- line 210: "The number of human-assessed forecast regions changed 12 times over 107 days, with region arrangements changing on 34 days."
The into mentions "115 days when both model and human data were available for analysis." . What is this discrepancy (107 vs 115)?
- Fig 6: It is not clear to me how the ARI supports the choice of Beta=0.02. I see how 6a supports that, if the goal is to mimic the human assessment of the nr of regions. But 6b is unclear (to me). Intuitively I would say the ARI should be high on days where nothing changes, but you want the clusters to change only when the snowpack changes. How does ARI reflect that?
- Sect 5.3: "default fuzzines parameter r=2" : previously (sect 4.2) r was determined/picked to be optimal at 1.25; why is it 2 here?
- Out of curiosity: did you see the clustering change as solar irradiation (and thus the difference between N and S) became larger, i.e. throughout the season?
- in general; how do you make representative profiles for a region when solar irradiation leads to big differences in aspect? Or is the clustering only done for dry (ENW) avalanche profiles?Citation: https://doi.org/10.5194/egusphere-2024-1609-RC1 -
AC2: 'Reply on RC1', Simon Horton, 26 Jul 2024
reply
We thank the reviewer for their thoughtful comments.
We appreciate and share the concern about the limitations of developing models tuned to specific datasets and are happy to discuss this further in the paper. We can elaborate on our experience tuning parameters to the larger operational domain in the 2023-24 season and suggest how others could do the same.
Unfortunately, tuning these types of model chains in the past has relied heavily on trial-and-error methods to produce realistic results. When introducing a newly developed method at the start of an operational season, we often need to adapt the code to handle unexpected midseason issues. Collaborative efforts to generalize models would be very beneficial, and we strongly support future efforts to test, validate, and apply models in different regions and contexts.
We can address the specific comments by clarifying some details in the manuscript.
In response to the comments on sequential clustering:
- We present grid search results with sequential weights between 0 and 0.1. Initially, we tested larger values, but they caused the clustering results to converge rapidly to a stable solution at the start of the season and then remain unchanged. This effect was partly observed for beta 0.1, which we explain at the end of the section. We can add a note at the beginning of the section to clarify why we don’t test beta values greater than 0.1.
- When evaluating the sequential clustering, we had fewer eligible days. The study spans 150 days (Nov 26 to Apr 24), with model data missing on 35 days. For non-sequential clustering, this leaves 115 days to analyze. However, for sequential clustering, we need to compare consecutive days, so each day of missing data prevents two comparisons (i.e., today's and tomorrow's). Based on the timing of the missing data, we had 107 consecutive day pairs to analyze.
- The critique about the ARI not clearly supporting beta = 0.02 is fair. The goal is for clusters to change when significant snowpack changes occur. We found that non-sequential clustering changed the regions too often, which would be disruptive to forecasting workflows. At the same time, forecasters had the sense they could not change the regions enough due to limited data and operational constraints. This suggest the ideal complexity for changes would be somewhere in between the two methods. The ARI shows that for beta = 0.02, the complexity of changes is midway between the non-sequential and human methods. Ultimately, selecting this parameter involved trial and error to produce favorable results, which we can explain more clearly in the manuscript.
Other comments:
- In Sect 5.3, we used a different fuzziness parameter (r = 2) to form coherent clusters. This parameter is sensitive to the distribution of values in the distance metric. The distance metric derived from counts has a larger skew towards values of 0 and 1 compared to metrics based on snow profile comparisons, so using r = 1.25 for the counts resulted in crisp clusters with membership values of 0% and 100%. To increase the fuzziness, we tested larger r values to optimize the average silhouette width, finding that r = 2.00 was optimal for the clustering method counts and r = 2.15 for human assessment counts.
- To date, we have only applied clustering to flat field simulations and have not specifically investigated seasonal changes influenced by solar radiation. We have observed that clusters often show stronger latitudinal dependencies in the spring, as southern regions transition to spring conditions earlier, while in winter, longitudinal patterns driven by precipitation are more dominant. We can assume similar patterns might be observed across aspects. Incorporating simulations on different aspects and elevation bands will add complexity to summarizing regional-scale patterns. Although we have considered this, we do not have a clear solution or recommendation. Ultimately, if we want to apply a standard clustering method, we need to distill all the snowpack and spatial information in a region into a single numeric pairwise comparison. Future work could explore combinations of averaging snow profiles for hazard-relevant features (as in Herla et al., 2022) and quantifying spatial relationships across spatial features with distances.
Citation: https://doi.org/10.5194/egusphere-2024-1609-AC2
-
AC2: 'Reply on RC1', Simon Horton, 26 Jul 2024
reply
Data sets
Clustering simulated snow profiles to form avalanche forecast regions – Code and Data Simon Horton, Florian Herla, and Pascal Haegeli https://osf.io/4u2az/
Model code and software
Clustering simulated snow profiles to form avalanche forecast regions – Code and Data Simon Horton, Florian Herla, and Pascal Haegeli https://osf.io/4u2az/
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
181 | 35 | 24 | 240 | 9 | 8 |
- HTML: 181
- PDF: 35
- XML: 24
- Total: 240
- BibTeX: 9
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1