the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A quantitative module of avalanche hazard—comparing forecaster assessments of storm and persistent slab avalanche problems with information derived from distributed snowpack simulations
Abstract. Avalanche forecasting is a human judgment process with the goal of describing the nature and severity of avalanche hazard based on the concept of distinct avalanche problems. Snowpack simulations can help improve forecast consistency and quality by extending qualitative frameworks of avalanche hazard with quantitative links between weather, snowpack, and hazard characteristics. Building on existing research on modeling avalanche problem information, we present the first spatial modeling framework for extracting the characteristics of storm and persistent slab avalanche problems from distributed snowpack simulations. Grouping of simulated layers based on regional burial dates allows us to track them across space and time and calculate insightful spatial distributions of avalanche problem characteristics.
We applied our approach to ten winter seasons in Glacier National Park, Canada, and compared the numerical predictions to human hazard assessments. Despite good agreement in the seasonal summary statistics, the comparison of the daily assessments of avalanche problems revealed considerable differences between the two data sources. The best agreements were found in the presence and absence of storm slab avalanche problems and the likelihood and expected size assessments of persistent slab avalanche problems. Even though we are unable to conclusively determine whether the human or model data set represents reality more accurately when they disagree, our analysis indicates that the current model predictions can add value to the forecasting process by offering an independent perspective. For example, the numerical predictions can provide a valuable tool for assisting avalanche forecasters in the difficult decision to remove persistent slab avalanche problems. The value of the spatial approach is further highlighted by the observation that avalanche danger ratings were better explained by a combination of various percentiles of simulated instability and failure depth than by simple averages or proportions. Our study contributes to a growing body of research that aims to enhance the operational value of snowpack simulations and provides insight into how snowpack simulations can help address some of the operational challenges of human avalanche hazard assessments.
- Preprint
(2440 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-871', Zachary Miller, 24 May 2024
Overview:
The manuscript titled “A quantitative module of avalanche hazard—comparing forecaster assessments of storm and persistent slab avalanche problems with information derived from distributed snowpack simulations” sets out to improve avalanche forecasting quality through the development of an additional toolset and analyzes the effectiveness of this toolset in this paper. The authors leverage recent scaling developments in the utilization of the SNOWPACK model to produce spatially distributed snow cover outputs over ten winter seasons for the Glacier National Park, Canada. They then post-process this data to produce numerical predictions of the characteristics of storm snow and persistent avalanche problems and compare those results against the timeseries of daily human hazard assessments. Their comparison is extensive and thorough, both evaluating broad trends and specific event day-to-day evolutions of avalanche problems. They describe their methods clearly despite the relative complexity required in post processing and inter-comparison of their datasets. The results and discussion clearly establish where their work fits within the current sphere of research and the contribution their methods offer to the snow and avalanche science community.
I do not have any major issues with this manuscript and feel that it is of very high quality. The largest question raised is whether or not they considered comparing their two hazard assessments (simulated and human) against observed avalanche records? I realize that the additional effort is probably outside the scope of the current project and that there are known limitations to observational records but also believe that their specific domain - Glacier National Park, Canada - has potentially one of the most complete and thorough records available due to the high level of professional avalanche activity in the terrain in and around the park. These records, and additionally utilizing the Avalanche Hazard Index (Schaerer, 1989), could provide a relative “truth” in the avalanche hazard characteristics being compared. I would appreciate a response to my question but do not feel that this further analysis is required for publication given the robustness of the work presented.
Specific technical corrections and comments:
Line 100 – use of word “over” is confusing within the discussion of the ordinal likelihood of avalanches scale, simply remove to clarify
Line 106 – use of word “over” is confusing within the discussion of the ordinal North American Public Avalanche Danger Scale, simply remove to clarify
Line 119 – “layers that were exposed to the snow surface” is confusing, perhaps adjust simply to “layers that were the exposed snow surface”
Line 120 – “represent rain events that form a crust” implies that rain is the only way surface crusts form and the date tags probably include additional crust formation events. Perhaps something like: “represent crust formation events at the snow surface such as rain or insolation-driven melting.”
Line 180 – It seems as though the likelihood of avalanches should be represented by the layer with the highest punstable value or an average for that profile rather than the lowest?
Figure 5g – Color and interquartile range of air temperature line is the same as punstable in plots 5c-5f and, therefore, is slightly confusing. Consider changing it to dashed or a different color to differentiate it.
Figure 5h – Color of median high of new snow is the same as storm slab instabilities in plots 5d and 5f, and, therefore, is slightly confusing. Consider changing the color to differentiate it.
Figure 6 – It appears but is not explicitly described that assessed values are solid and modelled are hatched. Perhaps mention this in the figure description or add a legend.
Line 299 – “increased strongly” doesn’t make sense in reference to the simulated depth of the weak layer, perhaps remove “strongly” or change wording to “increased substantially”
Line 304 – “short and moderate peaks of modeled instability” is confusing since “moderate” is not defined within the spectrum of modeled instability
Line 331 – “as” is meant to be “was”
Figure 8c & 8d – “Turn-on” and “Turn-off” are confusing category names and I am interpreting it them as if the avalanche problem was assessed (“Yes”/“No”) or added/removed?
Line 359 – “is unstable” should be “are unstable”
Line 361-363 – You mention that deeper splits show a generally higher hazard rating associated with storm snow problems than persistent problems. Is this due to the relative frequency of lower hazard ratings associated persistent problems (aka “spicy moderate” or existing for weeks) vs. the short-term spiking hazard commonly found with storm snow problems (quick to rise and quick to fall)? If so, I believe it is worth clarifying the relative temporal effects of both types of problems because your results discussion seems to say that the model simply pairs lower hazard with persistent problems.
Line 380-381 – Delete sentence since it seems you are submitting the manuscript for publication in a peer-reviewed journal and I hope you’ve taken those additional research steps.
Line 423-425 – Very concise distillation of the effectiveness of your model – nice job.
Line 444-445 – How do forecaster’s removal of reported persistent problems appear arbitrary? That is a big statement to make given the multitude of factors forecasters must balance to make that call.
Line 545 – Understanding the truth of avalanche hazard is infinitely complex. Has your team considered comparing your results with observed avalanche activity to quantify the accuracy of avalanche depths/size and loosely distribution (despite the inherent bias in physically observed avalanche records)? I wonder if this could help clarify a relative truth especially when the simulated and reported hazards differ?
Final Thoughts:
I reiterate that with minor updates this paper will be a valuable addition to the avalanche forecasting and snow science communities.
Citation: https://doi.org/10.5194/egusphere-2024-871-RC1 -
AC2: 'Reply on RC1', Florian Herla, 05 Nov 2024
## Interactive Comment to referee comment #1 (Zachary Miller)Thank you very much for your evaluation and the supportive assessment, we highly appreciate it!We appreciate your comment about considering observed avalanche activity as additional validation data set. Glacier National Park, Canada, indeed has valuable data on avalanche activity that could provide interesting insights for validating the simulations. Our main reasons for not including such comparison in our current manuscript are as follows. We focused on one validation data set (human assessments) and aimed at extracting as much useful information as possible from the comparison to the simulated data set, exploring the challenge from several perspectives. This made the methodology already quite complex and including yet another different data set would risk making the story line more confusing. Moreover, the comparison against avalanche activity comes with several challenges that would have blown up the current manuscript. First, human observed avalanche activity comes with known caveats. For example, ``no observed avalanches'' does not always mean that there were no avalanches (e.g., bad weather and visibility, limited terrain available for observations), and the timing of observations might not coincide with the release of avalanches or the peak modeled instability. Automated detections of avalanche activity in Glacier National Park are limited to the highway corridor, which hosts an artificially managed snowpack. In other words, it is only suited to validate storm snow instabilities, but not so much persistent problems. Then, there remains the challenge of translating artificial triggering of storm slabs with explosives to the modeled storm slab characteristics. Is the currently used instability model p_unstable (Mayer et al, 2022) suited, or would we need to prefer the avalanche day predictor by Mayer et al (2023)? How to include process-based indices? Overall, we agree and see it as highly important to carry out these comparisons against observed avalanches. A dedicated study might be best suited for this endeavor and ongoing conversations at the international working group [AvaCollabra](https://gitlab.com/groups/avacollabra/-/wikis/home) could help in designing a research approach in the most meaningful way. We will include a statement in our Conclusion section that highlights the opportunity for a validation against observed avalanche activity.Thank you also for the specific technical corrections and comments. We will incorporate them into the revised manuscript and add clarifications where required.Citation: https://doi.org/
10.5194/egusphere-2024-871-AC2
-
AC2: 'Reply on RC1', Florian Herla, 05 Nov 2024
-
CC1: 'Comment on egusphere-2024-871', Frank Techel, 05 Aug 2024
Dear authors,I greatly appreciate your comprehensive comparison between model predictions and human assessments.When reading through your manuscript, several points didn't become fully clear or were a little confusing. These all relate to the Methods described in Sections 3.2 and 3.3.Please find some feedback and questions below.I hope these comments and questions are helpful,Frank Techel********************************************Specific comments and questions
- While explaining the link between the distribution of p_unstable and the likelihood of avalanches (in CMAH) makes sense (Sect. 3.1), consider referring to p_unstable rather than the likelihood of avalanches when referring to p_unstable (e.g., L174). This would ease understanding when referring to model predictions and when to human assessments.
- Along that same line, you use expected p_unstable for the first time on L211. I presume it is meant to be introduced on L191(?) though it is referred to as the expected likelihood of avalanches. Only later I noticed that in Figure 4a, the expected p_unstable values are shown but this is nowhere mentioned (or I missed it). From Fig. 4a, I took that the expected p_unstable is the mean of all the p_unstable-values in the plot. On L189 you refer to the likelihood of avalanches (but I presume this is again p_unstable) for which you derive various percentiles. Consider indicating that the 50th percentile is what you call expected p_unstable.
- L164: you say that you used the threshold p_unstable >= 0.77 to define layers with poor stability, as proposed by Mayer et al. (2022). But afterwards, you seem to analyze exclusively p_unstable; at least in all the figures p_unstable is shown. - Why did you primarily explore p_unstable and not the proportion p_unstable >= 0.77? I would expect that this explains why your distribution of 2 (moderate) was wider compared to Switzerland, while the distribution of 3 (considerable) was wider in Switzerland than in your data. (L404-406). You also say something along that line. --- Out of curiosity, while analyzing, did you plot Figures 7a, c, e, g and Figure 8g and h using the proportion unstable rather than the expected p_unstable?
- L178: I assume this is just a typo, it should probably read >= rather than <=?
- L186: I don't understand how the point cloud in Figure 4 can provide a spatial distribution. I am aware that spatial distribution is the term used for the number of triggering locations in CMAH. In this context, I found it rather confusing as the distribution in the plot doesn't have a spatial component. - Consider changing to something like "the number of potential triggering points within a region can be gauged from the distribution of p_unstable". At least to me, p_unstable provides primarily an indication of potential instability considering a Rutschblock test. Of the unstable locations, only a small fraction will be sufficiently unstable to result in human-triggered (or natural) avalanches (Mayer et al., 2023).
- L192: How do you proceed if none of the profiles fulfilled the p_unstable >= 0.77 - criteria for deriving the expected depth when no such layers were present?
- You mentioned twice that p_unstable correlated more strongly with the danger level than the likelihood terms (e.g., L311-312). This is interesting. - Is this maybe due to likelihood estimates being less reliably estimated by forecasters as compared to the reliability of danger level estimates? Or is this maybe linked to the fact that p_unstable is a mix of Rutschblock stability and danger level (p_unstable actually correlated more with danger level than RB stability, see Mayer et al., 2022 [p.4601])?
Citation: https://doi.org/10.5194/egusphere-2024-871-CC1 -
AC1: 'Reply on CC1', Florian Herla, 05 Nov 2024
## Interactive Comment to community comment #1 (Frank Techel)Thank you for your interest in this manuscript and for providing feedback to improve it!We will take your suggestions and include them in the revised manuscript, in particular we will use terminology around human assessed "likelihood of avalanches" and the model predicted p_unstable more consistently in the revised manuscript.In the following are responses to your specific questions:* We wanted to explicitly test whether knowledge of the "full distribution" is more valuable than knowing only about the proportion of unstable grid points. Our CTree analysis confirmed this hypothesis (e.g., Fig.10, L474ff, L494ff), and particularly highlights the importance of the 90th percentile of p_unstable and the proportion unstable for discriminating between danger levels. We appreciate and value your suggestion of exploring the proportion unstable in the same light as the expected p_unstable, particularly since other studies primarily used the proportion unstable in their analysis (Mayer et al, 2023; Herla et al, 2023). Therefore, we re-plotted Figures 7a, c, e, g, and Figure 8g, h. Please find the resulting figures and our interpretation further below.* When none of the grid points were unstable, we could not compute the expected depth and assigned NaN values to those results.* Thanks also for you last question. Our results show more consistency between human--model for the higher-level variables. We observed this pattern not only for the danger rating, but also for the presence or absence of a problem, for example. Therefore, I do think that the more intricately-to-assess characteristics, such as likelihood, are less reliably and less consistently estimated by the forecasters. However, I am grateful for your hint that the local danger level indeed plays a role in the training process of p_unstable by filtering for stable and unstable profiles that were observed during low and high danger days, respectively. We will add a short statement to the revised Discussion to include this subtle but important detail.### Re-plotted figures#### Figure 7We re-plotted Fig. 7a, c, e, g, which uses the expected p_unstable on the y-axis, with two alternative variables, first with the 90th percentile of p_unstable and second with the proportion of grid points that exhibit p_unstable >= 0.77 ("proportion unstable"). To compare the result in the most optimal way, the following figure shows the expected p_unstable in the left column (panels a, d, g, j), the 90th percentile of p_unstable in the middle column (panels b, e, h, k), and the proportion unstable in the right column (panels c, f, i, l).Focusing first on the median positions on the chart (i.e., the squares colored and labeled with the danger rating), all three variables exhibit some capabilities in discriminating between the four different danger levels. Moreover, the median positions on the chart follow very similar patterns for all three variables. The main difference is as follows: The three variables consume different spaces on the y-axis. This is a simple shift towards higher values for the 90th percentile of p_unstable compared to the expected p_unstable, whereas the proportion unstable falls onto a conceptually different y-axis that uses the entire space between [0, 1]. Please see our discussion on this in L407--412.While the 90th percentile of p_unstable seems incapable of discriminating between danger levels considerable and high (due to an overlap of their medians), this very same variable is better in discriminating between danger levels low and moderate than the other two variables (due to a substantial shift of the contour fields). This is in line with the findings from our CTree analysis (Fig. 10), which uses the 90th percentile of p_unstable as the first split and then considers the proportion unstable to discriminate between the higher danger levels in subsequent splits.While the median values of the porportion unstable seem to discriminate best between the four danger levels, their underlying density distributions show most overlap. This pattern is particularly strong for danger level moderate, where a prominent contour ridge spans the entire y-axis, which makes it a highly uncertain variable for partitioning.Overall, the re-plotted figure offers a slightly different visualization on our data set that supports the findings of our existing CTree analysis. We believe that the figure could offer an additional perspective to the most interested reader when placed in an Appendix of the revised manuscript.#### Figure 8We also re-plotted Fig. 8g, h with the proportion unstable instead of the expected p_unstable (see figure below). Similarly to the existing plot, there is no trend for the storm snow problem case (panel g). For the persistent weak layer case (panel h), the trend is even weaker in the re-plot than in the existing plot that used the expected p_unstable (Fig. 8h). Therefore, contrary to Figure 7, the re-plot of Figure 8 in our opinion does not offer any valuable insight.Citation: https://doi.org/
10.5194/egusphere-2024-871-AC1
-
RC2: 'Comment on egusphere-2024-871', Veronika Hatvan, 19 Aug 2024
The manuscript titled "A quantitative module of avalanche hazard—comparing forecaster assessments of storm and persistent slab avalanche problems with information derived from distributed snowpack simulations" presents a well-executed study aimed at enhancing avalanche forecasting by integrating numerical snowpack simulations with human judgment. The study introduces a spatial approach to extracting the characteristics of storm and persistent slab avalanche problems from distributed snowpack simulations, tracing individual snowpack layers over time and space. This approach enables the calculation of spatial distributions of avalanche problem characteristics and presents the data in familiar hazard chart formats aligned with the (CMAH).
The study leverages snow cover data spanning ten winter seasons from Glacier National Park, Canada, to examine the agreement between snow cover simulations and human assessments for persistent and storm slab avalanche problems. The comparison is thorough, addressing both seasonal trends and day-to-day evaluations. The authors clearly describe their methods, which are up-to-date and well-suited to the study’s objectives. This work aligns well with recent advancements aimed at integrating snowpack modelling more closely with operational forecasting workflows.
The applied methods and developed approaches are of high quality and contribute significantly to the goal of further integrating snow cover modelling results into avalanche forecasting workflows. The comparison between modelling results and human assessments provides a valuable foundation and insights for future applications.
Minor Corrections and Comments:
Line 174: I noticed a small typo in the phrase 'characteristics avalanche problem type'; I assume it should be ' characteristic avalanche problem type'? Additionally, I concur with the comment by Frank Techel that, for clarity, it would be beneficial to consistently use the term p_unstable when referring to the model predictor. This distinction will help avoid (my) confusion, when differentiating it from human assessments (e.g., Line 174 and other locations).
Line 175 – 178 & 180 - 181: To me, it is unclear how the depth of the identified layer differs from the depth of the deepest unstable layer. To my understanding, these two are the same. Consider revising for more clarity, otherwise I would be interested in a reply to clarify this distinction.
Line 178 & 181: I assume this is a typo, and p_unstable ≤ 0.77 should instead read p_unstable ≥ 0.77.
Line 186: I do not understand the term 'spatial distribution' as it relates to Figure 4, as I don't see any spatial component represented in the figure. Consider revising this term for greater clarity.
Figure 5: For clarity, it would be helpful to choose different colours for air temperature and HN24, as the current colours are very similar to those used for p_unstable and avalanche problems in the upper panels. Additionally, I assume that air temperature and HN24 are based on modelled data rather than measurements? It might be beneficial to add a small comment on this for clarity.
Figure 6: It took multiple views to understand that the hashed bars represent modelled data and the full-colour bars represent assessed data. To improve clarity, consider explicitly stating this in the figure caption for easier understanding."
Line 380: Remove sentence – you are already submitting to a peer-reviewed journal.
Conclusion:
I do not have any major issues with this manuscript; the research is solid, and the conclusions are well-supported by the data. My comments are minor in nature. Overall, I recommend this manuscript for publication after these minor revisions are addressed.
Citation: https://doi.org/10.5194/egusphere-2024-871-RC2 -
AC3: 'Reply on RC2', Florian Herla, 05 Nov 2024
## Interactive Comment to referee comment #2 (Veronika Hatvan)Thank you very much for reviewing our manuscript and providing very supportive feedback! We highly appreciate it!We will include your suggestions into the revised manuscript, particularly use terminology more consistently and clearly (e.g., p_unstable), and make Figures 5 and 6 more accessible. We will also include an introductory paragraph that aims to reconcile the two different terms frequency distribution and spatial distribution, which are being used in Europe and North America, respectively.Citation: https://doi.org/
10.5194/egusphere-2024-871-AC3
-
AC3: 'Reply on RC2', Florian Herla, 05 Nov 2024
Interactive computing environment
Quantitative module of avalanche hazard—Data and Code F. Herla, P. Haegeli, S. Horton, and P. Mair https://doi.org/10.17605/OSF.IO/94826
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
318 | 103 | 76 | 497 | 16 | 15 |
- HTML: 318
- PDF: 103
- XML: 76
- Total: 497
- BibTeX: 16
- EndNote: 15
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1