the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Machine learning for automated avalanche terrain exposure scale (ATES) classification
Abstract. Avalanche risk management is essential for backcountry safety. The Avalanche Terrain Exposure Scale (ATES) classifies mountain terrain based on its potential exposure to avalanche hazards and offers assistance to backcountry users in their terrain assessment. Initially, ATES maps were generated manually, a costly and time-consuming process. Automated ATES model chains (AutoATES) have been developed to address these limitations, but existing approaches require careful parametrisation when applied to novel areas.
This study applies machine learning methods, specifically Random Forests, for automated ATES classification by replacing expert-driven AutoATES classification trees with a data-driven approach. Using a labelled training dataset from the Pirin Mountains, Bulgaria, we trained and evaluated three Random Forest models to assess their potential in classifying avalanche terrain. We analysed the influence of various input features, including slope, potential release areas, and percent canopy cover, on classification performance. Our results indicate that Random Forests offer a robust and scalable method for ATES mapping and that incorporating additional input features can improve classification performance. The accuracies for our Random Forest models on a held-out test set were 79.31 %, 82.32 %, and 80.42 %, demonstrating their potential for automated avalanche terrain classification and supporting safer backcountry decision-making.
- Preprint
(22881 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-2143', John Sykes, 01 Jul 2025
Overview
This manuscript presents a machine learning approach to classifying avalanche terrain using the Avalanche Terrain Exposure Scale (ATES). The authors developed a novel and meaningful validation approach and tested performance of several iterations of random forest models in a study area with limited avalanche information available. Overall, the research is well written, the methods are explained well, figures and tables are easy to digest and visually capture the key points of the research, and the results and discussion are sound.
I recommend this manuscript be published after minor revisions. Specifically, there are a few methodological questions that need to be clarified and I would ask that the authors reconsider how they are wording their conclusion that forest canopy cover is not an important feature for automated ATES classification in light of the limitations of the validation data and quality of the forest data used. Feature importance in a random forest model is highly dependent on the training and testing data, so this conclusion may be specific to the study area of this research. Further, an optional addition that would be useful to situate these results in the broader field would be to compare the accuracy of the RF approach to the previously published ‘deterministic’ autoATES method.
Specific Comments
Intro
- Line 15 - Global fatality numbers are much higher than I’ve seen in other recent publications (from Acharya et al 2023). Worth double checking, typical records from Europe and North America estimate closer to 140 annual fatalities. This additional research on Himalayan events is very meaningful, but differs from past estimates.
- Line 63 to 70 - If autoATES is working well for open terrain this points to a specific discrepancy in how forest data is captured. Therefore, another alternative to making the application of autoATES more consistent for novel regions would be improving consistency of input data sets (forest cover, DEM).
- Line 72 to 78 - You describe the autoATESv2.0 approach to classifying terrain as expert based, which is accurate, but it is also a physically based model that uses the output of PRA and runout models to explicitly categorize terrain. Shifting to a machine learning approach is more of a statistical approach to classifying terrain using ATES. Can you add some discussion about the trade off of physical versus statistical models. I understand that machine learning is much easier to implement if you have high quality training data, but there are limitations in terms of generalizability and reliance on a limited set of training data. These trade-offs are critical to highlight to paint the full picture of switching from a physically based to statistically based modelling approach.
Methods
- Line 120 to 125 - Good description of the overall study area topography. You could move the discussion about the distribution of different ATES classes to this section if you want. I found myself looking for that information while reading this section, but I see that you have it nicely summarized in Table 1 and paragraph 1 of section 2.3.
- Figure 1 is an excellent overview of your study area for those unfamiliar with the local geography. I assume the light green/teal shading is a rough approximation of treeline elevation, which may be worth including in the legend.
- Line 135 to 140 - Can you add information about how these DTMs were created. For example, using LiDAR, photogrammetry, or radar? Was the DEM data produced from satellite, UAV, or drone based remote sensing?
- Line 122 to 148 - In my experience the quality of the input forest data has a very large impact on the quality of the output of the autoATES model. Did you consider creating your own forest data using free satellite imagery such as Sentinel 2? Considering the known limitations of the Copernicus forest data you mentioned, creating your own forest data could significantly improve autoATES performance.
- Line 149 - This sentence probably does not need its own paragraph.
- Line 161 - Relying on one local expert to create the training data introduces a high degree of subjectivity to the machine learning approach. Prior research has shown that there can be major differences in how avalanche experts categorize terrain and apply the ATES scale. By relying on one local expert and using a machine learning approach you are putting a heavy emphasis on the skill of the local expert in driving the accuracy of your automated model. I recommend adding a statement along these lines to recognize the potential bias in your training data.
- Line 170 - Why did you decide not to include ATES class 0 terrain (non-avalanche?)
- Figure 2 - This is a very interesting and novel approach to precisely define many small polygons and not create a continuous validation data set. The limitation of mapping continuous areas at high resolution is a significant challenge for developing validation data for autoATES.
- Line 178 - Why did you choose a 50/50 split for your training and testing data? To my knowledge a 80/20 or 70/30 split is more typical of the machine learning field.
- Line 207 - Assigning a value of 0 to slope angles below 28 degrees puts a lot of faith on the accuracy of your DTM. This could lead to missing PRA on small slopes where adjacent lower angle terrain can smooth the slope angle due to the neighborhood function used to calculate slope angle. Further, avalanches on slope angles below 28 degrees are possible with persistent weak layers, especially surface hoar which is notorious for causing avalanches on slope angles of 25 degrees or less. I would consider decreasing this cutoff value or removing the cutoff value entirely and fine tuning the slope angle cauchy function so that the ‘fuzzy and’ operator can handle these low angle slopes with consideration of forest cover and wind shelter.
- Line 225 - Why are you targeting/limiting your runout simulations to size 3 avalanches? The ATES v2 classification scale specifies return frequencies for avalanches greater than size 3. Based on your description of the terrain in the study area, with some slopes having 1000 m of vertical relief, there is a strong possibility of avalanches larger than size 3. How do you factor these very large to historic avalanche events into your ATES classification?
- Table 4 - The RF model with the most input features only has 7 features. Why did you limit your feature selection to a relatively sparse set? Do you think it would be worthwhile to expand the set of features to include additional output from the PRA, additional forest data, or additional runout simulation information? One of the main benefits of machine learning methods is that they can handle very high dimensional data, which would support testing an RF model with more features.
Results
- The figures and tables in this section do a very good job of illustrating how the autoATES output looks on the terrain and providing a statistical summary of each model.
- Figure 8 - It is interesting that slope is consistently the second highest in feature importance while PRA is near the middle or end of the feature importance list. The PRA output is largely driven by slope angle distribution, which makes me wonder why slope is so dominant here. Could there be an impact of your local expert using slope angle maps to create the training/testing data and therefore your RF models contain some bias towards weighting slope angle more heavily?
Discussion
- Line 495 - Do you think the limitation you mentioned in your forest data in the intro could be contributing to the lower performance for challenging terrain?
- Line 500 to 503 - Yes! So maybe it would be worthwhile to include even more features to try and bump that accuracy even higher?
- Line 515 to 516 - Agreed, this is a critical consideration for evaluating model performance for autoATES.
- Line 525 to 540 - There are several other potential reasons that could cause PCC to have a lower feature importance. First is the distribution of total forested versus non-forested validation pixels. Everywhere that there is no forest cover in your training data the PCC feature would not be useful for classification. Therefore, the relatively low ranking of PCC in feature importance is likely due to the fact that much of the terrain you trained and tested on are not forested. The second cause is that you highlight significant limitations in the forest data in the intro section. The quality of the input forest data will be directly related to how useful it is for ATES classification. I agree that the thresholding approach for forest cover in the current autoATES classification approach is cumbersome and could be improved, but that does not mean that forest cover is not a critical feature to include in the ATES classification model. Finally, just because a feature has lower importance doesn’t mean it is not contributing to a better overall classification. As you stated on line 500, machine learning models excel at incorporating many features into the classification. So even if forest cover is only useful for 5-10% of the pixels in your data set, that does not mean that excluding it is the correct approach. The fact that RF2 is the most accurate and includes the most features (including PCC) is a good justification for keeping it. Adding additional features beyond what is included in RF2 might not produce new features with very high ranking feature importance values, but it could incrementally improve accuracy for specific types of terrain where the current model is lacking. Overall, I would consider these factors carefully before making a general statement that forest cover is not an important feature for automated ATES classification.
- Section 4.2.2 - This adaptation of using isolated polygons to validate autoATES is a novel and meaningful addition to the field. However, this is also a very different approach from traditional ATES mapping using linear or zonal features which may have some challenges in regards to defining boundaries between classes and incorporating the ATES elements of exposure and route options into the terrain ratings.
- Line 590 to 600 - This is a huge advantage of the RF approach to ATES classification. Taking advantage of efficiency of the automated approach while being able to manually validate specific pieces of terrain that are identified as low confidence. Great example of merging automated and manual mapping to create the best possible output for a lowest possible cost.
- Section 4.3 - One additional limitation could be that the RF model will probably be limited to working with input data that is very similar to what it is trained on. Using a lower resolution DEM or a forest cover dataset that captures a different forest characteristic (e.g. basal area, stem density) would likely not work with the RF model developed in this research. Therefore, the RF autoATES model presented here is likely limited to application in regions with similar topography, forest characteristics, and input data availability.
Conclusion
- Line 647 to 651. See comments from discussion section about limitation of feature importance rankings. I think the lack of importance is more reflective of the training/testing data used in this study and quality of the input forest data and not a sign that forest cover is not an important parameter for autoATES mapping.
Citation: https://doi.org/10.5194/egusphere-2025-2143-RC1 - AC1: 'Reply on RC1', Kalin Markov, 06 Aug 2025
-
RC2: 'Comment on egusphere-2025-2143', Cameron Campbell, 18 Jul 2025
General Comments
This preprint manuscript summarizes the development and validation of an automated Avalanche Terrain Exposure Scale (ATES) classification algorithm using random forest machine learning models. The study area encompassed popular backcountry ski-touring destinations in the Pirin Mountains of Bulgaria, with limited information available to the public regarding current snowpack stability and avalanche danger or the spatial distribution of avalanche-prone terrain. The random forest machine learning approach was investigated as a potential data-driven method to improve classification performance over previous automated ATES (AutoATES) mapping for the area that relied on expert-driven classification trees.
Three different iterations of the machine learning model were developed using a different selection of input features informed by a training dataset consisting of isolated manually classified ATES polygons. A selection of established statistical methods was then used to assess the agreement between the resulting AutoATES classifications and an independent test dataset. The results were used to evaluate the utility of random forest machine learning for AutoATES classification and optimize model input features for the study area.
Terrain classification for the study uses ATES v.2 with four avalanche terrain classes ranging from Class 1 – Simple to Class 4 – Extreme; however, the optional Class 0 – Non-avalanche Terrain class is not used. The manuscript would benefit from a discussion on the importance of identifying non-avalanche terrain for the study area and potential end-users, and the decision to exclude it from the study. The manuscript could also benefit from more analysis and discussion focused on the areas of disagreement between the AutoATES classification and the test dataset, especially areas where the disagreement is more than one ATES class, areas near the boundaries between ATES class zones, or critical terrain features for backcountry recreational route-planning and safe navigation (e.g., ridge crests, valley bottoms, high mountain passes, and terrain traps).
Overall, the manuscript is clear, concise, and well-structured, and addresses the research questions well. The scientific and technical approaches and the applied methods are valid, and the results are discussed in an appropriate and balanced way. The work represents a substantial contribution to the understanding and communication of avalanche terrain severity in Bulgaria, and the development and validation of AutoATES algorithms worldwide.
Specific Comments
Line 170-173: Small, precisely delineated polygons of manually assessed terrain were used for the training and test datasets in lieu of conventional continuous ATES zone mapping in order to reduce generalization errors. From Figure 2, it appears as though the locations of these polygons are somewhat random, the dimensions of these polygons range from less than 100 m to almost 1000 m, and some seem more precisely delineated than others (i.e., rounded or squared-off boundaries versus complex precise shapes). However, there are no details on the approach used in identifying or delineating these polygons. It is also unclear whether the boundaries of these polygons represent the exact transition between ATES classes, or if the transition is considered to be somewhere outside of these polygons. The manuscript could benefit from more details with this regard.
Line 260: The manuscript could benefit from discussion regarding the decision to omit the ATES v.2 non-avalanche terrain classification from the analysis. This ATES class can default to simple terrain and is generally considered optional as it often requires high confidence in the assessment (and associated level of effort). However, it can provide valuable information to end-users with little or no tolerance for avalanche risk.
Line 400: The manuscript could benefit from further analysis and discussion regarding the classification errors that involved misclassification by more than a single class level. Is this largely attributed to errors in input data or are there specific terrain features that the model grossly misclassifies?
Line 427-429: The manuscript could benefit from further analysis and discussion in regard to the finding that predictive confidence tends to decrease near the boundaries between classes. Can this be related to the approach used in delineating the boundaries of the manually derived polygons used for the training and test datasets? I.e., see comment above, it is unclear whether the boundaries of these polygons represent the exact transition between ATES classes, or if the transition is considered to be somewhere outside of these polygons.
Line 470-472: It could also be noted that the analysis performed by Sykes et al. (2024) included validation against manually derived benchmark maps with areas classified as non-avalanche terrain that were subsequently reclassified as simple for the analysis. This increased the agreement rate between the AutoATES algorithm and the test dataset for the simple terrain class.
Line 487-489: The manuscript could benefit from further discussion regarding the influence of potentially including non-avalanche terrain classified as simple terrain in the test and training datasets on the model accuracy results. I.e., it is expected that the model would be able to easily classify non-avalanche terrain as simple terrain.
Line 595-598: The manuscript could benefit from further discussion regarding the importance of manual quality control and fine-tuning of automatically produced ATES maps prior to final map publication. It is the disagreement found between the AutoATES output and the test dataset that necessitates this.
In preparing these comments have also reviewed and agree with previous comments posted by John Sykes and have attempted to avoid repetition. However, I would like to reiterate the comments pertaining to the importance of forest cover on ATES classification highlighted in other studies.
Technical Corrections
There is inconsistent use if the acronyms “ML” and “RF” throughout the report. Suggest defining acronyms on first use then consistent use throughout.
There is inconsistent and inappropriate use of capitalization throughout the References section (e.g., lines 691, 699-700, 712-713, 717, 730-731, 756-758, 783-784, 804) and some authors names are missing (e.g., lines 732, 740, 829).
Citation: https://doi.org/10.5194/egusphere-2025-2143-RC2 - AC2: 'Reply on RC2', Kalin Markov, 06 Aug 2025
Model code and software
machine-learning-auto-ates Kalin Markov https://doi.org/10.5281/zenodo.15310357
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
727 | 142 | 20 | 889 | 11 | 26 |
- HTML: 727
- PDF: 142
- XML: 20
- Total: 889
- BibTeX: 11
- EndNote: 26
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1