Disentangling the effect of geomorphological features and tall shrubs on snow depth variation in a sub-Arctic watershed using UAV derived products

Shirley, Ian; Uhlemann, Sebastian; Peterson, John; Bennett, Katrina; Hubbard, Susan S.; Dafflon, Baptiste

doi:https://doi.org/10.5194/egusphere-2023-968

Preprints

https://doi.org/10.5194/egusphere-2023-968

Preprints

22 May 2023

| 22 May 2023

Disentangling the effect of geomorphological features and tall shrubs on snow depth variation in a sub-Arctic watershed using UAV derived products

Ian Shirley, Sebastian Uhlemann, John Peterson, Katrina Bennett, Susan S. Hubbard, and Baptiste Dafflon

Abstract. Spatial variation in snow depth is a main driver of heterogeneity in discontinuous permafrost landscapes, exerting a strong control on thermal and hydrological processes, vegetation dynamics, and carbon cycling. Topography and vegetation are understood to play an important role in driving variation in snow depth, but complex morphology often impedes efforts to disentangle these drivers. Maps of ground, vegetation and snow surface elevation were collected using an Unmanned Aerial Vehicle (UAV) over multiple years across a watershed on the Seward Peninsula in Alaska. Here, we quantify drivers of snow depth variation using the inferred maps of snow depth during peak snow accumulation in 2019 and 2022 and collocated ground surface elevation and vegetation height. A novel approach to extract microtopographic information from complex landscape morphologies is used to classify different features (e.g. drainage paths, risers and terraces, thermokarst patterned ground) and characterize their relationships with snow depth variation. A simple model developed using topographic information alone is shown to correlate strongly with local snow depth variation where vegetation height is low. We build a machine learning model to quantify snow trapping by shrub canopies in the watershed and show that snow trapping can be characterized by an exponential function of canopy height above snow (RMSE = 0.12 m, R² = 0.5). Finally, we demonstrate that relationships between microtopography, vegetation height, and snow depth hold in years of deep and shallow snowpack. These results can be applied to improve representation of heterogeneity and vegetation-snow feedbacks in Earth System Models and to increase the spatial resolution of pan-arctic estimates of snow depth.

Received: 11 May 2023 – Discussion started: 22 May 2023

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1503 KB)

Supplement (1157 KB)

Download & links

Ian Shirley, Sebastian Uhlemann, John Peterson, Katrina Bennett, Susan S. Hubbard, and Baptiste Dafflon

Status: closed

RC1:
'Comment on egusphere-2023-968', Anonymous Referee #1, 29 Jul 2023

Review of the manuscript “Disentangling the effect of geomorphological features and tall shrubs on snow depth variation in a sub-Arctic watershed using UAV derived products”.
The manuscript describes an interesting approach to understand and differentiate the control that topography and shrubs have on snow distribution. The methods presented are novel and exploit state of the art observation tools as UAVs. The findings of Shirley et al., are highly interesting and I consider this work will contribute to improve the understanding of snow dynamics in sub-arctic areas.
However I consider the manuscript needs some work before publication in order to improve it and thus I recommend major review (despite it is between minor and major). I encourage manuscript authors to complete this review as I think the work and the approach applied are valuable for the community.
Major points:
1. Some sections need a clearer description of the methods to allow an easier understanding. In the list of minor points I have included some recommendations. For example more details are needed when describing the stacked directional filtering as it is a novel method in the community. This way, when some figure are referred to in section 2.3, a more detailed description in what is observed is required to help their interpretation.
2. Description of the study area in Section 2.1 This section needs a more appropriate study area map. Figure 1 gives some information, but it is not easy to have detailed information of the study area. The elevation map (left panel in figure 1) helps but discretized elevation bands and elevation curves (additionally to the vegetation contour lines) would do easier the interpretation for potential readers. Moreover a location map, showing Alaska and placing the watershed is required. Also I encourage manuscript authors to include photographs with and without snow of this site. Similarly, vegetation and snow depths maps would benefit of discretized color ramps also including elevation curves and vegetation contours (with thicker lines).
3. I encourage to apply the analysis of TM_SV for shorter distances, (and not only starting in 11 m) This might allow to identify the maximum distance for which shrubs are able to disturb the snow distribution (see line 228-229 minor comment).
Minor comments:
Line 24: Indicate that both, the RMSE and the R2 are the error metrics of the machine learning model.
Line 36: I encourage manuscript authors to include more references (not only a self-citation) about the control that snow has as driver of landscapes.
Line 40: Which is the microtopographic spatial scale, 1m, 10 m?, please include some insight/refernces about this.
Line 54-55: I think here can be changed the reference of Sturm et al., 2005 by a more recent one as this: https://doi.org/10.1175/JHM-D-22-0067.1
Line 100: Is the ground resolution the ground sampling distance of the UAV camera or the final resolution of the grid cells? Please, clarify.
Line 101: Approximately 50 targets as GCP? Did you change the number depending on the acquisition date? Please clarify this point and include a map showing the location of GCPs.
Line 108: Which method did you use to interpolate data?
Line 112-115: These lines are describing an interesting validation along several transects. I would provide further details here and potentially include a new section for this validation, as you are comparing the elevations with and without snow and not a direct snow depth validation. In view of Figure 2 caption I think some information is missing in this figure. For instance in the caption it is stated: “Orthomosaic of the transects and surrounding area is shown on the left. Contour lines are drawn at a 3 m distance from vegetation of 1 m height or taller. A snow depth map of the transects and surrounding area is shown in the middle. Contour lines are drawn at terrain elevation isolines with a spacing of 1m.”, where is the orthomosaic? And the snow depth map of the transects?
Line 120: Please, give more details on what potential readers can see/understand in “Figs 1,4” You are giving too much information without a detailed explanation and without introducing some important concepts as the TM-SV plotted in some of these figures or the different classes you are focusing at (Stream, Terracets, Patterned ground).
Line 132: Which are the finest and the coarsest spatial scales? Why you have chosen these scales?
Line 139: Why you don’t start with 1 m spatial scale (your dataset has a 0.1 m spatial resolution).
Line 145: Why 21 m? Please, give some details about this choice, otherwise it seems too arbitrary.
Line 151: In figure S2 you are talking about TM_SV, without a previous description of this variable and this is confusing. Please, here and elsewhere in the manuscript try to re-order figures/concepts to have a step by step explanation of the concepts described in your article.
Line 155: Why the snow depth map of 2019 and not that of 2022? Please justify this choice.
Line 167: OK, I understand that computational efficiency a 50x50 m window is chosen, but this is not a physical choice. I suggest reviewing literature and also justifying this search distance in view to previous works.
Line 185: “measured shrub height and potential”, is there something missing after potential? I am not able to understand this sentence. Please rephrase if possible.
Figure 3, vertical dashed lines in the different panels (indicating distances) would help to interpret the distances.
Line 221 to 226. Panels d, e, f do not appear in Figure 4. Oppositely panels I, II, III, IV and V are shown. Please, change conveniently.
Line 228-229: I like this last sentence of this section as it summarizes an interesting outcome of this work. However, and despite the interesting analysis described in section 3.2 which demonstrate the importance of shrubs on trapping snow, I think it is necessary to understand the distance to which these shrubs can control snow depth variations. That is why I would suggest applying 3.1 analyses to shorter distances. Please consider doing an insight to determine this distance.
Line 243: The message of this sentence would benefit of elevation curves included in all snow maps. I encourage including contour elevation lines in all maps.
Line 250: is this figure 6 or is it figure 5??
Line 305: I would say “we demonstrate….on the order of 100 m IN THIS STUDY AREA”.
Line 319-325: I see these sentences more appropriate for the discussion. I encourage moving them to this section in new section about the consequences of the findings of this research.
Line 326-333: I think these lines also fit better in the discussion section. What about a final subsection in the discussion named: “Impacts and future work” (or similar).

Citation: https://doi.org/10.5194/egusphere-2023-968-RC1
- AC1:
  'Reply on RC1', Ian Shirley, 30 Sep 2023
  The manuscript describes an interesting approach to understand and differentiate the control that topography and shrubs have on snow distribution. The methods presented are novel and exploit state of the art observation tools as UAVs. The findings of Shirley et al., are highly interesting and I consider this work will contribute to improve the understanding of snow dynamics in sub-arctic areas.
  However I consider the manuscript needs some work before publication in order to improve it and thus I recommend major review (despite it is between minor and major). I encourage manuscript authors to complete this review as I think the work and the approach applied are valuable for the community.
  Thank you for your comments and your positive review. We have responded to your comments and suggestions below, and through changes to the manuscript.
  
  Major points:
  Some sections need a clearer description of the methods to allow an easier understanding. In the list of minor points I have included some recommendations. For example more details are needed when describing the stacked directional filtering as it is a novel method in the community. This way, when some figure are referred to in section 2.3, a more detailed description in what is observed is required to help their interpretation.
  
  We have improved the description of the stacked directional filtering in the revised manuscript. In addition to updating the text describing the filtering approach in the Methods section, we have included an additional figure that outlines the approach in schematic form.
  Description of the study area in Section 2.1 This section needs a more appropriate study area map. Figure 1 gives some information, but it is not easy to have detailed information of the study area. The elevation map (left panel in figure 1) helps but discretized elevation bands and elevation curves (additionally to the vegetation contour lines) would do easier the interpretation for potential readers. Moreover a location map, showing Alaska and placing the watershed is required. Also I encourage manuscript authors to include photographs with and without snow of this site. Similarly, vegetation and snow depths maps would benefit of discretized color ramps also including elevation curves and vegetation contours (with thicker lines).
  
  Thank you for these suggestions. We have included the area map and select images within the watershed in Figure 1. We have also changed the figures throughout the manuscript and supplementary information to include elevation curves and discretized color ramps.
  I encourage to apply the analysis of TM_SV for shorter distances, (and not only starting in 11 m) This might allow to identify the maximum distance for which shrubs are able to disturb the snow distribution (see line 228-229 minor comment).
  
  As suggested by the reviewer, we have performed the stacked directional filtering and TM_SV analysis for shorter distances (3, 5, 7 and 9 m). We note that we have downsampled the UAV products to 1 m resolution to reduce the memory requirements needed for analysis, so filtering at shorter distances is not possible. We are not able to determine the maximum distance for which shrubs are able to disturb the snow distribution using only the TM_SV, however, as the shrub canopy structure and its interaction with topography is quite complex. We developed the machine learning models in order to deal with this complexity.
  Minor comments:
  Line 24: Indicate that both, the RMSE and the R2 are the error metrics of the machine learning model.
  These metrics describe the fit between the machine learning model of shrub canopy snow trapping and the simple exponential fit that we perform in Figure 8 to approximate the complex output of the machine learning model. We hope that this is more clear in the revised version of the abstract.
  Line 36: I encourage manuscript authors to include more references (not only a self-citation) about the control that snow has as driver of landscapes.
  Thank you for this suggestion. We have added the following references to support this well-researched topic:
  Avanzi, Francesco, Alberto Bianchi, Alberto Cina, Carlo De Michele, Paolo Maschio, Diana Pagliari, Daniele Passoni, Livio Pinto, Marco Piras, and Lorenzo Rossi. 2018. “Centimetric Accuracy in Snow Depth Using Unmanned Aerial System Photogrammetry and a MultiStation.” Remote Sensing 10 (5): 765.
  Chen, Yunxiang, Roman A. DiBiase, Nicholas McCarroll, and Xiaofeng Liu. 2019. “Quantifying Flow Resistance in Mountain Streams Using Computational Fluid Dynamics Modeling over Structure-from-Motion Photogrammetry-Derived Microtopography.” Earth Surface Processes and Landforms 44 (10): 1973–87.
  Goetz, Jason, and Alexander Brenning. 2019. “Quantifying Uncertainties in Snow Depth Mapping From Structure From Motion Photogrammetry in an Alpine Area.” Water Resources Research 55 (9). https://doi.org/10.1029/2019WR025251.
  Harder, Phillip, John W. Pomeroy, and Warren D. Helgason. 2020. “Improving Sub-Canopy Snow Depth Mapping with Unmanned Aerial Vehicles: Lidar versus Structure-from-Motion Techniques.” The Cryosphere 14 (6): 1919–35.
  Harder, Phillip, Michael Schirmer, John Pomeroy, and Warren Helgason. 2016. “Accuracy of Snow Depth Estimation in Mountain and Prairie Environments by an Unmanned Aerial Vehicle.” The Cryosphere 10 (6): 2559–71.
  Le, Phong V. V., and Praveen Kumar. 2017. “Interaction Between Ecohydrologic Dynamics and Microtopographic Variability Under Climate Change.” Water Resources Research 53 (10): 8383–8403.
  Revuelto, Jesus, Esteban Alonso-Gonzalez, Ixeia Vidaller-Gayan, Emilien Lacroix, Eñaut Izagirre, Guillermo Rodríguez-López, and Juan Ignacio López-Moreno. 2021. “Intercomparison of UAV Platforms for Mapping Snow Depth Distribution in Complex Alpine Terrain.” Cold Regions Science and Technology 190 (October): 103344.
  Tabler, Ronald D. 1980. “Geometry and Density of Drifts Formed by Snow Fences.” Journal of Glaciology 26 (94): 405–19.
  Wainwright, Haruko M., Anna K. Liljedahl, Baptiste Dafflon, Craig Ulrich, John E. Peterson, Alessio Gusmeroli, and Susan S. Hubbard. 2017. “Mapping Snow Depth within a Tundra Ecosystem Using Multiscale Observations and Bayesian Methods.” The Cryosphere. https://doi.org/10.5194/tc-11-857-2017.
  
  Line 40: Which is the microtopographic spatial scale, 1m, 10 m?, please include some insight/refernces about this.
  Thank you for pointing out this ambiguity. The term “microtopography” still lacks a strict definition, but we’ve added the following sentences to clarify our use of the term.
  “Microtopography can be defined as the difference between topographic maps and smoothed or low-pass filtered versions of these maps (Wainwright et al. 2017). The characteristic scale of microtopographic features varies depending on the environment and processes of interest, but length scales of microtopographic features relevant for landscape surface hydrology are typically on the order of 5 - 50 m (Chen et al. 2019; Le and Kumar 2017; Wainwright et al. 2017).”
  
  Line 54-55: I think here can be changed the reference of Sturm et al., 2005 by a more recent one as this: https://doi.org/10.1175/JHM-D-22-0067.1
  Thank you for this suggestion, we have updated this reference accordingly.
  Line 100: Is the ground resolution the ground sampling distance of the UAV camera or the final resolution of the grid cells? Please, clarify.
  The ground sampling distance of the UAV camera was 3-5 cm, and the final resolution of the grid cells was 10 cm. Please note that we then downsampled the UAV products to a resolution of 1m for the purposes of this analysis - we have clarified this in the updated manuscript.
  Line 101: Approximately 50 targets as GCP? Did you change the number depending on the acquisition date? Please clarify this point and include a map showing the location of GCPs.
  Yes, for the winter flights we used 47 targets and for the snow-free flights we used about 65 targets. This information is included in the updated version of the manuscript. We will provide the processing reports from the AgiSoft reconstruction which include maps of the locations of the GCPs, along with the UAV products in a data archive that will be made publicly available at https://ess-dive.lbl.gov/ upon publication of the manuscript.
  Line 108: Which method did you use to interpolate data?
  The data was linearly interpolated. We have clarified this in the revised manuscript.
  Line 112-115: These lines are describing an interesting validation along several transects. I would provide further details here and potentially include a new section for this validation, as you are comparing the elevations with and without snow and not a direct snow depth validation. In view of Figure 2 caption I think some information is mis sing in this figure. For instance in the caption it is stated: “Orthomosaic of the transects and surrounding area is shown on the left. Contour lines are drawn at a 3 m distance from vegetation of 1 m height or taller. A snow depth map of the transects and surrounding area is shown in the middle. Contour lines are drawn at terrain elevation isolines with a spacing of 1m.”, where is the orthomosaic? And the snow depth map of the transects?
  We appreciate this suggestion, and have added a new section in the methods, as well as an additional figure in the Supplementary Material, that describes the comparison between ground measurements and the UAV products in more detail. We apologize for the confusion with the Figure 2 caption. The legend refers to an earlier version of the figure, and has been corrected in the revised manuscript.
  Line 120: Please, give more details on what potential readers can see/understand in “Figs 1,4” You are giving too much information without a detailed explanation and without introducing some important concepts as the TM-SV plotted in some of these figures or the different classes you are focusing at (Stream, Terracets, Patterned ground).
  Thank you for highlighting here and below the confusion introduced by the order that concepts are introduced in the manuscript. The methods section is reorganized in the updated manuscript so that the concepts are presented in order. For example, we have added a section that describes the topographic features present in the watershed (and that does not reference the TM_SV) before this sentence which appears at the beginning of the section on stacked directional filtering.
  Line 132: Which are the finest and the coarsest spatial scales? Why you have chosen these scales?
  The finest and the coarsest scales used for stacked directional filtering would vary depending on the application. For our application, we have used scales that range from 3 m to 230 m, as we are analyzing images with 1 m resolution that are approximately 400 m in width.
  Line 139: Why you don’t start with 1 m spatial scale (your dataset has a 0.1 m spatial resolution).
  Thank you for this suggestion. We have updated our approach to start with 3 m as the smallest spatial scale, and have updated the TM_SV and machine learning models accordingly.
  Line 145: Why 21 m? Please, give some details about this choice, otherwise it seems too arbitrary.
  We have removed reference to the 21 m threshold (and other arbitrary thresholds) when describing the classification of regions characterized by each topographic feature. Instead, we refer the reader to our knowledge of the site as the method used for classification.
  Line 151: In figure S2 you are talking about TM_SV, without a previous description of this variable and this is confusing. Please, here and elsewhere in the manuscript try to re-order figures/concepts to have a step by step explanation of the concepts described in your article.
  The TM_SV serves mostly as a background image for the classified regions shown in Figure S2. In the updated version of the manuscript, we have used maps of snow depth instead for this figure.
  Line 155: Why the snow depth map of 2019 and not that of 2022? Please justify this choice.
  Thank you for allowing us to justify this choice. We’ve added the following sentence at the beginning of the section to clarify why we used the 2019 maps to build the simple model:
  “We created a simple model of snow depth variation (TM_SV) in the watershed based on the 2019 snow map instead of the 2022 snow map because the topographic variation was more consistently smoothed by the deeper snowpack in 2019, with fewer patches of <10 cm snow depths.”
  
  Line 167: OK, I understand that computational efficiency a 50x50 m window is chosen, but this is not a physical choice. I suggest reviewing literature and also justifying this search distance in view to previous works.
  Thank you for this suggestion. We have justified the choice of a 50x50 m window using a previous study of snow drift geometry that demonstrates that the impact of a 1.5 m snow fence is attenuated to zero within a 50 m distance. We’ve updated the text accordingly:
  “For computational efficiency, and because previous studies have shown a 1.5 m snow fence creates a snow drift that is less than 50 m in length (Tabler 1980), only shrubs that lie within a 50 m by 50 m window of each point contribute to 1 and 2.”
  
  Line 185: “measured shrub height and potential”, is there something missing after potential? I am not able to understand this sentence. Please rephrase if possible.
  We apologize for the confusion. “Measured shrub height and potential” is terminology from a previous draft of the paper. We’ve replaced this phrase with “shrub canopy snow trapping fields” in the updated manuscript.
  Figure 3, vertical dashed lines in the different panels (indicating distances) would help to interpret the distances.
  Thank you for this suggestion. We have updated the figure accordingly.
  Line 221 to 226. Panels d, e, f do not appear in Figure 4. Oppositely panels I, II, III, IV and V are shown. Please, change conveniently.
  Thank you for pointing out this mistake. We have updated the manuscript accordingly.
  Line 228-229: I like this last sentence of this section as it summarizes an interesting outcome of this work. However, and despite the interesting analysis described in section 3.2 which demonstrate the importance of shrubs on trapping snow, I think it is necessary to understand the distance to which these shrubs can control snow depth variations. That is why I would suggest applying 3.1 analyses to shorter distances. Please consider doing an insight to determine this distance.
  We agree that it is important to understand the distance to which shrubs can impact snow depth. However, the shrub canopy structure is complex and there are larger-scale interactions between topography and snow depth than those incorporated in the TM_SV. To deal with this complexity, we developed machine learning models to attempt to quantify the snow trapping by shrub canopies. Even with the machine learning predictions, however, it is challenging to characterize the relationship between distance from canopy and snow trapping, because of difficulties selecting the edge of the canopy vs gaps in the canopy, and varying interactions of shrub patches across the watershed with other shrub patches and with the terrain. Our discussion of snow trapping by shrub patches B and C in the last paragraph of Section 3.2 highlights this complexity, and we have further emphasized its importance in the text of the revised manuscript and with the inclusion of an additional conceptual figure.
  Line 243: The message of this sentence would benefit of elevation curves included in all snow maps. I encourage including contour elevation lines in all maps.
  Thank you for this suggestion. We have included contour elevation lines in all maps in the revised manuscript.
  Line 250: is this figure 6 or is it figure 5??
  Yes, this should be figure 5 not figure 6. We have updated the manuscript to correct this mistake.
  Line 305: I would say “we demonstrate….on the order of 100 m IN THIS STUDY AREA”.
  We agree that it is important to emphasize that identified relationships between topography, vegetation, and snow depth are specific to the studied watershed and have updated this sentence as you suggest. We have also included some discussion that compares our results to similar previous findings in other environments.
  Line 319-325: I see these sentences more appropriate for the discussion. I encourage moving them to this section in new section about the consequences of the findings of this research.
  Line 326-333: I think these lines also fit better in the discussion section. What about a final subsection in the discussion named: “Impacts and future work” (or similar).
  
  We appreciate this suggestion, and have moved these two paragraphs to a new section in the discussion in the revised manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2023-968-AC1
CC1:
'Comment on egusphere-2023-968', Florent Dominé, 31 Jul 2023

I have read this paper with interest as indeed understanding the combined effects of topography and erect vegetation on snow depth is critical to many applications. I draw the Author’s attention to a related paper by our group (Lamare et al., 2023) that was published a few weeks before the Authors submitted their own work.
I am amazed that the Authors achieved this work using photogrammetry. We tried that method at our study site but were not able to detect the ground surface below shrubs with sufficient reliability, which is why we had to resort to using a lidar. Our attempts were in the fall for logistical reasons, and perhaps spring before bud burst is more favorable. In any case, I felt that few details are given by the Authors to explain how they overcame the difficulties. Did the Authors perform any field validation of vegetation height? I feel this is important to validate the use of photogrammetry to obtain a DTM in the presence of shrubs, as several previous publications (some of them cited by the Authors) report difficulties (De Michele et al., 2016; Fernandes et al., 2018; Harder, Pomeroy, & Helgason, 2020). If no field validation was performed, I suggest mentioning this explicitly. The same suggestion applies to vegetation protruding above the snowpack in April.
The models developed by the Authors appear quite interesting. Should they be interested, they may wish to test them with our data, to possibly demonstrate their more general validity. Spring and fall DTM and DSM are available at (Lamare et al., 2022). Additional data, including meteorological data, are also available as indicated in the data statement of (Lamare et al., 2023).
Lastly, I personally feel that a general situation map would have been useful to understand the general context of the study. Likewise, more details on the vegetation are in my opinion highly desirable. This is useful for many reasons, among others for readers interesting in evaluating shrub bending, as some species have branches much more supple than others. “Salix spp” is rather vague, and “dwarf shrubs” even more so. Photographs of the site would be useful to many readers as well. Maps and photographs may probably be added to the supplementary material with moderate efforts.
References
De Michele, C., Avanzi, F., Passoni, D., Barzaghi, R., Pinto, L., Dosso, P., . . . Della Vedova, G. (2016). Using a fixed-wing UAS to map snow depth distribution: an evaluation at peak accumulation. The Cryosphere, 10(2), 511-522. doi:10.5194/tc-10-511-2016
Fernandes, R., Prevost, C., Canisius, F., Leblanc, S. G., Maloley, M., Oakes, S., . . . Knudby, A. (2018). Monitoring snow depth change across a range of landscapes with ephemeral snowpacks using structure from motion applied to lightweight unmanned aerial vehicle videos. The Cryosphere, 12(11), 3535-3550. doi:10.5194/tc-12-3535-2018
Harder, P., Pomeroy, J. W., & Helgason, W. D. (2020). Improving sub-canopy snow depth mapping with unmanned aerial vehicles: lidar versus structure-from-motion techniques. Cryosphere, 14(6), 1919-1935. doi:10.5194/tc-14-1919-2020
Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2022). UAV-borne lidar campaign over Umiuaq, Hudson Bay, Canada in 2017 and 2018. Retrieved from: https://doi.org/10.1594/PANGAEA.943854
Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2023). Investigating the Role of Shrub Height and Topography in Snow Accumulation on Low-Arctic Tundra using UAV-Borne Lidar. Journal of Hydrometeorology, 24(5), 835-853. doi:https://doi.org/10.1175/JHM-D-22-0067.1

Citation: https://doi.org/10.5194/egusphere-2023-968-CC1
- AC3: 'Reply on CC1', Ian Shirley, 30 Sep 2023
  
  Thank you for your comments and interest in our work! We have read your recent paper with great interest and were happy to see similar conclusions regarding the interactions of snow, topography, and shrubs.
  We have included some additional description of the comparison between UAV products and ground measurements (including vegetation height) in the updated version of the manuscript that may be of interest to you. In short, we find a good match between the UAV products and ground measurements of ground surface elevation, snow surface elevation, and snow depth even in patches of tall shrubs. We timed our spring flight in a very short window between snowmelt and bud burst, but the success of this approach may be dependent on the environment and the species present in the watershed. We appreciate your suggestion to test the approach presented here on your data, and will include it in future efforts to apply these techniques in different environments and at larger scales. Per your suggestion, we have included a map that shows the location of the watershed in Alaska in the updated manuscript, as well as some snow on-off photos taken from some pheno-cams we have in the watershed. The willows in the watershed are primarily S. pulchra and S. glauca, we have included this detail in the text.
  
  Citation: https://doi.org/10.5194/egusphere-2023-968-AC3
RC2:
'Comment on egusphere-2023-968', Anonymous Referee #2, 01 Aug 2023
Report of Disentangling the effect of geomorphological features and tall shrubs on snow depth variation in a sub-Arctic watershed using UAV derived products, by Shirley et al.
The article by Shirley et al. used UAV-derived terrain, vegetation height and snow depth maps to investigate scale-dependent relationships between these variables. Interactions between topography, terrain and vegetation are important drivers of soil temperatures and thus of permafrost occurrence/active-later depths in the Arctic. They find that topographic variations on spatial scales <100 m explain much of the spatial variations of snow depth, with a super-imposed, but spatially-variable effect of snow trapping by shrubs. The effect of shrubs can be subdued by topography, as shrubs tend to grow in sheltered depressions, which also trap snow. The study introduces a novel ‘stack directional filtering’ approach which decomposes topographic and snow depth variations into discrete spatial scales in two orthogonal directions. This is an interesting and original approach which has the potential to improve our understanding of the complex associations between vegetation, terrain and snowpack in the Arctic.
The study has a great potential for the community, providing that the underlying data can be proven to be reliable, which at this point I am not convinced of. As such, there are a number of issues to be resolved before this paper could be considered for publication. The study area should be better described, some methodological steps need to be clarified, and the reliability of some of the data processing steps need to be demonstrated. Several decisions made on quantitative thresholds used for the classification of ground points and further analyses are presented without clear justification or backed by external validation, which raises questions on the reliability of the raw datasets and their derived products used for analysis. Those major concerns are addressed below in point form, followed by minor comments.
Major comments
1) Classification of the digital surface model (DSM) into bare earth (DSM) models and canopy height model (CHM).
Classification of point clouds in ground/non-ground is a much-researched area, and the authors have browsed very quickly on this important step, without providing enough details or convincing results that their simple scheme to separate ground from vegetation works well. The authors should provide additional information and results to demonstrate that their methodology is reliable, and better place it in the context of previous efforts to classify LIDAR or UAV photogrammetry point clouds or DSMs into ground/non-ground. Classifying ground/non-ground for UAV photogrammetric point clouds or derived DSMs would be challenging, and would require a better proof of concepts before launching into analyzing the results. This is explicated in more details below.
A 0.5 m threshold was used to separate topographic variations in the 10 cm DSM that are due to vegetation from those due to local terrain. How did you choose this value, and how influential is it on the results? Did you validate somehow this simple ground classification scheme against independent observation of vegetation presence/absence, or RTK-GPS beneath shrubs for example?

The DTM was produced by interpolating DSM gridcells classified as ground (as derived from the above procedure), but no details on the interpolation procedure and its performance is given. Given that photogrammetry is expected not to work well for capturing sub-canopy terrain, even in leafless shrubs, one would expect to obtain few points below canopies, hence the DTM would rely heavily on interpolated values from the surrounding open terrain. Given that the point clouds were aggregated into a 10 cm DTM before classification, the sub canopy ground points eventually captured by photogrammetry would be diluted by the higher elevation of surrounding shrub branches.

Given that the terrain can be locally higher beneath shrubs to the accumulation of litter (e.g. Lamare et al., 2023), or locally lower if shrubs grow in small depressions, I am not convinced that the ground classification used is robust and thus would ask the authors to provide more details and supporting results that show the reliability of their ground classification and DTM production procedure, as this is a challenging and important step in this work. One starting example would be to show maps of ground point density (number of points / m2) in supplementary material, and some kind of k-fold cross-validation test of the interpolation procedure (i.e., how accurate is the prediction of left out ground points under shrubs when interpolating from the available ground points).

2) UAV-photogrammetry and validation of the derived products.
Some important details are missing to better understand the quality of the derived topographic and snow depth products:
What was the mean ground-sampling distance (GSD) in each flight, compared to the 10 cm gridcell size used for aggregating/interpolating the cloud point to rasters? A table showing the specification of the UAV flights (height, overlap, GSD, etc) in SI would be useful.

The validation of the products appears somewhat simplistic. The transects presented in Fig2 are interesting to qualitatively assess the concordance between RTK-GPS surveys and UAV topographic surfaces, but do not give a quantitative view of the validation. Please provide scatterplots of measured (probed) vs UAV snow depths, and of RTK vs UAV elevations, with corresponding error metrics (bias, rmse, r2, etc). For example, I see quite many discrepancies between RTK surveyed elevations and the UAV reconstructed ones… Based on Figure 2, I would not trust the statement made in L115 that the ‘overall quality of the dataset is high’.

Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2023). Investigating the role of shrub height and topography in snow accumulation on low-Arctic tundra using UAV-borne lidar. Journal of Hydrometeorology, 24(5), 853-871.
A RMSE of 16 cm for snow depth is on the high side for the snow-on/off photogrammetric method. I guess what is important here would be to give some relative metrics, e.g. normalized RMSE. Is a 16 cm RMSE a large fraction of the average snow depth in the landscape?
The ‘Simple topographic model of snow depth variation (TM_SV)’ does not appear that simple from the information given. If I understood correctly, you fit a regression between directionally-filtered snow depth and terrain at various spatial scales (11-221 m) and then added these predictions to obtain TM_SV in equation 5? Some more rational could be given to theses methods… as stated presently the readers is left asking himself several questions about the validity of the approach. You could give some interpretation of the weights in Fig S3, and their R2… why are some of the R2 negative?

3) Machine learning model of snow depth variations
A boosted regression tree model of snow depth variation was produced using 2% of the datasets. As I understand this was due to the computational burden, but 15 000 points do not appear such a big dataset. Anyhow, the question arises on the reliability/stability of the derived model, and no formal quantitative validation is presented. One possibility is to repeat the model training a few times on different random samples to show that the results are robust to sampling, and/or validating the trained model on the left-out data (i.e. 98% of the whole set). A map of the bias within 100x100 m grid cells is presented in SI, but this is not enough to judge on model performance. Please provide additional validation metrics (RMSE, r2) for the whole leftout set, and perhaps also map them within 100x100 gridcells as done for the bias to see spatial patterns. It is unclear at present if the high r2 reported in the paper was calculated on the training sample, in which case it would be prone to overfitting, or on the validation set, which wold give a more reliable performance measure.

Minor comments and edits
Shrub canopy snow trapping field: this is an interesting idea to consider the shrub neighbourhood. The influence of he shrub neighbourhood would also depend on prevailing wind directions; did you consider that? Please give some information on wind directions in the site description
L31: 0 C : missing a degree symbol?
Figure 2 (Validation of digital terrain map and snow depth) says ‘Orthomosaic of the transects and surrounding area is shown on the left. ‘ I don’t see it, is it missing? It also says: ‘A snow depth map of the transects and surrounding area is shown in the middle’. I don’t see it. Seems the caption does not relate to the figure?
L122: a much-used topographic index for snow redistribution is the Winstral wind sheltering index (eg Winstral 2002; Dharmadasa et al., 2023), which you should at least mention. It also relies on the dominant wind direction, you may want to compare/contrast your approach to that.
Winstral, A., Elder, K., & Davis, R. E. (2002). Spatial snow modeling of wind-redistributed snow using terrain-based parameters. Journal of hydrometeorology, 3(5), 524-538.
Dharmadasa, V., Kinnard, C., & Baraër, M. (2023). Topographic and vegetation controls of the spatial distribution of snow depth in agro-forested environments by UAV lidar. The Cryosphere, 17(3), 1225-1246.
L135: equations 1-4: the x and y are not defined in text. Some symbology is strange: an underscore is used on z. should it not be the ‘bar’ symbol for average (as said in the text)?
The description of the procedure should be improved a bit, perhaps by explaining better in wording, first (or after), what the equations produce. You basically run a moving 1D window along two orthogonal directions with different window lengths (111 m to 221 m) and then calculate topographic deviations from these? Then you remove these from the basemap and reapply at the next coarser level? I wonder why digital filtering has not been used to remove the ‘coarse features? Perhaps a conceptual figure in main text or SI could help the reader to better understand the new proposed filtering step…
L144: ‘we take the top part of the watershed as representative of thermokarst patterned ground.’ => not evident why for the reader
L145: why using the 21 m scale?
L146-151: there are many choices made here on thresholds which are not explained nor validated anyhow.
L157, ‘by performing a linear fit’…
Fig S1 and S2 lack a scale bar to interpret the colour scale
Why is there a sharp boundary for the patterned ground in Fig S2? This seems arbritrary
What are the units of the ‘weights’? Is it cm of snow depth per m of terrain deviation? Please explain better
L 176: ‘filtered maps’ : which ones? DTM?
L183-187: did you train a single model on 2% of the data or did you train a few ones? Is the model stable if you retrain on another 2% of the data? Please consider showing some sensitivity results in SI
L187: what is the ‘shrub potential’?
Fig.3: add black stippled line to legend. Also, it is not clear what the first row is. I taught it was the ‘filtered topography’, but the y-label has the SD label to it, so are these snow depth or elevations?
L192: ‘topography varies across all scales while variation in snow depth is attenuated at scales larger than ~100 m (Fig. 3).’ I agree that the std of SD is more concentrated at smaller scales, but I also see this tendency for the topography (first row of fig3?). The exception is for terraces in the cross-slope direction?
L214: ‘Developed with the hypothesis that snow depth variation at a given spatial scale is related
215 to topographic variation at the same scale’ : this statement would be better place before the methods, to better explain the rational of the methods developed.
L217: remove space before the period.
L234: What type of R2 is that? Is this a cross-validated R2? If it is only the squared correlation coefficient between the observations used for calibrating and the model prediction, it is not a robust measure of performance. Please provide a robust assessment, i.e. predicting you model on observations not used for training (i.e. you have 98% of your data available for this). You figure S5 is instructive in this way but please provide validation metrics calculate on the data not used for training (RMSE, Bias and R2). If this is what was done, explain it better.
Table S1: why including the 31 m scale topo map? And is this information not redundant with the filtered topographic map used to predict the TM_SV model?
L236-240: effects of shrubs. I am not fully convinced you can get an independent shrub trapping effect with this method. For this to work, the shrub potential would need to be fully independent from the other predictors. If some collinearity exists between the shrub potential and topography (or the topo-derived TM_SV), then the effect of topography (or TM_SV) could include shrub effects… The reverse is also true.
Figure 5. What are the units of the snow trapping in the two maps? Meters? Or is this the Shrub canopy snow trapping field? Please clarify.
Section 3.3. Perhaps I missed it in Methods, but it is unclear for me if you trained a new ML model for 2022 or if you applied the 2019 model to the 2022 data? Please clarify
Citation: https://doi.org/10.5194/egusphere-2023-968-RC2
- AC2:
  'Reply on RC2', Ian Shirley, 30 Sep 2023
  The article by Shirley et al. used UAV-derived terrain, vegetation height and snow depth maps to investigate scale-dependent relationships between these variables. Interactions between topography, terrain and vegetation are important drivers of soil temperatures and thus of permafrost occurrence/active-later depths in the Arctic. They find that topographic variations on spatial scales <100 m explain much of the spatial variations of snow depth, with a super-imposed, but spatially-variable effect of snow trapping by shrubs. The effect of shrubs can be subdued by topography, as shrubs tend to grow in sheltered depressions, which also trap snow. The study introduces a novel ‘stack directional filtering’ approach which decomposes topographic and snow depth variations into discrete spatial scales in two orthogonal directions. This is an interesting and original approach which has the potential to improve our understanding of the complex associations between vegetation, terrain and snowpack in the Arctic.
  The study has a great potential for the community, providing that the underlying data can be proven to be reliable, which at this point I am not convinced of. As such, there are a number of issues to be resolved before this paper could be considered for publication. The study area should be better described, some methodological steps need to be clarified, and the reliability of some of the data processing steps need to be demonstrated. Several decisions made on quantitative thresholds used for the classification of ground points and further analyses are presented without clear justification or backed by external validation, which raises questions on the reliability of the raw datasets and their derived products used for analysis. Those major concerns are addressed below in point form, followed by minor comments.
  Major comments
  1) Classification of the digital surface model (DSM) into bare earth (DSM) models and canopy height model (CHM).
  Classification of point clouds in ground/non-ground is a much-researched area, and the authors have browsed very quickly on this important step, without providing enough details or convincing results that their simple scheme to separate ground from vegetation works well. The authors should provide additional information and results to demonstrate that their methodology is reliable, and better place it in the context of previous efforts to classify LIDAR or UAV photogrammetry point clouds or DSMs into ground/non-ground. Classifying ground/non-ground for UAV photogrammetric point clouds or derived DSMs would be challenging, and would require a better proof of concepts before launching into analyzing the results. This is explicated in more details below.
  A 0.5 m threshold was used to separate topographic variations in the 10 cm DSM that are due to vegetation from those due to local terrain. How did you choose this value, and how influential is it on the results? Did you validate somehow this simple ground classification scheme against independent observation of vegetation presence/absence, or RTK-GPS beneath shrubs for example?
  
  The DTM was produced by interpolating DSM gridcells classified as ground (as derived from the above procedure), but no details on the interpolation procedure and its performance is given. Given that photogrammetry is expected not to work well for capturing sub-canopy terrain, even in leafless shrubs, one would expect to obtain few points below canopies, hence the DTM would rely heavily on interpolated values from the surrounding open terrain. Given that the point clouds were aggregated into a 10 cm DTM before classification, the sub canopy ground points eventually captured by photogrammetry would be diluted by the higher elevation of surrounding shrub branches.
  
  Given that the terrain can be locally higher beneath shrubs to the accumulation of litter (e.g. Lamare et al., 2023), or locally lower if shrubs grow in small depressions, I am not convinced that the ground classification used is robust and thus would ask the authors to provide more details and supporting results that show the reliability of their ground classification and DTM production procedure, as this is a challenging and important step in this work. One starting example would be to show maps of ground point density (number of points / m2) in supplementary material, and some kind of k-fold cross-validation test of the interpolation procedure (i.e., how accurate is the prediction of left out ground points under shrubs when interpolating from the available ground points).
  
  2) UAV-photogrammetry and validation of the derived products.
  Some important details are missing to better understand the quality of the derived topographic and snow depth products:
  What was the mean ground-sampling distance (GSD) in each flight, compared to the 10 cm gridcell size used for aggregating/interpolating the cloud point to rasters? A table showing the specification of the UAV flights (height, overlap, GSD, etc) in SI would be useful.
  
  The validation of the products appears somewhat simplistic. The transects presented in Fig2 are interesting to qualitatively assess the concordance between RTK-GPS surveys and UAV topographic surfaces, but do not give a quantitative view of the validation. Please provide scatterplots of measured (probed) vs UAV snow depths, and of RTK vs UAV elevations, with corresponding error metrics (bias, rmse, r2, etc). For example, I see quite many discrepancies between RTK surveyed elevations and the UAV reconstructed ones… Based on Figure 2, I would not trust the statement made in L115 that the ‘overall quality of the dataset is high’.
  
  A RMSE of 16 cm for snow depth is on the high side for the snow-on/off photogrammetric method. I guess what is important here would be to give some relative metrics, e.g. normalized RMSE. Is a 16 cm RMSE a large fraction of the average snow depth in the landscape?
  Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2023). Investigating the role of shrub height and topography in snow accumulation on low-Arctic tundra using UAV-borne lidar. Journal of Hydrometeorology, 24(5), 853-871.
  The ‘Simple topographic model of snow depth variation (TM_SV)’ does not appear that simple from the information given. If I understood correctly, you fit a regression between directionally-filtered snow depth and terrain at various spatial scales (11-221 m) and then added these predictions to obtain TM_SV in equation 5? Some more rational could be given to theses methods… as stated presently the readers is left asking himself several questions about the validity of the approach. You could give some interpretation of the weights in Fig S3, and their R2… why are some of the R2 negative?
  
  3) Machine learning model of snow depth variations
  A boosted regression tree model of snow depth variation was produced using 2% of the datasets. As I understand this was due to the computational burden, but 15 000 points do not appear such a big dataset. Anyhow, the question arises on the reliability/stability of the derived model, and no formal quantitative validation is presented. One possibility is to repeat the model training a few times on different random samples to show that the results are robust to sampling, and/or validating the trained model on the left-out data (i.e. 98% of the whole set). A map of the bias within 100x100 m grid cells is presented in SI, but this is not enough to judge on model performance. Please provide additional validation metrics (RMSE, r2) for the whole leftout set, and perhaps also map them within 100x100 gridcells as done for the bias to see spatial patterns. It is unclear at present if the high r2 reported in the paper was calculated on the training sample, in which case it would be prone to overfitting, or on the validation set, which wold give a more reliable performance measure.
  
  We thank the reviewer for the very detailed and thorough review. Our responses to the major concerns of the reviewer are given here, and the minor comments and edits are addressed individually below. We have updated the manuscript to address these comments and concerns.
  We appreciate the opportunity to elaborate on our evaluation of the data products derived using photogrammetry, and have added a section in the manuscript (Section 2.3) that expands on the discussion of the comparison of UAV derived products to ground measurements. We have also added a figure in the supplementary information (Figure S1) that includes the suggested scatterplots and statistics. There are a number of potential problems with the snow probe measurements themselves, as outlined in the Lamare et al., 2023 paper referenced by the reviewer, which may lead to underestimation of the accuracy of the UAV dataset. These problems include snow probe penetration into the litter layer (overestimation of snow thickness), surface ice formation (underestimation of snow thickness), and positional accuracy of the snow probe measurements. Even with all of these potential issues, the UAV derived data products and ground measurements are in close agreement for the ground surface elevation (RMSE = 17 cm), the snow surface elevation (RMSE = 18 cm), and the snow depth (RMSE = 16). These products are in good agreement even beneath patches of tall shrubs (Figure 2), demonstrating the validity of the approach used to generate the ground surface elevation using the June UAV flights. The 16 cm RMSE for the comparison of the ground measurements and UAV snow depth product is less than 15% of the mean probed snow depth along the transects (1.12 m), is well in line with previously published studies that used photogrammetry to derive snow depths (RMSE = 10 – 30 cm; Revuelto et al. 2021; Harder, Pomeroy, and Helgason 2020; Harder et al. 2016; Avanzi et al. 2018; Goetz and Brenning 2019), and is very close to the estimated RMSE of 0.15 m reported by Lamare et al., 2023, even though that study used lidar rather than photogrammetry. We feel that these comparisons demonstrate the high quality of the UAV derived products. We will provide the processing reports from the AgiSoft reconstruction for each flight, which include information about GCP locations, overlap, etc, along with the UAV products in a data archive that will be made publicly available at https://ess-dive.lbl.gov/ upon publication of the manuscript. In general, the photogrammetric technique is a well established technique. This paper is not intended to be an in-depth investigation of the method itself, so we would like to limit the amount of highly technical details. We recognize that there some challenges and limitations with the photogrammetric method may arise depending on the landscape organization of the surveyed site. However, the accuracy assessment described above and included in the revised manuscript demonstrates that this method performs well in this watershed for each product (terrain elevation, vegetation surface, and snow surface).
  
  In response to the comments about the TM_SV, yes, the reviewer’s interpretation of the procedure is correct. We have expanded the description of the TM_SV in the methods to include the rationale behind this procedure (“Developed with the hypothesis that snow depth variation at a given spatial scale is related to topographic variation at the same scale, we develop a simple model of snow depth variation (TM_SV) in the watershed by performing a linear fit of filtered snow maps at a given scale to filtered topography maps at that same scale.”), and interpretation of the weights (“the weights wi (with units m/m) can be interpreted as the deviation from local mean snow depth for a given terrain deviation at each scale i”). The intercept of the linear fit was forced to zero at each scale, which leads to the potential for a negative R2 (meaning that the linear fit is a worse description of the data than the mean of the data). However, after updating the analysis to include filtered maps at smaller scales (as suggested by Reviewer 1), the R2 values are no longer negative, even at the large scales. Since the weights approach zero at 90 m scales, the TM_SV model is not strongly affected by the inclusion of the scales that have low or even negative R2 values.
  
  We have also addressed the reviewer’s concerns regarding the machine learning model performance and repeatability in the updated manuscript. All statistics reported in the previous and current version of the manuscript are calculated using the entire dataset (i.e. the 2% used for training and the 98% left-out set). In the updated manuscript we have included an analysis of the repeatability of the machine learning model by training separate ensembles of ML models on ten randomly generated 15,000 point training sets for 2019 and 2022. Performance statistics (RMSE, MAB, R2) for each ML model are shown in Table S1 of the updated manuscript, and are extremely consistent across each ensemble. Further, maps of standard deviation shown in Figure S7 of the updated manuscript demonstrate strong spatial consistency across each ensemble. The ML ensemble mean, which exhibits considerably higher performance than any ensemble member, is used throughout the updated manuscript.
  
  Minor comments and edits
  Shrub canopy snow trapping field: this is an interesting idea to consider the shrub neighbourhood. The influence of he shrub neighbourhood would also depend on prevailing wind directions; did you consider that? Please give some information on wind directions in the site description
  We understand the importance of wind direction on snow trapping by shrub canopies, but both wind direction and speed vary across the watershed, making it complicated to include this information in the model. The gradient of the shrub canopy snow trapping field is included in the machine learning model which gives information about the directionality of this field that may be used by the ML model to infer how the shrub canopy snow trapping is influenced by wind direction. We have included information on wind direction at the top of the site in Section 2.1 of the updated manuscript.
  L31: 0 C : missing a degree symbol?
  Thank you for pointing this out, we have corrected it in the updated manuscript.
  Figure 2 (Validation of digital terrain map and snow depth) says ‘Orthomosaic of the transects and surrounding area is shown on the left. ‘ I don’t see it, is it missing? It also says: ‘A snow depth map of the transects and surrounding area is shown in the middle’. I don’t see it. Seems the caption does not relate to the figure?
  We apologize for the confusion. The legend refers to an earlier version of the figure. We have corrected it in the updated manuscript.
  L122: a much-used topographic index for snow redistribution is the Winstral wind sheltering index (eg Winstral 2002; Dharmadasa et al., 2023), which you should at least mention. It also relies on the dominant wind direction, you may want to compare/contrast your approach to that.
  Winstral, A., Elder, K., & Davis, R. E. (2002). Spatial snow modeling of wind-redistributed snow using terrain-based parameters. Journal of hydrometeorology, 3(5), 524-538.
  Dharmadasa, V., Kinnard, C., & Baraër, M. (2023). Topographic and vegetation controls of the spatial distribution of snow depth in agro-forested environments by UAV lidar. The Cryosphere, 17(3), 1225-1246.
  Thank you for this suggestion. We have included the following discussion of this important index in Section 2.5 of the updated manuscript: “Other indices like the Winstral wind sheltering index (Winstral, Elder, and Davis 2002; Dharmadasa, Kinnard, and Baraër 2023), are calculated based on the dominant wind direction, which varies across the watershed and throughout the winter, and cannot account for snow redistribution resulting from turbulence created in front of a topographic obstruction.”
  L135: equations 1-4: the x and y are not defined in text. Some symbology is strange: an underscore is used on z. should it not be the ‘bar’ symbol for average (as said in the text)?
  Thank you for pointing this out, we have corrected it in the updated manuscript.
  The description of the procedure should be improved a bit, perhaps by explaining better in wording, first (or after), what the equations produce. You basically run a moving 1D window along two orthogonal directions with different window lengths (111 m to 221 m) and then calculate topographic deviations from these? Then you remove these from the basemap and reapply at the next coarser level? I wonder why digital filtering has not been used to remove the ‘coarse features? Perhaps a conceptual figure in main text or SI could help the reader to better understand the new proposed filtering step…
  Thank you for this comment. We have improved the wording and added a schematic that describes this filtering approach (Figure 3 in the updated manuscript). Generally, this approach can be considered as a digital filtering approach, although it is different from most other digital filters in that it preserves edges in orthogonal directions and can extract information within a window of length scales (e.g. the filtered map for 30 m gives the information at scales larger than 30 m but at scales smaller than 20 m). These are important features for our application because the topographic features in the studied watershed have sharp edges and vary in size. Plus, it allows us to create the TM_SV which relates variation in snow depth at a certain scale to variation in topography at that same scale.
  L144: ‘we take the top part of the watershed as representative of thermokarst patterned ground.’ => not evident why for the reader
  L145: why using the 21 m scale?
  L146-151: there are many choices made here on thresholds which are not explained nor validated anyhow.
  Why is there a sharp boundary for the patterned ground in Fig S2? This seems arbritrary
  In response to the previous four comments, we have rearranged the methods section to avoid confusion for the reader. The comparison of snow depth variation between the different types of topographic features is the main goal of the classification. Therefore, we decided to refer the reader to our knowledge of the site in classifying regions characterized by each topographic feature, rather than methods based on the topographic maps alone. The updated manuscript avoids the use of arbitrary thresholds and scales, and instead includes the following line:
  “Using our knowledge of the site, we identify regions in the watershed that are characterized by each topographic feature”
  L157, ‘by performing a linear fit’…
  Thank you for this suggestion, we have updated the manuscript accordingly.
  Fig S1 and S2 lack a scale bar to interpret the colour scale
  Thank you for this suggestion, we have added colorbars to these figures.
  What are the units of the ‘weights’? Is it cm of snow depth per m of terrain deviation? Please explain better
  We appreciate the suggestion to provide further clarification. We have added units to the y-axis of Figure S3 and included the following explanation at the end of Section 2.5:
  “where the weights wi (with units m/m) can be interpreted as the deviation from local mean snow depth for a given terrain deviation at each scale i.”
  
  L 176: ‘filtered maps’ : which ones? DTM?
  Thank you for pointing this out. We have clarified in the updated manuscript that the filtered topographic maps were used to train the ML model.
  L183-187: did you train a single model on 2% of the data or did you train a few ones? Is the model stable if you retrain on another 2% of the data? Please consider showing some sensitivity results in SI
  Each model is trained on 2% of the data and tested on the full dataset. In the updated manuscript, we have trained an ensemble of 10 ML models for both 2019 and 2022. The statistics and maps of standard deviation across the watershed are shown in Table S1 and Figure S7 of the revised manuscript, respectively.
  L187: what is the ‘shrub potential’?
  We apologize for the confusion. This is a term used in a previous draft of the paper. We have corrected it to read ‘shrub canopy trapping field.’
  Fig.3: add black stippled line to legend. Also, it is not clear what the first row is. I taught it was the ‘filtered topography’, but the y-label has the SD label to it, so are these snow depth or elevations?
  Thanks for this suggestion. We have added the black stippled line to the legend in the updated manuscript. The mean standard deviation of 100 m x 100 m windows is shown in each panel of this figure for the different products. We have tried to make this clearer by changing the SD (which can be confused with snow depth) in the y-axes labels to STD.
  L192: ‘topography varies across all scales while variation in snow depth is attenuated at scales larger than ~100 m (Fig. 3).’ I agree that the std of SD is more concentrated at smaller scales, but I also see this tendency for the topography (first row of fig3?). The exception is for terraces in the cross-slope direction?
  Thank you for the opportunity to clarify this point. Indeed, both the topography and the snow depths have more variation at smaller scales than at larger scales. However, whereas the variation in snow depth goes to zero at scales larger than 90 m, the variation in topography remains larger than zero at all scales across all topographic features. It is this difference between zero and non-zero that we are emphasizing in this sentence.
  L214: ‘Developed with the hypothesis that snow depth variation at a given spatial scale is related
  215 to topographic variation at the same scale’ : this statement would be better place before the methods, to better explain the rational of the methods developed.
  Thank you for this suggestion. We have moved this line to the beginning of the section that describes the TM_SV in the methods.
  L217: remove space before the period.
  Thank you for pointing out this typo. We have corrected it in the updated manuscript.
  L234: What type of R2 is that? Is this a cross-validated R2? If it is only the squared correlation coefficient between the observations used for calibrating and the model prediction, it is not a robust measure of performance. Please provide a robust assessment, i.e. predicting you model on observations not used for training (i.e. you have 98% of your data available for this). You figure S5 is instructive in this way but please provide validation metrics calculate on the data not used for training (RMSE, Bias and R2). If this is what was done, explain it better.
  The machine learning performance metrics reported in both the original and updated manuscript were calculated using the entire dataset, not just the point used for training. We have emphasized this in the updated version of the manuscript, and included Table S1 in the revised manuscript that details performance metrics for each ensemble member for both 2019 and 2022.
  Table S1: why including the 31 m scale topo map? And is this information not redundant with the filtered topographic map used to predict the TM_SV model?
  Table S1 shows only the most important variables used by the machine learning models to predict snow depth. Many more variables were used for model training, including all of the available topographic information, but these first variables were the ones that the machine learning models rely on the most. The TM_SV model is indeed redundant with the filtered topographic maps that are also included as predictor variables (since the TM_SV is simply a weighted sum of these maps), but the machine learning algorithm is more efficient when the information is pre-organized in a useful way like this. Plus, the strong reliance of the machine learning models on the TM_SV is further proof that the TM_SV is an effective translation of the topographic information into local variation in snow depth.
  L236-240: effects of shrubs. I am not fully convinced you can get an independent shrub trapping effect with this method. For this to work, the shrub potential would need to be fully independent from the other predictors. If some collinearity exists between the shrub potential and topography (or the topo-derived TM_SV), then the effect of topography (or TM_SV) could include shrub effects… The reverse is also true.
  While there may be some association between tall shrubs and topographic depressions, there are many areas in the watershed where there are no tall shrubs. The machine learning algorithm can develop relationships between topography and snow depth using data from these areas which are completely independent of shrub impacts. In this way, the impact of shrubs and topography on snow depth can be separated. Before performing this analysis, we actually expected that there would be a stronger association between tall shrub patches and topographic lows. However, close examination of the map of TM_SV (Figure 5 in the updated analysis) shows almost no relationship between the footprints of shrub patches A and B with the underlying topography and perhaps a minor association for shrub patch C (although not enough to account for the significant snow accumulation there). For these reasons we are quite confident that the shrub patches are accumulating snow, although we do acknowledge that we cannot independently evaluate the performance of the machine learning predictions of snow trapping by shrub canopies.
  Figure 5. What are the units of the snow trapping in the two maps? Meters? Or is this the Shrub canopy snow trapping field? Please clarify.
  Thank you for pointing out this omission. We have included the units of the snow trapping in the two maps (m) in the updated version of the manuscript.
  Section 3.3. Perhaps I missed it in Methods, but it is unclear for me if you trained a new ML model for 2022 or if you applied the 2019 model to the 2022 data? Please clarify
  We trained separate machine learning models for 2019 and 2022. We have clarified this both in the methods and in Section 3.3
  
  Citation: https://doi.org/10.5194/egusphere-2023-968-AC2

Status: closed

RC1:
'Comment on egusphere-2023-968', Anonymous Referee #1, 29 Jul 2023

Review of the manuscript “Disentangling the effect of geomorphological features and tall shrubs on snow depth variation in a sub-Arctic watershed using UAV derived products”.
The manuscript describes an interesting approach to understand and differentiate the control that topography and shrubs have on snow distribution. The methods presented are novel and exploit state of the art observation tools as UAVs. The findings of Shirley et al., are highly interesting and I consider this work will contribute to improve the understanding of snow dynamics in sub-arctic areas.
However I consider the manuscript needs some work before publication in order to improve it and thus I recommend major review (despite it is between minor and major). I encourage manuscript authors to complete this review as I think the work and the approach applied are valuable for the community.
Major points:
1. Some sections need a clearer description of the methods to allow an easier understanding. In the list of minor points I have included some recommendations. For example more details are needed when describing the stacked directional filtering as it is a novel method in the community. This way, when some figure are referred to in section 2.3, a more detailed description in what is observed is required to help their interpretation.
2. Description of the study area in Section 2.1 This section needs a more appropriate study area map. Figure 1 gives some information, but it is not easy to have detailed information of the study area. The elevation map (left panel in figure 1) helps but discretized elevation bands and elevation curves (additionally to the vegetation contour lines) would do easier the interpretation for potential readers. Moreover a location map, showing Alaska and placing the watershed is required. Also I encourage manuscript authors to include photographs with and without snow of this site. Similarly, vegetation and snow depths maps would benefit of discretized color ramps also including elevation curves and vegetation contours (with thicker lines).
3. I encourage to apply the analysis of TM_SV for shorter distances, (and not only starting in 11 m) This might allow to identify the maximum distance for which shrubs are able to disturb the snow distribution (see line 228-229 minor comment).
Minor comments:
Line 24: Indicate that both, the RMSE and the R2 are the error metrics of the machine learning model.
Line 36: I encourage manuscript authors to include more references (not only a self-citation) about the control that snow has as driver of landscapes.
Line 40: Which is the microtopographic spatial scale, 1m, 10 m?, please include some insight/refernces about this.
Line 54-55: I think here can be changed the reference of Sturm et al., 2005 by a more recent one as this: https://doi.org/10.1175/JHM-D-22-0067.1
Line 100: Is the ground resolution the ground sampling distance of the UAV camera or the final resolution of the grid cells? Please, clarify.
Line 101: Approximately 50 targets as GCP? Did you change the number depending on the acquisition date? Please clarify this point and include a map showing the location of GCPs.
Line 108: Which method did you use to interpolate data?
Line 112-115: These lines are describing an interesting validation along several transects. I would provide further details here and potentially include a new section for this validation, as you are comparing the elevations with and without snow and not a direct snow depth validation. In view of Figure 2 caption I think some information is missing in this figure. For instance in the caption it is stated: “Orthomosaic of the transects and surrounding area is shown on the left. Contour lines are drawn at a 3 m distance from vegetation of 1 m height or taller. A snow depth map of the transects and surrounding area is shown in the middle. Contour lines are drawn at terrain elevation isolines with a spacing of 1m.”, where is the orthomosaic? And the snow depth map of the transects?
Line 120: Please, give more details on what potential readers can see/understand in “Figs 1,4” You are giving too much information without a detailed explanation and without introducing some important concepts as the TM-SV plotted in some of these figures or the different classes you are focusing at (Stream, Terracets, Patterned ground).
Line 132: Which are the finest and the coarsest spatial scales? Why you have chosen these scales?
Line 139: Why you don’t start with 1 m spatial scale (your dataset has a 0.1 m spatial resolution).
Line 145: Why 21 m? Please, give some details about this choice, otherwise it seems too arbitrary.
Line 151: In figure S2 you are talking about TM_SV, without a previous description of this variable and this is confusing. Please, here and elsewhere in the manuscript try to re-order figures/concepts to have a step by step explanation of the concepts described in your article.
Line 155: Why the snow depth map of 2019 and not that of 2022? Please justify this choice.
Line 167: OK, I understand that computational efficiency a 50x50 m window is chosen, but this is not a physical choice. I suggest reviewing literature and also justifying this search distance in view to previous works.
Line 185: “measured shrub height and potential”, is there something missing after potential? I am not able to understand this sentence. Please rephrase if possible.
Figure 3, vertical dashed lines in the different panels (indicating distances) would help to interpret the distances.
Line 221 to 226. Panels d, e, f do not appear in Figure 4. Oppositely panels I, II, III, IV and V are shown. Please, change conveniently.
Line 228-229: I like this last sentence of this section as it summarizes an interesting outcome of this work. However, and despite the interesting analysis described in section 3.2 which demonstrate the importance of shrubs on trapping snow, I think it is necessary to understand the distance to which these shrubs can control snow depth variations. That is why I would suggest applying 3.1 analyses to shorter distances. Please consider doing an insight to determine this distance.
Line 243: The message of this sentence would benefit of elevation curves included in all snow maps. I encourage including contour elevation lines in all maps.
Line 250: is this figure 6 or is it figure 5??
Line 305: I would say “we demonstrate….on the order of 100 m IN THIS STUDY AREA”.
Line 319-325: I see these sentences more appropriate for the discussion. I encourage moving them to this section in new section about the consequences of the findings of this research.
Line 326-333: I think these lines also fit better in the discussion section. What about a final subsection in the discussion named: “Impacts and future work” (or similar).

Citation: https://doi.org/10.5194/egusphere-2023-968-RC1
- AC1:
  'Reply on RC1', Ian Shirley, 30 Sep 2023
  The manuscript describes an interesting approach to understand and differentiate the control that topography and shrubs have on snow distribution. The methods presented are novel and exploit state of the art observation tools as UAVs. The findings of Shirley et al., are highly interesting and I consider this work will contribute to improve the understanding of snow dynamics in sub-arctic areas.
  However I consider the manuscript needs some work before publication in order to improve it and thus I recommend major review (despite it is between minor and major). I encourage manuscript authors to complete this review as I think the work and the approach applied are valuable for the community.
  Thank you for your comments and your positive review. We have responded to your comments and suggestions below, and through changes to the manuscript.
  
  Major points:
  Some sections need a clearer description of the methods to allow an easier understanding. In the list of minor points I have included some recommendations. For example more details are needed when describing the stacked directional filtering as it is a novel method in the community. This way, when some figure are referred to in section 2.3, a more detailed description in what is observed is required to help their interpretation.
  
  We have improved the description of the stacked directional filtering in the revised manuscript. In addition to updating the text describing the filtering approach in the Methods section, we have included an additional figure that outlines the approach in schematic form.
  Description of the study area in Section 2.1 This section needs a more appropriate study area map. Figure 1 gives some information, but it is not easy to have detailed information of the study area. The elevation map (left panel in figure 1) helps but discretized elevation bands and elevation curves (additionally to the vegetation contour lines) would do easier the interpretation for potential readers. Moreover a location map, showing Alaska and placing the watershed is required. Also I encourage manuscript authors to include photographs with and without snow of this site. Similarly, vegetation and snow depths maps would benefit of discretized color ramps also including elevation curves and vegetation contours (with thicker lines).
  
  Thank you for these suggestions. We have included the area map and select images within the watershed in Figure 1. We have also changed the figures throughout the manuscript and supplementary information to include elevation curves and discretized color ramps.
  I encourage to apply the analysis of TM_SV for shorter distances, (and not only starting in 11 m) This might allow to identify the maximum distance for which shrubs are able to disturb the snow distribution (see line 228-229 minor comment).
  
  As suggested by the reviewer, we have performed the stacked directional filtering and TM_SV analysis for shorter distances (3, 5, 7 and 9 m). We note that we have downsampled the UAV products to 1 m resolution to reduce the memory requirements needed for analysis, so filtering at shorter distances is not possible. We are not able to determine the maximum distance for which shrubs are able to disturb the snow distribution using only the TM_SV, however, as the shrub canopy structure and its interaction with topography is quite complex. We developed the machine learning models in order to deal with this complexity.
  Minor comments:
  Line 24: Indicate that both, the RMSE and the R2 are the error metrics of the machine learning model.
  These metrics describe the fit between the machine learning model of shrub canopy snow trapping and the simple exponential fit that we perform in Figure 8 to approximate the complex output of the machine learning model. We hope that this is more clear in the revised version of the abstract.
  Line 36: I encourage manuscript authors to include more references (not only a self-citation) about the control that snow has as driver of landscapes.
  Thank you for this suggestion. We have added the following references to support this well-researched topic:
  Avanzi, Francesco, Alberto Bianchi, Alberto Cina, Carlo De Michele, Paolo Maschio, Diana Pagliari, Daniele Passoni, Livio Pinto, Marco Piras, and Lorenzo Rossi. 2018. “Centimetric Accuracy in Snow Depth Using Unmanned Aerial System Photogrammetry and a MultiStation.” Remote Sensing 10 (5): 765.
  Chen, Yunxiang, Roman A. DiBiase, Nicholas McCarroll, and Xiaofeng Liu. 2019. “Quantifying Flow Resistance in Mountain Streams Using Computational Fluid Dynamics Modeling over Structure-from-Motion Photogrammetry-Derived Microtopography.” Earth Surface Processes and Landforms 44 (10): 1973–87.
  Goetz, Jason, and Alexander Brenning. 2019. “Quantifying Uncertainties in Snow Depth Mapping From Structure From Motion Photogrammetry in an Alpine Area.” Water Resources Research 55 (9). https://doi.org/10.1029/2019WR025251.
  Harder, Phillip, John W. Pomeroy, and Warren D. Helgason. 2020. “Improving Sub-Canopy Snow Depth Mapping with Unmanned Aerial Vehicles: Lidar versus Structure-from-Motion Techniques.” The Cryosphere 14 (6): 1919–35.
  Harder, Phillip, Michael Schirmer, John Pomeroy, and Warren Helgason. 2016. “Accuracy of Snow Depth Estimation in Mountain and Prairie Environments by an Unmanned Aerial Vehicle.” The Cryosphere 10 (6): 2559–71.
  Le, Phong V. V., and Praveen Kumar. 2017. “Interaction Between Ecohydrologic Dynamics and Microtopographic Variability Under Climate Change.” Water Resources Research 53 (10): 8383–8403.
  Revuelto, Jesus, Esteban Alonso-Gonzalez, Ixeia Vidaller-Gayan, Emilien Lacroix, Eñaut Izagirre, Guillermo Rodríguez-López, and Juan Ignacio López-Moreno. 2021. “Intercomparison of UAV Platforms for Mapping Snow Depth Distribution in Complex Alpine Terrain.” Cold Regions Science and Technology 190 (October): 103344.
  Tabler, Ronald D. 1980. “Geometry and Density of Drifts Formed by Snow Fences.” Journal of Glaciology 26 (94): 405–19.
  Wainwright, Haruko M., Anna K. Liljedahl, Baptiste Dafflon, Craig Ulrich, John E. Peterson, Alessio Gusmeroli, and Susan S. Hubbard. 2017. “Mapping Snow Depth within a Tundra Ecosystem Using Multiscale Observations and Bayesian Methods.” The Cryosphere. https://doi.org/10.5194/tc-11-857-2017.
  
  Line 40: Which is the microtopographic spatial scale, 1m, 10 m?, please include some insight/refernces about this.
  Thank you for pointing out this ambiguity. The term “microtopography” still lacks a strict definition, but we’ve added the following sentences to clarify our use of the term.
  “Microtopography can be defined as the difference between topographic maps and smoothed or low-pass filtered versions of these maps (Wainwright et al. 2017). The characteristic scale of microtopographic features varies depending on the environment and processes of interest, but length scales of microtopographic features relevant for landscape surface hydrology are typically on the order of 5 - 50 m (Chen et al. 2019; Le and Kumar 2017; Wainwright et al. 2017).”
  
  Line 54-55: I think here can be changed the reference of Sturm et al., 2005 by a more recent one as this: https://doi.org/10.1175/JHM-D-22-0067.1
  Thank you for this suggestion, we have updated this reference accordingly.
  Line 100: Is the ground resolution the ground sampling distance of the UAV camera or the final resolution of the grid cells? Please, clarify.
  The ground sampling distance of the UAV camera was 3-5 cm, and the final resolution of the grid cells was 10 cm. Please note that we then downsampled the UAV products to a resolution of 1m for the purposes of this analysis - we have clarified this in the updated manuscript.
  Line 101: Approximately 50 targets as GCP? Did you change the number depending on the acquisition date? Please clarify this point and include a map showing the location of GCPs.
  Yes, for the winter flights we used 47 targets and for the snow-free flights we used about 65 targets. This information is included in the updated version of the manuscript. We will provide the processing reports from the AgiSoft reconstruction which include maps of the locations of the GCPs, along with the UAV products in a data archive that will be made publicly available at https://ess-dive.lbl.gov/ upon publication of the manuscript.
  Line 108: Which method did you use to interpolate data?
  The data was linearly interpolated. We have clarified this in the revised manuscript.
  Line 112-115: These lines are describing an interesting validation along several transects. I would provide further details here and potentially include a new section for this validation, as you are comparing the elevations with and without snow and not a direct snow depth validation. In view of Figure 2 caption I think some information is mis sing in this figure. For instance in the caption it is stated: “Orthomosaic of the transects and surrounding area is shown on the left. Contour lines are drawn at a 3 m distance from vegetation of 1 m height or taller. A snow depth map of the transects and surrounding area is shown in the middle. Contour lines are drawn at terrain elevation isolines with a spacing of 1m.”, where is the orthomosaic? And the snow depth map of the transects?
  We appreciate this suggestion, and have added a new section in the methods, as well as an additional figure in the Supplementary Material, that describes the comparison between ground measurements and the UAV products in more detail. We apologize for the confusion with the Figure 2 caption. The legend refers to an earlier version of the figure, and has been corrected in the revised manuscript.
  Line 120: Please, give more details on what potential readers can see/understand in “Figs 1,4” You are giving too much information without a detailed explanation and without introducing some important concepts as the TM-SV plotted in some of these figures or the different classes you are focusing at (Stream, Terracets, Patterned ground).
  Thank you for highlighting here and below the confusion introduced by the order that concepts are introduced in the manuscript. The methods section is reorganized in the updated manuscript so that the concepts are presented in order. For example, we have added a section that describes the topographic features present in the watershed (and that does not reference the TM_SV) before this sentence which appears at the beginning of the section on stacked directional filtering.
  Line 132: Which are the finest and the coarsest spatial scales? Why you have chosen these scales?
  The finest and the coarsest scales used for stacked directional filtering would vary depending on the application. For our application, we have used scales that range from 3 m to 230 m, as we are analyzing images with 1 m resolution that are approximately 400 m in width.
  Line 139: Why you don’t start with 1 m spatial scale (your dataset has a 0.1 m spatial resolution).
  Thank you for this suggestion. We have updated our approach to start with 3 m as the smallest spatial scale, and have updated the TM_SV and machine learning models accordingly.
  Line 145: Why 21 m? Please, give some details about this choice, otherwise it seems too arbitrary.
  We have removed reference to the 21 m threshold (and other arbitrary thresholds) when describing the classification of regions characterized by each topographic feature. Instead, we refer the reader to our knowledge of the site as the method used for classification.
  Line 151: In figure S2 you are talking about TM_SV, without a previous description of this variable and this is confusing. Please, here and elsewhere in the manuscript try to re-order figures/concepts to have a step by step explanation of the concepts described in your article.
  The TM_SV serves mostly as a background image for the classified regions shown in Figure S2. In the updated version of the manuscript, we have used maps of snow depth instead for this figure.
  Line 155: Why the snow depth map of 2019 and not that of 2022? Please justify this choice.
  Thank you for allowing us to justify this choice. We’ve added the following sentence at the beginning of the section to clarify why we used the 2019 maps to build the simple model:
  “We created a simple model of snow depth variation (TM_SV) in the watershed based on the 2019 snow map instead of the 2022 snow map because the topographic variation was more consistently smoothed by the deeper snowpack in 2019, with fewer patches of <10 cm snow depths.”
  
  Line 167: OK, I understand that computational efficiency a 50x50 m window is chosen, but this is not a physical choice. I suggest reviewing literature and also justifying this search distance in view to previous works.
  Thank you for this suggestion. We have justified the choice of a 50x50 m window using a previous study of snow drift geometry that demonstrates that the impact of a 1.5 m snow fence is attenuated to zero within a 50 m distance. We’ve updated the text accordingly:
  “For computational efficiency, and because previous studies have shown a 1.5 m snow fence creates a snow drift that is less than 50 m in length (Tabler 1980), only shrubs that lie within a 50 m by 50 m window of each point contribute to 1 and 2.”
  
  Line 185: “measured shrub height and potential”, is there something missing after potential? I am not able to understand this sentence. Please rephrase if possible.
  We apologize for the confusion. “Measured shrub height and potential” is terminology from a previous draft of the paper. We’ve replaced this phrase with “shrub canopy snow trapping fields” in the updated manuscript.
  Figure 3, vertical dashed lines in the different panels (indicating distances) would help to interpret the distances.
  Thank you for this suggestion. We have updated the figure accordingly.
  Line 221 to 226. Panels d, e, f do not appear in Figure 4. Oppositely panels I, II, III, IV and V are shown. Please, change conveniently.
  Thank you for pointing out this mistake. We have updated the manuscript accordingly.
  Line 228-229: I like this last sentence of this section as it summarizes an interesting outcome of this work. However, and despite the interesting analysis described in section 3.2 which demonstrate the importance of shrubs on trapping snow, I think it is necessary to understand the distance to which these shrubs can control snow depth variations. That is why I would suggest applying 3.1 analyses to shorter distances. Please consider doing an insight to determine this distance.
  We agree that it is important to understand the distance to which shrubs can impact snow depth. However, the shrub canopy structure is complex and there are larger-scale interactions between topography and snow depth than those incorporated in the TM_SV. To deal with this complexity, we developed machine learning models to attempt to quantify the snow trapping by shrub canopies. Even with the machine learning predictions, however, it is challenging to characterize the relationship between distance from canopy and snow trapping, because of difficulties selecting the edge of the canopy vs gaps in the canopy, and varying interactions of shrub patches across the watershed with other shrub patches and with the terrain. Our discussion of snow trapping by shrub patches B and C in the last paragraph of Section 3.2 highlights this complexity, and we have further emphasized its importance in the text of the revised manuscript and with the inclusion of an additional conceptual figure.
  Line 243: The message of this sentence would benefit of elevation curves included in all snow maps. I encourage including contour elevation lines in all maps.
  Thank you for this suggestion. We have included contour elevation lines in all maps in the revised manuscript.
  Line 250: is this figure 6 or is it figure 5??
  Yes, this should be figure 5 not figure 6. We have updated the manuscript to correct this mistake.
  Line 305: I would say “we demonstrate….on the order of 100 m IN THIS STUDY AREA”.
  We agree that it is important to emphasize that identified relationships between topography, vegetation, and snow depth are specific to the studied watershed and have updated this sentence as you suggest. We have also included some discussion that compares our results to similar previous findings in other environments.
  Line 319-325: I see these sentences more appropriate for the discussion. I encourage moving them to this section in new section about the consequences of the findings of this research.
  Line 326-333: I think these lines also fit better in the discussion section. What about a final subsection in the discussion named: “Impacts and future work” (or similar).
  
  We appreciate this suggestion, and have moved these two paragraphs to a new section in the discussion in the revised manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2023-968-AC1
CC1:
'Comment on egusphere-2023-968', Florent Dominé, 31 Jul 2023

I have read this paper with interest as indeed understanding the combined effects of topography and erect vegetation on snow depth is critical to many applications. I draw the Author’s attention to a related paper by our group (Lamare et al., 2023) that was published a few weeks before the Authors submitted their own work.
I am amazed that the Authors achieved this work using photogrammetry. We tried that method at our study site but were not able to detect the ground surface below shrubs with sufficient reliability, which is why we had to resort to using a lidar. Our attempts were in the fall for logistical reasons, and perhaps spring before bud burst is more favorable. In any case, I felt that few details are given by the Authors to explain how they overcame the difficulties. Did the Authors perform any field validation of vegetation height? I feel this is important to validate the use of photogrammetry to obtain a DTM in the presence of shrubs, as several previous publications (some of them cited by the Authors) report difficulties (De Michele et al., 2016; Fernandes et al., 2018; Harder, Pomeroy, & Helgason, 2020). If no field validation was performed, I suggest mentioning this explicitly. The same suggestion applies to vegetation protruding above the snowpack in April.
The models developed by the Authors appear quite interesting. Should they be interested, they may wish to test them with our data, to possibly demonstrate their more general validity. Spring and fall DTM and DSM are available at (Lamare et al., 2022). Additional data, including meteorological data, are also available as indicated in the data statement of (Lamare et al., 2023).
Lastly, I personally feel that a general situation map would have been useful to understand the general context of the study. Likewise, more details on the vegetation are in my opinion highly desirable. This is useful for many reasons, among others for readers interesting in evaluating shrub bending, as some species have branches much more supple than others. “Salix spp” is rather vague, and “dwarf shrubs” even more so. Photographs of the site would be useful to many readers as well. Maps and photographs may probably be added to the supplementary material with moderate efforts.
References
De Michele, C., Avanzi, F., Passoni, D., Barzaghi, R., Pinto, L., Dosso, P., . . . Della Vedova, G. (2016). Using a fixed-wing UAS to map snow depth distribution: an evaluation at peak accumulation. The Cryosphere, 10(2), 511-522. doi:10.5194/tc-10-511-2016
Fernandes, R., Prevost, C., Canisius, F., Leblanc, S. G., Maloley, M., Oakes, S., . . . Knudby, A. (2018). Monitoring snow depth change across a range of landscapes with ephemeral snowpacks using structure from motion applied to lightweight unmanned aerial vehicle videos. The Cryosphere, 12(11), 3535-3550. doi:10.5194/tc-12-3535-2018
Harder, P., Pomeroy, J. W., & Helgason, W. D. (2020). Improving sub-canopy snow depth mapping with unmanned aerial vehicles: lidar versus structure-from-motion techniques. Cryosphere, 14(6), 1919-1935. doi:10.5194/tc-14-1919-2020
Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2022). UAV-borne lidar campaign over Umiuaq, Hudson Bay, Canada in 2017 and 2018. Retrieved from: https://doi.org/10.1594/PANGAEA.943854
Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2023). Investigating the Role of Shrub Height and Topography in Snow Accumulation on Low-Arctic Tundra using UAV-Borne Lidar. Journal of Hydrometeorology, 24(5), 835-853. doi:https://doi.org/10.1175/JHM-D-22-0067.1

Citation: https://doi.org/10.5194/egusphere-2023-968-CC1
- AC3: 'Reply on CC1', Ian Shirley, 30 Sep 2023
  
  Thank you for your comments and interest in our work! We have read your recent paper with great interest and were happy to see similar conclusions regarding the interactions of snow, topography, and shrubs.
  We have included some additional description of the comparison between UAV products and ground measurements (including vegetation height) in the updated version of the manuscript that may be of interest to you. In short, we find a good match between the UAV products and ground measurements of ground surface elevation, snow surface elevation, and snow depth even in patches of tall shrubs. We timed our spring flight in a very short window between snowmelt and bud burst, but the success of this approach may be dependent on the environment and the species present in the watershed. We appreciate your suggestion to test the approach presented here on your data, and will include it in future efforts to apply these techniques in different environments and at larger scales. Per your suggestion, we have included a map that shows the location of the watershed in Alaska in the updated manuscript, as well as some snow on-off photos taken from some pheno-cams we have in the watershed. The willows in the watershed are primarily S. pulchra and S. glauca, we have included this detail in the text.
  
  Citation: https://doi.org/10.5194/egusphere-2023-968-AC3
RC2:
'Comment on egusphere-2023-968', Anonymous Referee #2, 01 Aug 2023
Report of Disentangling the effect of geomorphological features and tall shrubs on snow depth variation in a sub-Arctic watershed using UAV derived products, by Shirley et al.
The article by Shirley et al. used UAV-derived terrain, vegetation height and snow depth maps to investigate scale-dependent relationships between these variables. Interactions between topography, terrain and vegetation are important drivers of soil temperatures and thus of permafrost occurrence/active-later depths in the Arctic. They find that topographic variations on spatial scales <100 m explain much of the spatial variations of snow depth, with a super-imposed, but spatially-variable effect of snow trapping by shrubs. The effect of shrubs can be subdued by topography, as shrubs tend to grow in sheltered depressions, which also trap snow. The study introduces a novel ‘stack directional filtering’ approach which decomposes topographic and snow depth variations into discrete spatial scales in two orthogonal directions. This is an interesting and original approach which has the potential to improve our understanding of the complex associations between vegetation, terrain and snowpack in the Arctic.
The study has a great potential for the community, providing that the underlying data can be proven to be reliable, which at this point I am not convinced of. As such, there are a number of issues to be resolved before this paper could be considered for publication. The study area should be better described, some methodological steps need to be clarified, and the reliability of some of the data processing steps need to be demonstrated. Several decisions made on quantitative thresholds used for the classification of ground points and further analyses are presented without clear justification or backed by external validation, which raises questions on the reliability of the raw datasets and their derived products used for analysis. Those major concerns are addressed below in point form, followed by minor comments.
Major comments
1) Classification of the digital surface model (DSM) into bare earth (DSM) models and canopy height model (CHM).
Classification of point clouds in ground/non-ground is a much-researched area, and the authors have browsed very quickly on this important step, without providing enough details or convincing results that their simple scheme to separate ground from vegetation works well. The authors should provide additional information and results to demonstrate that their methodology is reliable, and better place it in the context of previous efforts to classify LIDAR or UAV photogrammetry point clouds or DSMs into ground/non-ground. Classifying ground/non-ground for UAV photogrammetric point clouds or derived DSMs would be challenging, and would require a better proof of concepts before launching into analyzing the results. This is explicated in more details below.
A 0.5 m threshold was used to separate topographic variations in the 10 cm DSM that are due to vegetation from those due to local terrain. How did you choose this value, and how influential is it on the results? Did you validate somehow this simple ground classification scheme against independent observation of vegetation presence/absence, or RTK-GPS beneath shrubs for example?

The DTM was produced by interpolating DSM gridcells classified as ground (as derived from the above procedure), but no details on the interpolation procedure and its performance is given. Given that photogrammetry is expected not to work well for capturing sub-canopy terrain, even in leafless shrubs, one would expect to obtain few points below canopies, hence the DTM would rely heavily on interpolated values from the surrounding open terrain. Given that the point clouds were aggregated into a 10 cm DTM before classification, the sub canopy ground points eventually captured by photogrammetry would be diluted by the higher elevation of surrounding shrub branches.

Given that the terrain can be locally higher beneath shrubs to the accumulation of litter (e.g. Lamare et al., 2023), or locally lower if shrubs grow in small depressions, I am not convinced that the ground classification used is robust and thus would ask the authors to provide more details and supporting results that show the reliability of their ground classification and DTM production procedure, as this is a challenging and important step in this work. One starting example would be to show maps of ground point density (number of points / m2) in supplementary material, and some kind of k-fold cross-validation test of the interpolation procedure (i.e., how accurate is the prediction of left out ground points under shrubs when interpolating from the available ground points).

2) UAV-photogrammetry and validation of the derived products.
Some important details are missing to better understand the quality of the derived topographic and snow depth products:
What was the mean ground-sampling distance (GSD) in each flight, compared to the 10 cm gridcell size used for aggregating/interpolating the cloud point to rasters? A table showing the specification of the UAV flights (height, overlap, GSD, etc) in SI would be useful.

The validation of the products appears somewhat simplistic. The transects presented in Fig2 are interesting to qualitatively assess the concordance between RTK-GPS surveys and UAV topographic surfaces, but do not give a quantitative view of the validation. Please provide scatterplots of measured (probed) vs UAV snow depths, and of RTK vs UAV elevations, with corresponding error metrics (bias, rmse, r2, etc). For example, I see quite many discrepancies between RTK surveyed elevations and the UAV reconstructed ones… Based on Figure 2, I would not trust the statement made in L115 that the ‘overall quality of the dataset is high’.

Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2023). Investigating the role of shrub height and topography in snow accumulation on low-Arctic tundra using UAV-borne lidar. Journal of Hydrometeorology, 24(5), 853-871.
A RMSE of 16 cm for snow depth is on the high side for the snow-on/off photogrammetric method. I guess what is important here would be to give some relative metrics, e.g. normalized RMSE. Is a 16 cm RMSE a large fraction of the average snow depth in the landscape?
The ‘Simple topographic model of snow depth variation (TM_SV)’ does not appear that simple from the information given. If I understood correctly, you fit a regression between directionally-filtered snow depth and terrain at various spatial scales (11-221 m) and then added these predictions to obtain TM_SV in equation 5? Some more rational could be given to theses methods… as stated presently the readers is left asking himself several questions about the validity of the approach. You could give some interpretation of the weights in Fig S3, and their R2… why are some of the R2 negative?

3) Machine learning model of snow depth variations
A boosted regression tree model of snow depth variation was produced using 2% of the datasets. As I understand this was due to the computational burden, but 15 000 points do not appear such a big dataset. Anyhow, the question arises on the reliability/stability of the derived model, and no formal quantitative validation is presented. One possibility is to repeat the model training a few times on different random samples to show that the results are robust to sampling, and/or validating the trained model on the left-out data (i.e. 98% of the whole set). A map of the bias within 100x100 m grid cells is presented in SI, but this is not enough to judge on model performance. Please provide additional validation metrics (RMSE, r2) for the whole leftout set, and perhaps also map them within 100x100 gridcells as done for the bias to see spatial patterns. It is unclear at present if the high r2 reported in the paper was calculated on the training sample, in which case it would be prone to overfitting, or on the validation set, which wold give a more reliable performance measure.

Minor comments and edits
Shrub canopy snow trapping field: this is an interesting idea to consider the shrub neighbourhood. The influence of he shrub neighbourhood would also depend on prevailing wind directions; did you consider that? Please give some information on wind directions in the site description
L31: 0 C : missing a degree symbol?
Figure 2 (Validation of digital terrain map and snow depth) says ‘Orthomosaic of the transects and surrounding area is shown on the left. ‘ I don’t see it, is it missing? It also says: ‘A snow depth map of the transects and surrounding area is shown in the middle’. I don’t see it. Seems the caption does not relate to the figure?
L122: a much-used topographic index for snow redistribution is the Winstral wind sheltering index (eg Winstral 2002; Dharmadasa et al., 2023), which you should at least mention. It also relies on the dominant wind direction, you may want to compare/contrast your approach to that.
Winstral, A., Elder, K., & Davis, R. E. (2002). Spatial snow modeling of wind-redistributed snow using terrain-based parameters. Journal of hydrometeorology, 3(5), 524-538.
Dharmadasa, V., Kinnard, C., & Baraër, M. (2023). Topographic and vegetation controls of the spatial distribution of snow depth in agro-forested environments by UAV lidar. The Cryosphere, 17(3), 1225-1246.
L135: equations 1-4: the x and y are not defined in text. Some symbology is strange: an underscore is used on z. should it not be the ‘bar’ symbol for average (as said in the text)?
The description of the procedure should be improved a bit, perhaps by explaining better in wording, first (or after), what the equations produce. You basically run a moving 1D window along two orthogonal directions with different window lengths (111 m to 221 m) and then calculate topographic deviations from these? Then you remove these from the basemap and reapply at the next coarser level? I wonder why digital filtering has not been used to remove the ‘coarse features? Perhaps a conceptual figure in main text or SI could help the reader to better understand the new proposed filtering step…
L144: ‘we take the top part of the watershed as representative of thermokarst patterned ground.’ => not evident why for the reader
L145: why using the 21 m scale?
L146-151: there are many choices made here on thresholds which are not explained nor validated anyhow.
L157, ‘by performing a linear fit’…
Fig S1 and S2 lack a scale bar to interpret the colour scale
Why is there a sharp boundary for the patterned ground in Fig S2? This seems arbritrary
What are the units of the ‘weights’? Is it cm of snow depth per m of terrain deviation? Please explain better
L 176: ‘filtered maps’ : which ones? DTM?
L183-187: did you train a single model on 2% of the data or did you train a few ones? Is the model stable if you retrain on another 2% of the data? Please consider showing some sensitivity results in SI
L187: what is the ‘shrub potential’?
Fig.3: add black stippled line to legend. Also, it is not clear what the first row is. I taught it was the ‘filtered topography’, but the y-label has the SD label to it, so are these snow depth or elevations?
L192: ‘topography varies across all scales while variation in snow depth is attenuated at scales larger than ~100 m (Fig. 3).’ I agree that the std of SD is more concentrated at smaller scales, but I also see this tendency for the topography (first row of fig3?). The exception is for terraces in the cross-slope direction?
L214: ‘Developed with the hypothesis that snow depth variation at a given spatial scale is related
215 to topographic variation at the same scale’ : this statement would be better place before the methods, to better explain the rational of the methods developed.
L217: remove space before the period.
L234: What type of R2 is that? Is this a cross-validated R2? If it is only the squared correlation coefficient between the observations used for calibrating and the model prediction, it is not a robust measure of performance. Please provide a robust assessment, i.e. predicting you model on observations not used for training (i.e. you have 98% of your data available for this). You figure S5 is instructive in this way but please provide validation metrics calculate on the data not used for training (RMSE, Bias and R2). If this is what was done, explain it better.
Table S1: why including the 31 m scale topo map? And is this information not redundant with the filtered topographic map used to predict the TM_SV model?
L236-240: effects of shrubs. I am not fully convinced you can get an independent shrub trapping effect with this method. For this to work, the shrub potential would need to be fully independent from the other predictors. If some collinearity exists between the shrub potential and topography (or the topo-derived TM_SV), then the effect of topography (or TM_SV) could include shrub effects… The reverse is also true.
Figure 5. What are the units of the snow trapping in the two maps? Meters? Or is this the Shrub canopy snow trapping field? Please clarify.
Section 3.3. Perhaps I missed it in Methods, but it is unclear for me if you trained a new ML model for 2022 or if you applied the 2019 model to the 2022 data? Please clarify
Citation: https://doi.org/10.5194/egusphere-2023-968-RC2
- AC2:
  'Reply on RC2', Ian Shirley, 30 Sep 2023
  The article by Shirley et al. used UAV-derived terrain, vegetation height and snow depth maps to investigate scale-dependent relationships between these variables. Interactions between topography, terrain and vegetation are important drivers of soil temperatures and thus of permafrost occurrence/active-later depths in the Arctic. They find that topographic variations on spatial scales <100 m explain much of the spatial variations of snow depth, with a super-imposed, but spatially-variable effect of snow trapping by shrubs. The effect of shrubs can be subdued by topography, as shrubs tend to grow in sheltered depressions, which also trap snow. The study introduces a novel ‘stack directional filtering’ approach which decomposes topographic and snow depth variations into discrete spatial scales in two orthogonal directions. This is an interesting and original approach which has the potential to improve our understanding of the complex associations between vegetation, terrain and snowpack in the Arctic.
  The study has a great potential for the community, providing that the underlying data can be proven to be reliable, which at this point I am not convinced of. As such, there are a number of issues to be resolved before this paper could be considered for publication. The study area should be better described, some methodological steps need to be clarified, and the reliability of some of the data processing steps need to be demonstrated. Several decisions made on quantitative thresholds used for the classification of ground points and further analyses are presented without clear justification or backed by external validation, which raises questions on the reliability of the raw datasets and their derived products used for analysis. Those major concerns are addressed below in point form, followed by minor comments.
  Major comments
  1) Classification of the digital surface model (DSM) into bare earth (DSM) models and canopy height model (CHM).
  Classification of point clouds in ground/non-ground is a much-researched area, and the authors have browsed very quickly on this important step, without providing enough details or convincing results that their simple scheme to separate ground from vegetation works well. The authors should provide additional information and results to demonstrate that their methodology is reliable, and better place it in the context of previous efforts to classify LIDAR or UAV photogrammetry point clouds or DSMs into ground/non-ground. Classifying ground/non-ground for UAV photogrammetric point clouds or derived DSMs would be challenging, and would require a better proof of concepts before launching into analyzing the results. This is explicated in more details below.
  A 0.5 m threshold was used to separate topographic variations in the 10 cm DSM that are due to vegetation from those due to local terrain. How did you choose this value, and how influential is it on the results? Did you validate somehow this simple ground classification scheme against independent observation of vegetation presence/absence, or RTK-GPS beneath shrubs for example?
  
  The DTM was produced by interpolating DSM gridcells classified as ground (as derived from the above procedure), but no details on the interpolation procedure and its performance is given. Given that photogrammetry is expected not to work well for capturing sub-canopy terrain, even in leafless shrubs, one would expect to obtain few points below canopies, hence the DTM would rely heavily on interpolated values from the surrounding open terrain. Given that the point clouds were aggregated into a 10 cm DTM before classification, the sub canopy ground points eventually captured by photogrammetry would be diluted by the higher elevation of surrounding shrub branches.
  
  Given that the terrain can be locally higher beneath shrubs to the accumulation of litter (e.g. Lamare et al., 2023), or locally lower if shrubs grow in small depressions, I am not convinced that the ground classification used is robust and thus would ask the authors to provide more details and supporting results that show the reliability of their ground classification and DTM production procedure, as this is a challenging and important step in this work. One starting example would be to show maps of ground point density (number of points / m2) in supplementary material, and some kind of k-fold cross-validation test of the interpolation procedure (i.e., how accurate is the prediction of left out ground points under shrubs when interpolating from the available ground points).
  
  2) UAV-photogrammetry and validation of the derived products.
  Some important details are missing to better understand the quality of the derived topographic and snow depth products:
  What was the mean ground-sampling distance (GSD) in each flight, compared to the 10 cm gridcell size used for aggregating/interpolating the cloud point to rasters? A table showing the specification of the UAV flights (height, overlap, GSD, etc) in SI would be useful.
  
  The validation of the products appears somewhat simplistic. The transects presented in Fig2 are interesting to qualitatively assess the concordance between RTK-GPS surveys and UAV topographic surfaces, but do not give a quantitative view of the validation. Please provide scatterplots of measured (probed) vs UAV snow depths, and of RTK vs UAV elevations, with corresponding error metrics (bias, rmse, r2, etc). For example, I see quite many discrepancies between RTK surveyed elevations and the UAV reconstructed ones… Based on Figure 2, I would not trust the statement made in L115 that the ‘overall quality of the dataset is high’.
  
  A RMSE of 16 cm for snow depth is on the high side for the snow-on/off photogrammetric method. I guess what is important here would be to give some relative metrics, e.g. normalized RMSE. Is a 16 cm RMSE a large fraction of the average snow depth in the landscape?
  Lamare, M., Domine, F., Revuelto, J., Pelletier, M., Arnaud, L., & Picard, G. (2023). Investigating the role of shrub height and topography in snow accumulation on low-Arctic tundra using UAV-borne lidar. Journal of Hydrometeorology, 24(5), 853-871.
  The ‘Simple topographic model of snow depth variation (TM_SV)’ does not appear that simple from the information given. If I understood correctly, you fit a regression between directionally-filtered snow depth and terrain at various spatial scales (11-221 m) and then added these predictions to obtain TM_SV in equation 5? Some more rational could be given to theses methods… as stated presently the readers is left asking himself several questions about the validity of the approach. You could give some interpretation of the weights in Fig S3, and their R2… why are some of the R2 negative?
  
  3) Machine learning model of snow depth variations
  A boosted regression tree model of snow depth variation was produced using 2% of the datasets. As I understand this was due to the computational burden, but 15 000 points do not appear such a big dataset. Anyhow, the question arises on the reliability/stability of the derived model, and no formal quantitative validation is presented. One possibility is to repeat the model training a few times on different random samples to show that the results are robust to sampling, and/or validating the trained model on the left-out data (i.e. 98% of the whole set). A map of the bias within 100x100 m grid cells is presented in SI, but this is not enough to judge on model performance. Please provide additional validation metrics (RMSE, r2) for the whole leftout set, and perhaps also map them within 100x100 gridcells as done for the bias to see spatial patterns. It is unclear at present if the high r2 reported in the paper was calculated on the training sample, in which case it would be prone to overfitting, or on the validation set, which wold give a more reliable performance measure.
  
  We thank the reviewer for the very detailed and thorough review. Our responses to the major concerns of the reviewer are given here, and the minor comments and edits are addressed individually below. We have updated the manuscript to address these comments and concerns.
  We appreciate the opportunity to elaborate on our evaluation of the data products derived using photogrammetry, and have added a section in the manuscript (Section 2.3) that expands on the discussion of the comparison of UAV derived products to ground measurements. We have also added a figure in the supplementary information (Figure S1) that includes the suggested scatterplots and statistics. There are a number of potential problems with the snow probe measurements themselves, as outlined in the Lamare et al., 2023 paper referenced by the reviewer, which may lead to underestimation of the accuracy of the UAV dataset. These problems include snow probe penetration into the litter layer (overestimation of snow thickness), surface ice formation (underestimation of snow thickness), and positional accuracy of the snow probe measurements. Even with all of these potential issues, the UAV derived data products and ground measurements are in close agreement for the ground surface elevation (RMSE = 17 cm), the snow surface elevation (RMSE = 18 cm), and the snow depth (RMSE = 16). These products are in good agreement even beneath patches of tall shrubs (Figure 2), demonstrating the validity of the approach used to generate the ground surface elevation using the June UAV flights. The 16 cm RMSE for the comparison of the ground measurements and UAV snow depth product is less than 15% of the mean probed snow depth along the transects (1.12 m), is well in line with previously published studies that used photogrammetry to derive snow depths (RMSE = 10 – 30 cm; Revuelto et al. 2021; Harder, Pomeroy, and Helgason 2020; Harder et al. 2016; Avanzi et al. 2018; Goetz and Brenning 2019), and is very close to the estimated RMSE of 0.15 m reported by Lamare et al., 2023, even though that study used lidar rather than photogrammetry. We feel that these comparisons demonstrate the high quality of the UAV derived products. We will provide the processing reports from the AgiSoft reconstruction for each flight, which include information about GCP locations, overlap, etc, along with the UAV products in a data archive that will be made publicly available at https://ess-dive.lbl.gov/ upon publication of the manuscript. In general, the photogrammetric technique is a well established technique. This paper is not intended to be an in-depth investigation of the method itself, so we would like to limit the amount of highly technical details. We recognize that there some challenges and limitations with the photogrammetric method may arise depending on the landscape organization of the surveyed site. However, the accuracy assessment described above and included in the revised manuscript demonstrates that this method performs well in this watershed for each product (terrain elevation, vegetation surface, and snow surface).
  
  In response to the comments about the TM_SV, yes, the reviewer’s interpretation of the procedure is correct. We have expanded the description of the TM_SV in the methods to include the rationale behind this procedure (“Developed with the hypothesis that snow depth variation at a given spatial scale is related to topographic variation at the same scale, we develop a simple model of snow depth variation (TM_SV) in the watershed by performing a linear fit of filtered snow maps at a given scale to filtered topography maps at that same scale.”), and interpretation of the weights (“the weights wi (with units m/m) can be interpreted as the deviation from local mean snow depth for a given terrain deviation at each scale i”). The intercept of the linear fit was forced to zero at each scale, which leads to the potential for a negative R2 (meaning that the linear fit is a worse description of the data than the mean of the data). However, after updating the analysis to include filtered maps at smaller scales (as suggested by Reviewer 1), the R2 values are no longer negative, even at the large scales. Since the weights approach zero at 90 m scales, the TM_SV model is not strongly affected by the inclusion of the scales that have low or even negative R2 values.
  
  We have also addressed the reviewer’s concerns regarding the machine learning model performance and repeatability in the updated manuscript. All statistics reported in the previous and current version of the manuscript are calculated using the entire dataset (i.e. the 2% used for training and the 98% left-out set). In the updated manuscript we have included an analysis of the repeatability of the machine learning model by training separate ensembles of ML models on ten randomly generated 15,000 point training sets for 2019 and 2022. Performance statistics (RMSE, MAB, R2) for each ML model are shown in Table S1 of the updated manuscript, and are extremely consistent across each ensemble. Further, maps of standard deviation shown in Figure S7 of the updated manuscript demonstrate strong spatial consistency across each ensemble. The ML ensemble mean, which exhibits considerably higher performance than any ensemble member, is used throughout the updated manuscript.
  
  Minor comments and edits
  Shrub canopy snow trapping field: this is an interesting idea to consider the shrub neighbourhood. The influence of he shrub neighbourhood would also depend on prevailing wind directions; did you consider that? Please give some information on wind directions in the site description
  We understand the importance of wind direction on snow trapping by shrub canopies, but both wind direction and speed vary across the watershed, making it complicated to include this information in the model. The gradient of the shrub canopy snow trapping field is included in the machine learning model which gives information about the directionality of this field that may be used by the ML model to infer how the shrub canopy snow trapping is influenced by wind direction. We have included information on wind direction at the top of the site in Section 2.1 of the updated manuscript.
  L31: 0 C : missing a degree symbol?
  Thank you for pointing this out, we have corrected it in the updated manuscript.
  Figure 2 (Validation of digital terrain map and snow depth) says ‘Orthomosaic of the transects and surrounding area is shown on the left. ‘ I don’t see it, is it missing? It also says: ‘A snow depth map of the transects and surrounding area is shown in the middle’. I don’t see it. Seems the caption does not relate to the figure?
  We apologize for the confusion. The legend refers to an earlier version of the figure. We have corrected it in the updated manuscript.
  L122: a much-used topographic index for snow redistribution is the Winstral wind sheltering index (eg Winstral 2002; Dharmadasa et al., 2023), which you should at least mention. It also relies on the dominant wind direction, you may want to compare/contrast your approach to that.
  Winstral, A., Elder, K., & Davis, R. E. (2002). Spatial snow modeling of wind-redistributed snow using terrain-based parameters. Journal of hydrometeorology, 3(5), 524-538.
  Dharmadasa, V., Kinnard, C., & Baraër, M. (2023). Topographic and vegetation controls of the spatial distribution of snow depth in agro-forested environments by UAV lidar. The Cryosphere, 17(3), 1225-1246.
  Thank you for this suggestion. We have included the following discussion of this important index in Section 2.5 of the updated manuscript: “Other indices like the Winstral wind sheltering index (Winstral, Elder, and Davis 2002; Dharmadasa, Kinnard, and Baraër 2023), are calculated based on the dominant wind direction, which varies across the watershed and throughout the winter, and cannot account for snow redistribution resulting from turbulence created in front of a topographic obstruction.”
  L135: equations 1-4: the x and y are not defined in text. Some symbology is strange: an underscore is used on z. should it not be the ‘bar’ symbol for average (as said in the text)?
  Thank you for pointing this out, we have corrected it in the updated manuscript.
  The description of the procedure should be improved a bit, perhaps by explaining better in wording, first (or after), what the equations produce. You basically run a moving 1D window along two orthogonal directions with different window lengths (111 m to 221 m) and then calculate topographic deviations from these? Then you remove these from the basemap and reapply at the next coarser level? I wonder why digital filtering has not been used to remove the ‘coarse features? Perhaps a conceptual figure in main text or SI could help the reader to better understand the new proposed filtering step…
  Thank you for this comment. We have improved the wording and added a schematic that describes this filtering approach (Figure 3 in the updated manuscript). Generally, this approach can be considered as a digital filtering approach, although it is different from most other digital filters in that it preserves edges in orthogonal directions and can extract information within a window of length scales (e.g. the filtered map for 30 m gives the information at scales larger than 30 m but at scales smaller than 20 m). These are important features for our application because the topographic features in the studied watershed have sharp edges and vary in size. Plus, it allows us to create the TM_SV which relates variation in snow depth at a certain scale to variation in topography at that same scale.
  L144: ‘we take the top part of the watershed as representative of thermokarst patterned ground.’ => not evident why for the reader
  L145: why using the 21 m scale?
  L146-151: there are many choices made here on thresholds which are not explained nor validated anyhow.
  Why is there a sharp boundary for the patterned ground in Fig S2? This seems arbritrary
  In response to the previous four comments, we have rearranged the methods section to avoid confusion for the reader. The comparison of snow depth variation between the different types of topographic features is the main goal of the classification. Therefore, we decided to refer the reader to our knowledge of the site in classifying regions characterized by each topographic feature, rather than methods based on the topographic maps alone. The updated manuscript avoids the use of arbitrary thresholds and scales, and instead includes the following line:
  “Using our knowledge of the site, we identify regions in the watershed that are characterized by each topographic feature”
  L157, ‘by performing a linear fit’…
  Thank you for this suggestion, we have updated the manuscript accordingly.
  Fig S1 and S2 lack a scale bar to interpret the colour scale
  Thank you for this suggestion, we have added colorbars to these figures.
  What are the units of the ‘weights’? Is it cm of snow depth per m of terrain deviation? Please explain better
  We appreciate the suggestion to provide further clarification. We have added units to the y-axis of Figure S3 and included the following explanation at the end of Section 2.5:
  “where the weights wi (with units m/m) can be interpreted as the deviation from local mean snow depth for a given terrain deviation at each scale i.”
  
  L 176: ‘filtered maps’ : which ones? DTM?
  Thank you for pointing this out. We have clarified in the updated manuscript that the filtered topographic maps were used to train the ML model.
  L183-187: did you train a single model on 2% of the data or did you train a few ones? Is the model stable if you retrain on another 2% of the data? Please consider showing some sensitivity results in SI
  Each model is trained on 2% of the data and tested on the full dataset. In the updated manuscript, we have trained an ensemble of 10 ML models for both 2019 and 2022. The statistics and maps of standard deviation across the watershed are shown in Table S1 and Figure S7 of the revised manuscript, respectively.
  L187: what is the ‘shrub potential’?
  We apologize for the confusion. This is a term used in a previous draft of the paper. We have corrected it to read ‘shrub canopy trapping field.’
  Fig.3: add black stippled line to legend. Also, it is not clear what the first row is. I taught it was the ‘filtered topography’, but the y-label has the SD label to it, so are these snow depth or elevations?
  Thanks for this suggestion. We have added the black stippled line to the legend in the updated manuscript. The mean standard deviation of 100 m x 100 m windows is shown in each panel of this figure for the different products. We have tried to make this clearer by changing the SD (which can be confused with snow depth) in the y-axes labels to STD.
  L192: ‘topography varies across all scales while variation in snow depth is attenuated at scales larger than ~100 m (Fig. 3).’ I agree that the std of SD is more concentrated at smaller scales, but I also see this tendency for the topography (first row of fig3?). The exception is for terraces in the cross-slope direction?
  Thank you for the opportunity to clarify this point. Indeed, both the topography and the snow depths have more variation at smaller scales than at larger scales. However, whereas the variation in snow depth goes to zero at scales larger than 90 m, the variation in topography remains larger than zero at all scales across all topographic features. It is this difference between zero and non-zero that we are emphasizing in this sentence.
  L214: ‘Developed with the hypothesis that snow depth variation at a given spatial scale is related
  215 to topographic variation at the same scale’ : this statement would be better place before the methods, to better explain the rational of the methods developed.
  Thank you for this suggestion. We have moved this line to the beginning of the section that describes the TM_SV in the methods.
  L217: remove space before the period.
  Thank you for pointing out this typo. We have corrected it in the updated manuscript.
  L234: What type of R2 is that? Is this a cross-validated R2? If it is only the squared correlation coefficient between the observations used for calibrating and the model prediction, it is not a robust measure of performance. Please provide a robust assessment, i.e. predicting you model on observations not used for training (i.e. you have 98% of your data available for this). You figure S5 is instructive in this way but please provide validation metrics calculate on the data not used for training (RMSE, Bias and R2). If this is what was done, explain it better.
  The machine learning performance metrics reported in both the original and updated manuscript were calculated using the entire dataset, not just the point used for training. We have emphasized this in the updated version of the manuscript, and included Table S1 in the revised manuscript that details performance metrics for each ensemble member for both 2019 and 2022.
  Table S1: why including the 31 m scale topo map? And is this information not redundant with the filtered topographic map used to predict the TM_SV model?
  Table S1 shows only the most important variables used by the machine learning models to predict snow depth. Many more variables were used for model training, including all of the available topographic information, but these first variables were the ones that the machine learning models rely on the most. The TM_SV model is indeed redundant with the filtered topographic maps that are also included as predictor variables (since the TM_SV is simply a weighted sum of these maps), but the machine learning algorithm is more efficient when the information is pre-organized in a useful way like this. Plus, the strong reliance of the machine learning models on the TM_SV is further proof that the TM_SV is an effective translation of the topographic information into local variation in snow depth.
  L236-240: effects of shrubs. I am not fully convinced you can get an independent shrub trapping effect with this method. For this to work, the shrub potential would need to be fully independent from the other predictors. If some collinearity exists between the shrub potential and topography (or the topo-derived TM_SV), then the effect of topography (or TM_SV) could include shrub effects… The reverse is also true.
  While there may be some association between tall shrubs and topographic depressions, there are many areas in the watershed where there are no tall shrubs. The machine learning algorithm can develop relationships between topography and snow depth using data from these areas which are completely independent of shrub impacts. In this way, the impact of shrubs and topography on snow depth can be separated. Before performing this analysis, we actually expected that there would be a stronger association between tall shrub patches and topographic lows. However, close examination of the map of TM_SV (Figure 5 in the updated analysis) shows almost no relationship between the footprints of shrub patches A and B with the underlying topography and perhaps a minor association for shrub patch C (although not enough to account for the significant snow accumulation there). For these reasons we are quite confident that the shrub patches are accumulating snow, although we do acknowledge that we cannot independently evaluate the performance of the machine learning predictions of snow trapping by shrub canopies.
  Figure 5. What are the units of the snow trapping in the two maps? Meters? Or is this the Shrub canopy snow trapping field? Please clarify.
  Thank you for pointing out this omission. We have included the units of the snow trapping in the two maps (m) in the updated version of the manuscript.
  Section 3.3. Perhaps I missed it in Methods, but it is unclear for me if you trained a new ML model for 2022 or if you applied the 2019 model to the 2022 data? Please clarify
  We trained separate machine learning models for 2019 and 2022. We have clarified this both in the methods and in Section 3.3
  
  Citation: https://doi.org/10.5194/egusphere-2023-968-AC2

Ian Shirley, Sebastian Uhlemann, John Peterson, Katrina Bennett, Susan S. Hubbard, and Baptiste Dafflon

Supplement

https://doi.org/10.5194/egusphere-2023-968-supplement

Ian Shirley, Sebastian Uhlemann, John Peterson, Katrina Bennett, Susan S. Hubbard, and Baptiste Dafflon

Viewed

Total article views: 1,081 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
702	318	61	1,081	118	58	84

HTML: 702
PDF: 318
XML: 61
Total: 1,081
Supplement: 118
BibTeX: 58
EndNote: 84

Views and downloads (calculated since 22 May 2023)

Month	HTML	PDF	XML	Total
May 2023	82	26	4	112
Jun 2023	36	16	2	54
Jul 2023	68	21	6	95
Aug 2023	36	18	4	58
Sep 2023	35	14	7	56
Oct 2023	25	15	2	42
Nov 2023	4	4	2	10
Dec 2023	21	12	3	36
Jan 2024	29	20	3	52
Feb 2024	21	15	3	39
Mar 2024	19	6	1	26
Apr 2024	15	3	4	22
May 2024	14	10	2	26
Jun 2024	43	15	4	62
Jul 2024	15	5	2	22
Aug 2024	29	5	4	38
Sep 2024	24	6	1	31
Oct 2024	9	2	0	11
Nov 2024	8	4	0	12
Dec 2024	3	1	0	4
Jan 2025	24	4	0	28
Feb 2025	20	6	0	26
Mar 2025	22	6	0	28
Apr 2025	13	8	0	21
May 2025	9	9	2	20
Jun 2025	12	25	1	38
Jul 2025	9	15	0	24
Aug 2025	6	7	0	13
Sep 2025	34	13	1	48
Oct 2025	17	7	3	27

Cumulative views and downloads (calculated since 22 May 2023)

Month	HTML	PDF	XML	Total
May 2023	82	26	4	112
Jun 2023	36	16	2	54
Jul 2023	68	21	6	95
Aug 2023	36	18	4	58
Sep 2023	35	14	7	56
Oct 2023	25	15	2	42
Nov 2023	4	4	2	10
Dec 2023	21	12	3	36
Jan 2024	29	20	3	52
Feb 2024	21	15	3	39
Mar 2024	19	6	1	26
Apr 2024	15	3	4	22
May 2024	14	10	2	26
Jun 2024	43	15	4	62
Jul 2024	15	5	2	22
Aug 2024	29	5	4	38
Sep 2024	24	6	1	31
Oct 2024	9	2	0	11
Nov 2024	8	4	0	12
Dec 2024	3	1	0	4
Jan 2025	24	4	0	28
Feb 2025	20	6	0	26
Mar 2025	22	6	0	28
Apr 2025	13	8	0	21
May 2025	9	9	2	20
Jun 2025	12	25	1	38
Jul 2025	9	15	0	24
Aug 2025	6	7	0	13
Sep 2025	34	13	1	48
Oct 2025	17	7	3	27

Viewed (geographical distribution)

Total article views: 1,075 (including HTML, PDF, and XML) Thereof 1,075 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Oct 2025

Short summary

Snow depth has a strong impact on soil temperatures and carbon cycling in the arctic. Because of this, we want to understand why snow is deeper in some places than others. Using cameras mounted on a drone, we mapped snow depth, vegetation height, and elevation across a watershed in Alaska. In this paper, we develop novel techniques using image processing and machine learning to characterize the influence of topography and shrubs on snow depth in the watershed.


Total:	0
HTML:	0
PDF:	0
XML:	0