the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Use of multiple reference data sources to cross validate gridded snow water equivalent products over North America
Abstract. We use snow course and airborne gamma data available over North America to compare the validation of gridded snow water equivalent (SWE) products when evaluated with one reference dataset versus the other. We assess product performance across both non-mountainous and mountainous regions, determining the sensitivity of relative product rankings and absolute performance measures. In non-mountainous areas, product performance is insensitive to the choice of SWE reference dataset (snow course or airborne gamma): the validation statistics (bias, unbiased root mean squared error, correlation) are consistent with one another. In mountainous areas, the choice of reference dataset has little impact on relative product ranking but a large impact on assessed error magnitudes (bias and unbiased root mean squared error). Further analysis indicates the agreement in non-mountainous regions occurs because the reference SWE estimates themselves agree up to spatial scales of at least 50 km, comparable to the grid spacing of most available SWE products. In mountain areas, there is poor agreement between the reference datasets even at short distances (< 5 km). We determine that differences in assessed error magnitudes result primarily from the range of SWE magnitudes sampled by each method, although their respective spatiotemporal distribution and elevation differences between the reference measurements and grid centroids also play a role. We use this understanding to produce a combined reference SWE dataset for North America, applicable for future gridded SWE product evaluations and other applications.
- Preprint
(2476 KB) - Metadata XML
-
Supplement
(590 KB) - BibTeX
- EndNote
Status: closed
-
CC1: 'Comment on egusphere-2023-3013', Jeff Dozier, 14 Jan 2024
The name of one of the authors (Carrie Vuyovich) is not correctly spelled!
Citation: https://doi.org/10.5194/egusphere-2023-3013-CC1 -
AC1: 'Reply on CC1', Colleen Mortimer, 24 May 2024
Thanks Jeff Dozier for catching this error. It will be fixed in the revised manuscript.
Citation: https://doi.org/10.5194/egusphere-2023-3013-AC1
-
AC1: 'Reply on CC1', Colleen Mortimer, 24 May 2024
-
RC1: 'Comment on egusphere-2023-3013', Anonymous Referee #1, 17 Mar 2024
In this work, the authors utilized snow course and airborne gamma data from North America to comprehensively evaluate the performance of grid snow water equivalent products in both mountainous and non-mountainous areas at various spatial and temporal scales. Additionally, a combined reference SWE dataset for North America was produced. It's a challenging but promising endeavor. Here, I would provide some comments and suggestions for authors’ consideration when revising the paper.
Comments:
It is easy for the snow water equivalent in mountainous areas to exceed 1000 mm. However, the 1000 mm SWE was excluded from the validation in the manuscript, did the authors calculate the amount of data for these exclusions, and did they affect the accuracy assessment of SWE in mountainous areas, where snow depth tends to be very large. I suggest the authors add a discussion of the relevant chapters in Uncertainty.
As the UA SWE with the highest accuracy, its scatter plot does not show obvious scatter aggregation in the low-value area. So I would like to know if the number of verification points in Figure 1 is the same for each type of SWE product, please give the total number of verification points.
Line 125, How do authors retain two-thirds of these sites, and what are the retained principles?
Line 133, the authors highlighted the importance of preventing oversampling in spatially dense areas by limiting the sampling of snow course and gamma SWE to 100 km. However, considering that the resampled grid surpasses the dimensions of certain SWE datasets, could this potentially introduce additional sampling errors that might impact the validation results?
Is Figure 2 a scatter plot obtained by sampling the snow course and gamma SWE to 100 km? Why not choose a smaller scale? Will this affect the accuracy of verification results?
The abbreviations of ESn and CE5 in Figure 2 do not correspond to the abbreviations in Table 1. Please review the product abbreviations throughout the document and make sure they are aligned.
Line 133-135: “Sensitivity analysis of various spatial aggregation distances between 50 and 200 km showed little impact of aggregation distance. We selected 100 km as a compromise between sample size and spatial distribution". How can we see "little impact of aggregation distance"? Will the sensitivity analysis results be considered for addition, and further explanation is needed for the selection basis of 50-200 km aggregation distance.
Since the North American reference SWE dataset was finally constructed in the article, please consider whether further additions are needed using this dataset to verify the SWE grid products mentioned in the article.
Citation: https://doi.org/10.5194/egusphere-2023-3013-RC1 - AC2: 'Reply on RC1', Colleen Mortimer, 24 May 2024
-
RC2: 'Comment on egusphere-2023-3013', Simon Gascoin, 17 Apr 2024
This study presents an evaluation of several SWE products with spatial resolutions ranging from 4 km to over 100 km. These SWE products are constantly evolving and it is very useful to have an up-to-date assessment of their strengths and weaknesses, in particular to assess the impact of climate change on global snow mass. The reference data are airborne gamma and snow courses which represents a novelty compared to previous studies (L60 "a unified assessment of gridded SWE products using both reference datasets is lacking"). The analyzes are the result of significant work since 14 products were evaluated over a vast area with a large data set. This work is therefore of notable interest. In my opinion, Figure 2 alone is useful enough for this work to be published. However, the rest of the study is much less convincing in my opinion. The authors decided to analyze the impact of the reference dataset on the evaluation results. This gives e.g. scatterplots of correlation coefficients with legends indicating correlations of correlations (Fig. 5), difficult to understand and above all of an interest which escapes me. What do we learn about the SWE products from this? Next, the authors conclude their study by presenting a combined benchmark dataset that is generated by an aggregation method that I did not understand*. The end of the reading leaves me with the impression that the authors used products to evaluate the validity of the observations which will ultimately be used to evaluate these products (cf. Sect 5.1, 5.3). Maybe I didn't understand correctly, but isn't there a form of circular reasoning here? Another question: a product can be evaluated by observations of different types (with uncertainties and specific spatial characteristics), why aggregate these data in a multi-source composite? By aggregating the risk is to lose knowledge of the error associated with each observation.
In situ measurements of SWE in mountain areas can vary drastically on the scale of a few kilometers. The problem does not come from the observations but from the evaluated products which give a representation of the snow cover on a smoothed landscape. Taking into account the spatial variability of mountain SWE which is documented in numerous studies (a bibliographic analysis on this subject would have been useful), the date-to-date comparison of a SWE value in a region of 50 km x 50 km with a SWE value obtained by snow course seems very random. In fact, a snow course value taken in a region of 50 km x 50 km can be seen as a random draw from a SWE distribution which would likely extend from 0 to >500 mm w.e. The representativeness of this measurement can be assessed using the SWE semi-variogram. If the range of the SVG of the SWE is close to the resolution of the model then the comparison is well founded. Otherwise, one way to overcome these known biases could be to select observations which have altitudes close to the altitude of the model grid. Or to consider the anomaly of the SWE in relation to an interannual average SWE in order to remove the first order effect of the topography. Another option would be to consider the higher resolution Arizona dataset as the reference (after independently evaluating it using in situ data, unless this has already been done), thereby aggregating that reference on the grid of each hemispherical products to facilitate their evaluation, including stratifying the residuals by elevation, land cover, etc.
* This product is formed by a method that I do not understand : L132 "To avoid oversampling specific grid cells, we first aggregated reference sites within the same product grid cell (at the native resolution of the product grid) before aggregating to the 100 km spacing." See also Section 3.3.2: I have read this part several times and am unable to understand what is being done. It would have been useful to share the source code of the analyzes (what does aggregation, resampling mean? average, median, bilinear interpolation? how is the centroid of the new data defined?)
In conclusion I think that the authors should rework their article in order to clarify their scientific objective but I am convinced that the analyzes already carried out have great value for the scientific community which studies snow mass on a global scale.
Minor comments
- Fig 7. I don't understand why the "full domain" histogram has lower values than the "restricted" histogram (e.g. in the 100-150 mm bin)
- L80: Coterminus
- L174: "In mountain regions, large changes in elevation over short distances are common. (..) SWE decreases due to wind redistribution" A more in-depth bibliographic analysis on this subject in the introduction would be useful. By definition, “redistribution” does not reduce the SWE in average but increases its spatial variability. Think about precipitation gradients, blowing snow sublimation, avalanches, etc.
Fig. 2 legend: logarithmic or lognormal?
Fig. 4 I would add the units to the RMSE and bias
- What are the t-tests used for in this study? I missed it.
- Fig. 8 is missing the x axis label
- L338: the bias decreases not increases (it is negative)
- there are two sections 5.2
Citation: https://doi.org/10.5194/egusphere-2023-3013-RC2 - AC3: 'Reply on RC2', Colleen Mortimer, 24 May 2024
Status: closed
-
CC1: 'Comment on egusphere-2023-3013', Jeff Dozier, 14 Jan 2024
The name of one of the authors (Carrie Vuyovich) is not correctly spelled!
Citation: https://doi.org/10.5194/egusphere-2023-3013-CC1 -
AC1: 'Reply on CC1', Colleen Mortimer, 24 May 2024
Thanks Jeff Dozier for catching this error. It will be fixed in the revised manuscript.
Citation: https://doi.org/10.5194/egusphere-2023-3013-AC1
-
AC1: 'Reply on CC1', Colleen Mortimer, 24 May 2024
-
RC1: 'Comment on egusphere-2023-3013', Anonymous Referee #1, 17 Mar 2024
In this work, the authors utilized snow course and airborne gamma data from North America to comprehensively evaluate the performance of grid snow water equivalent products in both mountainous and non-mountainous areas at various spatial and temporal scales. Additionally, a combined reference SWE dataset for North America was produced. It's a challenging but promising endeavor. Here, I would provide some comments and suggestions for authors’ consideration when revising the paper.
Comments:
It is easy for the snow water equivalent in mountainous areas to exceed 1000 mm. However, the 1000 mm SWE was excluded from the validation in the manuscript, did the authors calculate the amount of data for these exclusions, and did they affect the accuracy assessment of SWE in mountainous areas, where snow depth tends to be very large. I suggest the authors add a discussion of the relevant chapters in Uncertainty.
As the UA SWE with the highest accuracy, its scatter plot does not show obvious scatter aggregation in the low-value area. So I would like to know if the number of verification points in Figure 1 is the same for each type of SWE product, please give the total number of verification points.
Line 125, How do authors retain two-thirds of these sites, and what are the retained principles?
Line 133, the authors highlighted the importance of preventing oversampling in spatially dense areas by limiting the sampling of snow course and gamma SWE to 100 km. However, considering that the resampled grid surpasses the dimensions of certain SWE datasets, could this potentially introduce additional sampling errors that might impact the validation results?
Is Figure 2 a scatter plot obtained by sampling the snow course and gamma SWE to 100 km? Why not choose a smaller scale? Will this affect the accuracy of verification results?
The abbreviations of ESn and CE5 in Figure 2 do not correspond to the abbreviations in Table 1. Please review the product abbreviations throughout the document and make sure they are aligned.
Line 133-135: “Sensitivity analysis of various spatial aggregation distances between 50 and 200 km showed little impact of aggregation distance. We selected 100 km as a compromise between sample size and spatial distribution". How can we see "little impact of aggregation distance"? Will the sensitivity analysis results be considered for addition, and further explanation is needed for the selection basis of 50-200 km aggregation distance.
Since the North American reference SWE dataset was finally constructed in the article, please consider whether further additions are needed using this dataset to verify the SWE grid products mentioned in the article.
Citation: https://doi.org/10.5194/egusphere-2023-3013-RC1 - AC2: 'Reply on RC1', Colleen Mortimer, 24 May 2024
-
RC2: 'Comment on egusphere-2023-3013', Simon Gascoin, 17 Apr 2024
This study presents an evaluation of several SWE products with spatial resolutions ranging from 4 km to over 100 km. These SWE products are constantly evolving and it is very useful to have an up-to-date assessment of their strengths and weaknesses, in particular to assess the impact of climate change on global snow mass. The reference data are airborne gamma and snow courses which represents a novelty compared to previous studies (L60 "a unified assessment of gridded SWE products using both reference datasets is lacking"). The analyzes are the result of significant work since 14 products were evaluated over a vast area with a large data set. This work is therefore of notable interest. In my opinion, Figure 2 alone is useful enough for this work to be published. However, the rest of the study is much less convincing in my opinion. The authors decided to analyze the impact of the reference dataset on the evaluation results. This gives e.g. scatterplots of correlation coefficients with legends indicating correlations of correlations (Fig. 5), difficult to understand and above all of an interest which escapes me. What do we learn about the SWE products from this? Next, the authors conclude their study by presenting a combined benchmark dataset that is generated by an aggregation method that I did not understand*. The end of the reading leaves me with the impression that the authors used products to evaluate the validity of the observations which will ultimately be used to evaluate these products (cf. Sect 5.1, 5.3). Maybe I didn't understand correctly, but isn't there a form of circular reasoning here? Another question: a product can be evaluated by observations of different types (with uncertainties and specific spatial characteristics), why aggregate these data in a multi-source composite? By aggregating the risk is to lose knowledge of the error associated with each observation.
In situ measurements of SWE in mountain areas can vary drastically on the scale of a few kilometers. The problem does not come from the observations but from the evaluated products which give a representation of the snow cover on a smoothed landscape. Taking into account the spatial variability of mountain SWE which is documented in numerous studies (a bibliographic analysis on this subject would have been useful), the date-to-date comparison of a SWE value in a region of 50 km x 50 km with a SWE value obtained by snow course seems very random. In fact, a snow course value taken in a region of 50 km x 50 km can be seen as a random draw from a SWE distribution which would likely extend from 0 to >500 mm w.e. The representativeness of this measurement can be assessed using the SWE semi-variogram. If the range of the SVG of the SWE is close to the resolution of the model then the comparison is well founded. Otherwise, one way to overcome these known biases could be to select observations which have altitudes close to the altitude of the model grid. Or to consider the anomaly of the SWE in relation to an interannual average SWE in order to remove the first order effect of the topography. Another option would be to consider the higher resolution Arizona dataset as the reference (after independently evaluating it using in situ data, unless this has already been done), thereby aggregating that reference on the grid of each hemispherical products to facilitate their evaluation, including stratifying the residuals by elevation, land cover, etc.
* This product is formed by a method that I do not understand : L132 "To avoid oversampling specific grid cells, we first aggregated reference sites within the same product grid cell (at the native resolution of the product grid) before aggregating to the 100 km spacing." See also Section 3.3.2: I have read this part several times and am unable to understand what is being done. It would have been useful to share the source code of the analyzes (what does aggregation, resampling mean? average, median, bilinear interpolation? how is the centroid of the new data defined?)
In conclusion I think that the authors should rework their article in order to clarify their scientific objective but I am convinced that the analyzes already carried out have great value for the scientific community which studies snow mass on a global scale.
Minor comments
- Fig 7. I don't understand why the "full domain" histogram has lower values than the "restricted" histogram (e.g. in the 100-150 mm bin)
- L80: Coterminus
- L174: "In mountain regions, large changes in elevation over short distances are common. (..) SWE decreases due to wind redistribution" A more in-depth bibliographic analysis on this subject in the introduction would be useful. By definition, “redistribution” does not reduce the SWE in average but increases its spatial variability. Think about precipitation gradients, blowing snow sublimation, avalanches, etc.
Fig. 2 legend: logarithmic or lognormal?
Fig. 4 I would add the units to the RMSE and bias
- What are the t-tests used for in this study? I missed it.
- Fig. 8 is missing the x axis label
- L338: the bias decreases not increases (it is negative)
- there are two sections 5.2
Citation: https://doi.org/10.5194/egusphere-2023-3013-RC2 - AC3: 'Reply on RC2', Colleen Mortimer, 24 May 2024
Viewed
HTML | XML | Total | Supplement | BibTeX | EndNote | |
---|---|---|---|---|---|---|
534 | 172 | 37 | 743 | 52 | 15 | 20 |
- HTML: 534
- PDF: 172
- XML: 37
- Total: 743
- Supplement: 52
- BibTeX: 15
- EndNote: 20
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1