A framework for evaluating ice sheet altimetry uncertainty estimates
Abstract. For three decades, ice sheet elevation records from satellite radar altimetry have provided new insight into the state of the cryosphere and its contribution to global sea-level rise. The availability of robust, consistent and traceable uncertainties alongside ice sheet elevation data is crucial for combining measurements across missions and enabling their use in reconciling estimates of ice sheet mass balance and constraining numerical ice sheet models. At present, such uncertainties are largely absent from existing Level 2 ice sheet elevation products, and for the subset of products where uncertainties are provided, there is neither a standardised approach to uncertainty generation nor a method to evaluate their robustness. Here, we develop a novel uncertainty evaluation framework and provide a comprehensive assessment of uncertainty generation for altimetry-based ice sheet elevations. Overall, we find that calculating uncertainty as a parameterisation of topographic complexity (characterised by surface slope and roughness) and measurement quality (characterised by backscattered power and coherence) improves performance relative to solutions that use fewer co-variates. Ultimately, the framework presented here will enable the systematic characterisation of ice sheet elevation uncertainties associated with historical, current and future polar radar altimeter missions, including the Copernicus Polar Ice and Snow Topography Altimeter (CRISTAL). Such information will aid the successful combination of altimetry measurements across missions, improve the constraint of numerical ice sheet models, and enable more certain estimates of current and future ice sheet mass balance and global sea-level rise.
General Comments
This manuscript presents a framework for evaluating and improving uncertainty estimates in satellite radar altimetry measurements over ice sheets. The topic is timely and highly relevant, as robust and traceable uncertainty estimates are essential for combining multi-mission datasets and for reliable assessments of ice sheet mass balance and sea-level rise. In this respect, the study addresses an important gap in the current literature.
Overall, the manuscript shows strong scientific potential. The work fills a gap in the scientific literature and could have a high scientific impact. At the same time, I believe it would benefit from further clarification and refinement before it can be fully assessed. In particular, clearer definitions of key concepts, along with a more structured presentation of the framework and methodology, would significantly improve readability and accessibility.
I found the section “Uncertainty Evaluation Framework” somewhat difficult to follow in its current form. The purpose of the section is not entirely clear, as it appears to serve partly as an introduction and partly as a method description, without fully achieving either. It might be helpful to either (i) integrate a concise overview of the framework into the Introduction (e.g., as a structured summary or bullet points), or (ii) expand this section within the Methods to provide a clearer step-by-step description, including definitions, inputs, and outputs.
In addition, it would strengthen the manuscript to clearly define the chosen metrics and covariates early on, as these form the backbone of the framework but are not always consistently introduced. Also, as this is an uncertainty study, please clearly state what uncertainty, robustness, etc., means. Fx, does robustness refer to stability across configurations, agreement with residuals, or bias minimisation? You use the word, particular in Section 2, without describing what you mean by it (Comes later in Methods).
The decision to construct a look-up table based on a single reference year (2020) would benefit from further justification. Given known temporal and spatial variations in datasets (e.g., ICESat-2 sampling), it would be useful to discuss potential implications of this choice, including any seasonal or regional biases. Also, it is not clear to me that the resulting uncertainties are solely from 2022. Furthermore, I am missing your consideration in choosing exactly these years.
The manuscript aims to cover a broad range of analyses, including comparisons across multiple CryoSat-2 SARIn mode datasets and Sentinel-3 observations over both Greenland and Antarctica. While the motivation for this comprehensive approach is clear and appreciated, there are instances where the level of detail, particularly regarding how the uncertainties evolve in space and time, feels somewhat limited. At present, the scope appears quite ambitious for a single paper, and the material could potentially support multiple focused studies. I am not suggesting that the manuscript should be split, nor that its length should be substantially increased. Rather, I encourage the authors to consider whether certain aspects could be treated with slightly more depth or clarity, possibly by refining the focus or guiding the reader more explicitly through the key uncertainty-related findings.
Specific Comments
Abstract
The abstract could be strengthened by including more specific results of your findings (e.g., comparisons between POCA vs swath, Sentinel-3 vs CryoSat-2, or regional differences between Antarctica and Greenland), and the years your analysis covers.
Introduction
Consider adding a short paragraph summarising existing approaches to uncertainty estimation in satellite altimetry (e.g., cross-over analysis, external validation with GNSS/airborne lidar/DEMs, inter-mission comparisons, statistical approaches).
Uncertainty Evaluation Framework
L61–62: The statement regarding the absence of approaches to assess uncertainty may be somewhat strong; several studies have addressed this topic, even if not within a unified framework (see comment from the introduction).
L66–71: The choice of 2020 (and 2022?) for the look-up table would benefit from additional explanation.
L69: Please clarify which datasets are included in the look-up table.
L70: It would be helpful to clarify what is meant by “unseen,” and to specify the reference dataset and how elevation differences are computed (e.g., crossovers, gridding, ect).
L74: It is not entirely clear how the uncertainty values are derived (e.g., from the 2020 dataset).
L77: Please define the lower and upper quartiles explicitly.
Data
Altimetry data section: Please specify which CryoSat-2 and Sentinel-3 elevations/backscatter are used. Are the Sentinel-3 elevations from L2?
L128: “Binning” is introduced here; consider introducing this concept earlier in section 2.
L131: Please provide a reference for “these variables have been shown…”.
It may be helpful to comment more explicitly on the differences in spatial coverage among ICESat-2, CryoSat-2, and Sentinel-3, as these differences complicate direct comparisons.
Methods
Table 1: Additional clarification of the metrics (e.g., elevation differences, confidence intervals) would help the reader.
L168–172: It might be worth discussing whether temporal sampling affects the results, and whether footprint differences (e.g., ICESat-2 vs. CryoSat-2) play a role.
L179–181: This statement would benefit from further clarification.
L203: Please elaborate on the motivation and implications of the different approach used here.
L225, 249, L285, .., : Ensure consistent use of “median,” particularly if used as shorthand for median absolute differences. I found the use of “median” in the sense of “median absolute difference” potentially confusing. A clearer distinction in terminology could help avoid ambiguity for the reader.
Results
L291–292: It is not entirely clear how the interpretation is derived from the figure; additional explanation would help.
L337: The term "true accuracy" is not well defined. Is there such a term? Consider replacing it with something measurable, e.g., agreement with reference data.
L408: It would be useful to specify whether results refer to the entire ice sheet or specific regions (e.g., margins).
L425–427: A short explanation of why CryoSat-2 appears to perform better than Sentinel-3 would strengthen the discussion.
Discussion
L511: Could the metrics be further refined for Sentinel-3?
L514, L525: Grid size, spatial sampling, and repeat cycle may influence the results - a brief discussion would be helpful. A sensitivity analysis (e.g., grid size or parameter choices) could strengthen the conclusions.
Technical Corrections
L55–59: Consider adding a clearer introductory sentence describing the framework.
L96: Possibly add a reference to Section 2.
L261 (and elsewhere): Consider explicitly naming the reference dataset (e.g., ICESat-2) rather than using a generic term.
Ensure consistent terminology throughout (e.g., median vs. mean; definition of quartiles).
Avoid repeated definitions of the same parameters; median and IQR are defined several places.