the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Year-Round High-Resolution Sea Ice Freeboard Retrieval Using ICESat-2 ATL03 Photon Data
Abstract. Arctic sea ice freeboard is critical for estimating ice thickness and characterizing surface morphology, yet it remains poorly constrained, especially during melt seasons due to limitations in conventional altimetry products. The ICESat-2 ATL07 product retrieves surface heights at variable along-track segment lengths (10–200 m) and identifies floes and leads using fixed thresholds (photon rate, background rate and the width of height distribution) to support freeboard estimation, which smooths small-scale features and reduces reliability over sea ice surface with complex spatial variations or affected by melting. To address these challenges, we present a year-round, high-resolution (5 m) freeboard retrieval method (HRFM) based directly on ICESat-2 ATL03 photon data. A two-stage denoising strategy is implemented to robustly extract signal photons, while a machine-learning classifier, trained on 25 coincident Sentinel-2 images, discriminates between sea ice, thin ice, and leads across seasons. Identified lead segments provide local sea surface references from which freeboard is estimated. Validation against Airborne Topographic Mapper (ATM) data shows that HRFM reduces the surface-height root-mean-square error (RMSE) for strong beams from 0.12 m (ATL07) to 0.08 m (by 33 %), while weak-beam retrievals achieve comparable accuracy. HRFM better preserves ridge-related heights that are underestimated by ATL07. The classification attains a precision of 0.96 and a recall of 0.95 for lead detection, supporting reliable freeboard estimation. After applying the method at a pan-Arctic scale, the spatial patterns of retrieved freeboard are consistent with the ICESat-2 ATL20 product, but with seasonal mean differences reaching up to 0.04 m. By improving both topographic fidelity and lead detection, HRFM mitigates common limitations of ICESat-2 sea ice products and enables high-resolution freeboard estimates across seasons.
Competing interests: At least one of the (co-)authors is a member of the editorial board of The Cryosphere.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.- Preprint
(33399 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 04 Jun 2026)
- RC1: 'Comment on egusphere-2026-766', Alek Petty, 13 Mar 2026 reply
-
RC2: 'Comment on egusphere-2026-766', Anonymous Referee #2, 25 May 2026
reply
The manuscript is well written and clearly structured, and it presents a promising high-resolution freeboard retrieval method using ICESat-2 ATL03 photon data. The results show improved performance relative to existing ICESat-2 products, and the method may also increase the usability of weak-beam observations for sea-ice freeboard retrieval. However, some methodological details require further clarification. In particular, the machine-learning workflow should be described in more detail, especially the link between GMM clustering, Sentinel-2-based human interpretation, and RF classification. In addition, the diversity of the Sentinel-2/ICESat-2 coincident cases should be better documented, including their seasonal coverage and time offsets, to support the claimed year-round applicability of the method.
Specific comments:
P6, L168: Because both the GMM clustering and the Sentinel-2-based manual interpretation rely on the 25 coincident scenes, the authors should show how these scenes are distributed across months/seasons and surface conditions. A table summarizing the acquisition dates, Sentinel-2–ICESat-2 time offsets, beam types, and assigned surface classes would help assess whether the training data adequately support the claimed year-round applicability of the classifier.
Figure 2: As I understand the workflow, the authors first grouped HRFM segments into 20 GMM clusters and then manually assigned these clusters to three surface classes using Sentinel-2 imagery. However, this important step is not easy to follow from Fig. 2. I suggest revising the flowchart so that the intermediate GMM clustering step and the subsequent human interpretation/manual class assignment based on Sentinel-2 imagery are shown more explicitly. This would make the training-data generation procedure clearer to readers.
P11, L280: More detail is needed on how transitional or ambiguous classes were handled during the Sentinel-2-based manual labeling, especially for thin ice, dark leads, and melt-pond-covered ice. This subjectivity may affect the RF training labels and should be discussed as a source of uncertainty.
Figure 10 shows systematic seasonal differences between HRFM and ATL20, with HRFM generally lower during the cold season and higher during the melt season. The manuscript attributes these differences to surface-height retrieval, lead detection, and reference sea-level construction, but the explanation would be stronger if supported by more quantitative evidence. For example, monthly differences in the number of detected lead segments, reference sea-level estimates.
Table 2 shows that a non-negligible number of thin-ice segments are classified as leads, particularly for weak beams. Since HRFM uses identified lead segments to estimate the local sea-surface reference, misclassified thin ice could bias the reference sea level and propagate into the freeboard estimates. This issue is only indirectly discussed through the low thin-ice classification performance, but its potential impact on freeboard retrieval should be addressed more explicitly.
Minor comments:
Figure 2: Please show the RF classifier more explicitly in the workflow after the GMM clustering and manual labeling steps.
Figure 4 caption: (d) height standard deviation (STD) à (e) height standard deviation.
Please clarify whether “thin ice” and “gray ice” refer to the same class and use the terminology consistently throughout the manuscript.
Citation: https://doi.org/10.5194/egusphere-2026-766-RC2 -
RC3: 'Comment on egusphere-2026-766', Anonymous Referee #3, 01 Jun 2026
reply
Review report on “Year-Round High-Resolution Sea Ice Freeboard Retrieval Using ICESat-2 ATL03 Photon Data” by Liu et al.,
Sea ice freeboard is a critical parameter not only a proxy for snow depth estimation, but also critical for sea ice thickness extraction. This manuscript introduces the High-Resolution Freeboard Method (HRFM), a novel framework that retrieves sea ice freeboard directly from ICESat-2 ATL03 photon data at 5 m along-track resolution. The approach combines a two-stage denoising strategy (Kernel Density Estimation with adaptive histogram‑based filtering) and a machine‑learning surface‑type classifier trained on 25 coincident Sentinel‑2 scenes covering winter and summer conditions across diverse ice types. The authors validate surface heights against ATM (Airborne Topographic Mapper) data, compare their freeboard estimates with the operational ATL20 product, and demonstrate improved performance in preserving ridge heights, mitigating after‑pulse effects, handling melt‑season complexities, and enabling consistent weak‑beam utilisation.
The study is highly relevant to the scope of TC Journal and timely to the cryosphere society, addressing a critical need for year‑round, high‑resolution sea ice thickness monitoring. The manuscript is well‑structured, the methodology is clearly described, and the results represent a substantial advance over existing ICESat2 sea ice products (ATL07/ATL20), which convinced me that this study warrants publication in TC Journal. However, before final acceptance. There are numerical issues that either need further clarification or significant improvement. Please see my major and minor comments below that I hope the authors can consider during the revision.
Major Comments / Required Revisions
- The authors should specify explicitly what annual cycle was investigated either in the title or in the abstract. Why was this annual cycle selected in this study?
- The key algorithm parameters are mentioned throughout the text (e.g., horizontal scaling a=25/100, histogram bin size 0.1 m, after‑pulse offsets 0.45/0.9 m, 10‑km window for sea‑level reference, 3σ outlier removal, 10th percentile density threshold for refined denoising), a single table summarising all critical parameters with their values and justifications would greatly enhance reproducibility. Please add such a table (e.g., in Section 3 or as an appendix) in the revised manuscript.
- In the abstract and results (Sect. 4.1), the authors state that weak‑beam retrievals achieve “comparable accuracy” to strong beams. How “comparable” it was? Pleaseprovide a concrete assessment. Please consider adding a statistical test to show whether the remaining difference is meaningful or not.
- Table 2 shows thin‑ice recall is only 0.50 (strong beam) and 0.43 (weak beam), with precision ~0.65. The authors merge thin ice into sea ice for two‑class validation (Table A1, overall accuracy >0.98). However, thin ice is a distinct surface type with different radiative, thermodynamic, and mechanical properties. Please discuss:
a: How often does misclassified thin ice affect freeboard retrieval (e.g., when thin ice is wrongly labelled as lead and thus included in the reference sea level)?
b: Should users treat the thin‑ice class as “experimental” or apply the two‑class version for freeboard estimation? A clear recommendation would be helpful.
5. The reference sea level is computed by aggregating lead segments within a 10‑km along‑track window. The choice of 10 km is plausible but not justified. Could the optimal window size vary with ice regime (e.g., compact vs. marginal ice zone, winter vs. summer)? A brief sensitivity test (e.g., 5 km, 10 km, 20 km) on a representative track would strengthen the method. If such a test already exists, please cite or summarizeit.
6. Figure 10 shows seasonal mean differences between HRFM and ATL20 up to ±0.04 m, with sign reversals between winter and summer. The authors attributed these to differences in surface‑height retrieval, lead detection, and reference sea‑level construction. A rough quantitative decomposition (e.g., how much of the winter difference is due to after‑pulse removal vs. lead sampling density) would greatly strengthen the discussion. Even an approximate breakdown based on the examples in Figs. 8 and 12 would be valuable.
7. The classifier was trained on scenes south of 82°N because of Sentinel-2’s orbital limit. How confident are the authors that the classifier works north of 82°N (e.g., the central Arctic) where no coincident optical imagery is available for validation? Please add a discussion of potential extrapolation risks and whether the physical feature space (photon rate, density, height STD, background rate) is expected to remain valid poleward.
Minor comments
-Line 70: “detector after‑pulses” should be introduced earlier (it is defined in Sect. 3.1.2 but referenced here). Consider adding a brief definition at first mention.
-Line 125: Use consistent version notation (e.g., “Version 6” rather than “V06”, or define “V06” at first use).
-Figure 8 caption: “The green dashed lines represent the ATL07 signal selection envelope” :please describe this explicitly in the caption.
-Section 5.1, line 585: “lower transmitted energy level (∼80% of outer beams)” – please verify and provide a citation for the energy difference between middle and outer beams.
-Figure 5 caption: “It’s a 10‑km profile” should read “It is a 10‑km profile”
-Language: The manuscript is generally well written, yet there are a few grammatical issues that remain (e.g., “It’s” in figure captions, occasional missing articles). I see you have English co-authors, and please ask them to proofread the language.
- The text fonts in almost all figures are quite small. Please consider enlarging them to improve the readability of the figures
- The authors provided this data availability statement: The Sentinel-2 imagery was derived from Google Earth Engine at https://developers.google.com/earth-engine/datasets/catalog/sentinel-2. I am not sure if this is adequately enough by TC, or should authors provide the Sentinel-2 imagery data somewhere for readers to explore. I leave this for the TC handling editor to decide.
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 981 | 486 | 82 | 1,549 | 187 | 301 |
- HTML: 981
- PDF: 486
- XML: 82
- Total: 1,549
- BibTeX: 187
- EndNote: 301
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
Review of “Year-Round High-Resolution Sea Ice Freeboard Retrieval Using ICESat-2 ATL03 Photon Data” by Liu et al.,
Review by Alek Petty
This manuscript presents a new high-resolution photon-based lead classification and height/freeboard retrieval framework using ATL03, with the goal of improving sea surface height and derived freeboard estimates relative to the current ATL07 and ATL10 products. The work builds on recent machine-learning lead classification efforts (by the primary author and others) that showed strong promise compared to the pre-launch empirical ATL07 classification approach, and I think this is the logical next step. They combine this new ML trained classification scheme with a much higher resolution processing approach.
In general, the study clearly involved a considerable amount of analysis and description, and I really commend the effort. The paper was also very clearly written and the figures were all of high quality. The study is also very appropriate for this journal. That said, I have a number of questions about the robustness of the approach and assessments presented. In a few places it feels like the algorithm works nicely for the examples shown, but I am less convinced about how broadly applicable it is beyond those cases and I think there is a potential need for more filtering and at the very least clear documentation of the processing steps involved to ensure transparency and reproducibility. My comments are below.
Major comments
Height precision:
I didn’t see any mention in the paper of the fact ATL07 uses a ~150 aggregation (pulses are aggregated towards this number, but the actual photons aggregated is variable) motivated by a desire for something like a 2-3 cm precision over flat surfaces for lead retrieval considering the theoretical ~10-20 cm precision of ATL03 heights (Markus et al., 2017, Neumann et al., 2019, Kwok et al., 2019). In reality the ATL03 precision is probably a bit than that (Brunt et al., 2019), but this is hard to interpret. That motivation needs to be discussed more. The clear downside of the ATL07 focus on lead precision is the reduced along-track resolution and issues with very long segment lengths.
We can see by your height SD plots that your use of a much finer fixed 5 m along-track photon aggregation is resulting in roughly 10 cm height standard deviations over open water leads, which is quite high! This all needs to be discussed way more in terms of impacts on freeboards.
Related – I find some of the discussion around ATL07 height biases over ridges to be over stated, a lot of the difference seems to be resolution/sampling differences and not a bias.
ATM comparisons:
The ATM comparison section needs more clarity. For one, it is not clear whether you implemented the cross-correlation maximization procedure used in Kwok (2019). In that study, raw ATM was aggregated into 17 m by (segment length plus 17 m) blocks with Gaussian weighting, and the ATM data were shifted to maximize correlation with ICESat 2. What exactly have you done here? Are you using the same aggregation? Are you shifting the profiles? Which beams are included in the comparisons? Note that the current thinking is ~11 m footprints (Magruder et al., 2020), but maybe 17 m would keep things consistent with Kwok.
Your reported correlations and RMSE values between ATM and ATL07 are notably worse than the very high values reported in Kwok (2019). The lack of details regarding the processing and comparison approach needs a lot more explaining to make such a big claim (unless I am missing something!). It would also be helpful to repeat the April 8 and 12 flights shown in Figure 1 of Kwok (2019) as a sanity check. At the moment it is difficult to reconcile and trust the differences presented.
Geophysical corrections:
The treatment of geophysical corrections is glossed over very quickly. What corrections are applied, and are they the exact same as ATL07? I think you are using a different MSS for one. These details matter, especially since you are doing height comparisons with ATM (unless I’m missing that you did something else to reconcile the heights). In previous SSH comparison work, getting the corrections consistent took quite a lot of effort (Bagnardi et al., 2021). I would encourage a more careful and explicit description here.
First photon bias:
I also saw no mention of the first photon bias correction. Even if your ultimate goal is relative freeboard, once you start comparing heights directly (again as you do here with ATM data) this correction should be included and documented, and it could still impact the freeboard results too as this is variable (I do confess I’m not sure exactly how variable it is over sea ice/leads…).
Filtering:
The photon filtering strategy needs more explanation. ATL07 uses windowing filters to separate signal from background photons, I was surprised you do not do any windowing. I wondered what would happen in cloudy conditions with your approach as I don’t think you are implementing any type of cloud filter. The coarse filtering does not appear to work very consistently, but the second step does. I am a bit worried you’re showing the best cases here, which is why I would like to see the along-track data included with the release (see comment later).
Small related point - Why not use the new Yet Another Photon Classifier (YAPC) variables from ATL03? A lot of effort has gone into developing those precisely for photon level signal/background classification like this! Fine if you want to do your own thing, but I think worth at least acknowledging as that data is now all on ATL03 (and ATL07) .
Extra processing concerns:
ATL07 reduces the energy of the middle beam by a factor of 0.82. Is this applied here? You report 2 to 3 cm freeboard differences between strong and weak beams. I would not call that slight. It would be clearer to show this as a difference plot rather than as two separate seasonal curves.
How do you treat saturated or highly specular returns?
Release 005 introduced filtering using the PODPPD flag and beam angle in ATL03 to identify pointing issues. I think if you want to introduce this as a dataset you need to consider all these important edge cases and filter them out (or decide what to do with them).
Classification training:
I am a bit confused about the training framework. You mention leave one out cross validation which is good, but it appears that all scenes are then used to train the final classifier that is evaluated and shown in Figure 9? Or maybe I got confused about that. Please clarify, as if so, that is not a fully independent test and should be clarified?
Related – I didn’t quite get/see if the models were trained for weak and strong beams separately? I think you need to apply the middle beam gain correction if not trained by beam?
As in the above, I wasn’t sure what you are doing about very specular/saturated pulses.
Data availability:
I strongly encourage the authors to make the along track height estimates and classification outputs publicly available. Having access to these intermediate products is essential for assessment of these new SSH and freeboard approaches. Much of what we have learned about the strengths and deficiencies of ATL07 has come from the community being able to interrogate the along track heights and classifications directly independently. Given the standards and expectations of The Cryosphere, I believe these along track outputs should be made available as part of the publication.
Seasonal analysis:
You only have one winter of data, so the seasonal discussion felt quite premature and over confident. I would suggest tempering that interpretation considerably until more years are processed.
Figure 9c:
I include in major comments as I find this quite confusing/worrying (?). But in the middle profile for the bottom lead, height_segment_type shows sea ice while ssh_flag is set to sea surface on one of the beams (see yellow circles). How is that possible?
Minor comments
On the multi beam discussion, much of the concern in the literature relates to using sea surface points across beams to generate a swath like SSH estimate. Your approach seems to combine independently derived freeboards across beams, which is conceptually simpler and something ATL20 could easily do now anyway. I would make that distinction clearer.
It would also be useful to test the algorithm on the cloudy but usable scene discussed in Kwok (2021), just to see how it behaves under less ideal conditions. This is a pretty big issue considering the dropping of the dark lead classification from ssh/freeboard in ATL07/10.
I would use the term background rather than noise, since noise suggests a sensor issue rather than environmental background photons being reliably detected.
L71: I would not characterize this as a clear underestimation. These often seem like smoothing/resolution differences to me that are over-characterized as height biases.
L126: Please clarify the version number.
L149: Also worth noting that there is an additional ssh_flag = 2 flag in ATL10 for the segments actually used.
L214: Clarify whether a different MSS is being used and what corrections are inherited from ATL03.
L260: Please list the geophysical corrections and their source and comparisons with ATL07. ICESat-2 has a geophysical corrections document that lists this all out.
L265: OK fine, but ATL07 provides both and I think you just focus on norm, so that gets a little confusing later.
L266: Make clearer that ATL07 aggregates in order to beat down the noise and achieve a precision of 2 cm over flat surfaces, as driven by mission requirements.
L316: Do you really expect 10 cm surface roughness over a level ice lead?
L443: These could still be small leads. If possible, show a coincident Sentinel 2 scene and compare classifications.
Replace ‘dark ice’ with ‘dark lead’ in the ATL07 figure labels.
References
Bagnardi, M., Kurtz, N. T., Petty, A. A., and Kwok, R.: Sea Surface Height Anomalies of the Arctic Ocean From ICESat-2: A First Examination and Comparisons With CryoSat-2, Geophysical Research Letters, 48, e2021GL093155, https://doi.org/10.1029/2021GL093155, 2021.
Brunt, K. M., Neumann, T. A., and Smith, B. E.: Assessment of ICESat-2 Ice Sheet Surface Heights, Based on Comparisons Over the Interior of the Antarctic Ice Sheet, Geophysical Research Letters, 46, 13072–13078, https://doi.org/10.1029/2019GL084886, 2019.
Magruder, L. A., Brunt, K. M., and Alonzo, M.: Early ICESat-2 on-orbit Geolocation Validation Using Ground-Based Corner Cube Retro-Reflectors, Remote Sensing, 12, 3653, https://doi.org/10.3390/rs12213653, 2020.
Markus, T., Neumann, T., Martino, A., Abdalati, W., Brunt, K., Csatho, B., Farrell, S., Fricker, H., Gardner, A., Harding, D., Jasinski, M., Kwok, R., Magruder, L., Lubin, D., Luthcke, S., Morison, J., Nelson, R., Neuenschwander, A., Palm, S., Popescu, S., Shum, C., Schutz, B. E., Smith, B., Yang, Y., and Zwally, J.: The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2): Science requirements, concept, and implementation, Remote Sensing of Environment, 190, 260–273, https://doi.org/10.1016/j.rse.2016.12.029, 2017.
Kwok, R., Markus, T., Kurtz, N. T., Petty, A. A., Neumann, T. A., Farrell, S. L., Cunningham, G. F., Hancock, D. W., Ivanoff, A., and Wimert, J. T.: Surface Height and Sea Ice Freeboard of the Arctic Ocean From ICESat-2: Characteristics and Early Results, Journal of Geophysical Research: Oceans, 124, 6942–6959, https://doi.org/10.1029/2019JC015486, 2019.
Kwok, R., Kacimi, S., Markus, T., Kurtz, N. T., Studinger, M., Sonntag, J. G., Manizade, S. S., Boisvert, L. N., and Harbeck, J. P.: ICESat-2 Surface Height and Sea Ice Freeboard Assessed With ATM Lidar Acquisitions From Operation IceBridge, Geophysical Research Letters, 46, 11228–11236, https://doi.org/10.1029/2019GL084976, 2019.