the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope
Abstract. Thermokarst lakes are critical components of Arctic carbon cycling, yet their ice phenology, which directly impacts total carbon flux, remains poorly characterized at regional scales. We present the first comprehensive analysis of ice-on and ice-off timing across 30,862 lakes on the Alaska North Slope using Sentinel-1 synthetic aperture radar (SAR) classified by a Random Forest (RF) model trained on Sentinel-2 optical imagery and ERA5 temperature data for the period 2019–2023. Our RF classifier achieved 94 % accuracy for ice state detection, enabling phenology retrieval for 97 % of lakes. Results revealed a mean ice-free period of 115 days (standard deviation = 24 days), with ice-off occurring at day-of-year 163 (June 12) and ice-on at day-of-year 278 (October 5). Spatial analysis demonstrated strong latitudinal control on ice phenology, with ice-free duration decreasing by 30 days per degree northward. Lake morphology (area, circularity, convexity, and shoreline development index) showed modest but significant effects on ice timing after controlling for latitude effects, with shoreline development index and convexity each contributing ∼three days variation across typical lake ranges. Comparison of the RF model and simplistic accumulated degree-day (ADD) model-detected ice phenology yielded a convincing match, where the offsets in ice phenology between the models fell within two Sentinel-1 repeats for approximately 60 % of the lakes. Furthermore, these offsets exhibited the same strong latitudinal control and negligible effects of lake morphology. These lake-specific phenology dates provide timing and duration constraints for future methane studies using high-resolution sensors and provide baseline phenology data essential for understanding how continued Arctic warming will affect thermokarst lake dynamics and associated carbon cycle feedbacks.
- Preprint
(1196 KB) - Metadata XML
-
Supplement
(2206 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2026-1135', Anonymous Referee #1, 13 Apr 2026
-
AC1: 'Reply on RC1', Alexander Nguyen, 29 May 2026
We thank the reviewers for their thoughtful and insightful comments, which helped improve our manuscript. We outline the revisions we plan on making to the manuscript in response to the reviewer feedback and address their specific comments and evaluations below. Broadly, we will implement additional clarification in the form of additional text and figures regarding the choices made in our methodology in the random forest, training labels, and implications of constraints in the methods and available datasets. We will further contextualize our results in relation to findings from previous studies by performing additional analysis, such as incorporating bedfast and floating ice data, where required.
Reviewer #1 comments:
General comments:
As a reader, I would prefer if the information and figures on the RF model were not hidden in the supplement, but moved to the main part of the manuscript as it is an important aspect of the paper.
We agree with the reviewer that the relevant figures for the RF model should be included in the main part of the manuscript. Depending on space, we plan to move both figures S1 (RF confusion matrix) and figure S2 (RF feature importance) to the main text.
As I understand your classification set up, you are defining ice off as being completely ice free, you also rely on data from the center of the lake, would that not lead to an ‘early ice off’ bias in situations where remaining ice drifts to the shore? If so, can you estimate how much of an uncertainty this introduces?
This is correct. As explained in sections 2.2 and 2.3 of the methods section, we use a combination of Sentinel-2-derived labels and temperature-derived labels for defining ice off. For Sentinel-2, we define ice off using the NDSI threshold of 0.8 and 0.2, where above 0.8 is ice, and below 0.2 is water. For the temperature-based labels, we add additional constraints such as there needing to be 10 or more days above 10 degrees C (with ERA5 data) for the water training labels, excluding days in which the water class is called on days where the temperature is below –5 degrees C, and having a manually-set window for ice free to be called between April 1st and August 31st. As a final check, we remove any Sentinel-2 labels if they conflict with the temperature-derived labels to ensure the ice off is completely ice free.
We also do rely on data from the center of the lake, owing to complex freeze-thaw occurring on the lake edges in the shoulder season. By relying on data from the center of the lake, we avoid spurious calls that may lead to an ‘early ice off’ bias, as we may expect that the land surrounding the lake will warm faster than the frozen lake, melting ice around the lake edges first. While it is possible for ice drifting to the shore, the inner edge 10-meter buffer corresponds to a single Sentinel-1/2 pixel. Given that possible ice drifts would be in this 10-meter buffer, we do not believe that we are able to definitively estimate uncertainty and bias within the single-pixel interior edge of the lake. Furthermore, the likelihood of an ‘early ice off’ bias occurring is minimized given our additional temperature-based criterion for labels explained above and in the methodology. We will include a short statement the Model Performance section (4.1) in the discussion regarding this.
You refer several times to ‘a RF classifier trained on Sentinel-2 optical labels’ which I think is misleading. Would it be more adequately described as a RF classifier trained using labels created with the help of Sentinel-2 data? Or something in that direction.
We agree with this point raised by the reviewer, and we will change the wording to reflect this using a phrase such as "RF classifier trained using labels derived from Sentinel-2 optical imagery and ERA5 temperature data", as in the abstract, to maintain that the RF model was trained using data from both Sentinel-2 and ERA5-derived labels.
You stress the relationship between ice off date and depth of the lake. I am aware of the lack of data on lake depth on the required scale, have you considered comparing your results to proxy data sets such as bedfast/floating ice data? It might be an interesting exercise.
We agree with the reviewer that on a first order basis, it would be interesting to compare and discuss our results in the context of previous work analyzing bedfast and floating lake ice in the study region (e.g., Arp et al., 2012; 2015; Engram et al., 2018). We will update the discussion section to include comparisons between our results and those from studies analyzing bedfast vs floating ice lakes, specifically discussing the relationship between ice thickness, bedfast/floating regime, and ice off date. Additionally, we will discuss how relative differences in depth between different lakes or different regions of the same lake can be determined from SAR classification of bedfast vs floating ice (Bergstedt et al., 2026; Engram et al., 2013; 2018; Grunblatt and Atwood, 2014; Hirose et al., 2008), and the potential implications for GHG emissions for floating ice lakes in particular, which contain sub-lake taliks (Arp et al., 2012; Engram et al., 2018).
I am not very familiar with the ADD approach but my impression would be not to interpret the offset between the RF and ADD approaches necessarily as indicators for errors but rather as expected delays between freezing and thawing onset and an ice free or fully ice covered lake. If I interpreted this incorrectly it would be good to clarify your explanations in the paper regarding this issue.
We agree, and we will include additional text in the discussion section to further explain this point, as we do not necessarily interpret the offsets between the RF and ADD models as errors. As mentioned here, it is possible that the 72% later ice-off calls for the RF model, as compared to the ADD model, are due to thermal inertia during the spring; Where higher temperatures and ADDT would predict ice thickness decreasing faster than the RF model trained with labels derived from Sentinel-2. We will add this to the ‘Model Performance’ section 4.1 in the discussion.
A workflow diagram would help the reader to better follow the work.
We will include an updated workflow diagram into the main manuscript to help the readability of our methods.
Specific comments:
Section 3.3 is missing
We thank the reviewer for catching this error; we have corrected section numbering in section 3 to reflect there being only three subsections.
Section 3.4: this is an important aspect of the paper and also features heavily in the discussion section, however the corresponding figures are in the supplement. I think they should be moved. It is difficult to jump back and forth between documents.
We agree with the reviewer that the relevant supplemental figures should be moved to the main text. We plan on including figures S6 (ice-off, ice-on, and ice-free vs lake size) and S7 (ice-free vs circularity, SDI, and convexity) into the main text. Figure S11 (ice-free period vs SDI) is the same as the second panel in figure S7, so we will combine figure S6 and S7 into a single figure as well as include the red line (best-fit regression) from S11 into the corresponding panel in the merged figure. Figure S8 (correlation matrix between lake morphometrics , latitude, and ice phenology) will also be moved to the main text.
Figure 5 is unnecessarily huge
We have now reduced the figure size by half.
Line 118: I do not understand what makes this scaling necessary
We use the scaling here to target specific dB ranges that aid in differentiation between ice and water as well as potential ice type/thickness. By scaling the data, we target these specific dynamic ranges to maximize differentiation between the expected minimum and maximum expected backscatter values for the ice/water distinction for a given band. Additionally, scaling the data reduces the impact of extraneous outliers outside of the expected backscatter ranges for each band (Surdu et al. 2015; Murfitt and Duguay 2020).
144: you say you trained the classifier on backscatter from the interior. Are you using mean values? If not you would have much more pixel values for bigger lakes. If you trained it on the interior values, I assume you are also applying it to the interior values only and can not account for gradual ice development or melt.
This is correct. We trained the classifier on backscatter from both the interior and exterior buffers for a given lake, and we use the mean values from both of these buffers for both the Sentinel-2 NDSI-derived ice mask for training labels and the coincident Sentinel-2 training data. Given that we use mean values over the entire buffers, it is correct that we cannot account for lateral gradual ice development or melt within a given lake, unless it is being captured by the mean. We will add text in this paragraph within the manuscript to acknowledge the fact that we use mean values of these lake buffers for both the Sentinel-2-derived training labels and the corresponding Sentinel-1 training data.
Line 324: delete ‘y’
Removed.
325: also give the time for the ascending track for this to be meaningful
We have now included the ascending track acquisition time of around 18:00-20:00/6:00-8:00 PM AKST for this sentence.
329: the snow depths in this region are usually limited, it should not be a big source of uncertainty in your chosen study region.
While we agree that snow depths on the lowland tundra of the North Slope are limited, as compared to boreal regions, we believe that the presence of modest snow depths (such as the 40 cm mean found in Stuefer et al. 2025 on the Arctic Coastal Plain) are significant enough to insulate the soil and delay ice-off timing. The RF model may be capturing this thermal insulation and relative delay in ice-off timing compared to the ADD model, where the RF model was later than the ADD model for 72% of lake-year combinations.
Citation: https://doi.org/10.5194/egusphere-2026-1135-AC1
-
AC1: 'Reply on RC1', Alexander Nguyen, 29 May 2026
-
RC2: 'Comment on egusphere-2026-1135', Anonymous Referee #2, 13 Apr 2026
This study presents an interesting and somewhat complex analysis of ice phenology for a very large number of lakes on Alaska’s North Slope over a relatively short 5-year period. Much of the motivation for this study appears to be in terms of carbon balance and methane emissions from these lakes, as well as testing new methods for quantifying ice cover timing across a vast lake-rich area. A fair bit of emphasis is placed on well documented elevational and latitudinal climate gradients on the North Slope and minor variability related to some attributes of lake morphometry (area and several indices of lake shape).
Here’s relatively long list of general concerns:
1 – All lakes on the North Slope are not of thermokarst origin. The majority of lakes in the coastal plain are of thermokarst origin, but many of these lakes are also formed by fluvial and aeolian processes. Into the foothills and Brooks Range there is much more variation including glacial processes. Thus, saying all of these lakes are thermokarst isn’t correct (as in the title) and I would suggest just calling them lakes with a paragraph explaining their various origins.
2 – The link between ice phenology and lake carbon emissions is weak to non-existent, while there are other ecological, landscape, and climatic processes that are more important relative to lake ice phenology in the Arctic. It’s possible that a longer open-water season causes higher productivity, warmer bed sediments, and deeper thaw into sublake permafrost that could lead to higher GHG emissions, but I don’t believe this has been well documented. Otherwise the timing of lake ice cover can help explain when methane and other GHGs are released, but doesn’t affect the total flux as implied in the opening sentence of the abstract.
3 – Ice cover formation and decay can occur over several weeks or more on arctic lakes, yet little information is provided about what moment this method(s) is(are) detecting. This is important for applying SAR backscatter because the signal will vary widely over these conditions and it is also quite difficult to differentiate varying open-water conditions (still vs. rough water) from this range of evolving ice surface conditions. The use of optical Sentinel 2 imagery and reanalysis data probably helps with some of this, but honestly I had a very hard time following these methods and the supplemental materials didn’t help much. It seems like a confusing cross-validation approach, where actual observations from sensors, cameras, or higher temporal resolution imagery (like MODIS or Planet) compared to these methods would serve as much more convincing validation even if it was only for a few lakes.
4 – Larger lakes generally have longer ice cover than smaller lakes in this region. The largest lake, Teshekpuk, is a good example and usually has the longest ice cover duration on the North Slope. The notion that larger lakes are consistently deeper doesn’t really apply well here, where there are lots of very large thermokarst lakes that are less than 2-m deep and many smaller dune-trough and glacial lakes that are over 20-m deep. Teshekpuk Lake has an average depth of only about 4 m.
5 – Besides the mountain to coastal temperature and snow gradient, lake depth is generally the most important control on ice-out timing, particularly in coastal plain thermokarst lakes that have bedfast ice (shallow) and floating ice (deeper) (see Sellmann et al. 1975). There is region-wide data on these conditions (see Grunblatt and Atwood 2014).
A few specific points:
L51: I’m not aware of any lake ice phenology estimates based on ice thickness.
L59-61: The relationship between lake depth and ice-out timing is specifically shown for thermokarst lakes by Arp et al 2015, which is referenced elsewhere in this manuscript.
Figure 2: Suggest putting dates rather than days of the year here.
L321-324: These confounding factors described here have a profound impact on using SAR to detect ice and open water and need to be addressed in the methods.
Citation: https://doi.org/10.5194/egusphere-2026-1135-RC2 -
AC2: 'Reply on RC2', Alexander Nguyen, 29 May 2026
We thank the reviewers for their thoughtful and insightful comments, which helped improve our manuscript. We outline the revisions we plan on making to the manuscript in response to the reviewer feedback and address their specific comments and evaluations below. Broadly, we will implement additional clarification in the form of additional text and figures regarding the choices made in our methodology in the random forest, training labels, and implications of constraints in the methods and available datasets. We will further contextualize our results in relation to findings from previous studies by performing additional analysis, such as incorporating bedfast and floating ice data, where required.
Reviewer #2 comments:
This study presents an interesting and somewhat complex analysis of ice phenology for a very large number of lakes on Alaska’s North Slope over a relatively short 5-year period. Much of the motivation for this study appears to be in terms of carbon balance and methane emissions from these lakes, as well as testing new methods for quantifying ice cover timing across a vast lake-rich area. A fair bit of emphasis is placed on well documented elevational and latitudinal climate gradients on the North Slope and minor variability related to some attributes of lake morphometry (area and several indices of lake shape).
Here’s relatively long list of general concerns:
1 – All lakes on the North Slope are not of thermokarst origin. The majority of lakes in the coastal plain are of thermokarst origin, but many of these lakes are also formed by fluvial and aeolian processes. Into the foothills and Brooks Range there is much more variation including glacial processes. Thus, saying all of these lakes are thermokarst isn’t correct (as in the title) and I would suggest just calling them lakes with a paragraph explaining their various origins.
We agree with the change to the title considering that we are capturing some combination of lakes formed from abrupt thaw (thermokarst) alongside other processes. We plan on including a few additional sentences in the introduction explaining the different origins of lakes in the region and how these origins may change as a function of moving southward towards higher elevations at the Brooks Range foothills and in the Brooks Range itself. We selected the latitude range to limit the coverage of the Brooks Range, but we will ensure that this discussion is included considering other possible methods of origin of these lakes and emphasize that the majority of these lakes on the Arctic Coastal Plain are formed from thermokarst.
2 – The link between ice phenology and lake carbon emissions is weak to non-existent, while there are other ecological, landscape, and climatic processes that are more important relative to lake ice phenology in the Arctic. It’s possible that a longer open-water season causes higher productivity, warmer bed sediments, and deeper thaw into sublake permafrost that could lead to higher GHG emissions, but I don’t believe this has been well documented. Otherwise the timing of lake ice cover can help explain when methane and other GHGs are released, but doesn’t affect the total flux as implied in the opening sentence of the abstract.
We agree with the reviewer that a relationship between ice phenology and GHG emissions remains uncertain which, as we explain in the paper, is one of the motivations for performing this study and employing the RF detection of ice phenology from SAR technique. Future studies could employ these techniques for lake phenology detections and use available datasets or methods for GHG detection to better constrain the nature of this relationship, if it exists, as we note in section 4.3. We removed the section “which directly impacts total carbon flux” from the opening sentence of the introduction so as to not mischaracterize the state of the field or the linkage between lake phenology and GHG emissions. We will also update the discussion to emphasize some of these other climatic processes that depend upon ice on/off timing, such as the evaporative flux from ice free lakes, which has been shown to be important for this area (Arp et al., 2015).
3 – Ice cover formation and decay can occur over several weeks or more on arctic lakes, yet little information is provided about what moment this method(s) is(are) detecting. This is important for applying SAR backscatter because the signal will vary widely over these conditions and it is also quite difficult to differentiate varying open-water conditions (still vs. rough water) from this range of evolving ice surface conditions. The use of optical Sentinel 2 imagery and reanalysis data probably helps with some of this, but honestly I had a very hard time following these methods and the supplemental materials didn’t help much. It seems like a confusing cross-validation approach, where actual observations from sensors, cameras, or higher temporal resolution imagery (like MODIS or Planet) compared to these methods would serve as much more convincing validation even if it was only for a few lakes.
We agree that the formation and decay of ice can occur over weeks, and calling a particular date of ice-off vs ice-on is challenging with there being a corresponding amount of uncertainty to this determination. The ice-off and ice-on moments in time this method is detecting are periods in which we believe have high certainty of being ice-free and ice-covered, respectively. As explained in section 2.3 in the methods, to increase the likelihood that these dates are on the more conservative side of certainty (likely later than ice-off/on in reality):
- We impose an ERA5-derived temperature-based threshold of at least ten consecutive days of temperatures above 10 degrees C and below –20 degrees C to aid in the RF model training label process for calling a particular lake-date pair ice or water, respectively
- We remove labels in which water is called for temperatures below –5 degrees C and ice is called for above 5 degrees C
- We set a search window of ice-off between April 1st to August 31st
- We ensured that the ice-off date was the first date in which there are two consecutive satellite observations (about two weeks) in which water was predicted by the RF model
- We applied this same strategy for ice-on after September 1st and imposed the ice-on date being called after two consecutive ice predictions for a given lake. These details are explained in the methods section 2.4.
We further agree with the reviewer that the signal from SAR backscatter can vary over large ranges during the shoulder season owing to complex freeze-thaw occurring over the span of weeks in some cases. We observed this phenomenon when manually analyzing some of the SAR data in a preliminary test for the RF model and consequently implemented the above methodology to minimize the effects of this phenomenon influencing our ice on/off detections. As mentioned, we believe that the use of optical data from Sentinel-2 and reanalysis data from ERA5 aid in our ability to create robust training labels for the RF model. To address this reviewer’s comment and a separate reviewer’s comment, we will include a workflow diagram explaining the salient points in the proposed method, and we will move several items from the supplementary materials to the main text (see response to reviewer #1). Additionally, we explain the details of the training label creation for the RF model in section 2.3 of the methods, and we include a statement referring readers to the code that implements this proposed method hosted on GitHub and Zenodo.
While we agree that actual observations would be provide a convincing validation of the method for select lakes, the rationale for this study is a regional scale analysis of ice phenology; while prior work (cited in the manuscript) has used in situ sensors and cameras to validate ice on-off timing for a limited number of lakes, the scale of analysis in this proposed method, and the emphasis on aggregate regional trends rather than ice phenology at individual features precludes such a validation scheme. We believe that our proposed method, with conservative imposed constraints on both reanalysis-derived temperature and multispectral indices maximizes the likelihood that ice-free and ice-covered conditions as well as ice-off and ice-on dates are predicted accurately within the effective temporal resolution of a spaceborne SAR system. However, we will add text to the Discussion section discussing the value of in situ sensors for validation of local ice phenology, and the potential to integrate in situ sensors and moderate resolution imagery (e.g. MODIS) for validation as future avenues that might improve the proposed method.
4 – Larger lakes generally have longer ice cover than smaller lakes in this region. The largest lake, Teshekpuk, is a good example and usually has the longest ice cover duration on the North Slope. The notion that larger lakes are consistently deeper doesn’t really apply well here, where there are lots of very large thermokarst lakes that are less than 2-m deep and many smaller dune-trough and glacial lakes that are over 20-m deep. Teshekpuk Lake has an average depth of only about 4 m.
Our findings suggest that lake area shows a statistically significant, but negligible correlation with ice-off and ice-on dates and ice-free duration; while this relationship is detectable given the large sample size, it explains minimal variation in phenology as discussed in sections 3.4, 4.32, and Figure 6. We agree with the reviewer that a highly correlated scaling between lake depth, area and ice-free duration does not exist, potentially due to the limited range of thermokarst lake depths in this region and the importance of the underlying substrate (Hinkel et al., 2012). Despite this some studies have shown moderate correlations between lake depth and area (Arp et al., 2011), with modelling suggesting physical mechanisms by which lake deepening and enlargement can occur (Ohara et al., 2022). We therefore suggest a very general correspondence between these morphometric properties, which allows for the observed variability the reviewer notes.
5 – Besides the mountain to coastal temperature and snow gradient, lake depth is generally the most important control on ice-out timing, particularly in coastal plain thermokarst lakes that have bedfast ice (shallow) and floating ice (deeper) (see Sellmann et al. 1975). There is region-wide data on these conditions (see Grunblatt and Atwood 2014).
We agree with the reviewer on the relative importance of lake depth as a control on ice-out timing. We will include a brief discussion of the importance of lake depth as a control on ice-out timing in the revised manuscript, with explicit reference to both Sellmann et al. 1975 and Grunblatt and Atwood 2014, among other related studies.
A few specific points:
L51: I’m not aware of any lake ice phenology estimates based on ice thickness.
Degree day estimates along with optical remote sensing have been used to constrain lake ice phenology (Kirchner and Hannam, 2024) but we have updated the sentence to avoid confusion or misrepresentation of ice thickness estimates from utilization of degree-day metrics for estimating ice phenology. Instead, lake ice phenology can, at least in part, be constrained through degree-day metrics rather than through estimates of ice thickness.
L59-61: The relationship between lake depth and ice-out timing is specifically shown for thermokarst lakes by Arp et al 2015, which is referenced elsewhere in this manuscript.
We will update the text in line 59 to discuss the results from Arp et al., 2015 with something along the lines of:
“Prior investigations into the relationship between lake depth and ice-off timing for lakes in Northern Alaska have found that the relationship between lake depth and ice thickness -- which dictates whether lake ice is bedfast or floating -- exerts a strong control on ice-off timing, occurring on average 17days earlier for bedfast ice lakes than floating ice lakes (Arp et al., 2015).”
Figure 2: Suggest putting dates rather than days of the year here.
We appreciate the reviewer's comment but believe that changing the DOY formatting to standard date formatting is not necessary. DOY is a conventional format which we argue is the more intuitive metric to report in the case of ice-on and off dates, as it facilitates direct comparison of timing among years and sites without the additional interpretation required for calendar dates. Reporting DOY also emphasizes seasonal progression and allows readers to more readily assess lake ice phenology.
L321-324: These confounding factors described here have a profound impact on using SAR to detect ice and open water and need to be addressed in the methods.
We agree that these factors have a significant impact on SAR backscatter for detecting ice-covered and ice-free conditions, as well as specific ice-off and ice-on dates, particularly during the shoulder seasons. In our methodology we have introduced a series of temperature-based thresholds, consistency checks, and consecutive occurrences (as explained in prior responses to the reviewer comments and in sections 2.3 and 2.4 in the methods) that mitigate the effect of these factors on our retrieval. We will include additional discussion of these effects in the methods section to address this reviewer’s comment.
Citation: https://doi.org/10.5194/egusphere-2026-1135-AC2
-
AC2: 'Reply on RC2', Alexander Nguyen, 29 May 2026
-
RC3: 'Comment on egusphere-2026-1135', Anonymous Referee #3, 16 Apr 2026
Ngyuyen et al
General Comments: The authors present ice phenology retrievals for over 30 000 lakes on the Alaskan Coastal Plain from the period of January 2019 to December 2023 using Sentinel-1 and Sentinel-2 to train and run a random forest machine learning model, which is validated using an Accumulated Degree Day (ADD) equation forced using ERA-5 data. The authors present clear, concise objectives that are statistically tested, with reasonable conclusions based on the assumptions of the input data. There are some issues with the Methodology, where statistical tests are presented in the results section but not described in the Methods.
The main criticism which I raise with regards to the methodology is the question of ground truth or similar data. This study effectively produces an image stack of several Sentinel-1 polarimetric parameters (VV, VH, VV/VH) and retrieves ice phenology based on Sentinel-2 derived image labels (classifying as open water or ice from the Normalized Difference Snow Index (NDSI)). The resulting ice phenology retrieved from Sentinel-1 is then statistically compared against ADDT and ADDF, which are forced only with air temperature (if another variable is used, it is not described in an equation). There is no in-situ, or thermodynamic model that is used as validation in this study. The issue with using the thresholded NDSI values to discriminate open water and ice is that the NDSI does not perform well in the fringe seasons that the authors are looking to retrieve. Prior to snowfall, lakes that freeze with bare ice, or wind swept ice can be classified as open water. Concurrently, during the melt season, ice that has had the snowcover melt, exposing the clear ice beneath can be classified as open water, which can cause issues with the NDSI-derived ice flags. The authors need to show the reader with Sentinel-2 imagery the NDSI classification of freeze or melt to show its efficacy, but also quantify the accuracy of the ice labels derived from the NDSI. The other metric that is being used as validation is the ADDT and the ADDF, both of which are reliant on the air temperature at a 30km resolution. Firstly, there are higher resolution gridded air temperature reanalysis products (NLDAS is at 12.5km, and the NClimGrid is 5km) that cut reduce the variability in comparisons. Secondly, the ADDT and ADDF do not account for the thermal mixing that lakes undergo prior to freeze-up, introducing considerable error. I understand that with the sheer number of lakes in this analysis that a fully parameterized thermodynamic lake model may not be appropriate, but the authors should a) use reanalysis air temperature with a higher spatial resolution, and b) show that air temperature agrees with in-situ weather stations (e.g. Utqiagvik, Umiat, Inigok, Toolik Lake). This would better constrain the error associated with the ADDT/ADDF.
The authors present quite a large degree of variability in ice free period based on latitudinal gradient, 30 days per degree. There can be quite a bit of variance when modeling ice phenology using in-situ weather station data, including all the lakes within a 30km pixel to have one freeze-up and one melt date reduces this variability. This is likely why there is such a large spread shown in Figure 3. Additionally, based on the spatial clustering of high residuals (upwards of 60 days) in the lakes in and surrounding the Brooks mountain range, those lakes should not be included in the summary of ice free days per degree. The pattern of 30 days per degree is then discussed in Section 4.2. that it is six times that of the broader range of the Canadian Cordillera at 6 days per degree, “consistent with the heightened climate sensitivity of lakes near the northern limits of the seasonal ice zone”. Then the authors state “this sensitivity implies that continued Arctic warming could produce disproportionately large changes in ice phenology for these northern most lakes”. It appears that the baked-in error associated with large deviations in the southern region of the study site has influenced the 30 day per degree ice free days, and sweeping statements like this should omitted.
The manuscript requires revision with respect to the validation (Sentinel-2, ADD) of ice covered regions both in data analysis and text to make the reader believe that the patterns being presented a function of ice dynamics in-situ.
Specific Comments:
Page 2 Line 57: “the degree to which thermokarst lake morphology may influence ice phenology relative to estimates based on ADD metrics remains largely unknown”
I don’t see how you could test how lake morphology affects ADD metrics. There are no inputs to the ADD that would be affected by morphometry. Other lake ice models have a mixing depth or bottom associated with them.
Page 3 Lines 93-94: “we were able to retrieve phenology from 30,862 lakes (99.2%)…
Why not all lakes in the dataset? What was the factor that inhibited the other 0.8% from being obtained?
Page 4 Line 112: “the period January 2019 through December 2023”
Why this time scale? Sentinel-1 and 2 have been in orbit since 2015, and after 2023.
Page 5 Line 119: “-20 to -5 dB”
For this location and others where you refer to the standardization of the symbologies – what justification is there for the max and min and the association made to the different scattering mechanisms associated with them? There needs to be a justification.
Page 5 Line 128: “we classified NDSI values > 0.4 as snow/ice”
Does this mean that the input was then classified into a binary raster?
Page 7 Line 194: This paragraph describes the ADD model
Some equations are needed here to describe the ADD. Also, some justification here for using the ADDT and ADDF instead of a thermodynamic model, what the drawbacks are and the benefit of using it for such a large dataset. Where do you expect the air temperature to exhibit the highest error (e.g. mountainous regions).
Page 7 Line 197, 198: “ranging from DOY 91 to DOY 242”…”DOY 244 to DOY 361”
These are the endpoints of the DOY that the model has constrained as possible days for open water and freeze-up. The output having dates at the first or last potential should warrant a revision in the model to see why those dates are being flagged.
Page 8 Line 210: “After loss of Sentinel 1-B”
This should be discussed in the data section – how did it affect your analysis? When was Sentinel 1-B lost?
Figure 3: Figure 3 and in text does a good job at identifying that outside of observing a lake within a single Sentinel-1 revisit that the error increases substantially. There should also be a figure that shows the spatial pattern of comparison to the ADDT and ADDF estimates. Is this pattern consistent year to year, is the pattern consistent between freeze and melt? If consistent spatial patterns exist latitudinally, that may also drive the very large 30 day per degree reduction in ice free season. Figure S10 appears to show the spatial distribution of the residuals, showing that the largest residuals are in the southern most regions in the Brooks mountain range.
Page 9 Lines 225-226: “we implemented a series of statistical analyses including …”
These statistical tests should not be introduced in the Results section. They should be in a validation or statistical tests subheading in the methods section.
Figure 5: The X and Y axis need to be flipped. The independent variable is latitude, and the dependent variable is the ice-free duration.
Page 15 Line 324: “ice phenology y difficulty”
Typo
Page 15 Lines 327 – 329: “Likewise, the use of ADD to model thermal diffusion through the ice cover…”
You should also mention that the ADD does not account for the thermal mixing that is required of lakes prior to initial skim ice formation. Larger/deeper lakes will take longer to mix and delay ice cover establishment.
Page 15 Line 335: “models in figure 6”
Should be “Fig. 6”.
Page 16 Lines 348-350 and Lines 355 – 358: There are two statements that are overly generalized here:
“This sensitivity implies that continued Arctic warming could produce disproportionately large changes in ice phenology for these northernmost lakes, with consequently disproportionate effects on future green-house gas emissions (Walter Anthony et al., 2018).”
“This indicates that the RF model detects a longer ice-free period for lower latitude lakes and a shorter ice-free period for higher latitudes relative to the ADD estimates, suggesting that the ADD model is more conservative in predicting ice-free duration at lower latitudes whereas at higher latitudes, the RF model is more conservative”
The lakes in the lower latitudes are surrounding the Brooks mountain range, at higher elevations with variation in weather conditions. While this is acknowledged in the paper, the extrapolation that the 30 day per degree would potentially result in considerably larger greenhouse gas emissions does not take into account the different in elevation, situation, weather patterns, precipitation, etc. Latitude alone does not influence ice cover duration. This would be a good circumstance to test to see if the ERA5 air temperature agrees with the Toolik weather station located at the foothills of the Brooks range.
Page 17 Lines 387-388: “these larger and deeper lakes remain ice-free longer because they begin freezing later in the year likely due to their greater thermal inertia”.
And the longer period associated with thermal mixing.
The authors mention the loss of Sentinel 1B – the density of measurements (i.e. repeat passes) must have been reduced as a result. I think that this should be mentioned and identify if it impacts the accuracy of the RF model before and after losing Sentinel 1-B.
Supplemental Figures: Be sure that for the ice-off and ice-on dates in each of the figure captions you reference what analysis output the data is from. For instance, Figure S5: “Box plots of ice-off dates for phenology-detected lakes each year”. Is this the RF output? The ADD output?
Citation: https://doi.org/10.5194/egusphere-2026-1135-RC3 -
AC3: 'Reply on RC3', Alexander Nguyen, 29 May 2026
We thank the reviewers for their thoughtful and insightful comments, which helped improve our manuscript. We outline the revisions we plan on making to the manuscript in response to the reviewer feedback and address their specific comments and evaluations below. Broadly, we will implement additional clarification in the form of additional text and figures regarding the choices made in our methodology in the random forest, training labels, and implications of constraints in the methods and available datasets. We will further contextualize our results in relation to findings from previous studies by performing additional analysis, such as incorporating bedfast and floating ice data, where required.
Reviewer #3 comments:
Nguyen and Lopez et al
General Comments: The authors present ice phenology retrievals for over 30 000 lakes on the Alaskan Coastal Plain from the period of January 2019 to December 2023 using Sentinel-1 and Sentinel-2 to train and run a random forest machine learning model, which is validated using an Accumulated Degree Day (ADD) equation forced using ERA-5 data. The authors present clear, concise objectives that are statistically tested, with reasonable conclusions based on the assumptions of the input data. There are some issues with the Methodology, where statistical tests are presented in the results section but not described in the Methods.
The main criticism which I raise with regards to the methodology is the question of ground truth or similar data. This study effectively produces an image stack of several Sentinel-1 polarimetric parameters (VV, VH, VV/VH) and retrieves ice phenology based on Sentinel-2 derived image labels (classifying as open water or ice from the Normalized Difference Snow Index (NDSI)). The resulting ice phenology retrieved from Sentinel-1 is then statistically compared against ADDT and ADDF, which are forced only with air temperature (if another variable is used, it is not described in an equation). There is no in-situ, or thermodynamic model that is used as validation in this study. The issue with using the thresholded NDSI values to discriminate open water and ice is that the NDSI does not perform well in the fringe seasons that the authors are looking to retrieve. Prior to snowfall, lakes that freeze with bare ice, or wind swept ice can be classified as open water. Concurrently, during the melt season, ice that has had the snowcover melt, exposing the clear ice beneath can be classified as open water, which can cause issues with the NDSI-derived ice flags. The authors need to show the reader with Sentinel-2 imagery the NDSI classification of freeze or melt to show its efficacy, but also quantify the accuracy of the ice labels derived from the NDSI. The other metric that is being used as validation is the ADDT and the ADDF, both of which are reliant on the air temperature at a 30km resolution. Firstly, there are higher resolution gridded air temperature reanalysis products (NLDAS is at 12.5km, and the NClimGrid is 5km) that cut reduce the variability in comparisons. Secondly, the ADDT and ADDF do not account for the thermal mixing that lakes undergo prior to freeze-up, introducing considerable error. I understand that with the sheer number of lakes in this analysis that a fully parameterized thermodynamic lake model may not be appropriate, but the authors should a) use reanalysis air temperature with a higher spatial resolution, and b) show that air temperature agrees with in-situ weather stations (e.g. Utqiagvik, Umiat, Inigok, Toolik Lake). This would better constrain the error associated with the ADDT/ADDF.
We appreciate the reviewer’s general comments and criticisms of this work. We would like to clarify that the RF model used in this study is trained on the Sentinel-1 data itself (VV, VH, and the cross-polarized VV/VH ratio from both the lake interior and marginal landscape). The RF model uses training labels derived from Sentinel-2 NDSI as well as ERA5 reanalysis 2m air temperature data. Furthermore, ADDT and ADDF are only estimated from air temperature data alone, as there is no readily-available product for land surface temperature at this spatiotemporal resolution. 2m air temperature data is the most commonly-used alternative. ADDT/ADDF is an empirical index model that we use in place of in-situ or thermodynamically-modeled validation, and ADDT/ADDF is commonly used for lake ice studies (Stefan 1891; Richards 1964; Bilello 1980; Ashton 1985; Korhonen 2006; Arp et al. 2010; Kirchner and Hannam 2024).
We agree that solely relying on NDSI to discriminate between open water and ice during the fringe season would not be advised, owing to misclassification of open water during freeze-up and melt from misidentified ice reflectance. For this reason, our strategy for training labels uses a combined NDSI threshold and temperature criterion check, as explained in the methods section 2.3, the response to reviewer #1’s second comment, and the response to reviewer #2’s third comment. Furthermore, the temperature data comprises 62% of training labels, as mentioned in the manuscript. This thresholding approach mitigates the confounding effects of ice phenology in the shoulder season mentioned by the reviewer. To make this clearer in the text, we will expand upon our explanation of the RF model training in the revision to the manuscript. Additionally, we agree that it would be illustrative to include a figure depicting Sentinel-2 imagery, corresponding NDSI value, and ice/water classification for randomly-selected lakes during both the ice-off and ice-on periods. We plan on creating this supplemental 6-panel figure to visually show the efficacy and accuracy of the RF model.
While we appreciate the reviewer’s suggestions for the use of higher resolution climate reanalysis products, to our knowledge, and after an investigation in response to this comment, we note that the NClimGrid and NLDAS products suggested by the reviewer are available for the contiguous United States, not Alaska. NLDAS-3, which has planned Alaskan coverage, is currently in development, but at present only the meteorological model input (forcing) data is available, not the output data. For these reasons, we argue that the use of ERA5 is not only appropriate, but the best choice for this study.
We appreciate the reviewer’s suggestion to compare ERA5 air temperature with in situ weather stations; the reviewer is correct that the 31 km spatial resolution of ERA5 cannot fully capture the spatial variability of air temperature or other climatic variables. However, a detailed validation of ERA5 reanalysis air temperature with in situ records is beyond the scope of this project. Instead, we will supplement the revision with a brief discussion of prior studies that have demonstrated excellent agreement between ERA5 and in situ weather stations across the Arctic (Clelland et al. 2024; Ibebuchi et al. 2024; Tarek et al. 2020; Pernov et al. 2024). For example, Clelland et al. 2024 has shown agreeance of ERA5 with in-situ data in Siberia, Ibebuchi et al. 2024 find ERA5 outperforms other reanalysis products for comparison with in-situ data across the contiguous United States, Tarek et al. 2020 found ERA5 hydrological modeling was comparable to in-situ data across North America, and Pernov et al. 2024 found a 0.94 R2 for temperature when comparing ERA5 to 17 weather stations across the Arctic.
The reviewer is completely correct in noting that the empirical ADDT/F model does not explicitly account for thermal mixing within the water column, potentially introducing error into the ADD model results. However, we agree with the reviewer’s remark that the sheer size and scale of this work precludes the implementation of a robust thermodynamic lake model.
The authors present quite a large degree of variability in ice free period based on latitudinal gradient, 30 days per degree. There can be quite a bit of variance when modeling ice phenology using in-situ weather station data, including all the lakes within a 30km pixel to have one freeze-up and one melt date reduces this variability. This is likely why there is such a large spread shown in Figure 3. Additionally, based on the spatial clustering of high residuals (upwards of 60 days) in the lakes in and surrounding the Brooks mountain range, those lakes should not be included in the summary of ice free days per degree. The pattern of 30 days per degree is then discussed in Section 4.2. that it is six times that of the broader range of the Canadian Cordillera at 6 days per degree, “consistent with the heightened climate sensitivity of lakes near the northern limits of the seasonal ice zone”. Then the authors state “this sensitivity implies that continued Arctic warming could produce disproportionately large changes in ice phenology for these northern most lakes”. It appears that the baked-in error associated with large deviations in the southern region of the study site has influenced the 30 day per degree ice free days, and sweeping statements like this should omitted.
In reference to the spatial clustering of residuals shown in supplemental figure 10, we agree that the lakes with high residuals (including up to 60 days) should not be included in the summary report of ice-free days per degree, as suggested by the reviewer. We will remove the relevant statements the reviewer highlighted. To address the necessary change in our summary report of ice-free days per degree, we will either remove lakes below a latitude threshold or process lakes below this threshold separately. We will then report the updated summary report for ice free days per degree and make a clear distinction between lakes on the Arctic Coastal Plain and those in and around the Brooks Range.
We will further remove the statement comparing the reported latitudinal/degree relationship to the Canadian Cordillera, and we will instead restrict our regional analysis to the subpopulation of lakes within the Arctic Coastal Plain and the Brooks Range foothills.
The manuscript requires revision with respect to the validation (Sentinel-2, ADD) of ice covered regions both in data analysis and text to make the reader believe that the patterns being presented a function of ice dynamics in-situ.
As mentioned above in our responses to this reviewer’s comments, we believe that our proposed inclusion of an additional figure showing randomly-selected imagery, an update to the summary report for ice free days per degree, and text surrounding these additions will address this revision request related to data analysis and text for validation of our method.
Specific Comments:
Page 2 Line 57: “the degree to which thermokarst lake morphology may influence ice phenology relative to estimates based on ADD metrics remains largely unknown”
I don’t see how you could test how lake morphology affects ADD metrics. There are no inputs to the ADD that would be affected by morphometry. Other lake ice models have a mixing depth or bottom associated with them.
This line was unclear and has been corrected. We fully agree that lake morphology would not affect ADD metrics which was the point of the sentence. While lake morphology may affect lake ice phenology, the ADD metrics would be unable to capture this behavior because they are independent of any morphometric properties:
“As a result, the degree to which thermokarst lake morphology may influence ice phenology relative to estimates based on ADD metrics, which do not account for morphology, remains largely unknown.”
Page 3 Lines 93-94: “we were able to retrieve phenology from 30,862 lakes (99.2%)…
Why not all lakes in the dataset? What was the factor that inhibited the other 0.8% from being obtained?
For the 31,108 lakes in the dataset, we did not retrieve phenology for 246 of them (0.8%). Our determination of whether or not a lake had a successful detection required both ice-off and ice-on detection in any year within the five-year time period. The 246, or 0.8%, of lake either did not have an ice-on or ice-off detection. The reason that these lakes did not have either an ice-on, ice-off, or any detection may be due to there not being a minimum of two consecutive observations for ice or water for a given lake, the unlikely case in which ice-off would otherwise be detected outside of the window between April 1st to August 31st or ice-on may have occurred prior to our beginning search date of September 1st. These time and date parameters set and called within the code may have excluded the ability to detect either or both of the ice phenology dates for these 246 lakes over the five-year period. We plan on further analyzing the reason why these lakes fell outside of the set constraints, their locations, and consistencies in morphometrics. We will include this information in the text, likely in the section 3.1 when addressing the ice phenology results.
Page 4 Line 112: “the period January 2019 through December 2023”
Why this time scale? Sentinel-1 and 2 have been in orbit since 2015, and after 2023.
We selected this time scale based on the availability of the Harmonized Sentinel-2 MSI: MultiSpectral Instrument, Level-2A (SR) collection from Google Earth Engine, as described in the code. 2019 was selected as the start year, as the 2017-2018 data does not have global coverage in this collection. Furthermore, the start year of data for this collection is in 2017. Using 2023 as the final year for this data date range ensured that there were at least five years of data to retrieve lake ice phenology from, where we sought to define a nominal five-year time span over which to perform this analysis. We can include this information into the supplement if necessary.
Page 5 Line 119: “-20 to -5 dB”
For this location and others where you refer to the standardization of the symbologies – what justification is there for the max and min and the association made to the different scattering mechanisms associated with them? There needs to be a justification.
The justification for the maximum and minimum for these ranges, as well as the different scattering mechanisms, are explained in Surdu et al. 2015 as well as Murfitt and Duguay 2020. We will add text explaining this in this paragraph and add these references as well.
Page 5 Line 128: “we classified NDSI values > 0.4 as snow/ice”
Does this mean that the input was then classified into a binary raster?
This is correct. We use the 0.4 threshold on computed NDSI for the pixels within a lake-date pair to compute a binary classification. We then take the mean across all pixels within the lake-date pair of this binary classification to get an ice fraction (0-1). This ice fraction is then read into our RF label creation code that then labels an individual lake-date as either ice or water using a threshold of ice fraction > 80% being ice, and an ice fraction < 20% being water. We will update the text regarding the binary classification into section 2.2. Likewise, we will include information for the ice fraction thresholds to the first paragraph in section 2.3.
Page 7 Line 194: This paragraph describes the ADD model
Some equations are needed here to describe the ADD. Also, some justification here for using the ADDT and ADDF instead of a thermodynamic model, what the drawbacks are and the benefit of using it for such a large dataset. Where do you expect the air temperature to exhibit the highest error (e.g. mountainous regions).
We agree with the reviewer that reporting the relevant equation for the ADD model would help motivate its use; we will update this paragraph to include the equations used in the ADD model. We will also incorporate a discussion of its prior use in similar studies of lake ice formation and decay, the strengths and limitations of the ADD model, and a justification for why we favor this method over a thermodynamic model for the regional scale study presented here.
Page 7 Line 197, 198: “ranging from DOY 91 to DOY 242”…”DOY 244 to DOY 361”
These are the endpoints of the DOY that the model has constrained as possible days for open water and freeze-up. The output having dates at the first or last potential should warrant a revision in the model to see why those dates are being flagged.
This is true. The ranges reported here for ice-off and ice-on are exactly the windowed DOY we selected to search for ice-off and ice-on. We originally chose these dates such that there are no instances in which ice-off is called after ice-on for a given year. To remedy this current issue, we plan to remove this search window and mark specific lakes in which ice-off is called after ice-on and will exclude them from the main reported summary statistics in this paper. However, we will note the number and percentages of lakes, as well as their spatial distribution and general morphometrics, that exhibit the physically impossible ice-off call after ice-on call. We will also report our ranges using either standard deviations from the mean or 5th and 95th percentiles to remove outliers.
Page 8 Line 210: “After loss of Sentinel 1-B”
This should be discussed in the data section – how did it affect your analysis? When was Sentinel 1-B lost?
We will include information about the loss of Sentinel-1B in the methods for section 2.2, including the date in which the satellite was lost (December 23rd, 2021). The loss of Sentinel-1B affected our analysis in increasing the repeat time between acquisitions over our study area, thereby effectively decreasing the density of measurements after data transmission failure of the satellite.
Figure 3: Figure 3 and in text does a good job at identifying that outside of observing a lake within a single Sentinel-1 revisit that the error increases substantially. There should also be a figure that shows the spatial pattern of comparison to the ADDT and ADDF estimates. Is this pattern consistent year to year, is the pattern consistent between freeze and melt? If consistent spatial patterns exist latitudinally, that may also drive the very large 30 day per degree reduction in ice free season. Figure S10 appears to show the spatial distribution of the residuals, showing that the largest residuals are in the southern most regions in the Brooks mountain range.
We agree with the reviewer in that there should be a figure comparing the spatial patterns between the ADDT and ADDF model compared to the RF model calls of ice-off and ice-on. We will make a supplemental figure with 4-panel plots showing maps of ice-off and ice-on for both the ADD-derived and RF-derived ice phenology for each year. We will then include text describing the pattern consistency between years, ice-off and ice-on, and relation to the 30 days ice-free per degree change reported prior, although we also expect that the previously reported 30 days per degree relationship will change significantly after separating our analysis to lakes on the Coastal Plain and lakes in the Brooks Range foothills.
Page 9 Lines 225-226: “we implemented a series of statistical analyses including …”
These statistical tests should not be introduced in the Results section. They should be in a validation or statistical tests subheading in the methods section.
We will update the methods, specifically at line 194 (section 2.5) to include a brief overview of the statistical analyses employed in this study:
“We employ the Kruskal-Wallis H-test, Mann-Whitney U-test, and Cliff's Delta (δ) to statistically evaluate any potential controls of latitude and lake morphometry on discrepancies with the RF model and ADD model estimates of ice-off and ice-off dates.”
Figure 5: The X and Y axis need to be flipped. The independent variable is latitude, and the dependent variable is the ice-free duration.
While it is true that the independent variable is latitude, and the dependent variable is the ice-free duration in this case, the choice of orientation for this figure was deliberate, such that latitude is displayed vertically as in a map-view of ice-free duration. However, we will include a statement in the revised text making explicit that the independent variable in this regression is indeed latitude and justify the choice of figure orientation.
Page 15 Line 324: “ice phenology y difficulty”
Typo
Fixed.
Page 15 Lines 327 – 329: “Likewise, the use of ADD to model thermal diffusion through the ice cover…”
You should also mention that the ADD does not account for the thermal mixing that is required of lakes prior to initial skim ice formation. Larger/deeper lakes will take longer to mix and delay ice cover establishment.
We appreciate this suggestion and will update the sentence to discuss how the empirical ADD model does not explicitly account for thermal mixing and what that may mean for ice phenology as a function of lake size.
Page 15 Line 335: “models in figure 6”
Should be “Fig. 6”.
Fixed.
Page 16 Lines 348-350 and Lines 355 – 358: There are two statements that are overly generalized here:
“This sensitivity implies that continued Arctic warming could produce disproportionately large changes in ice phenology for these northernmost lakes, with consequently disproportionate effects on future green-house gas emissions (Walter Anthony et al., 2018).”
“This indicates that the RF model detects a longer ice-free period for lower latitude lakes and a shorter ice-free period for higher latitudes relative to the ADD estimates, suggesting that the ADD model is more conservative in predicting ice-free duration at lower latitudes whereas at higher latitudes, the RF model is more conservative”
The lakes in the lower latitudes are surrounding the Brooks mountain range, at higher elevations with variation in weather conditions. While this is acknowledged in the paper, the extrapolation that the 30 day per degree would potentially result in considerably larger greenhouse gas emissions does not take into account the different in elevation, situation, weather patterns, precipitation, etc. Latitude alone does not influence ice cover duration. This would be a good circumstance to test to see if the ERA5 air temperature agrees with the Toolik weather station located at the foothills of the Brooks range.
The choice of ERA5 reanalysis air temperature was motivated in part because these reanalysis products do account for physical features and processes such as elevation, prevailing precipitation and weather patterns, etc. Previous studies have analyzed and validated the performance of different ERA5 temperature products for the Arctic region and have found strong performance of ERA5 data (Cao et al., 2020; Graham et al., 2019; Yu et al., 2021). However, a specific ERA5 air temperature analysis for the North Slope of Alaska, to our knowledge, remains outstanding. While we agree with the reviewer that this type of analysis would be beneficial, it is beyond the scope of this study to conduct these analyses, and such an analysis ought instead to be the subject of its own separate study.
We do completely agree with the reviewer’s suggestion to conduct a subpopulation analysis of lakes within the Coastal Plain separately from lakes within the Brooks Range foothills, which we will do in the revision as discussed above. We will further modify the two specific statements identified here by the reviewer to discuss in greater detail the effect of elevation, situation, precipitation, etc. on ice phenology in addition to latitude.
The statement comparing relative performance of the RF and ADD models as a function of latitude (i.e., lines 355-358) was intended to provide a generalized description of model performance over the study region. After performing the suggested changes to our analysis, we will append this discussion to make explicit that latitude is not the sole controlling factor that dictates ice phenology, and that systematic differences in elevation, precipitation regime, aspect, etc. can also affect ice phenology, and consequently might be expressed in the observed differences between ice-on/off date detections for the RF and ADD models discussed here.
Page 17 Lines 387-388: “these larger and deeper lakes remain ice-free longer because they begin freezing later in the year likely due to their greater thermal inertia”.
And the longer period associated with thermal mixing.
We will modify this statement to the following:
“These larger and deeper lakes remain ice-free longer because they begin freezing later in the year due to a combination of longer periods of thermal mixing within the water column and their greater thermal inertia.”
The authors mention the loss of Sentinel 1B – the density of measurements (i.e. repeat passes) must have been reduced as a result. I think that this should be mentioned and identify if it impacts the accuracy of the RF model before and after losing Sentinel 1-B.
We agree that the reduction of density in the measurements from Sentinel-1 due to the loss of Sentinel-1B potentially affect the accuracy of our RF model. We will add text regarding the reduction of temporal visit time coarsening the granularity of the ice-on/off DOY estimates after its loss to the methods section.
Supplemental Figures: Be sure that for the ice-off and ice-on dates in each of the figure captions you reference what analysis output the data is from. For instance, Figure S5: “Box plots of ice-off dates for phenology-detected lakes each year”. Is this the RF output? The ADD output?
We appreciate the reviewer bringing this point to our attention. We will add additional text to the figure captions to explicitly state the specific analysis output data reported in each figure (e.g., S3, S5, S6, S7, S8, S10, S11, and S15).
Citation: https://doi.org/10.5194/egusphere-2026-1135-AC3
-
AC3: 'Reply on RC3', Alexander Nguyen, 29 May 2026
Data sets
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope Nguyen, Lopez, Melara-Valle, Ross, Bradley, Limbeck, Masteller, and Michaelides https://zenodo.org/records/18799073
Model code and software
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope Nguyen, Lopez, Melara-Valle, Ross, Bradley, Limbeck, Masteller, and Michaelides https://zenodo.org/records/18799073
Interactive computing environment
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope Nguyen, Lopez, Melara-Valle, Ross, Bradley, Limbeck, Masteller, and Michaelides https://zenodo.org/records/18799073
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 556 | 203 | 53 | 812 | 148 | 55 | 60 |
- HTML: 556
- PDF: 203
- XML: 53
- Total: 812
- Supplement: 148
- BibTeX: 55
- EndNote: 60
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General comments:
As a reader, I would prefer if the information and figures on the RF model were not hidden in the supplement, but moved to the main part of the manuscript as it is an important aspect of the paper.
As I understand your classification set up, you are defining ice off as being completely ice free, you also rely on data from the center of the lake, would that not lead to an ‘early ice off’ bias in situations where remaining ice drifts to the shore? If so, can you estimate how much of an uncertainty this introduces?
You refer several times to ‘a RF classifier trained on Sentinel-2 optical labels’ which I think is misleading. Would it be more adequately described as a RF classifier trained using labels created with the help of Sentinel-2 data? Or something in that direction.
You stress the relationship between ice off date and depth of the lake. I am aware of the lack of data on lake depth on the required scale, have you considered comparing your results to proxy data sets such as bedfast/floating ice data? It might be an interesting exercise.
I am not very familiar with the ADD approach but my impression would be not to interpret the offset between the RF and ADD approaches necessarily as indicators for errors but rather as expected delays between freezing and thawing onset and an ice free or fully ice covered lake. If I interpreted this incorrectly it would be good to clarify your explanations in the paper regarding this issue.
A workflow diagram would help the reader to better follow the work.
Specific comments:
Section 3.3 is missing
Section 3.4: this is an important aspect of the paper and also features heavily in the discussion section, however the corresponding figures are in the supplement. I think they should be moved. It is difficult to jump back and forth between documents.
Figure 5 is unnecessarily huge
Line 118: I do not understand what makes this scaling necessary
144: you say you trained the classifier on backscatter from the interior. Are you using mean values? If not you would have much more pixel values for bigger lakes. If you trained it on the interior values, I assume you are also applying it to the interior values only and can not account for gradual ice development or melt.
Line 324: delete ‘y’
325: also give the time for the ascending track for this to be meaningful
329: the snow depths in this region are usually limited, it should not be a big source of uncertainty in your chosen study region.