the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope
Abstract. Thermokarst lakes are critical components of Arctic carbon cycling, yet their ice phenology, which directly impacts total carbon flux, remains poorly characterized at regional scales. We present the first comprehensive analysis of ice-on and ice-off timing across 30,862 lakes on the Alaska North Slope using Sentinel-1 synthetic aperture radar (SAR) classified by a Random Forest (RF) model trained on Sentinel-2 optical imagery and ERA5 temperature data for the period 2019–2023. Our RF classifier achieved 94 % accuracy for ice state detection, enabling phenology retrieval for 97 % of lakes. Results revealed a mean ice-free period of 115 days (standard deviation = 24 days), with ice-off occurring at day-of-year 163 (June 12) and ice-on at day-of-year 278 (October 5). Spatial analysis demonstrated strong latitudinal control on ice phenology, with ice-free duration decreasing by 30 days per degree northward. Lake morphology (area, circularity, convexity, and shoreline development index) showed modest but significant effects on ice timing after controlling for latitude effects, with shoreline development index and convexity each contributing ∼three days variation across typical lake ranges. Comparison of the RF model and simplistic accumulated degree-day (ADD) model-detected ice phenology yielded a convincing match, where the offsets in ice phenology between the models fell within two Sentinel-1 repeats for approximately 60 % of the lakes. Furthermore, these offsets exhibited the same strong latitudinal control and negligible effects of lake morphology. These lake-specific phenology dates provide timing and duration constraints for future methane studies using high-resolution sensors and provide baseline phenology data essential for understanding how continued Arctic warming will affect thermokarst lake dynamics and associated carbon cycle feedbacks.
- Preprint
(1196 KB) - Metadata XML
-
Supplement
(2206 KB) - BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on egusphere-2026-1135', Anonymous Referee #1, 13 Apr 2026
-
RC2: 'Comment on egusphere-2026-1135', Anonymous Referee #2, 13 Apr 2026
This study presents an interesting and somewhat complex analysis of ice phenology for a very large number of lakes on Alaska’s North Slope over a relatively short 5-year period. Much of the motivation for this study appears to be in terms of carbon balance and methane emissions from these lakes, as well as testing new methods for quantifying ice cover timing across a vast lake-rich area. A fair bit of emphasis is placed on well documented elevational and latitudinal climate gradients on the North Slope and minor variability related to some attributes of lake morphometry (area and several indices of lake shape).
Here’s relatively long list of general concerns:
1 – All lakes on the North Slope are not of thermokarst origin. The majority of lakes in the coastal plain are of thermokarst origin, but many of these lakes are also formed by fluvial and aeolian processes. Into the foothills and Brooks Range there is much more variation including glacial processes. Thus, saying all of these lakes are thermokarst isn’t correct (as in the title) and I would suggest just calling them lakes with a paragraph explaining their various origins.
2 – The link between ice phenology and lake carbon emissions is weak to non-existent, while there are other ecological, landscape, and climatic processes that are more important relative to lake ice phenology in the Arctic. It’s possible that a longer open-water season causes higher productivity, warmer bed sediments, and deeper thaw into sublake permafrost that could lead to higher GHG emissions, but I don’t believe this has been well documented. Otherwise the timing of lake ice cover can help explain when methane and other GHGs are released, but doesn’t affect the total flux as implied in the opening sentence of the abstract.
3 – Ice cover formation and decay can occur over several weeks or more on arctic lakes, yet little information is provided about what moment this method(s) is(are) detecting. This is important for applying SAR backscatter because the signal will vary widely over these conditions and it is also quite difficult to differentiate varying open-water conditions (still vs. rough water) from this range of evolving ice surface conditions. The use of optical Sentinel 2 imagery and reanalysis data probably helps with some of this, but honestly I had a very hard time following these methods and the supplemental materials didn’t help much. It seems like a confusing cross-validation approach, where actual observations from sensors, cameras, or higher temporal resolution imagery (like MODIS or Planet) compared to these methods would serve as much more convincing validation even if it was only for a few lakes.
4 – Larger lakes generally have longer ice cover than smaller lakes in this region. The largest lake, Teshekpuk, is a good example and usually has the longest ice cover duration on the North Slope. The notion that larger lakes are consistently deeper doesn’t really apply well here, where there are lots of very large thermokarst lakes that are less than 2-m deep and many smaller dune-trough and glacial lakes that are over 20-m deep. Teshekpuk Lake has an average depth of only about 4 m.
5 – Besides the mountain to coastal temperature and snow gradient, lake depth is generally the most important control on ice-out timing, particularly in coastal plain thermokarst lakes that have bedfast ice (shallow) and floating ice (deeper) (see Sellmann et al. 1975). There is region-wide data on these conditions (see Grunblatt and Atwood 2014).
A few specific points:
L51: I’m not aware of any lake ice phenology estimates based on ice thickness.
L59-61: The relationship between lake depth and ice-out timing is specifically shown for thermokarst lakes by Arp et al 2015, which is referenced elsewhere in this manuscript.
Figure 2: Suggest putting dates rather than days of the year here.
L321-324: These confounding factors described here have a profound impact on using SAR to detect ice and open water and need to be addressed in the methods.
Citation: https://doi.org/10.5194/egusphere-2026-1135-RC2 -
RC3: 'Comment on egusphere-2026-1135', Anonymous Referee #3, 16 Apr 2026
Ngyuyen et al
General Comments: The authors present ice phenology retrievals for over 30 000 lakes on the Alaskan Coastal Plain from the period of January 2019 to December 2023 using Sentinel-1 and Sentinel-2 to train and run a random forest machine learning model, which is validated using an Accumulated Degree Day (ADD) equation forced using ERA-5 data. The authors present clear, concise objectives that are statistically tested, with reasonable conclusions based on the assumptions of the input data. There are some issues with the Methodology, where statistical tests are presented in the results section but not described in the Methods.
The main criticism which I raise with regards to the methodology is the question of ground truth or similar data. This study effectively produces an image stack of several Sentinel-1 polarimetric parameters (VV, VH, VV/VH) and retrieves ice phenology based on Sentinel-2 derived image labels (classifying as open water or ice from the Normalized Difference Snow Index (NDSI)). The resulting ice phenology retrieved from Sentinel-1 is then statistically compared against ADDT and ADDF, which are forced only with air temperature (if another variable is used, it is not described in an equation). There is no in-situ, or thermodynamic model that is used as validation in this study. The issue with using the thresholded NDSI values to discriminate open water and ice is that the NDSI does not perform well in the fringe seasons that the authors are looking to retrieve. Prior to snowfall, lakes that freeze with bare ice, or wind swept ice can be classified as open water. Concurrently, during the melt season, ice that has had the snowcover melt, exposing the clear ice beneath can be classified as open water, which can cause issues with the NDSI-derived ice flags. The authors need to show the reader with Sentinel-2 imagery the NDSI classification of freeze or melt to show its efficacy, but also quantify the accuracy of the ice labels derived from the NDSI. The other metric that is being used as validation is the ADDT and the ADDF, both of which are reliant on the air temperature at a 30km resolution. Firstly, there are higher resolution gridded air temperature reanalysis products (NLDAS is at 12.5km, and the NClimGrid is 5km) that cut reduce the variability in comparisons. Secondly, the ADDT and ADDF do not account for the thermal mixing that lakes undergo prior to freeze-up, introducing considerable error. I understand that with the sheer number of lakes in this analysis that a fully parameterized thermodynamic lake model may not be appropriate, but the authors should a) use reanalysis air temperature with a higher spatial resolution, and b) show that air temperature agrees with in-situ weather stations (e.g. Utqiagvik, Umiat, Inigok, Toolik Lake). This would better constrain the error associated with the ADDT/ADDF.
The authors present quite a large degree of variability in ice free period based on latitudinal gradient, 30 days per degree. There can be quite a bit of variance when modeling ice phenology using in-situ weather station data, including all the lakes within a 30km pixel to have one freeze-up and one melt date reduces this variability. This is likely why there is such a large spread shown in Figure 3. Additionally, based on the spatial clustering of high residuals (upwards of 60 days) in the lakes in and surrounding the Brooks mountain range, those lakes should not be included in the summary of ice free days per degree. The pattern of 30 days per degree is then discussed in Section 4.2. that it is six times that of the broader range of the Canadian Cordillera at 6 days per degree, “consistent with the heightened climate sensitivity of lakes near the northern limits of the seasonal ice zone”. Then the authors state “this sensitivity implies that continued Arctic warming could produce disproportionately large changes in ice phenology for these northern most lakes”. It appears that the baked-in error associated with large deviations in the southern region of the study site has influenced the 30 day per degree ice free days, and sweeping statements like this should omitted.
The manuscript requires revision with respect to the validation (Sentinel-2, ADD) of ice covered regions both in data analysis and text to make the reader believe that the patterns being presented a function of ice dynamics in-situ.
Specific Comments:
Page 2 Line 57: “the degree to which thermokarst lake morphology may influence ice phenology relative to estimates based on ADD metrics remains largely unknown”
I don’t see how you could test how lake morphology affects ADD metrics. There are no inputs to the ADD that would be affected by morphometry. Other lake ice models have a mixing depth or bottom associated with them.
Page 3 Lines 93-94: “we were able to retrieve phenology from 30,862 lakes (99.2%)…
Why not all lakes in the dataset? What was the factor that inhibited the other 0.8% from being obtained?
Page 4 Line 112: “the period January 2019 through December 2023”
Why this time scale? Sentinel-1 and 2 have been in orbit since 2015, and after 2023.
Page 5 Line 119: “-20 to -5 dB”
For this location and others where you refer to the standardization of the symbologies – what justification is there for the max and min and the association made to the different scattering mechanisms associated with them? There needs to be a justification.
Page 5 Line 128: “we classified NDSI values > 0.4 as snow/ice”
Does this mean that the input was then classified into a binary raster?
Page 7 Line 194: This paragraph describes the ADD model
Some equations are needed here to describe the ADD. Also, some justification here for using the ADDT and ADDF instead of a thermodynamic model, what the drawbacks are and the benefit of using it for such a large dataset. Where do you expect the air temperature to exhibit the highest error (e.g. mountainous regions).
Page 7 Line 197, 198: “ranging from DOY 91 to DOY 242”…”DOY 244 to DOY 361”
These are the endpoints of the DOY that the model has constrained as possible days for open water and freeze-up. The output having dates at the first or last potential should warrant a revision in the model to see why those dates are being flagged.
Page 8 Line 210: “After loss of Sentinel 1-B”
This should be discussed in the data section – how did it affect your analysis? When was Sentinel 1-B lost?
Figure 3: Figure 3 and in text does a good job at identifying that outside of observing a lake within a single Sentinel-1 revisit that the error increases substantially. There should also be a figure that shows the spatial pattern of comparison to the ADDT and ADDF estimates. Is this pattern consistent year to year, is the pattern consistent between freeze and melt? If consistent spatial patterns exist latitudinally, that may also drive the very large 30 day per degree reduction in ice free season. Figure S10 appears to show the spatial distribution of the residuals, showing that the largest residuals are in the southern most regions in the Brooks mountain range.
Page 9 Lines 225-226: “we implemented a series of statistical analyses including …”
These statistical tests should not be introduced in the Results section. They should be in a validation or statistical tests subheading in the methods section.
Figure 5: The X and Y axis need to be flipped. The independent variable is latitude, and the dependent variable is the ice-free duration.
Page 15 Line 324: “ice phenology y difficulty”
Typo
Page 15 Lines 327 – 329: “Likewise, the use of ADD to model thermal diffusion through the ice cover…”
You should also mention that the ADD does not account for the thermal mixing that is required of lakes prior to initial skim ice formation. Larger/deeper lakes will take longer to mix and delay ice cover establishment.
Page 15 Line 335: “models in figure 6”
Should be “Fig. 6”.
Page 16 Lines 348-350 and Lines 355 – 358: There are two statements that are overly generalized here:
“This sensitivity implies that continued Arctic warming could produce disproportionately large changes in ice phenology for these northernmost lakes, with consequently disproportionate effects on future green-house gas emissions (Walter Anthony et al., 2018).”
“This indicates that the RF model detects a longer ice-free period for lower latitude lakes and a shorter ice-free period for higher latitudes relative to the ADD estimates, suggesting that the ADD model is more conservative in predicting ice-free duration at lower latitudes whereas at higher latitudes, the RF model is more conservative”
The lakes in the lower latitudes are surrounding the Brooks mountain range, at higher elevations with variation in weather conditions. While this is acknowledged in the paper, the extrapolation that the 30 day per degree would potentially result in considerably larger greenhouse gas emissions does not take into account the different in elevation, situation, weather patterns, precipitation, etc. Latitude alone does not influence ice cover duration. This would be a good circumstance to test to see if the ERA5 air temperature agrees with the Toolik weather station located at the foothills of the Brooks range.
Page 17 Lines 387-388: “these larger and deeper lakes remain ice-free longer because they begin freezing later in the year likely due to their greater thermal inertia”.
And the longer period associated with thermal mixing.
The authors mention the loss of Sentinel 1B – the density of measurements (i.e. repeat passes) must have been reduced as a result. I think that this should be mentioned and identify if it impacts the accuracy of the RF model before and after losing Sentinel 1-B.
Supplemental Figures: Be sure that for the ice-off and ice-on dates in each of the figure captions you reference what analysis output the data is from. For instance, Figure S5: “Box plots of ice-off dates for phenology-detected lakes each year”. Is this the RF output? The ADD output?
Citation: https://doi.org/10.5194/egusphere-2026-1135-RC3
Data sets
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope Nguyen, Lopez, Melara-Valle, Ross, Bradley, Limbeck, Masteller, and Michaelides https://zenodo.org/records/18799073
Model code and software
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope Nguyen, Lopez, Melara-Valle, Ross, Bradley, Limbeck, Masteller, and Michaelides https://zenodo.org/records/18799073
Interactive computing environment
Multi-sensor satellite analysis reveals latitudinal and morphometric controls on ice phenology across 31,000 thermokarst lakes on the Alaska North Slope Nguyen, Lopez, Melara-Valle, Ross, Bradley, Limbeck, Masteller, and Michaelides https://zenodo.org/records/18799073
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 189 | 63 | 16 | 268 | 50 | 9 | 12 |
- HTML: 189
- PDF: 63
- XML: 16
- Total: 268
- Supplement: 50
- BibTeX: 9
- EndNote: 12
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General comments:
As a reader, I would prefer if the information and figures on the RF model were not hidden in the supplement, but moved to the main part of the manuscript as it is an important aspect of the paper.
As I understand your classification set up, you are defining ice off as being completely ice free, you also rely on data from the center of the lake, would that not lead to an ‘early ice off’ bias in situations where remaining ice drifts to the shore? If so, can you estimate how much of an uncertainty this introduces?
You refer several times to ‘a RF classifier trained on Sentinel-2 optical labels’ which I think is misleading. Would it be more adequately described as a RF classifier trained using labels created with the help of Sentinel-2 data? Or something in that direction.
You stress the relationship between ice off date and depth of the lake. I am aware of the lack of data on lake depth on the required scale, have you considered comparing your results to proxy data sets such as bedfast/floating ice data? It might be an interesting exercise.
I am not very familiar with the ADD approach but my impression would be not to interpret the offset between the RF and ADD approaches necessarily as indicators for errors but rather as expected delays between freezing and thawing onset and an ice free or fully ice covered lake. If I interpreted this incorrectly it would be good to clarify your explanations in the paper regarding this issue.
A workflow diagram would help the reader to better follow the work.
Specific comments:
Section 3.3 is missing
Section 3.4: this is an important aspect of the paper and also features heavily in the discussion section, however the corresponding figures are in the supplement. I think they should be moved. It is difficult to jump back and forth between documents.
Figure 5 is unnecessarily huge
Line 118: I do not understand what makes this scaling necessary
144: you say you trained the classifier on backscatter from the interior. Are you using mean values? If not you would have much more pixel values for bigger lakes. If you trained it on the interior values, I assume you are also applying it to the interior values only and can not account for gradual ice development or melt.
Line 324: delete ‘y’
325: also give the time for the ascending track for this to be meaningful
329: the snow depths in this region are usually limited, it should not be a big source of uncertainty in your chosen study region.