the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Determining TTOP model parameter importance and overall performance across northern Canada
Abstract. Modelling current permafrost distribution and response to a warming climate depends on understanding which factors most strongly control ground temperatures. The Temperature at the Top of Permafrost (TTOP) model provides a simple, widely used framework for estimating permafrost presence and thermal state, yet its sensitivity to key parameters remains poorly quantified across diverse northern environments. This study evaluates the relative influence of TTOP model parameters using ground and air temperature data from 330 sites across northern Canada. A leave – one – out cross-validation approach combined with random forest analysis was used to assess both model sensitivity and variable importance. Results show that TTOP performance is dominated by freezing-season conditions—particularly the freezing n-factor and freezing degree days—while thaw-season parameters exert less control. Sensitivity patterns vary by region, with thawing parameters becoming more influential where the duration of the freezing and thawing seasons is similar. Machine-learning results highlight the additional importance of thermal offset and mean surface temperatures, emphasizing the importance of substrate properties. While the model generally reproduces observed ground temperatures well, parameters derived from landcover classes were not transferable between sites, underscoring the importance of locally calibrated inputs. Overall, this study clarifies how different climatic and environmental factors shape the accuracy of permafrost temperature modelling and provides practical guidance for improving parameterization in regional and global permafrost models.
- Preprint
(1694 KB) - Metadata XML
-
Supplement
(426 KB) - BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2025-4478', Anonymous Referee #1, 24 Nov 2025
-
CC1: 'Reply on RC1', Philip Bonnaventure, 25 Nov 2025
Thank you for your review and constructive comments, this is much appreciated. We look forward to continuing to improve the paper towards publication.
Citation: https://doi.org/10.5194/egusphere-2025-4478-CC1 -
AC2: 'Reply on RC1', Madeleine Garibaldi, 19 Feb 2026
General comment to both reviewers
We thank the reviewers for their constructive suggestions and comments. We feel they will improve the clarity and strengthen the results of this study. As most of the recommendations are minor, we agree and will implement these changes in the next version of the manuscript with a few of our own suggestions.
Reviewer specific comments
We thank the reviewer for their insightful feedback and constructive comments. Our responses to the specific comments continue below. The original comment is in bold and our responses follow.
It would help to clarify whether leave-one-out cross-validation applies to the random forest or the sensitivity analysis.
The leave-one-out cross validation applies to the sensitivity analysis. We will clarify this in the abstract.
Including a performance metric (for instance the RMSE reported later) could strengthen the abstract.
We will include the RMSE in the abstract as suggested.
“Changing climate” may be a more neutral term than “warming climate.”
We will change to changing climate as suggested.
- The final sentence could more clearly underline the novelty of the pan-Canadian empirical assessment.
We will change the final sentence to “Overall, this study is the first to assess and clarify how different climatic and environmental factors shape the accuracy of permafrost temperature modelling using an empirical dataset from across the Canadian Arctic and provides practical guidance for improving parameterization in regional and global permafrost models.” To highlight the novelty of the dataset and empirical methodology.
L68: Minor typo (“A of the primary challenge”).
We will fix the typo as suggested.
L68–75: The knowledge gap could be stated more explicitly, especially regarding the relative importance of nf, nt, and rk across regions.
We will change these lines to “Few studies, however, have examined the uncertainties arising from mischaracterization of the TTOP model parameters values or the relative importance of each parameter which may vary substantially in different permafrost environments (Way & Lewkowicz, 2018).” To hopefully make the knowledge gap and importance of this information clearer.
L89–105: The random forest paragraph is somewhat general; a shorter description tied directly to the study aims might improve flow.
We will remove the italicized section of the paragraph to focus more on the used of random forest variable importance rankings in permafrost research.
Random forest is a supervised machine learning technique, which combines randomized decision trees with bagging, and aggregates their predictions though averaging or majority vote (Breiman 2001; Biau & Scornet, 2016). Random forest has been used in studies of air quality (Yu et al., 2016; Pendergrass et al., 2022), chemoinfomatics (Mitchell, 2014), ecology (Cutler et al., 2007; Brieuc et al., 2018) and remote sensing (Belgiu & Drăgu, 2016). Recently, random forest has been used in spatial mapping of permafrost presence using environmental predictors (topography, rock glaciers, vegetation, and land surface characteristics) in a variety of environments (Pastick et al., 2015; Deluigi et al., 2017; Baral & Haq, 2020). Random forest also provides variable importance rankings which can be used to either identify important variables for explanatory or interpolation purposes or to identify a small number of variables that provide a good prediction (Díaz-Uriarte & Alvarez de Andrés, 2006; Grömping, 2009; Genuer et al., 2010).
L106–111: The objectives could be phrased to highlight the combined use of sensitivity analysis and machine-learning-derived importance.
We will combine the first and second objectives to the highlighted text. The objectives of this study are: (1) to use both a sensitivity analysis and machine learning to evaluate TTOP model parameter importance to incremental changes in parameter value and (2) to assess the accuracy of the TTOP model using measured parameters across the permafrost regions of Canada.
L205: A brief rationale for using percentile substitution would clarify this choice.
We chose to use percentile substitution as it allowed us to increase and decrease parameter values while avoiding erroneous negative values. This method also kept the parameter change within the measured distribution, therefore keeping them more realistic than a plus/ minus change in magnitude. Additionally, percentile substitution allows for a direct comparison of sensitivity to each parameter across regions as the parameters at all sites were changed to the same value. The main disadvantage of this sensitivity testing method is that the values of the percentile are directly dependent on the range and spread of the data. As a result, the direct comparison of parameter sensitivity may not be possible, the step change for the individual observations were not the same. Therefore, the corresponding difference in the perturbed and reference TTOP may be higher for one parameter due to a larger change in value rather than an increased model sensitivity. However, this is likely not the case for this analysis as each TTOP model parameter had a large range of values since sites from a wide range of climatological and environmental conditions were sampled.
We will include some of this rationale in the manuscript text.
L215–225: The target variable used in the RF models is worth specifying.
The target variable was the mean annual ground temperature at the top of permafrost (where present) or the bottom of seasonal frost where near surface permafrost was absent. We will include this in the random forest methods section as recommended.
L226–235: Commenting on RF repeatability and possible predictor correlations would strengthen this section.
We will include that some of the variable importance rankings may change as RF output may not be perfectly repeatable, however in several random forest runs the most important and least important parameters were consistent even if they were not in the exact same order each time. We will also include comments on the potential correlations between parameters and how this might impact the results, similar to what we have done in the discussion. We note that the reviewer has included this comment in the methods section, but we prefer to include this in the discussion section perhaps in the uncertainty section.
L236–241: Listing the performance metrics used (RMSE, bias, R²) and whether they are per site or per site-year would aid transparency.
Currently, we base the performance off RMSE per site and per year (each year at all sites were treated as an observation) but can include r2 and bias if necessary. We will include a sentence explaining this in the recommended methods section.
L242–252: Adding mean ± SD to Table 4 would contextualize the percentile ranges.
We will include a column for mean ± standard deviation in Table 4.
L296–307: Clarify whether the importance values are averaged across sites or regions and consider noting the correlation between TO and rk.
The importance values were averaged across sites for the overall importance and then averaged across sites within the regions for the regional importance. We will include correlation values between the variable in this section as recommended. Perhaps by highlighting the correlations with the most influence on the rankings and including the rest in a supplemental data section.
L322–331: A simple correlation (e.g., Spearman) between sensitivity and RF rankings could help compare approaches.
We will run a Spearman correlation for the parameter importance rankings between the two methods as suggested.
L332–341: The validation could be complemented with R² or NSE, and a brief note on the permafrost classification criterion.
We will include r2 and bias in this section. We classified permafrost occurrence where the model predicted annual mean ground temperatures below 0° at the top of permafrost while non permafrost (or where near surface permafrost has degraded) was where the model predicted temperatures above 0 °C. We will clarify this in the text.
L351–366: A short explanation of why nf dominates (snow–air coupling) could enhance clarity.
We will add these sentences to the discussion to address this comment. “Additionally, nf represents the impact of freezing season air temperature and snow depth which is highly important influence on the ground thermal regime and therefore permafrost occurrence in the discontinuous zone (Riseborough & Smith ,1998; Smith & Riseborough, 2002). Therefore, it is unsurprising it was consistently ranked as an important parameter.”
L387–406: A bit more context on why TO and rk rank highly in RF, including their correlation, would aid interpretation.
We believe that the importance of these parameters in the RF with all parameters is skewed based on the highly correlated nature of the other parameters to each other since in the RF with only TTOP model parameters rk is not important. In the random forest rk and TO are only highly correlated with each other and not the other parameters. Potentially, their elevated importance may also come from the smaller number of High Arctic sites compared to the sensitivity analysis, though we think the correlation is the more likely explanation.
As suggested in our response to another comment, we will provide correlation data for all of the parameters in a supplemental data section and can highlight the values in this section as well. To hopefully make this clearer.
L437–463: The uncertainty section is strong; providing the correlation matrix and distinguishing sensitivity from statistical importance could add clarity.
As in previous responses we will include a correlation matrix. We will also aim to differentiate between sensitivity and statistical importance using the statistical differences presented in Table 5.
Citation: https://doi.org/10.5194/egusphere-2025-4478-AC2
-
CC1: 'Reply on RC1', Philip Bonnaventure, 25 Nov 2025
-
RC2: 'Comment on egusphere-2025-4478', Anonymous Referee #2, 26 Jan 2026
It was a pleasure to read this manuscript.
The manuscript includes a well-crafted introduction and provides an appropriate overview, incorporating both key studies and recent advancements in the field.
The study's objectives, methods, and results are clearly defined and effectively presented. It is well-written, accompanied by well-designed figures, and easy to follow.
The study highlights that the TTOP model's performance is most sensitive to freezing-season parameters, particularly the freezing n-factor and freezing degree days, and clarifies the relative importance of climatic and environmental factors. It offers practical recommendations for improving model parameterization at different scales and locations. It also emphasizes the significance of in situ data in validating parameters for accurate predictions of permafrost and ground temperatures.With the incorporation of the suggestions mentioned below in a revised version, I recommend it for publication in TC.
General comment:
The manuscript highlights that parameters derived from landcover classes are not transferable between sites, emphasizing the need for locally calibrated inputs. Expanding on how calibration methods can be improved across regions could strengthen the practical applicability of the findings.
Specific comments:
P10, L148-150: It would be useful to include (here or in the discussion) a short description on the sensitivity of the variation in measuring depth (within the 2 to 5 cm range) for calculation of the ground surface temperature variables (MAGST, FDDs, and TDDs) and how it may influence the results.P10-11: Regarding the different data loggers used, you should include the specific manufacturer name, city, and country of origin etc.
P27, L424-436: The Southern NWT and Southern Yukon-Northern BC regions contained several outliers with significant errors. Expanding the discussion on this result would be beneficial.
P37, L753-L759: This reference includes some misspelled names and an incomplete co-author list. Please verify it against the the name list from the original report.
Citation: https://doi.org/10.5194/egusphere-2025-4478-RC2 -
AC1: 'Reply on RC2', Madeleine Garibaldi, 19 Feb 2026
General comment to both reviewers
We thank the reviewers for their constructive suggestions and comments. We feel they will improve the clarity and strengthen the results of this study. As most of the recommendations are minor, we agree and will implement these changes in the next version of the manuscript with a few of our own suggestions.
Reviewer specific comments
We would like to thank the reviewer for taking the time to read our manuscript and provide helpful and constructive comments. Our responses to the specific comments are found below. The original comment is bold and our response follows.
The manuscript highlights that parameters derived from landcover classes are not transferable between sites, emphasizing the need for locally calibrated inputs. Expanding on how calibration methods can be improved across regions could strengthen the practical applicability of the findings.
The lack of transferability likely stems from differences in snow depth and morphology between landcover classes across Canada, which is why using external nf values always resulted in higher errors than nt (although this could also just be a product of the relative importance of the parameters as well). As snow depth is difficult to predict and model especially at national scales, using landcover classes will likely remain the main way to prescribe empirical parameter values when modelling. Additionally, using landcover as a proxy even when parameter values were outside of the region still resulted in a smaller error range than when values from the entire dataset (regardless of landcover type) were used. So, it appears to still be a viable option, especially when snow conditions are similar, likely in locations with similar climate. However, caution should still be used when assuming parameter values are transferable and perhaps when this approach is used the uncertainty should be more clearly stated especially when used at a national scale.
Additionally, it should be noted that there is variability within the landcover classes regionally, which can also lead to errors.
We will add more of this context to the discussion.
P10, L148-150: It would be useful to include (here or in the discussion) a short description on the sensitivity of the variation in measuring depth (within the 2 to 5 cm range) for calculation of the ground surface temperature variables (MAGST, FDDs, and TDDs) and how it may influence the results.
Difference in the ground surface sampling depth from 2 to 5 cm likely did not impact the resulting analysis. The average change is -0.003 °/cm which is orders of magnitude less than the accuracy and precision of the loggers used. Even over 5 cm the temperature change would still only be -0. 02 °C still outside of the logger accuracy and precision as noted in the manuscript. In a previous paper, Garibaldi et al., 2021 we were also asked this by a reviewer and showed in our response that the temperature change over 3 cm in the Canadian High Arctic had no impact on the resulting analysis (though this was not included in publication). We will add a sentence in the manuscript stating that the difference in surface sampling depth had no impact on the results.
P10-11: Regarding the different data loggers used, you should include the specific manufacturer name, city, and country of origin etc.
We will include the manufacturer name in the text. In previous publications using similar loggers the city and country typically are not included within the text. As we use several different logger types (with different manufacturers) we feel this information may unnecessarily hurt the flow of the paper. However, we can include this information in the supplemental data if required or perhaps this could be included as a footnote within the paper.
P27, L424-436: The Southern NWT and Southern Yukon-Northern BC regions contained several outliers with significant errors. Expanding the discussion on this result would be beneficial.
The Southern NWT outlier point was one year at a site with relatively warm ground conditions during the freezing season leading to a low nf (0.1) compared to the other 12 years (0.46 on average). However, this warm winter did not impact the ground temperature as the annual mean ground temperature increased slightly during this year but remained comparable to the other years. As a result, the TTOP model produced a larger error for this site during this year. We will add this to the discussion.
The Southern Yukon-Northern BC main outliers were a small calculation mistake (and are no longer be outliers) and will be corrected in the next version of the figure.
P37, L753-L759: This reference includes some misspelled names and an incomplete co-author list. Please verify it against the name list from the original report.
Thank you for pointing out the omission of one of the lead authors and this will be corrected. For these reports normally only the lead author team is listed in the citation (contributing authors are not included as indicated in the suggested citation provided in the report). The correct citation to be included in the revised reference list is:
Romanovsky, V., Isaksen, K., Drozdov, D., Anisimov, O., Instanes, A., Leibman, M., McGuire, A.D., Shiklomanov, N., Smith, S., & Walker, D. Chapter 4, Changing permafrost and its impacts. In Snow, Water, Ice and Permafrost in the Arctic (SWIPA) 2017. Arctic Monitoring and Assessment Program (AMAP) Oslo, Norway. 65-102. ISBN 978-82-7971-101-8, 2017 https://www.amap.no/documents/doc/snow-water-ice-and-permafrost-in-the-arctic-swipa-2017/1610
Citation: https://doi.org/10.5194/egusphere-2025-4478-AC1
-
AC1: 'Reply on RC2', Madeleine Garibaldi, 19 Feb 2026
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 372 | 170 | 39 | 581 | 43 | 14 | 21 |
- HTML: 372
- PDF: 170
- XML: 39
- Total: 581
- Supplement: 43
- BibTeX: 14
- EndNote: 21
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
General Comment
This manuscript presents a rigorous and data-rich evaluation of the Temperature at the Top of Permafrost (TTOP) model, combining a deterministic sensitivity analysis with a random forest variable-importance approach. Using air and ground temperature data from 330 sites across northern Canada, the authors assess how key TTOP parameters (n-factors, degree days, and the thermal conductivity ratio) influence model performance and transferability. The integration of empirical sensitivity testing with machine-learning diagnostics represents a methodological advance in permafrost modelling using the TTOP approach.
The paper is well structured, methodologically sound, and clearly written. The combination of analytical and data-driven approaches is innovative. The study provides valuable empirical evidence on parameter importance and offers practical recommendations for improving permafrost model parameterization. Overall, this is a high-quality contribution deserving publication after a few corrections based on my comments below.
Specific comments
Abstract
- It would help to clarify whether leave-one-out cross-validation applies to the random forest or the sensitivity analysis.
- Including a performance metric (for instance the RMSE reported later) could strengthen the abstract.
- “Changing climate” may be a more neutral term than “warming climate.”
- The final sentence could more clearly underline the novelty of the pan-Canadian empirical assessment.
Introduction
L68: Minor typo (“A of the primary challenge”).
L68–75: The knowledge gap could be stated more explicitly, especially regarding the relative importance of nf, nt, and rk across regions.
L89–105: The random forest paragraph is somewhat general; a shorter description tied directly to the study aims might improve flow.
L106–111: The objectives could be phrased to highlight the combined use of sensitivity analysis and machine-learning-derived importance.
Methods
L205: A brief rationale for using percentile substitution would clarify this choice.
L215–225: The target variable used in the RF models is worth specifying.
L226–235: Commenting on RF repeatability and possible predictor correlations would strengthen this section.
L236–241: Listing the performance metrics used (RMSE, bias, R²) and whether they are per site or per site-year would aid transparency.
Results
L242–252: Adding mean ± SD to Table 4 would contextualize the percentile ranges.
L296–307: Clarify whether the importance values are averaged across sites or regions, and consider noting the correlation between TO and rk.
L322–331: A simple correlation (e.g., Spearman) between sensitivity and RF rankings could help compare approaches.
L332–341: The validation could be complemented with R² or NSE, and a brief note on the permafrost classification criterion.
Discussion and Conclusions
L351–366: A short explanation of why nf dominates (snow–air coupling) could enhance clarity.
L387–406: A bit more context on why TO and rk rank highly in RF, including their correlation, would aid interpretation.
L437–463: The uncertainty section is strong; providing the correlation matrix and distinguishing sensitivity from statistical importance could add clarity.