the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A physics-informed machine learning (PIML) framework for projecting 21st-century permafrost extent in Northeast China
Abstract. The degradation of marginal permafrost is a sensitive indicator of climate change, with far-reaching implications on regional ecosystems, hydrology, and infrastructure. Located near the southern limit of latitudinal permafrost (SLLP) in Eastern Asia, Northeast China has experienced pronounced permafrost retreat and persistent ground warming in recent decades. This study develops a physics-informed machine learning (PIML) framework that integrates the Temperature at the Top of Permafrost (TTOP) model, observed changes in land use and land cover (LULC), and climate projections from the Coupled Model Intercomparison Project 6 (CMIP6) to improve the understanding and prediction of permafrost dynamics in the region. Results indicate that, under the SSP5-8.5 scenario, permafrost extent may decline by more than 90 % by the end of the 21st century, primarily driven by a sharp reduction in the air freezing index (AFI), especially in high-latitude and high-elevation zones. Land use and cover changes (LUCC), particularly urban expansion and deforestation, further exacerbate ground thermal disturbances. Spatially, mountainous forested areas, such as the Da Xing’anling Mountains, exhibit relatively greater resilience to warming due to dense vegetation and complex topography that help buffer surface energy fluxes. Feature attribution analysis identifies surface temperature, snow cover duration, and vegetation as dominant drivers of permafrost stability, while Uniform Manifold Approximation and Projection (UMAP) clustering reveals distinct degradation trajectories across different land cover types. This study highlights the complex interplay of climatic and anthropogenic factors in permafrost evolution and demonstrates the utility of integrating physical modelling with machine learning to support ecological conservation and infrastructure risk management in cold regions environment.
- Preprint
(6275 KB) - Metadata XML
-
Supplement
(2140 KB) - BibTeX
- EndNote
Status: open (until 04 Jan 2026)
-
CC1: 'Comment on egusphere-2025-4544', Xianglong Li, 20 Nov 2025
reply
-
AC1: 'Reply on CC1', Shuai Huang, 03 Dec 2025
reply
Comment:
I believe the author has done an excellent job, but I think it would be preferable to compare such predictions with permafrost mapping or field survey results from the observed time period.
Response:
Dear Dr. Li:
Thank you very much for your constructive suggestion. We fully agree that comparing our simulation results with existing permafrost maps and field-based evidence is essential for strengthening the reliability of model outputs. In response to your comment, we have added a comprehensive comparison between our simulated permafrost distribution during 2001–2020 and two recently published Northern Hemisphere permafrost maps (Ran et al., 2022; Obu et al., 2019). The newly added content is presented in the revised manuscript (Lines 343–364) and the comparison is illustrated in the newly added Figure 5. Revision as below:
L343-364:
In addition, we compared the permafrost distribution simulated by the MLP model in this study during 2001–2020 with the recently published Northern Hemisphere permafrost maps (as shown in Fig. 5). Across the three permafrost maps, we observed a consistent representation of the widespread permafrost distribution in the Da Xing’anling Mountains, with the SLLP located approximately in the Arxan mountains. However, notable discrepancies occur among studies for the permafrost distribution in the Xiao Xing’anling Mountains, the Hulunbuir Plateau, and the southern mountainous regions (Huanggangliang Mountains and Changbai Mountains). For the Xiao Xing’anling region, our results are more consistent with those of Ran et al. (2022), but differ significantly from Obu et al. (2019). According to Huang et al. (2025), the SLLP in the Xiao Xing’anling mountains is located approximately between Heihe and Bei’an, which agrees well with our simulation. For the Hulunbuir Plateau, our estimation lies between the results of Ran et al. (2022) and Obu et al. (2019). However, due to the limited availability of field observations in this area, further verification is required. Regarding SLLP characteristics, the simulated permafrost distribution near the southern boundary in this study appears more scattered, reflecting the presence of isolated permafrost patches near the SLLP. This pattern is consistent with the actual conditions. With respect to the permafrost in the southern mountainous regions of Northeast China, our results and those of Ran et al. (2022) and Obu et al. (2019) all indicate the presence of permafrost. However, Obu et al. suggest a more extensive permafrost area in the Huanggangliang mountains, whereas both our study and Ran et al. (2022) show a more sporadic distribution. Based on the synthesis by Jin et al. (2025) and field surveys, permafrost in the southern mountainous regions of Northeast China may indeed exist but is difficult to detect; its occurrence is likely controlled by local factors. These findings further support the results of this study.
-
CC3: 'Reply on AC1', Xianglong Li, 03 Dec 2025
reply
I’m very interested in this article, and these additional analyses will greatly enhance the reliability of the predictions. This work is truly meaningful.
Citation: https://doi.org/10.5194/egusphere-2025-4544-CC3
-
CC3: 'Reply on AC1', Xianglong Li, 03 Dec 2025
reply
-
AC1: 'Reply on CC1', Shuai Huang, 03 Dec 2025
reply
-
CC2: 'Comment on egusphere-2025-4544', Guojie Hu, 01 Dec 2025
reply
This study develops a PIML framework that integrates physically based modeling with machine learning and incorporates dynamic land-use/land-cover changes to simulate and project permafrost evolution in Northeast China. The methodology demonstrates a certain degree of innovation. Since the study area is located near the SLLP in East Asia, its permafrost characteristics are regionally representative, and the findings provide valuable regional applicability and scientific insight. The manuscript is overall well written, but minor improvements can be made regarding clarity of expression as well as figure and text descriptions. The specific comments are as follows:
- The abstract predominantly provides qualitative descriptions. It is recommended to include more quantitative results to enhance informativeness.
- In Section 3.2, it is suggested to add comparisons with existing permafrost maps developed for the same region.
- In Figure 7, please indicate the spatial extents corresponding to the Da Xing’anling Mountains, Xiao Xing’anling Mountains, the northern Song-Nen rivers Plain, and the Hulun Buir Plateau.
- Lines 327–328 and 564–565 contain inaccurate wording, as the predictive accuracies of MLP and CatBoost differ depending on the metric used; thus, it is inappropriate to state that both models simultaneously exhibit the best performance.
- In Lines 564–565, MAE is mentioned without prior reference, which seems to be a typographical error where MSE was mistakenly written as MAE.
- The unit of MSE in the manuscript should be °C2 instead of °C.
-
AC2: 'Reply on CC2', Shuai Huang, 05 Dec 2025
reply
We sincerely thank Dr. Hu for the careful review and support of our manuscript. Our response letter is provided in the attachment; please kindly check it.
-
RC1: 'Comment on egusphere-2025-4544', Anonymous Referee #1, 13 Dec 2025
reply
The combination of dynamic land-use projections (PLUS), CMIP6 forcing, the TTOP model, and a machine-learning enhancement labelled as “physics-informed” offers a promising framework. The writing is clear and the regional focus is valuable. However, several critical methodological details, uncertainty treatment, quantitative attribution need to be improved. Therefore, I recommend that the manuscript could be accepted after major revision. I listed my concerns as follows:
Major concerns:
- The PIML component is introduced as the central novelty but remains inadequately described. Readers are not told which ML algorithm was used, how physical constraints from TTOP were explicitly embedded in the training process, what the training/target data were, or how performance was assessed. A schematic and the mathematical formulation of the physics-informed part are needed for transparency and reproducibility.
- Projections are presented without quantitative uncertainty. The headline result (>90 % permafrost loss by 2100 under SSP5-8.5) lacks ensemble spread from the 14 CMIP6 models and confidence intervals. At minimum, ensemble mean ±1σ and results for SSP1-2.6 and SSP2-4.5 should be added.
- The role of land-use/land-cover change is emphasised throughout the introduction but not quantified in the results. A simple control experiment (dynamic PLUS scenario versus fixed 2020 land cover) is required to show how much additional permafrost loss is attributable to LUCC.
- Independent validation of the full modelling chain against recent observations is missing. Present-day skill metrics are essential to support confidence in century-scale projections.
Minor comments:
- Please streamline some lengthy sentences and ensure consistent terminology (LULC/LUCC).
- Please briefly describe the content of repeatedly cited supplementary figures and tables when first mentioned.
- Please consider adding at least one lower-emission scenario to increase policy relevance.
- Please clarify or update citations listed as 2024/2025 that are still in review.
Citation: https://doi.org/10.5194/egusphere-2025-4544-RC1 -
RC2: 'Comment on egusphere-2025-4544', Orgogozo Laurent, 17 Dec 2025
reply
The present work pioneers the taking into account of land use and land cover changes in the simulation of climate change impacts on permafrost. It develops a new workflow by combining CMIP6-based climate projections and cellular automata-based land use projections for building the input data for a simple and classic permafrost model, the TTOP model, and finally Machine Learning procedures are combined with this modelling chain for improving the prediction capabilities and help studying the uncertainties. Then this new workflow is applied to the simulation of permafrost evolution in Northeast China for different scenarios of climate change. The work is novel and timely, and the results are interesting. However some information are lacking about the modeling procedure, and some additional discussions are needed regarding the links between the data-driven and process-oriented approaches. I recommend the manuscript to be accepted for publication after the appropriate revisions would have been made. My general and specific comments for doing so may be found below.
General comments:
The authors do not provide a finalized PIML model, but rather a familly of six PIML models with sometimes significantly different performances. I think that the authors should take the responsibility to make their recommendation about the procedure that should be used in future works that would adopt the methodology developed here.
The calibration / validation of the whole procedure should be better explained, and maybe needs some improvements (see my specific comments l 156-157 below).
I think that the performance of the proposed PIML approach depends greatly on the merits of the used process-based model. This point should be discuss more thoroughly, and it could be the subject of future comparative works.
Finally, the possibilities of applying the developed modelling chain in other permafrost contexts should be better discussed in the conclusions.
Specific comments:
l 87 : Error in the reference ; Tubini et al., 2021
l 90-91 : “Moreover, uncertainties in parameterization and incomplete representation of sub-grid heterogeneity can result in substantial variability in model projections (Groenke et al., 2023; Wang et al., 2024b).” I do agree with this statement, but the references provided for grounding it are not really relevant in this paragraph I think, since they mainly deal with results obtained with empirical or equilibrium models. Recent reviews in physical modelling of permafrost could be consulted for strengthening this point (e.g.: Yang et al., 2021, Hu et al, 2023).
https://doi.org/10.3389/feart.2021.721009
https://doi.org/10.1016/j.catena.2022.106844
l 93, l 97 : “Luo et al., 2024” : I can’t find this reference ? 10.1016/B978-0-323-85242-5.00013-0
l 140-141 : “14 global climate models (GCMs) were selected from the CMIP6 ensemble”. How this selection has been made? If all the CMIP6 models are included, it should be said as such, if not, the rational behind the choice of this specific sub-set should be provided.
l 103 : Citing a preprint is problematic, a regular reference should be provided for this work. Maybe the one below would be the relevant one? To be checked.
P. Pilyugina et al., "A Physics-Informed Machine Learning Framework for Permafrost Stability Assessment," in IEEE Access, vol. 13, pp. 96423-96433, 2025, doi: 10.1109/ACCESS.2025.3573072.
l 156-157 : Section 2.2 very interesting approach for LUCC projection. Two questions regarding the calibration / validation process. First, it seems that the 2000 and 2020 slices has been used both for calibrating and for validating the LUCC projection method. It seems to me that calibration and validation should be done on two different couples of time slices, e.g.: 2000-2010 for calibration and 2010-2020 for validation. So why only considering two dates for this calibration/validation process? Second, the used approach seems to rely on the assumption that the LUCC dynamics have stationary behaviors, so that a projection workflow calibrated on a given period (here 2000-2020) may be used for future projections. How to evaluate the strength of this assumption of stationary behaviors of LUCC?
l 191 : “The empirical parameters nf, nt and rk serve to account for snow insulation, surface energy exchange, and the ratio of soil thermal conductivity in frozen to thawed states, respectively. The model parameters nf, nt and rk were assigned based on LULC classifications.” It seems to me that these parameters should not only depends on LULC, but also on climate (e.g.: mean annual snowfall for snow insulation), topography (e.g.: plain vs mountain for surface energy exchange) and hydrology (e.g.: soil moisture for the ratio of soil thermal conductivity in frozen to thawed states). For instance in all LULC projections there are forested areas both at the South East and North West limits of the region of interest, for which may be climate, topography and hydrology are significantly different. Do all these forested areas share the same nf, nt and rk empirical parameters? If yes, how to evaluate the biases in projected MAGT related to climatic, topographic and hydrological variability?
l 231-232 : “We then constructed a training dataset using the TTOP-estimated MAGT as the target variable, allowing the supervised learning models to capture the relationships between environmental predictors and ground temperature.” I understand that the proposed PIML approach aims at taking into account the variability mentioned in my previous comment. However, using TTOP results as target variable, which means the variable that the ML should reproduce based on the other ones I presume (I am not a ML specialist, sorry if I misunderstood this), I cannot see how the shortcomings of TTOP-estimation itself as put forward in my previous comment could be dealt with.
l 284-285, Figure 1 : “The indices represent the arithmetic averages computed from site-level downscaled data at 225 meteorological stations, using the delta downscale method applied across 14 CMIP6 models.” I think that what ‘averages’ means exactly here should be clarified. I guess that the model specific curves present averages across the 225 sites, while scenario averages are averages across the previous 14 climate models-specific averages (averages of averages). What are exactly the different averages presented in Figure 1 should be explained without ambiguity.
l 288 : “To further examine the spatial dynamics of freeze–thaw responses, we have conducted a case analysis under the low-emission scenario SSP126 (Fig. 2).” Why this one ? This should be explained. In fact I would be interested by the same analysis for the other scenarios, at least also for scenario SSP585.
l 300 : Using an absolute variation visualization rather than a negative one for AFI and a positive one for ATI may improve the readability of the Figure. I would also say that the figure is too information-rich, I would recommend to show only the 2020-2100 deltas, since the other ones are not commented in the body of the text. These are only presentation suggestions.
l 323 : “To evaluate the capability of different ML algorithms in simulating TSP, we have compared model-predicted MAGTs with observed values using six ML methods”. What is exactly observed values here ? I guess each blue dot in Figure 4 corresponds to a given site, for a 1961-2020 multi-annual average, right ? I would also be interested by the performances analyzed by year, the same graphics but with each dot representing the average of the MAGT across the 225 sites for a given year between 1961-2020, in order to visualize temporal variability of performances as well.
l 335-336 : “The high-performing models are subsequently employed to simulate future MAGT patterns and permafrost extent under projected scenarios.” Please list these high-performing models.
l 351: “Projected mean annual ground temperatures (MAGTs) across Northeast China under four SSPs scenarios based multilayer perceptron model.” What are exactly these projections ? Averages of the best performing PIML models selected in the previous section? May be the ‘multilayer perceptron model’ is the answer, but I don’t know what is it, and I think that many readers of TC won’t know either.
l 360 : “the MLP-TTOP model” First occurrence of this acronym/name. It should be introduced earlier in the method section.
l 365 : What is DXAM?
l 366 : What is XXAM?
l 375-376 : “(a) total Northeast China; (b) Xiao Xing'anling Mountains; (c) Da Xing'anling Mountains; (d) northern Songhua-Nen rivers Plain, and; (e) Hulun Buir Plateau” The Figure 7 would be much more interesting if it was providing alongside a map that show the localization and extent of the considered four sub-regions of Northeast China.
Figure 7 : In 7a, why are there increases in discontinuous permafrost area between 2040 and 2080? In 7d, why are there more permafrost in 2040 in SSP370 than in SSP126 and SSP245?
l 379 – 380 : “Fig. 8 summarizes the relative importance of 15 environmental variables across six ML models.” Why not considering here only the best performing PIML-models as selected in section 3.2?
l 389-391 : “In contrast, topographic and edaphic variables, such as slope angle, slope aspect, soil organic matter contents (SOC), bulk density (BD), and sand and clay contents, generally rank lower, though they may modulate soil thermal properties and hydrological processes as secondary controls.” I think it must be said here that these topographic and edaphic variables do have a strong controls on the two most influential variables, Mean annual land surface temperature (MALST) an Land Use Land Cover (LULC). So in fact most likely a (large ?) part of their intrinsic influence is encompassed in the influence of MALST and LULC.
l 423 : “thereby underestimating the scale and impact of anthropogenic disturbances.” LUCC may also derive solely from climate change. Deciphering the scale and impact of anthropogenic disturbances would require here to separate the climate change-induced land use effects and the anthropogenic land use change effects.
l 443 : “a PIML model” In fact several PIML models have been used, and in its present form the study does not make a clear choice about which is the best one that could be preferred to the others. I think that this choice should be made, or this sentence should be reformulated.
l 473-474 : “Rather, such signals reflect the thermal inertia of geocryological system prior to abrupt transitions.” The study convincingly grounds this statement for Northeast China permafrost, but its generalization to other permafrost contexts maybe not straightforward. I would attenuate the wording here for avoiding over interpretation of the results.
l 477 : “4.3 Spatial heterogeneity and resilience of XAP” Please avoid acronyms in titles as much as possible.
l 501-503 : “Hydrogeological conditions further affect thermal stability: long-term drying and surface drainage can increase ground albedo and reduce soil thermal conductivity, thereby cooling the ground.” I would highlight that the effect of forest on soil hydrology is rather complex, since for instance the roots network enhances infiltration while at the same time dries the soil trough evapotranspiration. Moreover, in permafrost contexts, these competing effects interact in a complex manner with active layer dynamics (e.g. : Orgogozo et al., 2019, sorry for the self citation).
https://doi.org/10.1002/ppp.1995
l 566 : “Compared to traditional physically based models, the PIML framework yields superior performance” This study grounds this statement solely for TTOP model, and this should be made clear in this paragraph.
Citation: https://doi.org/10.5194/egusphere-2025-4544-RC2
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 328 | 121 | 35 | 484 | 45 | 14 | 20 |
- HTML: 328
- PDF: 121
- XML: 35
- Total: 484
- Supplement: 45
- BibTeX: 14
- EndNote: 20
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
I believe the author has done an excellent job, but I think it would be preferable to compare such predictions with permafrost mapping or field survey results from the observed time period.