Global Attention of Transformer Empowers Montane Periglacial Lake Identification
Abstract. Montane periglacial lakes, as sensitive indicators of cryospheric change, are undergoing rapid expansion under global warming. Investitating their evolving distribution is essential for monitoring climate understanding impacts and assessing associated geohazards. The complex topography and heterogeneous landscapes in high-mountain regions pose significant challenges for conventional methods, leading to the underdetection of small lakes, elevated false positive rates, and limited ability to discriminate between lake formation types. This study introduces a Vision Transformer (ViT)-based framework for montane periglacial lake identification, employing a two-step process of lake boundary segmentation and type classification. By leveraging ViT’s global attention mechanism, the framework captures long-range spatial and spectral relationships, enhancing contextual understanding of lakes and their surroundings. Compared to CNN-based models, the ViT-based approach achieved a mean intersection over union (MIoU) of 91.01 % for segmentation and an F1-score of 89.75 % for classification. It significantly improved detection of small lakes (as small as 0.0001 km2), reduced artifacts from shadows, snow, ice, and river fragments, and provided more accurate lake type classification. Applied to the Southeastern Tibetan Plateau Gorge Region, a region with high glacial lake density and outburst flood risks, the framework identified 3,266 lakes (1,708 glacial and 1,558 non-glacial), surpassing existing inventories in completeness and accuracy.