the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
AI-Based Tracking of Fast-Moving Alpine Landforms Using High Frequency Monoscopic Time-Lapse Imagery
Abstract. Active rock glaciers and landslides are critical indicators of permafrost dynamics in high mountain environments, reflecting the thermal state of permafrost and responding sensitively to climate change. Traditional monitoring methods, such as Global Navigation Satellite System (GNSS) measurements and permanent installations, face challenges in measuring the rapid movements of these landforms due to environmental constraints and limited spatial coverage. Remote sensing techniques offer improved spatial resolution but often lack the necessary temporal resolution to capture sub-seasonal variations. In this study, we introduce a novel approach utilising monoscopic time-lapse imagery and Artificial Intelligence (AI) for high-temporal-resolution velocity estimation, applied to two subsets of time-lapse datasets capturing a fast-moving landslide and rock glacier at the Grabengufer site (Swiss Alps). Specifically, we employed the Persistent Independent Particle tracking (PIPs++) model for tracking and the AI-based LightGlue matching algorithm to transfer 2D image data into 3D object space and further into 4D velocity data. This methodology was validated against GNSS surveys, demonstrating its capability to provide spatially and temporally detailed velocity information. Our findings highlight the potential of image-driven methodologies to enhance the understanding of dynamic landform processes, revealing spatio-temporal patterns previously unattainable with conventional monitoring techniques. By leveraging existing time-lapse data, our method offers a cost-effective solution for monitoring various geohazards, from rock glaciers to landslides, with implications for enhancing alpine safety and informing climate change impacts on permafrost dynamics. This study marks the pioneering application of AI-based methodologies in environmental monitoring using time-lapse image data, promising advancements in both research and practical applications within geomorphic studies.
- Preprint
(1655 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on egusphere-2024-2570', Anonymous Referee #1, 01 Oct 2024
This paper presents an interesting approach to monitoring fast-moving landforms, such as landslides and rock glaciers, using monoscopic time-lapse imagery and deep learning algorithms for feature matching and tracking. The authors applied their methodology to a case study at the Grabengufer site in the Swiss Alps, employing two low-cost monoscopic time-lapse cameras with relatively low image quality and significant camera-to-object distances. Utilizing the Persistent Independent Particle tracking (PIPs++) method, they effectively tracked distinctive features, overcoming challenges such as occlusions due to weather and object deformation. By tracking these features over time, they derived pixel displacement vectors illustrating movement in the rock glacier. Additionally, they used synthetic images generated from UAV point clouds with known extrinsic and intrinsic parameters to estimate the camera pose and interior orientation for the first image and scale the displacement vectors. This method, called the image-to-geometry approach, was previously published by the same group but only briefly detailed in this paper.
The code for feature tracking using PIPS++ and a sample dataset are provided as an open source to enhance reproducibility. However, the absence of code for the image-to-geometry method is a limitation.
Overall, this paper stands out as one of the first studies to apply state-of-the-art deep learning feature tracking algorithms from computer vision to geomorphology and geoscience. The image-to-geometry approach to georeferencing velocity fields is particularly relevant, presenting a viable alternative to traditional DEM back-projection, provided high-quality RGB point clouds are available.
Despite the paper's strong contributions, several areas could benefit from further clarification to enhance reader understanding and reproducibility in similar future studies. My primary concern is the image-to-geometry approach, which warrants a more thorough description. Given its importance in the proposed approach, I believe that the authors should summarize the main workflow with a step-by-step (synthetic) procedure, as it is really hard for a reader to understand the method without reading all the other papers.
Specifically, I recommend the following:
1. Once features on the real images are matched with those in the synthetic images using Lightglue, how are the 3D coordinates of the points retrieved for space resection? Are you doing a back-projection/ray-tracing to the point cloud/DSM/mesh, or did you store the original 3D coordinates of the points generating each pixel of the synthetic image? A brief explanation here would be valuable.
2. Are the authors using the image-to-geometry method to "scale" or to "georeference" the velocity vectors? Given the availability of full camera exterior and interior orientations, I suppose you are doing more than just “scaling”. Please clarify this aspect.
3. Please, add a few comments for comparing the image-to-geometry approach with a more traditional DEM-back projection, which is nowadays the most widely used approach to georeference velocity fields from monoscopic cameras.
Additionally:
4. The number of points used to compute the average of the image-based velocity in section 5 (e.g., Figure 6) is significantly different. Did the authors consider all features tracked in the two areas, regardless of their distance from the reference GNSS/theodolite measurement points? It may be more reasonable to average points within a certain radius to account for spatial variability in the phenomena.
5. Did the authors assess the pixel-level noise of the PIPS++ tracking in stable areas? This assessment is similar to what is stated in the caption of Figure 5 for the “scaled” vectors and it would help distinguish noise from tracking and that introduced by vector scaling. Additionally, I suggest moving this information from the figure caption to the main body of the paper, as it is relevant.
I have included additional minor comments directly in the PDF for further consideration. Overall, this paper presents a high-quality contribution to the field, and addressing these points would significantly enhance its clarity and impact.
-
RC2: 'Comment on egusphere-2024-2570', Anonymous Referee #2, 03 Oct 2024
This paper introduces some useful improvements for a well-known method in environmental monitoring. The increased degree of automatization and the AI supported multi frame approach are a valuable contributions, making the approach more robust and user friendly. In widest parts it is well written and good to follow.
Nevertheless, major changes are necessary on the manuscript. The study claims pioneering and considerable improvements in certain other aspects but remains completely without proof. The validation of the data is very weak. The accuracy of the method (also under different setups / image geometries) remains widely open. The landform-wide mean values for displacement velocities are rather meaningless for this. I sorely missed a point-to-point comparison of individual displacements measured with in situ systems (ground truth) and image tracking at one certain location. Validation includes not only the displacement magnitude but also the displacement direction, especially if the independent detection of displacement direction is highlighted as a benefit of the method. This must be made up for and an honest accuracy analysis of the measured displacements (stable areas are not that relevant because the key problem is scaling!) has to be carried out, discussing different setup scenarios of camera / pointclouds and monitoring object. See also all the detailed comments. That will reveal a lot of error sources and problems which are not discussed here at all. (E.g. topography and local horizons will change over time but you use a single pointcloud to claibrate your model and many other problems)
You might notice that I was a bit annoyed in the beginning by the many references to climate change. Of course, this is an important topic and of course you can refer to it but not at any cost when it misses a clear connection. Although the overall language is good you must be careful in using the right terminology and to formulate precisely.
Abstract
Line 8, First Sentence in Abstract: A vague and phrase-like statement. Personally, I do not like the term permafrost dynamics in this context, what exactly is this? Dynamics in rock glaciers are different then in landslides (perhaps you want to specify which type of landslides you are talking about?). Active rock glaciers are indicators of permafrost dynamics. Why are they critical indicators? Landslides in permafrost are mostly affected by several different processes and there are only a few documented cases, which show a direct link between landslide activity and permafrost characteristics e.g. rock temperature or permeability. Filtering the permafrost signal out of landslide kinematics is in most cases too difficult to call them a “critical indicator”. Yourself write later that you don’t know what drives the upper landslide. The same if you claim these landforms represent the thermal state of permafrost. On a decadal scale, most of the rock glaciers accelerated (Some of them also decelerated) and do thus roughly reflect the long-term thermal state of permafrost . However, on a shorter time scale it is impossible the establish a correlation between permafrost thermal state and rock glacier velocity, e.g. because water plays such an important role. For landslides it becomes even more difficult, I do not see how we can substantially conclude from landslide activity on the thermal state of permafrost. There might be a political incentive to write “climate change” in the first sentence of each paper but there is no scientific need.
Line 22: Sorry to come with this again, but why do you need a super high spatiotemporal resolution to reveal climate change impacts? Don’t you need particular long time series to show the effect of climate change?
Intro
Introduction is rather long with some repetitions, particularly towards the end. It should be shortened.
Feature tracking in time lapse imaging is meanwhile a widespread, operationally applied method in natural hazard management. Many engineering offices are using it. You should mention this.
Line 28/29: better here in this context but I would rather write: “… internal structure, and it reflects long term, temperature driven changes in permafrost structure.”
Line 29: “Creep rates/velocity” instead of “flow speeds”
Line 29: “occur towards the lower permafrost limit…”
Line 30: “the acceleration of these processes becomes more pronounced as rock glacier creep rates increase in a warming climate”
Which acceleration? Strange sentence makes no sense. Language.
Line 32/33: What are permafrost related creep features? Landslides do not creep per definition. Be careful to use the correct terminology.
Line 35/36: As written before, this does not make sense to me. First and foremost, monitoring is important for the safety because it helps to prevent hazards. Moreover, your high-resolution data is interesting for research because it helps process understanding. This is the core of your study! Impacts of climate change become evident when you monitor over more than 3 decades and the connection to the thermal state is complex. This is rather far from your study and there is no need to link it at all costs here. Use creep only if you mean creep.
Line 37: The same is true for very slow rock glaciers…
Line 37-58: All true but mention also disadvantages of the time lapse method, E.g. weather dependence, constantly changing image extents/complex distortions in many existing time series. Depending on the setup very limited accuracy when transforming displacements in the metric 3D space….
Line 53-54: The cited cases are very different from the current one. It is possible to measure distances in images when projecting them on a 3D model, however dependent on the setup (camera calibration and orientation, line of sight, object geometry, resolution of 3D model and so on) the accuracy is very limited and a reasonable accurate scaling of displacements from image to world coordinates can fail. Especially in the close range, at local horizons, in front or behind terrain which is shadowed in the image. You didn’t prove the opposite in your study…
Line 57: I don’t think that is true. How do you know? Cameras just have a much longer history of widespread application than permanent GPS devices, which came up just about 10 -15 years ago.
Site description
What makes the upper half a landslide? Its surface structure could tempt you to call it a rock glacier too, doesn’t it? Perhaps you can explain the characteristics of the two landforms in more detail. Is the landslide ice saturated?
Methods
Line 127: mm per day is more common.
Line 143: “independent of temporal prior” ??
3.1 I found it hard to understand what kind of trajectories are calculated for the multi-frame batches. Is there one trajectory per batch and one Trajectory position per frame? And how exactly are the trajectories “estimated” in case of fog? Is there kind of an interpolation or just a gap in the trajectory for the foggy frame? Or a calculation over a longer period? Please explain a bit more detailed.
Line 160: Distortions due to camera shifts can be much more complex than simple offsets. Due to rotations or depending on the camera sometimes even slight changes in focus, the distortions are often spatially differential and only a part of it can be corrected by simple adjustments.
Line 196: What means directly linked? There are of course other error sources than just the accuracy of the point cloud.
Line 215: Why only in the stable areas? The deciding thing would be to analyze accuracy in the moving part with ground truth!?
Results
Figure 5: What means accumulated velocity? Do you mean mean velocity? Theodolit points are the black white labeled points? How was the noise level defined? In Figure 5c there is still a lot of movement above the noise level in the stable areas.
Line 238: I do not consider that as more interesting. Geoferencing of locations in an image is one thing. However, a crucial step is the scaling of displacements from pixel to global coordinates. If you do not detect change in image coordinates (or barely any change) there is nothing to scale and this major error source does not become influential. Accuracy analysis of significant displacements is thus the deciding and most interesting validation step of your method.
Validation part in general
This part is very weak. What you show in Figure 6 is not a solid validation of your work as you compare spatial averages with single point GPS measurements or conspicuous average values from multiple TPS points. To proof your concept, you have to compare single in situ measurements with local trajectories from your images in the near vicinity of this in situ measurement! The current validation isn’t convincing at all.
Discussion
Line 261 ff: “The results of our workflow show a good agreement with dGNSS, theodolite and permanent GNSS measurements, proving our method to be reliable, robust and fast for creating a better spatial (Fig. 5) andtemporal coverage (Fig. 6) of the landform’s displacement”
This subjective statement isn’t supported by the presented data.
Line 266: This conclusion is not new but was made by many authors before.
Line 269: You do not know if where this discrepancy originates from. The displacement is not uniform over space and perhaps the GPS is located in a faster area but not on a surfing boulder as you say. As said above, comparing a single GPS with the average velocity of the entire landform is not a purposeful approach.
Line 294/295: This statement is not supported by your data and most likely wrong. In theory your method might be able to calculate 3D trajectories in world coordinates, but I assume strongly that these trajectories deviate considerably from trajectories measured by GNSS or Total station at the same location. Be aware that displacement directions are even more prone to error influences than displacement magnitudes. You do not validate your displacement directions from image tracking with ground truth at all, how can you make such a conclusion then? You not even compared displacement magnitudes on a specific location between image tracking and ground truth…
Line 291: only to a certain degree as mentioned above.
Limitations:
Line 311: True, moreover the RGB information for the point clouds might be difficult to acquire in very steep terrain (Rock walls). This is not discussed at all.
Generall: You whole methods depends on one point cloud. If surface geometry is changing, what is obviouly the case here. Your displacement scaling and the calculation of displacement directions is distorted or will fail, depending on the size of terrain change.
355: Accurate? How do you know?
362: Not really validated.
Citation: https://doi.org/10.5194/egusphere-2024-2570-RC2
Data sets
Github repository Hanne Hendrickx https://github.com/hannehendrickx/pips_env/tree/main/Data_Sample
Model code and software
Github repository Hanne Hendrickx https://github.com/hannehendrickx/pips_env
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
221 | 63 | 75 | 359 | 5 | 6 |
- HTML: 221
- PDF: 63
- XML: 75
- Total: 359
- BibTeX: 5
- EndNote: 6
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1