Exploring seismic mass-movement data with anomaly detection and dynamic time warping
Abstract. Catastrophic mass movements, such as rock avalanches, glacier collapses, and destructive debris flows, are typically rare events. Their detection is consequently challenging as annotated and verified events used as training data for instrumentation and algorithm tuning are absent or limited. In this work, we explore seismic mass-movement data through the lens of anomaly detection. The idea is to screen out segments of the data that are unlikely to contain mass movements by focusing only on anomalous signals, thereby reducing the number of signals to be studied, making downstream tasks such as expert labeling and clustering of events easier. To extract anomalous signals, we design a triggering algorithm using an anomaly score computed from an isolation forest obtained from sliding windows taken from the continuous data. The extracted signals are subjected to expert labeling and/or further analyzed by dynamic time warping, a popular technique used to evaluate the dissimilarity between different types of signals. We illustrate our approach by (a) mining for seismic signals of hazardous debris-flows in Switzerland's Illgraben catchment and (b) labeling of seismic mass movement data obtained from a Greenland seismometer network.
The manuscript presents an unsupervised approach for detecting and cataloging seismic signals generated by mass movements, based on isolation forest (IF) anomaly detection combined with dynamic time warping (DTW) for the characterization of signal dissimilarities. The methodology is applied to two case studies: the refinement of an existing debris-flow catalog in Illgraben, Switzerland, and the generation of a new catalog from seismometer data in Greenland. The results are compared against the widely used STA-LTA triggering method, showing that the proposed IF-based approach often outperforms the baseline. The topic is timely and of high relevance to environmental seismology, where labeled data remain scarce and the detection of rare but hazardous events requires robust and automated strategies.
Nevertheless, there are several important weaknesses that must be addressed before the manuscript can be considered for publication. A first and major concern lies in the perplexing incompleteness of the bibliography. The reference list remains very narrow and omits several first-order contributions to both mass-wasting and debris-flow seismology, as well as recent methodological developments in anomaly detection and machine learning applied to seismology (for mass-movement, but also for glaciers, volcanoes, earthquakes etc). Without these references, the manuscript, which is, first and foremost, a methodological paper, does not convincingly situate its contribution in the broader research landscape. A substantial expansion of the literature review is mandatory in order to properly contextualize the approach, demonstrate novelty, and ensure that the article reaches the visibility it deserves within the field.
In addition the manuscript currently places an excessive amount of important content in the appendices, including crucial figures (e.g., D2 or D3, E1, E3) showing seismic signals and examples of detected events. These visual results are central to a seismological study and should appear in the main body of the paper. Beyond this, the exposition of the methodology is at times poorly balanced. Sections presenting detailed mathematical derivations of the methodology, such as the full formalism of the isolation forest, could be more appropriately relegated to appendices. In the main text, a more didactic explanation would be far more valuable. It would make the methodological contribution both clearer and more accessible to the readership. At present, the paper tends to emphasize equations at the expense of interpretability and understandability.
The evaluation of the results, although rigorous, would benefit from a clearer and more accessible presentation. Metrics such as precision, recall, and IoU are appropriate, but their distribution across dense tables makes comparisons difficult to follow. Averaged summary values or graphical representations would make the performance differences between methods easier to grasp. Similarly, the role of DTW, while potentially promising, is not convincingly established. At some stations DTW improves detection, but in other contexts its added value is marginal. A more explicit discussion of the specific conditions under which DTW enhances performance would strengthen the manuscript considerably.
The generalization and scalability of the approach also deserve further elaboration. The manuscript focuses on two case studies, but it would be important to reflect on the applicability of the methodology to larger seismic networks, to other types of gravitational mass movements, and to real-time operational monitoring. A presentation and a discussion of all the hyper-parameters used and their values is mandatory.
Figures and visualization more broadly need to be improved. Beyond the introductory material, the reader is given few direct visualizations of the detection process or of anomaly scores. Examples of time series with IF anomaly scores, accompanied by a schematic representation of the full workflow, would make the study more intuitive and strengthen its appeal for a seismological audience. Related to this, the terminology is sometimes confusing. The distinction between “trigger segments,” “detections,” and "catalog entries" is central but not always presented with sufficient clarity. A clear diagram of the complete processing chain would help avoid such ambiguities.
I think the manuscript presents a promising and relevant study with strong potential impact in environmental seismology, but it requires major revisions in order to address its most significant shortcomings. The bibliography must be expanded to include key references in the field, the structure must be rebalanced to highlight results over technical appendices, the methodology should be streamlined, the role of DTW clarified, and the results presented in a more intuitive and visual way. Without these improvements, the contribution remains incomplete and risks being undervalued in the literature.
Minor comments :
- Larose et al. (2015) focuses exclusively on seismic noise monitoring. There are many other references that would more accurately illustrate the point being made here.
- Bahavar et al. (2019) and Collins et al. (2022) represent significant contributions, but they are not the only efforts (particularly regarding machine learning) which is directly relevant to the present study.
- L21–25: STA/LTA is a detector, not a discriminator. The current phrasing is misleading.
- L28: Replace “see for example” with “e.g.,” followed by citations. More exhaustive referencing is needed to 1) provide the correct context for the study and 2) guide readers to other relevant works.
- L34–36: The bibliography on background noise monitoring is more complete than that on machine learning in environmental seismology, even though the latter is central to this paper…
- L47: Clarify what “vanilla” refers to. In seismology or in machine learning? Many algorithms now exist that combine anomaly detection and classification (e.g., VAEs, contrastive learning).
- Sections 3.1.3/3.1.4: These are methodological and should not appear in the results section.
- The datasets description is insufficient (number of samples, classes distribution, training/validation/test splits).
- L217: Clarify how grid search is performed in an unsupervised context. Does this not undermine the intended advantage of IF as a parameter-free exploratory tool? And the use of IF for true unsupervised exploration ?
- Tables: Highlight best-performing results in bold to facilitate interpretation.
- L272: Provide justification for onset/offset thresholds; where do these “rule-of-thumb” values come from?
- L272–273: Why the “top 50” segments? What if more than 50 are of interest? This seems to be central to your approach, and should be thoroughly discussed
- L283–286: The explanation is unclear. A diagram of the complete processing chain would help. Also specify the inconsistency threshold used in agglomerative clustering.
- L289–290: Define the metric by which segments are “most anomalous.” Provide values. Clarify what is meant by “further emphasized by the agglomerative clustering.”
- Section 3.2.3: Replace dendrograms with examples of seismic signals in clusters (A, B, C and D). For a seismological audience, the waveforms themselves are far more informative.
- L295: The description of four clusters “in increasing order of diversity” reflects a subjective choice. Justify why the dendrogram splits were regrouped this way and acknowledge the subjectivity involved.
- Figure 4 is not useful for the discussion and could be removed or moved to the appendices.
- L346–355: This discussion belongs in the introduction, not in the conclusion.