the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A study on the effect of input data length on deep learning-based magnitude classifier
Abstract. The rapid characterisation of earthquake parameters such as its magnitude is at the heart of Earth-quake Early Warning (EEW). In traditional EEW methods the robustness in the estimation of earthquake parameters have been observed to increase with the length of input data. Since time is a crucial factor in EEW applications, in this paper we propose a deep learning based magnitude classifier and, further we investigate the effect of using five different durations of seismic waveform data after first P-wave arrival– 1s, 3s, 10s, 20s and 30s. This is accomplished by testing the performance of the proposed model that combines Convolution and Bidirectional Long-Short Term Memory units to classify waveforms based on their magnitude into three classes– "noise", "low-magnitude events" and "high-magnitude events". Herein, any earthquake signal with magnitude equal to or above 5.0 is labelled as high-magnitude. We show that the variation in the results produced by changing the length of the data, is no more than the inherent randomness in the trained models, due to their initialisation.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(0 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Journal article(s) based on this preprint
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-4', Filippo Gatti, 13 Jun 2022
The authors of the manuscript entitled "A study on the effect of input data length on deep learning-based magnitude classifier" present a very interesting contribution into addressing some crucial aspects of the automated techniques (powered by deep-learning) to estimate the earthquake magnitude. Despite the large perspectives that deep-learning techniques provide to the EEW framework, the choice of the right neural architecture, the dataset preparation and interpretation of results is crucial into determine their success. The authors have addresses many of the concerns that this sensitive deep-learning application has, showcasting a powerful neural classifier that can discriminate between noise, low-magnitude and high-magnitude earthquakes. Still, some open questions remain, which is why this reviewer suggested a major revision, so to address them more into detail and so to promote the scientific debate on it.
As metnioned, the manuscript is very interesting and it raises many questions about the decisions the authors made to perform their classification benchmark. The list of comments below is non-exhaustive: please, refer to the attached reviewed manuscript for further comments.
- The decision boundary between low-magnitude and high-magnitude is rather arbitrary, as stated by the reviewers and as shown in Fig.7a. Plus, It highly depends on the seismic context of interest and the risk assessment and vulnerability policies of each country/region. It would be interesting to test at least one another decision boundary or at least test somehow the sensitivity of the classifier to this choice.
- The effect of the source-to-site distance seems to have been disregarded. Maybe, separating the waveforms in different bins, based on source-to-site distance, could unveil some interesting aspects of the classification performance. Some comments on it would be beneficial to the manuscript overall clarity.
- Sometimes, it's rather useful to analyze the waveforms at stake in the Fourier's spectrum domain. The corner frequency is strictly related to the source spectrum, which mostly determines the magnitude (along with the distance) In this case, this reviewer suggests to check the spectrogram of the classified waveforms, so to verify that the duration is compatible with the associated frequency corner value for the correspondent moment magnitude (see the statistical relationship between corner frequency and moment magnitude presented by Courboulex F, Vallée M, Causse M, Chounet A (2016) Stress-drop variability of shallow earthquakes extracted from a global database of source time functions. Seismol Res Lett 87(4):912–918
- Have the authors considered the earthquake type when preparing the dataset? A comment on this aspect would be very interesting.
There are no technical corrections, besides the need to detach the unity of measure from the number (ex. 10 s and not 10s)
-
AC1: 'Reply on RC1', Megha Chakraborty, 05 Sep 2022
The authors thank the reviewer Dr. Filippo Gatti for his feedback.
Listed below are the responses to individual comments:
- We have experimented with decision boundaries of magnitude 3 and 4. The accuracy, precision and recall values were found to be similar to what has been presented in the manuscript and did not show any clear dependence on the length of input data. A comment on this will be added to a revised version of the manuscript.
- The authors agree with the point raised by the reviewer and thank him for addressing it. We have analyzed the model performance for different source-to-site distances and observed that the model is indeed capable of performing reliably over a wide range of hypocentral distances. In other words, no clear dependence between the model performance and hypocentral distance can be observed. Shown below is the relevant figure which can also be added in a revised version of the manuscript.
- We noticed that the model is capable to perform correct classifications over a wide range of hypocentral distances and magnitude ranges suggesting that it is capable of learning the frequency characteristics of the waveforms.The use of Fourier spectrum in addition to waveform data was tested during our initial experiments, and it achieved results comparable to the model which used only waveform data as input waveform.
- We have analyzed the effect of hypocentral distance (figure above) and SNR (figure below) on the model performance. While we do not see any clear dependence on hypocentral distance, the SNR of the data seems to play a role in the classification of waveforms. The relevant plots will be included in a revised version of the manuscript. On the other hand, due to unavailability of the Information on focal mechanism in the metadata we were not able to experiment with this. However, the role of the earthquake source type could be considered further in a separate study.
Citation: https://doi.org/10.5194/egusphere-2022-4-AC1
-
RC2: 'Comment on egusphere-2022-4', Anonymous Referee #2, 31 Jul 2022
"general comments"
This paper is trying to develop rapid magnitude classification method by using Deep Learning. This topic is important for rapid threshold warning in EEW. The objective, data, results are fairly documented while the reviewer has questions described in the specific comments section.
"specific comments"
Comments for the last paragraph in 'introduction':
The authors said boundary of low and high magnitudes are arbitrary chosen and does not influence the model performance. However, the reviewer think boundary selection could affect the performance, because the faulting process become more complex for larger earthquakes so that initial P-wave does not necessarily has large amplitude during the P-wave trains of the larger earthquake. In the paper, analysis durations does not affect the results, but this results are only examined for the magnitude boundary of 5.0. If the boundary shifts larger (like 7.0), analysis duration could affect the performance, although such analysis is difficult for STEAD.
Comments for the description of data used:
Are there any selection criteria in source-to-site distance and station?
STEAD includes from small to large distance data. In the scheme of the paper, the station(s) nearest to the epicenter seems appropriate for the analysis, because the rapid warning is the purpose. Please add description of selection criteria for distance/station if exists. Also, please add distance distribution like Figure 1 irrespective of existence of the criteria.
The reviewer is wondering that use of large-distant records increase the difficulty of classification, because such record become very complicated waveform due to the propagation of long distance in complex media.
Comments for Model Architecture:
Please describe why the authors choose the model architecture in Figure 3. (Please explain how each part contributes.)
Citation: https://doi.org/10.5194/egusphere-2022-4-RC2 -
AC2: 'Reply on RC2', Megha Chakraborty, 05 Sep 2022
The authors thank Anonymous Referee #2 for his feedback. Listes below is our response to "specific comments":
- It is difficult to experiment with decision boundaries above 5 because the number of waveforms for such high magnitudes present in the dataset is severely limited. Although we experimented with decision boundaries of magnitude 3 and 4 and got similar results. (This is also explained in AC1 in this discussion - https://doi.org/10.5194/egusphere-2022-4-AC1).
- Currently no selection criteria are applied to the source-to-site distance. (see next comment)
- The authors thank Anonymous Referee #2 for suggesting the analysis of source-to-site distance of the training data. The figure for the distribution of source-to-site distances is shown below and will be added in the revised version of the manuscript.
- As discussed in AC1 (https://doi.org/10.5194/egusphere-2022-4-AC1) we have analyzed the model performance for different source-to-site distances and observed that the model is indeed capable of performing reliably over a wide range of hypocentral distances. In other words, no clear dependence between the model performance and hypocentral distance can be observed. Shown below is the relevant figure which can also be added in a revised version of the manuscript.
- Convolutional Neural Networks have often been found to be useful for seismological data analysis as they are capable of extracting patterns in the data (features) without any temporal dependence. When combined with LSTMs the temporal relations between these features can be obtained. In applications such as magnitude-based classification of earthquakes, this aids in the effective analysis of signal features as compared to the pre-signal background noise. The dropout layers are used to prevent the model from overfitting and the maxpooling layer is a method to reduce the data dimensionality so that only relevant features can be retained. The final layer is a softmax layer which outputs the probabilities corresponding to each of the three classes that the data is classified into. This description will be added to a revised version of the manuscript.
Citation: https://doi.org/10.5194/egusphere-2022-4-AC2 -
AC1: 'Reply on RC1', Megha Chakraborty, 05 Sep 2022
The authors thank the reviewer Dr. Filippo Gatti for his feedback.
Listed below are the responses to individual comments:
- We have experimented with decision boundaries of magnitude 3 and 4. The accuracy, precision and recall values were found to be similar to what has been presented in the manuscript and did not show any clear dependence on the length of input data. A comment on this will be added to a revised version of the manuscript.
- The authors agree with the point raised by the reviewer and thank him for addressing it. We have analyzed the model performance for different source-to-site distances and observed that the model is indeed capable of performing reliably over a wide range of hypocentral distances. In other words, no clear dependence between the model performance and hypocentral distance can be observed. Shown below is the relevant figure which can also be added in a revised version of the manuscript.
- We noticed that the model is capable to perform correct classifications over a wide range of hypocentral distances and magnitude ranges suggesting that it is capable of learning the frequency characteristics of the waveforms.The use of Fourier spectrum in addition to waveform data was tested during our initial experiments, and it achieved results comparable to the model which used only waveform data as input waveform.
- We have analyzed the effect of hypocentral distance (figure above) and SNR (figure below) on the model performance. While we do not see any clear dependence on hypocentral distance, the SNR of the data seems to play a role in the classification of waveforms. The relevant plots will be included in a revised version of the manuscript. On the other hand, due to unavailability of the Information on focal mechanism in the metadata we were not able to experiment with this. However, the role of the earthquake source type could be considered further in a separate study.
Citation: https://doi.org/10.5194/egusphere-2022-4-AC1
-
AC2: 'Reply on RC2', Megha Chakraborty, 05 Sep 2022
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2022-4', Filippo Gatti, 13 Jun 2022
The authors of the manuscript entitled "A study on the effect of input data length on deep learning-based magnitude classifier" present a very interesting contribution into addressing some crucial aspects of the automated techniques (powered by deep-learning) to estimate the earthquake magnitude. Despite the large perspectives that deep-learning techniques provide to the EEW framework, the choice of the right neural architecture, the dataset preparation and interpretation of results is crucial into determine their success. The authors have addresses many of the concerns that this sensitive deep-learning application has, showcasting a powerful neural classifier that can discriminate between noise, low-magnitude and high-magnitude earthquakes. Still, some open questions remain, which is why this reviewer suggested a major revision, so to address them more into detail and so to promote the scientific debate on it.
As metnioned, the manuscript is very interesting and it raises many questions about the decisions the authors made to perform their classification benchmark. The list of comments below is non-exhaustive: please, refer to the attached reviewed manuscript for further comments.
- The decision boundary between low-magnitude and high-magnitude is rather arbitrary, as stated by the reviewers and as shown in Fig.7a. Plus, It highly depends on the seismic context of interest and the risk assessment and vulnerability policies of each country/region. It would be interesting to test at least one another decision boundary or at least test somehow the sensitivity of the classifier to this choice.
- The effect of the source-to-site distance seems to have been disregarded. Maybe, separating the waveforms in different bins, based on source-to-site distance, could unveil some interesting aspects of the classification performance. Some comments on it would be beneficial to the manuscript overall clarity.
- Sometimes, it's rather useful to analyze the waveforms at stake in the Fourier's spectrum domain. The corner frequency is strictly related to the source spectrum, which mostly determines the magnitude (along with the distance) In this case, this reviewer suggests to check the spectrogram of the classified waveforms, so to verify that the duration is compatible with the associated frequency corner value for the correspondent moment magnitude (see the statistical relationship between corner frequency and moment magnitude presented by Courboulex F, Vallée M, Causse M, Chounet A (2016) Stress-drop variability of shallow earthquakes extracted from a global database of source time functions. Seismol Res Lett 87(4):912–918
- Have the authors considered the earthquake type when preparing the dataset? A comment on this aspect would be very interesting.
There are no technical corrections, besides the need to detach the unity of measure from the number (ex. 10 s and not 10s)
-
AC1: 'Reply on RC1', Megha Chakraborty, 05 Sep 2022
The authors thank the reviewer Dr. Filippo Gatti for his feedback.
Listed below are the responses to individual comments:
- We have experimented with decision boundaries of magnitude 3 and 4. The accuracy, precision and recall values were found to be similar to what has been presented in the manuscript and did not show any clear dependence on the length of input data. A comment on this will be added to a revised version of the manuscript.
- The authors agree with the point raised by the reviewer and thank him for addressing it. We have analyzed the model performance for different source-to-site distances and observed that the model is indeed capable of performing reliably over a wide range of hypocentral distances. In other words, no clear dependence between the model performance and hypocentral distance can be observed. Shown below is the relevant figure which can also be added in a revised version of the manuscript.
- We noticed that the model is capable to perform correct classifications over a wide range of hypocentral distances and magnitude ranges suggesting that it is capable of learning the frequency characteristics of the waveforms.The use of Fourier spectrum in addition to waveform data was tested during our initial experiments, and it achieved results comparable to the model which used only waveform data as input waveform.
- We have analyzed the effect of hypocentral distance (figure above) and SNR (figure below) on the model performance. While we do not see any clear dependence on hypocentral distance, the SNR of the data seems to play a role in the classification of waveforms. The relevant plots will be included in a revised version of the manuscript. On the other hand, due to unavailability of the Information on focal mechanism in the metadata we were not able to experiment with this. However, the role of the earthquake source type could be considered further in a separate study.
Citation: https://doi.org/10.5194/egusphere-2022-4-AC1
-
RC2: 'Comment on egusphere-2022-4', Anonymous Referee #2, 31 Jul 2022
"general comments"
This paper is trying to develop rapid magnitude classification method by using Deep Learning. This topic is important for rapid threshold warning in EEW. The objective, data, results are fairly documented while the reviewer has questions described in the specific comments section.
"specific comments"
Comments for the last paragraph in 'introduction':
The authors said boundary of low and high magnitudes are arbitrary chosen and does not influence the model performance. However, the reviewer think boundary selection could affect the performance, because the faulting process become more complex for larger earthquakes so that initial P-wave does not necessarily has large amplitude during the P-wave trains of the larger earthquake. In the paper, analysis durations does not affect the results, but this results are only examined for the magnitude boundary of 5.0. If the boundary shifts larger (like 7.0), analysis duration could affect the performance, although such analysis is difficult for STEAD.
Comments for the description of data used:
Are there any selection criteria in source-to-site distance and station?
STEAD includes from small to large distance data. In the scheme of the paper, the station(s) nearest to the epicenter seems appropriate for the analysis, because the rapid warning is the purpose. Please add description of selection criteria for distance/station if exists. Also, please add distance distribution like Figure 1 irrespective of existence of the criteria.
The reviewer is wondering that use of large-distant records increase the difficulty of classification, because such record become very complicated waveform due to the propagation of long distance in complex media.
Comments for Model Architecture:
Please describe why the authors choose the model architecture in Figure 3. (Please explain how each part contributes.)
Citation: https://doi.org/10.5194/egusphere-2022-4-RC2 -
AC2: 'Reply on RC2', Megha Chakraborty, 05 Sep 2022
The authors thank Anonymous Referee #2 for his feedback. Listes below is our response to "specific comments":
- It is difficult to experiment with decision boundaries above 5 because the number of waveforms for such high magnitudes present in the dataset is severely limited. Although we experimented with decision boundaries of magnitude 3 and 4 and got similar results. (This is also explained in AC1 in this discussion - https://doi.org/10.5194/egusphere-2022-4-AC1).
- Currently no selection criteria are applied to the source-to-site distance. (see next comment)
- The authors thank Anonymous Referee #2 for suggesting the analysis of source-to-site distance of the training data. The figure for the distribution of source-to-site distances is shown below and will be added in the revised version of the manuscript.
- As discussed in AC1 (https://doi.org/10.5194/egusphere-2022-4-AC1) we have analyzed the model performance for different source-to-site distances and observed that the model is indeed capable of performing reliably over a wide range of hypocentral distances. In other words, no clear dependence between the model performance and hypocentral distance can be observed. Shown below is the relevant figure which can also be added in a revised version of the manuscript.
- Convolutional Neural Networks have often been found to be useful for seismological data analysis as they are capable of extracting patterns in the data (features) without any temporal dependence. When combined with LSTMs the temporal relations between these features can be obtained. In applications such as magnitude-based classification of earthquakes, this aids in the effective analysis of signal features as compared to the pre-signal background noise. The dropout layers are used to prevent the model from overfitting and the maxpooling layer is a method to reduce the data dimensionality so that only relevant features can be retained. The final layer is a softmax layer which outputs the probabilities corresponding to each of the three classes that the data is classified into. This description will be added to a revised version of the manuscript.
Citation: https://doi.org/10.5194/egusphere-2022-4-AC2 -
AC1: 'Reply on RC1', Megha Chakraborty, 05 Sep 2022
The authors thank the reviewer Dr. Filippo Gatti for his feedback.
Listed below are the responses to individual comments:
- We have experimented with decision boundaries of magnitude 3 and 4. The accuracy, precision and recall values were found to be similar to what has been presented in the manuscript and did not show any clear dependence on the length of input data. A comment on this will be added to a revised version of the manuscript.
- The authors agree with the point raised by the reviewer and thank him for addressing it. We have analyzed the model performance for different source-to-site distances and observed that the model is indeed capable of performing reliably over a wide range of hypocentral distances. In other words, no clear dependence between the model performance and hypocentral distance can be observed. Shown below is the relevant figure which can also be added in a revised version of the manuscript.
- We noticed that the model is capable to perform correct classifications over a wide range of hypocentral distances and magnitude ranges suggesting that it is capable of learning the frequency characteristics of the waveforms.The use of Fourier spectrum in addition to waveform data was tested during our initial experiments, and it achieved results comparable to the model which used only waveform data as input waveform.
- We have analyzed the effect of hypocentral distance (figure above) and SNR (figure below) on the model performance. While we do not see any clear dependence on hypocentral distance, the SNR of the data seems to play a role in the classification of waveforms. The relevant plots will be included in a revised version of the manuscript. On the other hand, due to unavailability of the Information on focal mechanism in the metadata we were not able to experiment with this. However, the role of the earthquake source type could be considered further in a separate study.
Citation: https://doi.org/10.5194/egusphere-2022-4-AC1
-
AC2: 'Reply on RC2', Megha Chakraborty, 05 Sep 2022
Peer review completion
Journal article(s) based on this preprint
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
215 | 0 | 0 | 215 | 2 | 1 |
- HTML: 215
- PDF: 0
- XML: 0
- Total: 215
- BibTeX: 2
- EndNote: 1
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprint-related metrics are limited to HTML views.
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
Megha Chakraborty
Wei Li
Johannes Faber
Georg Rümpker
Horst Stoecker
Nishtha Srivastava
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.