the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Realtime Earthquake Monitoring using Deep Learning: a case study on Turkey Earthquake Aftershock Sequence
Abstract. Seismic phase picking and magnitude estimation are essential components of realtime earthquake monitoring and earthquake early warning systems. Reliable phase picking enables the timely detection of seismic wave arrivals, facilitating rapid earthquake characterization and early warning alerts. Accurate magnitude estimation provides crucial information about an earthquake’s size and potential impact. Together, these steps contribute to effective earthquake monitoring, enhancing our ability to implement appropriate response measures in seismically active regions and mitigate risks. In this study, we explore5 the potential of deep learning in realtime earthquake monitoring. To that aim, we begin by introducing DynaPicker which leverages dynamic convolutional neural networks to detect seismic body wave phases. Subsequently, DynaPicker is employed for seismic phase picking on continuous seismic recordings. To showcase the efficacy of Dynapicker, several opensource seismic datasets including windowformat data and continuous seismic data are used to demonstrate it’s performance in seismic phase identification, and arrivaltime picking. Additionally, DynaPicker’s robustness in classifying seismic phases was tested10 on the lowmagnitude seismic data polluted by noise. Finally, the phase arrival time information is integrated into a previously published deeplearning model for magnitude estimation. This workflow is then applied and tested on the continuous recording of the aftershock sequences following the Turkey earthquake to detect the earthquakes, seismic phase picking and estimate the magnitude of the corresponding event. The results obtained in this case study exhibit a high level of reliability in detecting the earthquakes and estimating the magnitude of aftershocks following the Turkey earthquake.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(0 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
Journal article(s) based on this preprint
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20231391', Anonymous Referee #1, 31 Jul 2023
In this work, the authors apply the dynamic convolutional neural network to two tasks: seismic phase classification and arrivaltime picking. They compared the new model, DynaPicker, to a few other deep learning models and demonstrated that DynaPicker could achieve a better performance for input data of different lengths. The main concern I have for this work is that the model comparison may not be very accurate. The reported improvements in precision/recall/F1 scores are not significant, so the performance of DynaPicker may become even worse if choosing a slightly different threshold. Based on the selected examples shown in the paper, the false positive rate of the new model could be very high. I would request the authors to plot precisionrecall curves to compare the performance of different models to avoid the bias in selected a specific threshold for comparison. One good example is Münchmeyer et al.'s work of "Which Picker Fits My Data?"Comments:1. Table 4: I three comments of the reported results: (1) Based on the standard deviations of the time residuals, we can see clearly DynaPicker has a very large time error for both P and S phase. I am wondering if DynaPicker is really an improved alternative to current models. (2) In the table, only the number of picks (< 0.5s) is reported. But how many picks (> 0.5s) does DynaPicker detected? It is important to report the false positives. (3) The absolute number of undetected events is not helpful. What activation threshold do you use? How many false positive events do you detected in order to detect all true events?2. Fig. 2: I am confused by this plot. If the predicted scores are also pretty high for waveforms that are not P or S phases, there could be many false positives.Based on the examples shown in Fig. 5, we can see DynaPicker can also easily pick up false positives.3. Eq. 5: Did you compare the results using T = 1 and T = 4 for the phase picking problem? Because the temperature softmax function is not used by previous works of phase picking, it is necessary to demonstrate that it can help the phase picking task.4. L230: "we can observe that EPick achieves the best performance in phase picking over DynaPicker by using different window sizes." Does this mean the claimed advantage of DynaPicker for different input length is not true? Although you explain that the reason is that EPick is pretrained using the STEAD dataset, you can also train DynaPicker using the STEAD dataset to make the comparison more accurate.5. L240: "The testing accuracy of DynaPicker is 98.82%, which is slightly greater than CapsPhase [30] (98.66%) and 1DResNet [9] (98.66%)." Because the differences are very small and do not tell readers much information, could you compare the waveforms of false predictions of these models to help understand where DynaPicker can be better?Citation: https://doi.org/
10.5194/egusphere20231391RC1 
AC2: 'Reply on RC1', Nishtha Srivastava, 03 Nov 2023
Response to Reviewers
Dear Reviewer,
We appreciate the time and effort that you have dedicated and are grateful for your insightful comments which have improved
the manuscript. We have incorporated the constructive suggestions, and have highlighted the changes within the manuscript
and marked them in blue color. Here is a pointbypoint response to your comments and concerns.
Reviewer 1
In this work, the authors apply the dynamic convolutional neural network to two tasks: seismic phase classification and arrival
time picking. They compared the new model, DynaPicker, to a few other deep learning models and demonstrated that Dy
naPicker could achieve better performance for input data of different lengths. The main concern I have for this work is that
the model comparison may not be very accurate. The reported improvements in precision/recall/F1 scores are not significant,
so the performance of DynaPicker may become even worse if choosing a slightly different threshold. Based on the selected
examples shown in the paper, the false positive rate of the new model could be very high. I would request the authors to
plot precisionrecall curves to compare the performance of different models to avoid bias in selecting a specific threshold for
comparison. One good example is Münchmeyer et al.’s work of "Which Picker Fits My Data?”
Response: We thank the reviewer for the constructive suggestion.
– Following Münchmeyer et al.’s work “Which Picker Fits My Data?”, we have plotted the receiver operating characteristic
(ROC) curves for the DynaPicker and GPD models as follows. Based on Figure 1 (see attachment), it’s evident that DynaPicker exhibits
a similar low false positive rate to the GPD model. Furthermore, the picking error distribution summarized in Tables 4
and 5 that DynaPicker performs better in phase arrival time picking than GPD.
– Unfortunately, even with our multiple attempts, the CapsPhase model retrieved from the git repository cannot be utilized
at this time. When loading the CapsPhase model in the created virtual environment, we received a segmentation fault
error, even after increasing the memory size. We will keep on trying to perform the comparison with the CapsPhase
model and will address this in our followup work.
1. Table 4: I have three comments of the reported results: (1) Based on the standard deviations of the time residuals, we can
see clearly DynaPicker has a very large time error for both P and S phases. I am wondering if DynaPicker is really an improved
alternative to current models. (2) In the table, only the number of picks (< 0.5s) is reported. But how many picks (> 0.5s) does
DynaPicker detected? It is important to report the false positives. (3) The absolute number of undetected events is not helpful.
What activation threshold do you use? How many false positive events do you detected in order to detect all true events?Response:
– Here we followed the CapsPhase work definition i.e., ’if the error between the model’s predicted picks and the ground
truth picks have an absolute error below 0.5s, then it is true positive’. As indicated in Table 4, the time residuals of Dy
naPicker exhibit standard deviations that are either similar to or smaller than those of other models. In certain instances,
DynaPicker even surpasses the other models by having lower standard deviations.
– In Table 4, a total number of 10,000 events for each scenario are randomly selected from the STEAD dataset. All of these
earthquake events are correctly detected using DynaPicker. In case 1 (used model: DynaPicker), there are 945 events with
Pphase picking errors exceeding 0.5s and 2304 events with Sphase picking errors exceeding 0.5s, respectively. In case
2 (used model: GPD), there are 2595 events with Pphase picking error and 2403 events with Sphase picking error
greater than 0.5s, respectively. In both Tables 4 and 5, we have introduced two additional columns to denote the number
of picks exceeding 0.5s for both P and Sphases as shown in the following tables.
– In the task of the seismic phase classification with the SCEDC dataset, we did not use any activation threshold for
event detection. However, in the context of analyzing the aftershock sequence of the Turkey earthquake, we empirically
established an activation threshold of 0.7 for detecting events. We would further like to point out that the threshold
should indeed be chosen carefully by the user based on the station and the data and what we show here is an example.
The threshold of 0.7 was chosen experimentally to get the most optimum balance between false positives and false
negatives.
Changes in manuscript: We have updated Tables 4 and 5 in the manuscript. Line 7880 are added2. Fig. 2: I am confused by this plot. If the predicted scores are also pretty high for waveforms that are not P or S phases,
there could be many false positives. Based on the examples shown in Fig. 5, we can see DynaPicker can also easily pick up
false positives.Response: Figure 2 provides a schematic representation of the arrival time picking process for continuous seismic data,
employing various window sizes while processing the same continuous waveform. Even though the probability might be rela
tively high (> 0.7) in few other windows, we opt for the window with the highest P/S probability, which usually is in the order
of 0.99, to estimate the phase arrival time. As illustrated in the initial ROC curves, by following this approach, DynaPicker
exhibits a minimal number of false positives.3. Eq. 5: Did you compare the results using T = 1 and T = 4 for the phase picking problem? Because the temperature softmax
function is not used by previous works of phase picking, it is necessary to demonstrate that it can help the phase picking task.
Response: We concur with this observation of the reviewer. We have summarized the outcomes of utilizing different tem
peratures for phase picking in the Appendix. The relevant table is presented below. You can find this table on page 22 of the
manuscript.4. L230: "we can observe that EPick achieves the best performance in phase picking over DynaPicker by using different
window sizes." Does this mean the claimed advantage of DynaPicker for different input length is not true? Although you
explain that the reason is that EPick is pretrained using the STEAD dataset, you can also train DynaPicker using the STEAD
dataset to make the comparison more accurate.Response: In response to the reviewer’s recommendation, we proceeded to retrain the EPick model using the same dataset
sourced from the STEAD data, which serves as the training data for DynaPicker. Subsequently, we employed the retrained
EPick model to estimate phase arrival times for continuous data extracted from the STEAD dataset. It is important to highlight
that there is no overlap between the datasets used for EPick training and those utilized for phase arrival time detection. We
observed that, while the performance in detecting the P phase was similar, the accuracy of SPhase picking decreased from a
mean value of 0.002s to 0.050s, and the standard deviation increased from 0.122s to 0.147s. Additionally, The EPick model
is developed for the task of estimating seismic phase arrival times with fixed input length. In contrast, the DynaPicker model
was primarily designed for phase classification and its adaptation for phase arrival time detection with different input lengths
is a notable application. Furthermore, as the size of the training data increased, DynaPicker exhibited improved performance
and demonstrated greater robustness when compared to EPick.
Changes in manuscript: Texts updated to help readers understand the process on pages 11 and 12.5. L240: "The testing accuracy of DynaPicker is 98.82%, which is slightly greater than CapsPhase [30] (98.66%) and 1D
ResNet [9] (98.66%)." Because the differences are very small and do not tell readers much information, could you compare
the waveforms of false predictions of these models to help understand where DynaPicker can be better?Response: As per reviewer’s suggestion, here, we plot several waveforms of false predictions, while they are correctly
identified by DynaPicker.
Figure 2. Visualization of trace examples.
From these figures, we can observe that compared with other models, DynaPicker shows its advantage in phase classification.
in scenarios where the ground truth label is noise and the seismic waveform exhibits increased noise levels, DynaPicker
accurately identifies it as noise, whereas the GPD and ResNet models tend to misclassify it. Unfortunately, as mentioned
before, the CapsPhase model retrieved from the git repository cannot be utilized at this time.

AC2: 'Reply on RC1', Nishtha Srivastava, 03 Nov 2023

RC2: 'Comment on egusphere20231391', Anonymous Referee #2, 11 Sep 2023
The manuscript proposes a new deeplearning picker that leverages dynamic convolutional neural networks for detecting and picking seismic phases from windowed or continuous waveform data. The authors then combined the previously published CREIME model for magnitude estimations of waveform windows that have high Pwave probabilities. The authors have evaluated the performance of their picker and their combined workflow on opensource seismic datasets and aftershocks following the Turkey earthquake. The technical part of the manuscript is overall solid. However, I have vital concerns about the ‘realtime’ claim. It seems to me that the authors have confused the concept of processing continuous data with the concept of realtime earthquake monitoring. I suggest the authors modify their claim from ‘Realtime’ to ‘efficient’ and emphasize more on the performance of the proposed deeplearning picker. Aside from the ‘realtime’ claim, the study seems good overall. Below are my detailed comments:
 How is the term ‘Realtime’ defined? What is the time cost between the time of data recorded at the seismometer and the time of output produced? Please note that there are several important steps for realtime earthquake monitoring besides the time cost of the phasepicking model. For example, how is the time cost of the data transmitted from the seismometer to the data center? Is the data processed at the seismometer end with edgecomputing (which would be important in areas with poor internet access), or is the data transmitted to the data center first and processed later there? The data packages in the realtime seismic data flow can contain errors due to transmission issues. How is that addressed?
 What is the inference time cost of the model? What is the key advantage of the proposed method over conventional and lightweight convolutional deeplearning pickers in terms of realtime monitoring? The authors claim, "However, most of the prevalent CNNbased models perform inference using static convolution kernels, which may limit their representation power, efficiency, and ability for interpretation.” However, to my acknowledgment, the current CNNbased models, especially lightweight ones, are sufficient for millisecondlevel inference. One key claim of the manuscript is that the proposed method is much faster and, therefore, more suitable for realtime earthquake monitoring. However, I didn’t find any quantitative comparisons on the inference speed in this paper.
 The event's location is one key information in earthquake monitoring and yet not resolved by the current workflow. The lack of event location information would decrease the significance of the proposed monitoring method.
 Why is being adaptive to different input lengths important? Is that because in the realtime earthquake monitoring scenario that the authors are dealing with, the input lengths of data chunks can be significantly different? And what are the advantages of the proposed method over the RNNbased pickers, which can also adapt to different input lengths?
 Section 5.5 ‘Realtime earthquake detection’, how is the ‘realtime’ here different from ‘continuous data’? Section 6.2 ‘the live data of the Turkey earthquake’, what does the ‘live data’ mean, do authors have access to the realtime data packages from the Earthquake Data Center System of Turkey, or do they use the downloaded continuous waveform data?
Citation: https://doi.org/10.5194/egusphere20231391RC2 
AC1: 'Reply on RC2', Nishtha Srivastava, 27 Oct 2023
Response to Reviewers
Dear Reviewer 2,
We appreciate the time and effort that you have dedicated and are grateful for your insightful comments which have improved
the manuscript. We have incorporated the constructive suggestions, and have highlighted the changes within the manuscript
and marked them in blue color. Here is a pointbypoint response to your comments and concerns.
Reviewer 2
The manuscript proposes a new deeplearning picker that leverages dynamic convolutional neural networks for detecting and
picking seismic phases from windowed or continuous waveform data. The authors then combined the previously published CREIME model for magnitude estimations of waveform windows that have high Pwave probabilities. The authors have evaluated the performance of their picker and their combined workflow on opensource seismic datasets and aftershocks following
the Turkey earthquake. The technical part of the manuscript is overall solid. However, I have vital concerns about the ‘realtime’ claim. It seems to me that the authors have confused the concept of processing continuous data with the concept of realtime earthquake monitoring. I suggest the authors modify their claim from ‘Realtime’ to ‘efficient’ and emphasize more on the
performance of the proposed deeplearning picker. Aside from the ‘realtime’ claim, the study seems good overall. Below are our detailed comments:Response: We appreciate the reviewer’s suggestion to modify our claim from ’realtime’ to ’efficient’. We understand that
’realtime’ can have different interpretations, and we agree that emphasizing the efficiency of our proposed deeplearning
picker is essential. We made this adjustment in the revised manuscript avoiding the strict interpretation of ‘realtime’.1.How is the term ‘Realtime’ defined? What is the time cost between the time of data recorded at the seismometer and the
time of output produced? Please note that there are several important steps for realtime earthquake monitoring besides the
time cost of the phasepicking model. For example, how is the time cost of the data transmitted from the seismometer to the
data center? Is the data processed at the seismometer end with edgecomputing (which would be important in areas with poor
internet access), or is the data transmitted to the data center first and processed later there? The data packages in the realtime
seismic data flow can contain errors due to transmission issues. How is that addressed?Response: We acknowledge that the term ‘realtime’ to describe our model’s performance was a misnomer. Our model does
not operate in realtime, and therefore, we did not perform any analysis on the time cost of realtime data flow. We apologize
for any confusion caused by the incorrect terminology and thank the reviewer for pointing this out. Instead, our proposed model
provides timely results from continuous waveform recordings. We revised the manuscript to accurately reflect this and ensure
that our terminology aligns with the actual capabilities of our model.2.What is the inference time cost of the model? What is the key advantage of the proposed method over conventional
and lightweight convolutional deeplearning pickers in terms of realtime monitoring? The authors claim, "However, most
of the prevalent CNNbased models perform inference using static convolution kernels, which may limit their representation
power, efficiency, and ability for interpretation.” However, to my acknowledgment, the current CNNbased models, especially
lightweight ones, are sufficient for millisecondlevel inference. One key claim of the manuscript is that the proposed method is
much faster and, therefore, more suitable for realtime earthquake monitoring. However, I didn’t find any quantitative compar
isons on the inference speed in this paper.Response: In the introduction section of the manuscript, we claim that “However, most of the prevalent CNNbased models
perform inference using static convolution kernels, which may limit their representation power, efficiency, and ability for in
terpretation.” To clarify, our primary focus in this work is the utilization of the dynamic networks to enhance the performance
of the seismic phase classification performance and, consequently reduce the errors in the phase arrivaltime estimation. As
a result, we have not quantified the time comparison of the inference speed in this paper. However, following the reviewer’s
suggestion, we will explore such a relative comparison in our followup work.3.The event’s location is one key information in earthquake monitoring and yet not resolved by the current workflow. The
lack of event location information would decrease the significance of the proposed monitoring method.Response: Event location is indeed a crucial component of seismic analysis, but its inclusion in our current model was
beyond the scope of this work. We understand the importance of this aspect and recognize it as an essential feature for a
comprehensive seismic monitoring system. However, getting an accurate phase arrival is crucial for a correct event location
determination. We plan to address event location information and upgrade the current model in future research.4.Why is being adaptive to different input lengths important? Is that because in the realtime earthquake monitoring scenario
that the authors are dealing with, the input lengths of data chunks can be significantly different? And what are the advantages
of the proposed method over the RNNbased pickers, which can also adapt to different input lengths?Response:
– The proposed method is adaptive to different input lengths, as it can accommodate the continuous data of different
durations. We have edited the text in line 40 to highlight this.
– We are currently working on a RNNbased model for event detection which will be a followup of this work.
25. Section 5.5 ‘Realtime earthquake detection’, how is the ‘realtime’ here different from ‘continuous data’? Section 6.2
‘the live data of the Turkey earthquake’, what does the ‘live data’ mean, do authors have access to the realtime data packages
from the Earthquake Data Center System of Turkey, or do they use the downloaded continuous waveform data?
Response: We incorrectly used the term ‘realtime’ when referring to our data source. we apologize again for the inconve
nience. To accurately describe our data source, we now use the term ‘continuous seismic recordings’ throughout the revised
manuscript. This term better reflects the downloaded continuous waveform data that we are utilizing.
Interactive discussion
Status: closed

RC1: 'Comment on egusphere20231391', Anonymous Referee #1, 31 Jul 2023
In this work, the authors apply the dynamic convolutional neural network to two tasks: seismic phase classification and arrivaltime picking. They compared the new model, DynaPicker, to a few other deep learning models and demonstrated that DynaPicker could achieve a better performance for input data of different lengths. The main concern I have for this work is that the model comparison may not be very accurate. The reported improvements in precision/recall/F1 scores are not significant, so the performance of DynaPicker may become even worse if choosing a slightly different threshold. Based on the selected examples shown in the paper, the false positive rate of the new model could be very high. I would request the authors to plot precisionrecall curves to compare the performance of different models to avoid the bias in selected a specific threshold for comparison. One good example is Münchmeyer et al.'s work of "Which Picker Fits My Data?"Comments:1. Table 4: I three comments of the reported results: (1) Based on the standard deviations of the time residuals, we can see clearly DynaPicker has a very large time error for both P and S phase. I am wondering if DynaPicker is really an improved alternative to current models. (2) In the table, only the number of picks (< 0.5s) is reported. But how many picks (> 0.5s) does DynaPicker detected? It is important to report the false positives. (3) The absolute number of undetected events is not helpful. What activation threshold do you use? How many false positive events do you detected in order to detect all true events?2. Fig. 2: I am confused by this plot. If the predicted scores are also pretty high for waveforms that are not P or S phases, there could be many false positives.Based on the examples shown in Fig. 5, we can see DynaPicker can also easily pick up false positives.3. Eq. 5: Did you compare the results using T = 1 and T = 4 for the phase picking problem? Because the temperature softmax function is not used by previous works of phase picking, it is necessary to demonstrate that it can help the phase picking task.4. L230: "we can observe that EPick achieves the best performance in phase picking over DynaPicker by using different window sizes." Does this mean the claimed advantage of DynaPicker for different input length is not true? Although you explain that the reason is that EPick is pretrained using the STEAD dataset, you can also train DynaPicker using the STEAD dataset to make the comparison more accurate.5. L240: "The testing accuracy of DynaPicker is 98.82%, which is slightly greater than CapsPhase [30] (98.66%) and 1DResNet [9] (98.66%)." Because the differences are very small and do not tell readers much information, could you compare the waveforms of false predictions of these models to help understand where DynaPicker can be better?Citation: https://doi.org/
10.5194/egusphere20231391RC1 
AC2: 'Reply on RC1', Nishtha Srivastava, 03 Nov 2023
Response to Reviewers
Dear Reviewer,
We appreciate the time and effort that you have dedicated and are grateful for your insightful comments which have improved
the manuscript. We have incorporated the constructive suggestions, and have highlighted the changes within the manuscript
and marked them in blue color. Here is a pointbypoint response to your comments and concerns.
Reviewer 1
In this work, the authors apply the dynamic convolutional neural network to two tasks: seismic phase classification and arrival
time picking. They compared the new model, DynaPicker, to a few other deep learning models and demonstrated that Dy
naPicker could achieve better performance for input data of different lengths. The main concern I have for this work is that
the model comparison may not be very accurate. The reported improvements in precision/recall/F1 scores are not significant,
so the performance of DynaPicker may become even worse if choosing a slightly different threshold. Based on the selected
examples shown in the paper, the false positive rate of the new model could be very high. I would request the authors to
plot precisionrecall curves to compare the performance of different models to avoid bias in selecting a specific threshold for
comparison. One good example is Münchmeyer et al.’s work of "Which Picker Fits My Data?”
Response: We thank the reviewer for the constructive suggestion.
– Following Münchmeyer et al.’s work “Which Picker Fits My Data?”, we have plotted the receiver operating characteristic
(ROC) curves for the DynaPicker and GPD models as follows. Based on Figure 1 (see attachment), it’s evident that DynaPicker exhibits
a similar low false positive rate to the GPD model. Furthermore, the picking error distribution summarized in Tables 4
and 5 that DynaPicker performs better in phase arrival time picking than GPD.
– Unfortunately, even with our multiple attempts, the CapsPhase model retrieved from the git repository cannot be utilized
at this time. When loading the CapsPhase model in the created virtual environment, we received a segmentation fault
error, even after increasing the memory size. We will keep on trying to perform the comparison with the CapsPhase
model and will address this in our followup work.
1. Table 4: I have three comments of the reported results: (1) Based on the standard deviations of the time residuals, we can
see clearly DynaPicker has a very large time error for both P and S phases. I am wondering if DynaPicker is really an improved
alternative to current models. (2) In the table, only the number of picks (< 0.5s) is reported. But how many picks (> 0.5s) does
DynaPicker detected? It is important to report the false positives. (3) The absolute number of undetected events is not helpful.
What activation threshold do you use? How many false positive events do you detected in order to detect all true events?Response:
– Here we followed the CapsPhase work definition i.e., ’if the error between the model’s predicted picks and the ground
truth picks have an absolute error below 0.5s, then it is true positive’. As indicated in Table 4, the time residuals of Dy
naPicker exhibit standard deviations that are either similar to or smaller than those of other models. In certain instances,
DynaPicker even surpasses the other models by having lower standard deviations.
– In Table 4, a total number of 10,000 events for each scenario are randomly selected from the STEAD dataset. All of these
earthquake events are correctly detected using DynaPicker. In case 1 (used model: DynaPicker), there are 945 events with
Pphase picking errors exceeding 0.5s and 2304 events with Sphase picking errors exceeding 0.5s, respectively. In case
2 (used model: GPD), there are 2595 events with Pphase picking error and 2403 events with Sphase picking error
greater than 0.5s, respectively. In both Tables 4 and 5, we have introduced two additional columns to denote the number
of picks exceeding 0.5s for both P and Sphases as shown in the following tables.
– In the task of the seismic phase classification with the SCEDC dataset, we did not use any activation threshold for
event detection. However, in the context of analyzing the aftershock sequence of the Turkey earthquake, we empirically
established an activation threshold of 0.7 for detecting events. We would further like to point out that the threshold
should indeed be chosen carefully by the user based on the station and the data and what we show here is an example.
The threshold of 0.7 was chosen experimentally to get the most optimum balance between false positives and false
negatives.
Changes in manuscript: We have updated Tables 4 and 5 in the manuscript. Line 7880 are added2. Fig. 2: I am confused by this plot. If the predicted scores are also pretty high for waveforms that are not P or S phases,
there could be many false positives. Based on the examples shown in Fig. 5, we can see DynaPicker can also easily pick up
false positives.Response: Figure 2 provides a schematic representation of the arrival time picking process for continuous seismic data,
employing various window sizes while processing the same continuous waveform. Even though the probability might be rela
tively high (> 0.7) in few other windows, we opt for the window with the highest P/S probability, which usually is in the order
of 0.99, to estimate the phase arrival time. As illustrated in the initial ROC curves, by following this approach, DynaPicker
exhibits a minimal number of false positives.3. Eq. 5: Did you compare the results using T = 1 and T = 4 for the phase picking problem? Because the temperature softmax
function is not used by previous works of phase picking, it is necessary to demonstrate that it can help the phase picking task.
Response: We concur with this observation of the reviewer. We have summarized the outcomes of utilizing different tem
peratures for phase picking in the Appendix. The relevant table is presented below. You can find this table on page 22 of the
manuscript.4. L230: "we can observe that EPick achieves the best performance in phase picking over DynaPicker by using different
window sizes." Does this mean the claimed advantage of DynaPicker for different input length is not true? Although you
explain that the reason is that EPick is pretrained using the STEAD dataset, you can also train DynaPicker using the STEAD
dataset to make the comparison more accurate.Response: In response to the reviewer’s recommendation, we proceeded to retrain the EPick model using the same dataset
sourced from the STEAD data, which serves as the training data for DynaPicker. Subsequently, we employed the retrained
EPick model to estimate phase arrival times for continuous data extracted from the STEAD dataset. It is important to highlight
that there is no overlap between the datasets used for EPick training and those utilized for phase arrival time detection. We
observed that, while the performance in detecting the P phase was similar, the accuracy of SPhase picking decreased from a
mean value of 0.002s to 0.050s, and the standard deviation increased from 0.122s to 0.147s. Additionally, The EPick model
is developed for the task of estimating seismic phase arrival times with fixed input length. In contrast, the DynaPicker model
was primarily designed for phase classification and its adaptation for phase arrival time detection with different input lengths
is a notable application. Furthermore, as the size of the training data increased, DynaPicker exhibited improved performance
and demonstrated greater robustness when compared to EPick.
Changes in manuscript: Texts updated to help readers understand the process on pages 11 and 12.5. L240: "The testing accuracy of DynaPicker is 98.82%, which is slightly greater than CapsPhase [30] (98.66%) and 1D
ResNet [9] (98.66%)." Because the differences are very small and do not tell readers much information, could you compare
the waveforms of false predictions of these models to help understand where DynaPicker can be better?Response: As per reviewer’s suggestion, here, we plot several waveforms of false predictions, while they are correctly
identified by DynaPicker.
Figure 2. Visualization of trace examples.
From these figures, we can observe that compared with other models, DynaPicker shows its advantage in phase classification.
in scenarios where the ground truth label is noise and the seismic waveform exhibits increased noise levels, DynaPicker
accurately identifies it as noise, whereas the GPD and ResNet models tend to misclassify it. Unfortunately, as mentioned
before, the CapsPhase model retrieved from the git repository cannot be utilized at this time.

AC2: 'Reply on RC1', Nishtha Srivastava, 03 Nov 2023

RC2: 'Comment on egusphere20231391', Anonymous Referee #2, 11 Sep 2023
The manuscript proposes a new deeplearning picker that leverages dynamic convolutional neural networks for detecting and picking seismic phases from windowed or continuous waveform data. The authors then combined the previously published CREIME model for magnitude estimations of waveform windows that have high Pwave probabilities. The authors have evaluated the performance of their picker and their combined workflow on opensource seismic datasets and aftershocks following the Turkey earthquake. The technical part of the manuscript is overall solid. However, I have vital concerns about the ‘realtime’ claim. It seems to me that the authors have confused the concept of processing continuous data with the concept of realtime earthquake monitoring. I suggest the authors modify their claim from ‘Realtime’ to ‘efficient’ and emphasize more on the performance of the proposed deeplearning picker. Aside from the ‘realtime’ claim, the study seems good overall. Below are my detailed comments:
 How is the term ‘Realtime’ defined? What is the time cost between the time of data recorded at the seismometer and the time of output produced? Please note that there are several important steps for realtime earthquake monitoring besides the time cost of the phasepicking model. For example, how is the time cost of the data transmitted from the seismometer to the data center? Is the data processed at the seismometer end with edgecomputing (which would be important in areas with poor internet access), or is the data transmitted to the data center first and processed later there? The data packages in the realtime seismic data flow can contain errors due to transmission issues. How is that addressed?
 What is the inference time cost of the model? What is the key advantage of the proposed method over conventional and lightweight convolutional deeplearning pickers in terms of realtime monitoring? The authors claim, "However, most of the prevalent CNNbased models perform inference using static convolution kernels, which may limit their representation power, efficiency, and ability for interpretation.” However, to my acknowledgment, the current CNNbased models, especially lightweight ones, are sufficient for millisecondlevel inference. One key claim of the manuscript is that the proposed method is much faster and, therefore, more suitable for realtime earthquake monitoring. However, I didn’t find any quantitative comparisons on the inference speed in this paper.
 The event's location is one key information in earthquake monitoring and yet not resolved by the current workflow. The lack of event location information would decrease the significance of the proposed monitoring method.
 Why is being adaptive to different input lengths important? Is that because in the realtime earthquake monitoring scenario that the authors are dealing with, the input lengths of data chunks can be significantly different? And what are the advantages of the proposed method over the RNNbased pickers, which can also adapt to different input lengths?
 Section 5.5 ‘Realtime earthquake detection’, how is the ‘realtime’ here different from ‘continuous data’? Section 6.2 ‘the live data of the Turkey earthquake’, what does the ‘live data’ mean, do authors have access to the realtime data packages from the Earthquake Data Center System of Turkey, or do they use the downloaded continuous waveform data?
Citation: https://doi.org/10.5194/egusphere20231391RC2 
AC1: 'Reply on RC2', Nishtha Srivastava, 27 Oct 2023
Response to Reviewers
Dear Reviewer 2,
We appreciate the time and effort that you have dedicated and are grateful for your insightful comments which have improved
the manuscript. We have incorporated the constructive suggestions, and have highlighted the changes within the manuscript
and marked them in blue color. Here is a pointbypoint response to your comments and concerns.
Reviewer 2
The manuscript proposes a new deeplearning picker that leverages dynamic convolutional neural networks for detecting and
picking seismic phases from windowed or continuous waveform data. The authors then combined the previously published CREIME model for magnitude estimations of waveform windows that have high Pwave probabilities. The authors have evaluated the performance of their picker and their combined workflow on opensource seismic datasets and aftershocks following
the Turkey earthquake. The technical part of the manuscript is overall solid. However, I have vital concerns about the ‘realtime’ claim. It seems to me that the authors have confused the concept of processing continuous data with the concept of realtime earthquake monitoring. I suggest the authors modify their claim from ‘Realtime’ to ‘efficient’ and emphasize more on the
performance of the proposed deeplearning picker. Aside from the ‘realtime’ claim, the study seems good overall. Below are our detailed comments:Response: We appreciate the reviewer’s suggestion to modify our claim from ’realtime’ to ’efficient’. We understand that
’realtime’ can have different interpretations, and we agree that emphasizing the efficiency of our proposed deeplearning
picker is essential. We made this adjustment in the revised manuscript avoiding the strict interpretation of ‘realtime’.1.How is the term ‘Realtime’ defined? What is the time cost between the time of data recorded at the seismometer and the
time of output produced? Please note that there are several important steps for realtime earthquake monitoring besides the
time cost of the phasepicking model. For example, how is the time cost of the data transmitted from the seismometer to the
data center? Is the data processed at the seismometer end with edgecomputing (which would be important in areas with poor
internet access), or is the data transmitted to the data center first and processed later there? The data packages in the realtime
seismic data flow can contain errors due to transmission issues. How is that addressed?Response: We acknowledge that the term ‘realtime’ to describe our model’s performance was a misnomer. Our model does
not operate in realtime, and therefore, we did not perform any analysis on the time cost of realtime data flow. We apologize
for any confusion caused by the incorrect terminology and thank the reviewer for pointing this out. Instead, our proposed model
provides timely results from continuous waveform recordings. We revised the manuscript to accurately reflect this and ensure
that our terminology aligns with the actual capabilities of our model.2.What is the inference time cost of the model? What is the key advantage of the proposed method over conventional
and lightweight convolutional deeplearning pickers in terms of realtime monitoring? The authors claim, "However, most
of the prevalent CNNbased models perform inference using static convolution kernels, which may limit their representation
power, efficiency, and ability for interpretation.” However, to my acknowledgment, the current CNNbased models, especially
lightweight ones, are sufficient for millisecondlevel inference. One key claim of the manuscript is that the proposed method is
much faster and, therefore, more suitable for realtime earthquake monitoring. However, I didn’t find any quantitative compar
isons on the inference speed in this paper.Response: In the introduction section of the manuscript, we claim that “However, most of the prevalent CNNbased models
perform inference using static convolution kernels, which may limit their representation power, efficiency, and ability for in
terpretation.” To clarify, our primary focus in this work is the utilization of the dynamic networks to enhance the performance
of the seismic phase classification performance and, consequently reduce the errors in the phase arrivaltime estimation. As
a result, we have not quantified the time comparison of the inference speed in this paper. However, following the reviewer’s
suggestion, we will explore such a relative comparison in our followup work.3.The event’s location is one key information in earthquake monitoring and yet not resolved by the current workflow. The
lack of event location information would decrease the significance of the proposed monitoring method.Response: Event location is indeed a crucial component of seismic analysis, but its inclusion in our current model was
beyond the scope of this work. We understand the importance of this aspect and recognize it as an essential feature for a
comprehensive seismic monitoring system. However, getting an accurate phase arrival is crucial for a correct event location
determination. We plan to address event location information and upgrade the current model in future research.4.Why is being adaptive to different input lengths important? Is that because in the realtime earthquake monitoring scenario
that the authors are dealing with, the input lengths of data chunks can be significantly different? And what are the advantages
of the proposed method over the RNNbased pickers, which can also adapt to different input lengths?Response:
– The proposed method is adaptive to different input lengths, as it can accommodate the continuous data of different
durations. We have edited the text in line 40 to highlight this.
– We are currently working on a RNNbased model for event detection which will be a followup of this work.
25. Section 5.5 ‘Realtime earthquake detection’, how is the ‘realtime’ here different from ‘continuous data’? Section 6.2
‘the live data of the Turkey earthquake’, what does the ‘live data’ mean, do authors have access to the realtime data packages
from the Earthquake Data Center System of Turkey, or do they use the downloaded continuous waveform data?
Response: We incorrectly used the term ‘realtime’ when referring to our data source. we apologize again for the inconve
nience. To accurately describe our data source, we now use the term ‘continuous seismic recordings’ throughout the revised
manuscript. This term better reflects the downloaded continuous waveform data that we are utilizing.
Peer review completion
Journal article(s) based on this preprint
Viewed
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprintrelated metrics are limited to HTML views.
HTML  XML  Total  BibTeX  EndNote  

279  0  0  279  4  3 
 HTML: 279
 PDF: 0
 XML: 0
 Total: 279
 BibTeX: 4
 EndNote: 3
Viewed (geographical distribution)
Since the preprint corresponding to this journal article was posted outside of Copernicus Publications, the preprintrelated metrics are limited to HTML views.
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
Wei Li
Megha Chakraborty
Jonas Köhler
Claudia Quinteros Cartaya
Georg Rümpker
Nishtha Srivastava
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.