the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Hazard assessment modeling and software development of earthquaketriggered landslides in the SichuanYunnan area, China
Xaioyi Shao
Siyuan Ma
Abstract. To enhance the timeliness and accuracy of spatial prediction of coseismic landslides, we propose an improved threestage spatial prediction strategy and developed a corresponding hazard assessment software named Mat.LShazard V1.0. Based on this software, we evaluate the applicability of this improved spatial prediction strategy in six earthquake events that have occurred near the Sichuan Yunnan region including the Wenchuan, Ludian, Lushan, Jiuzhaigou, Minxian and Yushu earthquakes. The results indicate that in the first stage (within a halfhour of the earthquake), except for the 2013 Minxian earthquake, the AUC values of the modelling performance in other five events are above 0.8. Among them, the AUC value of the Wenchuan earthquake is the highest, reaching 0.947. The prediction results in the first stage can meet the requirements of emergency rescue with immediately obtaining the overall predicted information of the possible coseismic landslide locations in the quakeaffected area. In the second and third stages (Within 12 hours of the quake), with the improvement of landslide data quality, the prediction ability of the model based on the entire landslide database is gradually improved. Based on the entire landslide database, the AUC value of the six events exceeds 0.9, indicating a very high prediction accuracy. Whether in the second or third stage (After 3 days of the seismic event), the predicted landslide area (Ap) is in good agreement with the observed landslide area (Ao). However, based on incomplete landslide data in the meizoseismal area, Ap is much smaller than Ao. When the prediction model based on complete landslide data is built, Ap is nearly identical to Ao. This study provides a new application tool for coseismic landslide disaster prevention and mitigation in different stages of emergency rescue, temporary resettlement, and latereconstruction after a major earthquake.

Notice on discussion status
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint
(4324 KB)

The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(4324 KB)  Metadata XML
 BibTeX
 EndNote
 Final revised paper
Journal article(s) based on this preprint
Xaioyi Shao et al.
Interactive discussion
Status: closed

CEC1: 'Comment on egusphere2022772', Astrid Kerkweg, 07 Oct 2022
Dear authors,
in my role as Executive editor of GMD, I would like to bring to your attention our Editorial version 1.2: https://www.geoscimodeldev.net/12/2215/2019/
This highlights some requirements of papers published in GMD, which is also available on the GMD website in the ‘Manuscript Types’ section: http://www.geoscientificmodeldevelopment.net/submission/manuscript_types.html
In particular, please note that for your paper, the following requirement has not been met in the Discussions paper:
 "The main paper must give the model name and version number (or other unique identifier) in the title."
Please add the name and version number of Mat.LShazard V1.0 to the title of your article upon revision of the manuscript.
Yours,
Astrid Kerkweg
Citation: https://doi.org/10.5194/egusphere2022772CEC1 
AC1: 'Reply on CEC1', Chong Xu, 08 Oct 2022
Dear editor,
Thanks for your comments. We would added the name and version number of Mat.LShazard V1.0 to the title as you suggest. After the peerview comments posted, we will revise this problem in the revised MS.
Best regards.
Citation: https://doi.org/10.5194/egusphere2022772AC1

RC1: 'Comment on egusphere2022772', Ali P. Yunus, 30 Nov 2022
Hazard assessment modeling and software development of earthquaketriggered landslides in the SichuanYunnan area, China by Shao et al. presents a rapid landslide mapping tool  Mat.LShazard based on logistic regression model, which they successfully applied to six earthquake affected sites in SichuanYunnan region. The manuscript is well written and the toolbox may have wide applicability in future hazard scenarios. However, there are some concerns that need to be addressed.
Firstly, the stage 1, stage 2, and stage 3 as discussed in this paper is subjective. Obtaining remote sensing images within 12 hours after the quake and detailed images 3 days after the event is depends on many factors. At this point, it is better to define them as stage 2 and stage 3 only. May be termed as – stage 1 = immediately after the event, stage 2 = hours to a few days (e.g., Planet), and stage 3 = few days to weeks (e.g., Planet, Sentinel 2, Landsat 8/9).
Description on stage 1: Authors wrote  More detailed theory and calculation procedures can be found in (Xu et al., 2019). Xu et al 2019 is a paper written in Chinese Language. Hence describing more on the procedure of stage 1 is important for global readers. What are the inputs in stage 1?
Difference in stage 2 and stage 3: As far as I understand, the difference between these two stages is incorporation of more accurate training samples of landslides. Is it so? I believe the conditioning factors remains the same. This has to be explained clearly.
How Mat.LShazard model is different from USGS models – Godt 2008 and Nowicki Jessee et al 2018 ?.
Line 387 – How is this random selection achieved? It is not clear that in stage 2, for the final map, whether the study used all the 50 combinations for obtaining the mean probability distribution. If so, the accuracy is obviously close to stage 3. Instead, the study could have used random 6 (or X) combinations and their mean to get the probability distribution map. We could naturally expect high accuracy in third stage as all the landslide are used in training.
Since stage 3 involve mapping all landslides, then the applicability of the model for other study areas is limited. I would like to see how this model works for a validation site.
Line 446 – We chose 4 independent variable…. Why 4 ? what about remaining 9?
What threshold is used for calculating Ap in this study?
Minor comments
Line 28 29 is confusing rephrase to get a better reading
Fig 5. Is this the result of stage 1? . if so mention it in the caption.
Same for Fig 8. Is this the result of stage 2 or 3?
Apart from the graphs, there could also been a table for accuracy matrices.
Citation: https://doi.org/10.5194/egusphere2022772RC1 
AC2: 'Reply on RC1', Chong Xu, 27 Jan 2023
Review comments and responses
Reviewer 1
Over view
Hazard assessment modeling and software development of earthquaketriggered landslides in the SichuanYunnan area, China by Shao et al. presents a rapid landslide mapping tool  Mat.LShazard based on logistic regression model, which they successfully applied to six earthquake affected sites in SichuanYunnan region. The manuscript is well written and the toolbox may have wide applicability in future hazard scenarios. However, there are some concerns that need to be addressed.
Comments
 Firstly, the stage 1, stage 2, and stage 3 as discussed in this paper is subjective. Obtaining remote sensing images within 12 hours after the quake and detailed images 3 days after the event is depends on many factors. At this point, it is better to define them as stage 2 and stage 3 only. May be termed as – stage 1 immediately after the event, stage 2 = hours to a few days (e.g., Planet), and stage 3 = few days to weeks (e.g., Planet, Sentinel 2, Landsat 8/9).
Authors’ response: Yes, we have redefined the second and third phases, as you suggested.
 Description on stage 1: Authors wrote  More detailed theory and calculation procedures can be found in (Xu et al., 2019). Xu et al 2019 is a paper written in Chinese Language. Hence describing more on the procedure of stage 1 is important for global readers. What are the inputs in stage 1?
Authors’ response: Yes, we added the details of independent variables for the Xu_{2019} model in the text (line 265270). Meanwhile, we added the detailed description of the Xu_{2019} model in the supplementary materials, including the selection of earthquakeinduced landslide inventories, the input of influence factors and calculation procedures of the model.
 Difference in stage 2 and stage 3: As far as I understand, the difference between these two stages is incorporation of more accurate training samples of landslides. Is it so? I believe the conditioning factors remains the same. This has to be explained clearly.
Authors’ response: Yes. For the second and third stages, we chose the same influencing factors as the input in the first stage, so that we can easily compare the regression coefficient changes of different influencing factors in different stages, and thus explain the relationship between each influencing factor and the landslide occurrence. Relevant explanations have been added in section 3.21 (line 277281).
 How Mat.LShazard model is different from USGS models – Godt 2008 and Nowicki Jessee et al 2018 ?.
Authors’ response：The Godt 2008 and Nowicki Jessee et al 2018 models are nearrealtime assessment models of coseismic landslides. Among them, the Godt 2008 model adopts the physicallybased Newmark displacement method and has been widely used for quickly assessing earthquakeinduced landslides in the world. However, it should be mentioned that emergency hazard assessments based on the Newmark model require multiple parameters, including terrain, geotechnical mechanics, groundwater and ground motion, etc. There are numerous uncertainties in both these parameters themselves and the process of obtaining these parameters (Bojadjieva et al., 2018; Wang et al., 2015). To obtain more precise predicted displacement, the Newmark method needs accurate and complete physical and mechanical property information of rocks and ground motion parameters (Dreyfus et al., 2013), which is often challenging. Therefore, the accuracy of regional prediction results based on the Newmark model is relatively low, which cannot meet the needs of emergency assessment at present. Based on 23 global landslide inventories, Nowicki Jessee et al. (2018) established a new global landslide evaluation model using the datadriven method. However, the model is affected by the input samples of seismic landslide samples, and the applicability and accuracy of the model in the Sichuan Yunnan region are reduced.
The Mat. LSHazard model is the tool for earthquake landslide hazard assessment, which can be applied to three different scenarios in the Sichuan Yunnan region. The model used in the first stage is a nearrealtime assessmnet model (Xu2019 model) based on 9 seismic landslide databases near the Sichuan Yunnan region and shows a better applicability performance in the Sichuan Yunnan region, compared with the Nowicki Jessee et al 2018 model. Otherwise, the Nowicki Jessee 2018 model uses the same ratio of sliding samples as that of nonsliding samples to train the LR model. This sampling method artificially exaggerates the proportion of sliding samples in the study area, and thus the assessment results only consider the relative hazard level but do not represent the real occurrence probability of landslides. Therefore, the Xu_{2019} model combines the bayesian probability method with the LR model, realizing the establishment of a new generation of coseismic landslide hazard model, which can give landslide occurrence probability instead of relative hazard level. Meanwhile, we have supplemented the corresponding programs for the second and third stages to ensure that the Mat.LSHazard model can meet the various needs of disaster prevention and reduction in different stages after the major earthquake.
Bojadjieva, J., Sheshov, V., Christophe, B., 2018. Hazard and risk assessment of earthquakeinduced landslides—case study. Landslides, 15, 161171.
Wang, T., Wu, S., Shi, J., XIn, P., 2015. Concepts and mechanical assessment method for seismic landslide hazard: A review. Journal of Engineering Geology, 23, 93—104.
Dreyfus, D.K., Rathje, E.M., Jibson, R.W., 2013. The influence of different simplified slidingblock models and input parameters on regional predictions of seismic landslides triggered by the Northridge earthquake. Engineering Geology, 163, 4154.
Nowicki Jessee, M.A. et al., 2018. A global empirical model for nearrealtime assessment of seismicallyinduced landslides. Journal of Geophysical Research: Earth Surface, 123, 18351859.
Xu, C., Xu, X., Zhou, B., Shen, L., 2019. Probability of coseimic landslides: A new generation of earthquaketriggered landslide hazard model. Journal of Engineering Geology, 27, 1122.
 Line 387 – How is this random selection achieved? It is not clear that in stage 2, for the final map, whether the study used all the 50 combinations for obtaining the mean probability distribution. If so, the accuracy is obviously close to stage 3. Instead, the study could have used random 6 (or X) combinations and their mean to get the probability distribution map. We could naturally expect high accuracy in third stage as all the landslide are used in training.
Authors’ response: Yes. To avoid the sampling randomness, we chose 70% of all samples at random and independently repeated 50 times to construct the LR model based on partial landslide data available in the the meizoseismal area. 50 separate experiments yielded 50 modelling results. Fig. 7 and 8 show the mean probability prediction results of 50 models in the second and third stages of 6 earthquake events respectively. For the third stage, we also employed the same random sampling method, and each time we extracted 70% of all samples (complete landslide data in the entire earthquake area) to generate 50 model results. The only difference is that the samples used for model training in the second stage and the third stage are different. Because the training samples used in the third stage are the complete landslide data of the entire quake region, the accuracy of the evaluation results in the third stage is higher. Relevant descriptions have been added in Section 4.2 (line 410413 and line 429434).
 Since stage 3 involve mapping all landslides, then the applicability of the model for other study areas is limited. I would like to see how this model works for a validation site.
Authors’ response: Yes. In this study, we randomly selected 70% of the total samples for model training, and the remaining 30% were used for model validation; this step was repeated for a total of 50 times. In the third stage, it is assumed that we obtained the coseismic landslide data of the whole quake area based on the pre and postquake images, and then carried out the earthquakeinduced landslide hazard assessment based on the complete landslide inventory. The assessment result can serve for the identification of the potential landslide highhazard areas and the postdisaster restoration and infrastructure reconstruction in earthquake disaster areas. Meanwhile, since the model was trained by coseismic landslide data in this region, it is theoretically only applicable to this region.
 Line 446 – We chose 4 independent variable…. Why 4 ? what about remaining 9?
Authors’ response: Yes. Due to limited space, we only showed four independent variables, which have more obvious impact on the landslide occurrence in the text. Therefore, in the supplementary materials, we have added the regression coefficients of all continuous variables (Fig.1s).
 What threshold is used for calculating Ap in this study?
Authors’ response: In this study, based on previous studies, we explored a new sampling method (Shao et al 2020). Based on this method，the prediction results represent the real landslide probability. In other words, we correlated the resulting probability with spatial extent (e.g., areas labeled 5% probability of landsliding contain about 5% landslides by area). Therefore, the probability value of each grid multiplied by the grid area represents the predicted landslide area in each grid. The predicted landslide area in the study area can be obtained by the superposition of all grids (Allstadt, et al 2018). For all model outputs, we computed and obtained the predicted landslide area (Ap) as a metric to summarize the total hazard estimated by a given model for a given earthquake with a single number. The predicted landslide area (Ap) was computed by
(5)
where is the probability of a landslide at pixel i, j; m is the number of rows; n is the number of columns; A is the pixel/cell area (constant).
Allstadt, K.E. et al., 2018. Improving Near‐Real‐Time Coseismic Landslide Models: Lessons Learned from the 2016 Kaikōura, New Zealand, Earthquake. Bulletin of the Seismological Society of America, 108, 16491664.
Shao, X., Ma, S., Xu, C., Zhou, Q., 2020. Effects of sampling intensity and nonslide/slide sample ratio on the occurrence probability of coseismic landslides. Geomorphology, 363, 107222.
Minor comments
 Line 28 29 is confusing rephrase to get a better reading
Authors’ response: Yes. We have rewritten this sentence.
10.Fig 5. Is this the result of stage 1? . if so mention it in the caption.
Authors’ response: Yes. We have revised it.
 Same for Fig 8. Is this the result of stage 2 or 3?
Authors’ response: Yes. We have revised it.
 Apart from the graphs, there could also been a table for accuracy matrices.
Authors’ response: Yes. We have added the corresponding tables about the accuracy matrices in supplementary materials.

AC2: 'Reply on RC1', Chong Xu, 27 Jan 2023

RC2: 'Comment on egusphere2022772', Anonymous Referee #2, 19 Dec 2022
I have read the manuscript with interest. The paper develops a software for susceptibility assessment of coseismic landslides considering three stages after earthquake. The paper is well written, but I have some major concerns regarding the innovation and the method used by the paper.
I agree with the authors that rapid assessment of coseismic landslides is crucial to emergency response after strong earthquakes. The paper combined logistic regression and Bayesian probability methods in Matlab for assessing the spatial probability of landslides. Even the authors emphasized that there is no specialized software for seismic landslide hazard assessment, particularly in the various needs of different stages after a major earthquake, however, the methods they use are traditional methods, nothing new about the methodology itself. There are quite many existing toolbox or packages in ArcGIS, QGIS and R, which can be used for the same analysis as the authors did in Matlab. In addition, the threestage methodology is just classified considering the different time window after an earthquake. The methods at all stages are the same. The only difference is adding more and more landslide data after an earthquake due to more available information with time, such as remote sensing images. At third stage, if we already know all coseismic landslide distribution by RS imagery interpretation, why we still need the susceptibility model? Even at this stage, you get high R2, it is because the overfitting of the model. It does not mean the model will have good prediction power for next event.
Major comments:
 Why the authors did not try CNN or other more advanced AI methods, which should have better performance than logistic regression and Bayesian probability methods?
 PGA and PGV are considered as the most important seismic factors, why the authors used intensity rather than PGA and PGV data? Besides, distance to river and distance to transportation lines are also important factors considering the river incision in mountainous regions and human work effect, why they are not considered in the model?
 It is quite obvious that from Fig.4 that the actual landslides (black polygons) are not falling in high probability zones? The model seems not satisfactory for the first stage. Many landslides in all events are failing into blue (low probability zones), while the predicted high probability zones have a few landslides. This indicates that the model has quite high false alarms from prediction perspective. In the second stage, Fig.6, it still has the mismatching problem. In the third stage, it looks better, but this is because as I mentioned above, the overfitting of model by using a large amount of known landslides. Actually the first stage, the rapid prediction using very limited or even no available landslide information, is most important one considering the emergency response and rescue work. The model’s performance at this stage is not good.
Citation: https://doi.org/10.5194/egusphere2022772RC2 
AC3: 'Reply on RC2', Chong Xu, 27 Jan 2023
Review comments and responses
Reviewer 2
Over view
I have read the manuscript with interest. The paper develops a software for hazard assessment of coseismic landslides considering three stages after earthquake. The paper is well written, but I have some concerns regarding the innovation and the method used by the paper.
Comments
 I agree with the authors that rapid assessment of coseismic landslides is crucial to emergency response after strong earthquakes. The paper combined logistic regression and Bayesian probability methods in Matlab for assessing the spatial probability of landslides. Even the authors emphasized that there is no specialized software for seismic landslide hazard assessment, particularly in the various needs of different stages after a major earthquake, however, the methods they use are traditional methods, nothing new about the methodology itself. There are quite many existing toolbox or packages in ArcGIS, QGIS and R, which can be used for the same analysis as the authors did in Matlab.
Authors’ response: Yes. Currently, there are many toolboxes or packages for assessing the spatial probability of landslides in different programming languages. However, these toolboxes are built by the traditional hazard assessment process, which requires landslide samples for modelling. As a result, the prediction results often lag behind the actual application, which cannot satisfy the emergency assessment of earthquakeinduced landslides (Ma et al 2020). Therefore, to solve this problem, we integrated a new generation of earthquaketriggered landslide hazard model (Xu_{2019} model), so that our software can serve for the emergency assessment of earthquakeinduced landslides in the SichuanYunnan area. Secondly, most studies on datadriven model have used the same ratio of sliding samples to nonsliding samples. Such a sampling method artificially exaggerates the proportion of sliding samples in the study area (Allstadt, et al 2018, Nowicki Jessee et al 2018); thus, the assessment results only consider the relative hazard level, but do not represent the real occurrence probability of landslides. Consequently, the resulting probability of the model overestimates the actual landslide occurrence probability (Shao et al 2020, Nowicki Jessee et al 2018). We proposed a real probability prediction method of coseismic landslides by the bayesian probability method and LR model (Shao et al 2020). The results of this model represent the occurrence probability of landslide rather than the relative hazard level (Shao et al 2020) and thus can calculate the landslide area of the quakeaffected area. Thirdly, to our knowledge, although there are quite many existing toolboxes or packages in ArcGIS, QGIS and R, there is no specific software for regional landslide hazard assessment based on matlab language, so our work will also help those familiar with matlab language to carry out the earthquakeinduced landslide assessment.
Allstadt, K.E. et al., 2018. Improving Near‐Real‐Time Coseismic Landslide Models: Lessons Learned from the 2016 Kaikōura, New Zealand, Earthquake. Bulletin of the Seismological Society of America, 108, 16491664.
Ma, S., Xu, C., Shao, X., 2020. Spatial prediction strategy for landslides triggered by large earthquakes oriented to emergency response, midterm resettlement and later reconstruction. International Journal of Disaster Risk Reduction, 43, 101362.
Shao, X., Ma, S., Xu, C., Zhou, Q., 2020. Effects of sampling intensity and nonslide/slide sample ratio on the occurrence probability of coseismic landslides. Geomorphology, 363, 107222.
Nowicki Jessee, M.A. et al., 2018. A global empirical model for nearrealtime assessment of seismicallyinduced landslides. Journal of Geophysical Research: Earth Surface, 123, 18351859.
 In addition, the threestage methodology is just classified considering the different time window after an earthquake. The methods at all stages are the same. The only difference is adding more and more landslide data after an earthquake due to more available information with time, such as remote sensing images. At third stage, if we already know all coseismic landslide distribution by RS imagery interpretation, why we still need the hazard model? Even at this stage, you get high R2, it is because the overfitting of the model. It does not mean the model will have good prediction power for next event.
Authors’ response: In different stages after a large earthquake, the demand to mitigate earthquake disasters is different (Ma et al 2020). Especially in mountainous areas, the spatial prediction of landslides is of great significance to shortterm emergency response (stage 1), mediumterm temporary resettlement (stage 2) and longterm rehabilitation and reconstruction (stage 3). In stage 1, in the absence of landslide data, the rapid emergency hazard mapping using nearrealtime model can provide guidance for post disaster emergency rescue and the interpretation and identification of earthquakeinduced landslide in stage 2. In stage 2, considering the timeliness, the earthquakeinduced landslide hazard assessment was carried out based on partially available landslides data. The assessment results are beneficial for the improvement of the construction of earthquakeinduced landslide inventory, and provide useful information on avoiding high landslide hazard areas for quakeaffected areas.
In stage 3, we are faced with not only the problem of coseismic landslide identification, but also the weakened slope caused by the quake. As a result, it is critical to locate the landslides that are stable during the earthquake but unstable for a period of time after the earthquake. Such an essential process can be achieved by the hazard assessment of earthquakeinduced landslides based on the complete landslide data. Thus, the results obtained in stage 3 will definitely be more objective than those obtained in the stage 2, because the training samples used in the model in this stage are abundant and more objective. That is to say, the prediction ability of the model in stage 3 is stronger than that in stage 2 (the prediction rate in the third stage is higher). Meanwhile, the evaluation results at this stage can effectively serve the town planning and longterm risk assessment of the subsequent quakeaffected areas. To summarize, we suggested to perform earthquakeinduced landslide hazard assessment at multiple stages in a large earthquake in order to better deal with the landslide disaster prevention and mitigation issues that earthquake areas face at various stages. The relevant descriptions have been added in Section 3.2.1 (line 288296).
Ma, S., Xu, C., Shao, X., 2020. Spatial prediction strategy for landslides triggered by large earthquakes oriented to emergency response, midterm resettlement and later reconstruction. International Journal of Disaster Risk Reduction, 43, 101362.
Major comments:
 Why the authors did not try CNN or other more advanced AI methods, which should have better performance than logistic regression and Bayesian probability methods?
Authors’ response: Currently, the majority of the models employs one of several possible classification methods, including classical statistics (e.g., logistic regression, discriminant analysis, linear regression), indexbased (e.g., weightof evidence, heuristic analysis), machine learning (e.g., support vector machines, random forest) and neural networks (e.g., recurrent neural network, Convolutional neural network) (Reichenbach et al., 2018). Among them, the LR model is one of the most widely used models in the hazard assessment of earthquakeinduced landslides by virtue of its simplicity, high efficiency, and high prediction accuracy (Reichenbach et al., 2018; Shao and Xu, 2022) (Fig.1). In addition, it is the preferred method for establishing the nearrealtime prediction model of earthquakeinduced landslides (Nowicki Jessee et al., 2018; Tanyas et al., 2019; Xu et al, 2019).
In recent years, deep learning methods, especially Convolutional Neural Networks (CNN), have been pervasively applied in landslide hazard assessment. Similar to other machine learning methods, the internal structure of CNN model is complex like a black box, and the models need to classify the independent variables before the evaluation modeling (Yang et al., 2022). Compared with the CNN model, the LR model can better avoid these two problems. This method can carry out different types of independent variables including continuous variables and discrete variables. The LR model can give specific regression coefficients of independent variables, with simple calculation process and definite physical meanings. At the same time, recent studies show that the LR regression model performed better in prediction other machine learning models (Zhao et al., 2022). The relevant description have been added in the section 3.22.
Fig.1 Horizontal bar chart shows the count of 19 model type classes used to group the 163 model names given by Reichenbach et al., 2018 in the literature databases.
Reichenbach, P., Rossi, M., Malamud, B.D., Mihir, M., Guzzetti, F., 2018. A review of statisticallybased landslide hazard models. EarthScience Reviews, 180, 6091.
Tanyas, H., Rossi, M., Alvioli, M., van Westen, C.J., Marchesini, I., 2019. A global slope unitbased method for the near realtime prediction of earthquakeinduced landslides. Geomorphology, 327, 126146.
Xu, C., Xu, X., Zhou, B., Shen, L., 2019. Probability of coseimic landslides: A new generation of earthquaketriggered landslide hazard model. Journal of Engineering Geology, 27, 1122.
Yang, Z., Xu, C., Shao, X., Ma, S., Li, L., 2022. Landslide hazard mapping based on CNN3D algorithm with attention module embedded. Bulletin of Engineering Geology and the Environment, 81, 412.
Zhao, P., Masoumi, Z., Kalantari, M., Aflaki, M., Mansourian, A., 2022. A GISBased Landslide Hazard Mapping and Variable Importance Analysis Using Artificial Intelligent TrainingBased Methods. Remote. Sens., 14, 211.
 PGA and PGV are considered as the most important seismic factors, why the authors used intensity rather than PGA and PGV data? Besides, distance to river and distance to transportation lines are also important factors considering the river incision in mountainous regions and human work effect, why they are not considered in the model?
Authors’ response: Thank the reviewers for their suggestions. Indeed, as the reviewer said, PGA and PGV are the two most important seismic factors, and these two factors can be converted to each other through specific empirical formula (Boore et al., 2014; Saffari et al., 2012). Like PGA and PGV, seismic intensity is also one of the most important seismic factors, and there are specific formulas for seismic intensity and PGA to be converted (Du et al., 2018; Xin et al., 2020). In our study, we chose the seismic intensity instead of PGA because in the official results released by the China Earthquake administration, seismic intensity is obtained by the integration of multisource information such as instrument records, field survey and actual earthquake damages. Compared with the PGA map obtained from simple instrument records or attenuation relationship, the map of seismic intensity can better reflect the distribution of the seismic influence field. Therefore, in the three stages, we carried out the probability prediction of earthquakeinduced landslides based on the rapidly obtained seismic intensity.
The reason why the distance to river and distance to transportation lines are not selected is that among all influencing factors we selected, the topographic wetness index (TWI) and landuse type can represent regional hydrological factors and human factors respectively to a certain extent. Additionally, according to the previous studies about the spatial distribution of earthquakeinduced landslides in the SichuanYunnan region, we found that these two influencing factors do not show strong correlation with the occurrence of earthquakeinduced landslides. Furthermore, despite the fact that these two influencing factors are not taken into account in the evaluation results, the performance of the evaluation model is satisfactory. Meanwhile, the model and software we developed are adaptable and do not place rigid limits on the input of influencing factors. Peers who are interested in assessment models might add or change the corresponding independent variables during the modeling process.
Boore, D.M., Stewart, J.P., Seyhan, E., Atkinson, G.M., 2014. NGAWest2 Equations for Predicting PGA, PGV, and 5% Damped PSA for Shallow Crustal Earthquakes. Earthquake Spectra, 30 (3), 10571085.
Saffari, H., Kuwata, Y., Takada, S., Mahdavian, A., 2012. Updated PGA, PGV, and Spectral Acceleration Attenuation Relations for Iran. Earthquake Spectra, 28 (1), 257276.
Du, K., Ding, B., Luo, H., Sun, J., 2018. Relationship between Peak Ground Acceleration, Peak Ground Velocity, and Macroseismic Intensity in Western China. Bulletin of the Seismological Society of America, 109, 284297.
Xin, D., Daniell, J.E., Wenzel, F., 2020. Review of fragility analyses for major building types in China with new implications for intensity–PGA relation development. Nat. Hazards Earth Syst. Sci., 20, 643672.
 It is quite obvious that from Fig.4 that the actual landslides (black polygons) are not falling in high probability zones? The model seems not satisfactory for the first stage. Many landslides in all events are failing into blue (low probability zones), while the predicted high probability zones have a few landslides. This indicates that the model has quite high false alarms from prediction perspective. In the second stage, Fig.6, it still has the mismatching problem. In the third stage, it looks better, but this is because as I mentioned above, the overfitting of model by using a large amount of known landslides. Actually the first stage, the rapid prediction using very limited or even no available landslide information, is most important one considering the emergency response and rescue work. The model’s performance at this stage is not good.
Authors’ response: In the first stage, except for the Minxian earthquake, the AUC value of other earthquakes is above 0.8. But we have to admit that the evaluation results of six earthquakes based on the Xu_{2019} model can be improved. We can see that landslide observations from the earthquake match well with predicted high probabilities, but the model predicts potential landsliding in a large area beyond the mapped landslide area. Especially in Minxian, Jiuzhaigou and Yushu earthquake cases, the performance of the model is not satisfactory. But most of the current nearrealtime models have such problems that the model performs well when evaluated over the domain of an entire event area, but clearly, individual pixels will predict probabilities that underestimate or overestimate the landslide hazard (Nowicki Jessee et al., 2018; Allstadt, et al 2018). We propose two possible reasons for this phenomenon: (1) The resolution of the input data of the Xu_{2019} model is 100m, which affects the prediction accuracy of the model to a certain extent. Therefore, there may be errors between the modeling prediction and the actual result at the regional scale. (2) Nine earthquake cases used for the establishment of the Xu_{2019} model are located in China and its adjacent areas. The corresponding epicentral areas have different topographic and geological conditions, and only four cases are in the SichuanYunnan area, which may weaken the applicability of the Xu_{2019} model in other quake events. Therefore, in the past few years, we have been constantly supplementing the earthquake landslide database in Sichuan Yunnan region (e.g. 2014 Ms Jinggu earthquake, 2020 Ms Qiaojia earthquake, 2018 Ms 5.7 Xingwen earthquake, 2019 Changning earthquake, 2022 Ms 6.8 Luding earthquake,.etc). We suggest that with the accumulation of enough coseismic landslide inventories of earthquake cases in SichuanYunnan area, we can constantly update the nearrealtime earthquaketriggered landslide hazard model based on these abundant landslide data and high resolution input factors data, and further improve the accuracy of the modelling in the emergency assessment. The relevant description have been added in the discussion section (Line 578602).
For the second stage, the predicted landslide area (Ap) of the six events are almost the same as the observed landslide area (Ao). Except for the Jiuzhaigou earthquake, the overall error of the remaining five earthquakes is between 9% and 50%, of which the error of the results of the 2008 Wenchuan earthquake is the lowest with 9%, indicating that the assessment results of the second stage are reliable for the quantification of coseismic landslide development area in the quakeaffected area. In addition, from the complete landslide inventory and prediction results of six events (Fig. 6), although some landslides are spread out on the lowhazard areas, most landslide are located in the highhazard areas which are relatively consistent with the actual landslide distribution. Meanwhile, based on the complete landslide data, the validation results in the second stage show that the AUC values of the second stage are all above 0.85, which indicates that the model have pretty good performance.
Allstadt, K.E. et al., 2018. Improving Near‐Real‐Time Coseismic Landslide Models: Lessons Learned from the 2016 Kaikōura, New Zealand, Earthquake. Bulletin of the Seismological Society of America, 108, 16491664.
Nowicki Jessee, M.A. et al., 2018. A global empirical model for nearrealtime assessment of seismicallyinduced landslides. Journal of Geophysical Research: Earth Surface, 123, 18351859.
Interactive discussion
Status: closed

CEC1: 'Comment on egusphere2022772', Astrid Kerkweg, 07 Oct 2022
Dear authors,
in my role as Executive editor of GMD, I would like to bring to your attention our Editorial version 1.2: https://www.geoscimodeldev.net/12/2215/2019/
This highlights some requirements of papers published in GMD, which is also available on the GMD website in the ‘Manuscript Types’ section: http://www.geoscientificmodeldevelopment.net/submission/manuscript_types.html
In particular, please note that for your paper, the following requirement has not been met in the Discussions paper:
 "The main paper must give the model name and version number (or other unique identifier) in the title."
Please add the name and version number of Mat.LShazard V1.0 to the title of your article upon revision of the manuscript.
Yours,
Astrid Kerkweg
Citation: https://doi.org/10.5194/egusphere2022772CEC1 
AC1: 'Reply on CEC1', Chong Xu, 08 Oct 2022
Dear editor,
Thanks for your comments. We would added the name and version number of Mat.LShazard V1.0 to the title as you suggest. After the peerview comments posted, we will revise this problem in the revised MS.
Best regards.
Citation: https://doi.org/10.5194/egusphere2022772AC1

RC1: 'Comment on egusphere2022772', Ali P. Yunus, 30 Nov 2022
Hazard assessment modeling and software development of earthquaketriggered landslides in the SichuanYunnan area, China by Shao et al. presents a rapid landslide mapping tool  Mat.LShazard based on logistic regression model, which they successfully applied to six earthquake affected sites in SichuanYunnan region. The manuscript is well written and the toolbox may have wide applicability in future hazard scenarios. However, there are some concerns that need to be addressed.
Firstly, the stage 1, stage 2, and stage 3 as discussed in this paper is subjective. Obtaining remote sensing images within 12 hours after the quake and detailed images 3 days after the event is depends on many factors. At this point, it is better to define them as stage 2 and stage 3 only. May be termed as – stage 1 = immediately after the event, stage 2 = hours to a few days (e.g., Planet), and stage 3 = few days to weeks (e.g., Planet, Sentinel 2, Landsat 8/9).
Description on stage 1: Authors wrote  More detailed theory and calculation procedures can be found in (Xu et al., 2019). Xu et al 2019 is a paper written in Chinese Language. Hence describing more on the procedure of stage 1 is important for global readers. What are the inputs in stage 1?
Difference in stage 2 and stage 3: As far as I understand, the difference between these two stages is incorporation of more accurate training samples of landslides. Is it so? I believe the conditioning factors remains the same. This has to be explained clearly.
How Mat.LShazard model is different from USGS models – Godt 2008 and Nowicki Jessee et al 2018 ?.
Line 387 – How is this random selection achieved? It is not clear that in stage 2, for the final map, whether the study used all the 50 combinations for obtaining the mean probability distribution. If so, the accuracy is obviously close to stage 3. Instead, the study could have used random 6 (or X) combinations and their mean to get the probability distribution map. We could naturally expect high accuracy in third stage as all the landslide are used in training.
Since stage 3 involve mapping all landslides, then the applicability of the model for other study areas is limited. I would like to see how this model works for a validation site.
Line 446 – We chose 4 independent variable…. Why 4 ? what about remaining 9?
What threshold is used for calculating Ap in this study?
Minor comments
Line 28 29 is confusing rephrase to get a better reading
Fig 5. Is this the result of stage 1? . if so mention it in the caption.
Same for Fig 8. Is this the result of stage 2 or 3?
Apart from the graphs, there could also been a table for accuracy matrices.
Citation: https://doi.org/10.5194/egusphere2022772RC1 
AC2: 'Reply on RC1', Chong Xu, 27 Jan 2023
Review comments and responses
Reviewer 1
Over view
Hazard assessment modeling and software development of earthquaketriggered landslides in the SichuanYunnan area, China by Shao et al. presents a rapid landslide mapping tool  Mat.LShazard based on logistic regression model, which they successfully applied to six earthquake affected sites in SichuanYunnan region. The manuscript is well written and the toolbox may have wide applicability in future hazard scenarios. However, there are some concerns that need to be addressed.
Comments
 Firstly, the stage 1, stage 2, and stage 3 as discussed in this paper is subjective. Obtaining remote sensing images within 12 hours after the quake and detailed images 3 days after the event is depends on many factors. At this point, it is better to define them as stage 2 and stage 3 only. May be termed as – stage 1 immediately after the event, stage 2 = hours to a few days (e.g., Planet), and stage 3 = few days to weeks (e.g., Planet, Sentinel 2, Landsat 8/9).
Authors’ response: Yes, we have redefined the second and third phases, as you suggested.
 Description on stage 1: Authors wrote  More detailed theory and calculation procedures can be found in (Xu et al., 2019). Xu et al 2019 is a paper written in Chinese Language. Hence describing more on the procedure of stage 1 is important for global readers. What are the inputs in stage 1?
Authors’ response: Yes, we added the details of independent variables for the Xu_{2019} model in the text (line 265270). Meanwhile, we added the detailed description of the Xu_{2019} model in the supplementary materials, including the selection of earthquakeinduced landslide inventories, the input of influence factors and calculation procedures of the model.
 Difference in stage 2 and stage 3: As far as I understand, the difference between these two stages is incorporation of more accurate training samples of landslides. Is it so? I believe the conditioning factors remains the same. This has to be explained clearly.
Authors’ response: Yes. For the second and third stages, we chose the same influencing factors as the input in the first stage, so that we can easily compare the regression coefficient changes of different influencing factors in different stages, and thus explain the relationship between each influencing factor and the landslide occurrence. Relevant explanations have been added in section 3.21 (line 277281).
 How Mat.LShazard model is different from USGS models – Godt 2008 and Nowicki Jessee et al 2018 ?.
Authors’ response：The Godt 2008 and Nowicki Jessee et al 2018 models are nearrealtime assessment models of coseismic landslides. Among them, the Godt 2008 model adopts the physicallybased Newmark displacement method and has been widely used for quickly assessing earthquakeinduced landslides in the world. However, it should be mentioned that emergency hazard assessments based on the Newmark model require multiple parameters, including terrain, geotechnical mechanics, groundwater and ground motion, etc. There are numerous uncertainties in both these parameters themselves and the process of obtaining these parameters (Bojadjieva et al., 2018; Wang et al., 2015). To obtain more precise predicted displacement, the Newmark method needs accurate and complete physical and mechanical property information of rocks and ground motion parameters (Dreyfus et al., 2013), which is often challenging. Therefore, the accuracy of regional prediction results based on the Newmark model is relatively low, which cannot meet the needs of emergency assessment at present. Based on 23 global landslide inventories, Nowicki Jessee et al. (2018) established a new global landslide evaluation model using the datadriven method. However, the model is affected by the input samples of seismic landslide samples, and the applicability and accuracy of the model in the Sichuan Yunnan region are reduced.
The Mat. LSHazard model is the tool for earthquake landslide hazard assessment, which can be applied to three different scenarios in the Sichuan Yunnan region. The model used in the first stage is a nearrealtime assessmnet model (Xu2019 model) based on 9 seismic landslide databases near the Sichuan Yunnan region and shows a better applicability performance in the Sichuan Yunnan region, compared with the Nowicki Jessee et al 2018 model. Otherwise, the Nowicki Jessee 2018 model uses the same ratio of sliding samples as that of nonsliding samples to train the LR model. This sampling method artificially exaggerates the proportion of sliding samples in the study area, and thus the assessment results only consider the relative hazard level but do not represent the real occurrence probability of landslides. Therefore, the Xu_{2019} model combines the bayesian probability method with the LR model, realizing the establishment of a new generation of coseismic landslide hazard model, which can give landslide occurrence probability instead of relative hazard level. Meanwhile, we have supplemented the corresponding programs for the second and third stages to ensure that the Mat.LSHazard model can meet the various needs of disaster prevention and reduction in different stages after the major earthquake.
Bojadjieva, J., Sheshov, V., Christophe, B., 2018. Hazard and risk assessment of earthquakeinduced landslides—case study. Landslides, 15, 161171.
Wang, T., Wu, S., Shi, J., XIn, P., 2015. Concepts and mechanical assessment method for seismic landslide hazard: A review. Journal of Engineering Geology, 23, 93—104.
Dreyfus, D.K., Rathje, E.M., Jibson, R.W., 2013. The influence of different simplified slidingblock models and input parameters on regional predictions of seismic landslides triggered by the Northridge earthquake. Engineering Geology, 163, 4154.
Nowicki Jessee, M.A. et al., 2018. A global empirical model for nearrealtime assessment of seismicallyinduced landslides. Journal of Geophysical Research: Earth Surface, 123, 18351859.
Xu, C., Xu, X., Zhou, B., Shen, L., 2019. Probability of coseimic landslides: A new generation of earthquaketriggered landslide hazard model. Journal of Engineering Geology, 27, 1122.
 Line 387 – How is this random selection achieved? It is not clear that in stage 2, for the final map, whether the study used all the 50 combinations for obtaining the mean probability distribution. If so, the accuracy is obviously close to stage 3. Instead, the study could have used random 6 (or X) combinations and their mean to get the probability distribution map. We could naturally expect high accuracy in third stage as all the landslide are used in training.
Authors’ response: Yes. To avoid the sampling randomness, we chose 70% of all samples at random and independently repeated 50 times to construct the LR model based on partial landslide data available in the the meizoseismal area. 50 separate experiments yielded 50 modelling results. Fig. 7 and 8 show the mean probability prediction results of 50 models in the second and third stages of 6 earthquake events respectively. For the third stage, we also employed the same random sampling method, and each time we extracted 70% of all samples (complete landslide data in the entire earthquake area) to generate 50 model results. The only difference is that the samples used for model training in the second stage and the third stage are different. Because the training samples used in the third stage are the complete landslide data of the entire quake region, the accuracy of the evaluation results in the third stage is higher. Relevant descriptions have been added in Section 4.2 (line 410413 and line 429434).
 Since stage 3 involve mapping all landslides, then the applicability of the model for other study areas is limited. I would like to see how this model works for a validation site.
Authors’ response: Yes. In this study, we randomly selected 70% of the total samples for model training, and the remaining 30% were used for model validation; this step was repeated for a total of 50 times. In the third stage, it is assumed that we obtained the coseismic landslide data of the whole quake area based on the pre and postquake images, and then carried out the earthquakeinduced landslide hazard assessment based on the complete landslide inventory. The assessment result can serve for the identification of the potential landslide highhazard areas and the postdisaster restoration and infrastructure reconstruction in earthquake disaster areas. Meanwhile, since the model was trained by coseismic landslide data in this region, it is theoretically only applicable to this region.
 Line 446 – We chose 4 independent variable…. Why 4 ? what about remaining 9?
Authors’ response: Yes. Due to limited space, we only showed four independent variables, which have more obvious impact on the landslide occurrence in the text. Therefore, in the supplementary materials, we have added the regression coefficients of all continuous variables (Fig.1s).
 What threshold is used for calculating Ap in this study?
Authors’ response: In this study, based on previous studies, we explored a new sampling method (Shao et al 2020). Based on this method，the prediction results represent the real landslide probability. In other words, we correlated the resulting probability with spatial extent (e.g., areas labeled 5% probability of landsliding contain about 5% landslides by area). Therefore, the probability value of each grid multiplied by the grid area represents the predicted landslide area in each grid. The predicted landslide area in the study area can be obtained by the superposition of all grids (Allstadt, et al 2018). For all model outputs, we computed and obtained the predicted landslide area (Ap) as a metric to summarize the total hazard estimated by a given model for a given earthquake with a single number. The predicted landslide area (Ap) was computed by
(5)
where is the probability of a landslide at pixel i, j; m is the number of rows; n is the number of columns; A is the pixel/cell area (constant).
Allstadt, K.E. et al., 2018. Improving Near‐Real‐Time Coseismic Landslide Models: Lessons Learned from the 2016 Kaikōura, New Zealand, Earthquake. Bulletin of the Seismological Society of America, 108, 16491664.
Shao, X., Ma, S., Xu, C., Zhou, Q., 2020. Effects of sampling intensity and nonslide/slide sample ratio on the occurrence probability of coseismic landslides. Geomorphology, 363, 107222.
Minor comments
 Line 28 29 is confusing rephrase to get a better reading
Authors’ response: Yes. We have rewritten this sentence.
10.Fig 5. Is this the result of stage 1? . if so mention it in the caption.
Authors’ response: Yes. We have revised it.
 Same for Fig 8. Is this the result of stage 2 or 3?
Authors’ response: Yes. We have revised it.
 Apart from the graphs, there could also been a table for accuracy matrices.
Authors’ response: Yes. We have added the corresponding tables about the accuracy matrices in supplementary materials.

AC2: 'Reply on RC1', Chong Xu, 27 Jan 2023

RC2: 'Comment on egusphere2022772', Anonymous Referee #2, 19 Dec 2022
I have read the manuscript with interest. The paper develops a software for susceptibility assessment of coseismic landslides considering three stages after earthquake. The paper is well written, but I have some major concerns regarding the innovation and the method used by the paper.
I agree with the authors that rapid assessment of coseismic landslides is crucial to emergency response after strong earthquakes. The paper combined logistic regression and Bayesian probability methods in Matlab for assessing the spatial probability of landslides. Even the authors emphasized that there is no specialized software for seismic landslide hazard assessment, particularly in the various needs of different stages after a major earthquake, however, the methods they use are traditional methods, nothing new about the methodology itself. There are quite many existing toolbox or packages in ArcGIS, QGIS and R, which can be used for the same analysis as the authors did in Matlab. In addition, the threestage methodology is just classified considering the different time window after an earthquake. The methods at all stages are the same. The only difference is adding more and more landslide data after an earthquake due to more available information with time, such as remote sensing images. At third stage, if we already know all coseismic landslide distribution by RS imagery interpretation, why we still need the susceptibility model? Even at this stage, you get high R2, it is because the overfitting of the model. It does not mean the model will have good prediction power for next event.
Major comments:
 Why the authors did not try CNN or other more advanced AI methods, which should have better performance than logistic regression and Bayesian probability methods?
 PGA and PGV are considered as the most important seismic factors, why the authors used intensity rather than PGA and PGV data? Besides, distance to river and distance to transportation lines are also important factors considering the river incision in mountainous regions and human work effect, why they are not considered in the model?
 It is quite obvious that from Fig.4 that the actual landslides (black polygons) are not falling in high probability zones? The model seems not satisfactory for the first stage. Many landslides in all events are failing into blue (low probability zones), while the predicted high probability zones have a few landslides. This indicates that the model has quite high false alarms from prediction perspective. In the second stage, Fig.6, it still has the mismatching problem. In the third stage, it looks better, but this is because as I mentioned above, the overfitting of model by using a large amount of known landslides. Actually the first stage, the rapid prediction using very limited or even no available landslide information, is most important one considering the emergency response and rescue work. The model’s performance at this stage is not good.
Citation: https://doi.org/10.5194/egusphere2022772RC2 
AC3: 'Reply on RC2', Chong Xu, 27 Jan 2023
Review comments and responses
Reviewer 2
Over view
I have read the manuscript with interest. The paper develops a software for hazard assessment of coseismic landslides considering three stages after earthquake. The paper is well written, but I have some concerns regarding the innovation and the method used by the paper.
Comments
 I agree with the authors that rapid assessment of coseismic landslides is crucial to emergency response after strong earthquakes. The paper combined logistic regression and Bayesian probability methods in Matlab for assessing the spatial probability of landslides. Even the authors emphasized that there is no specialized software for seismic landslide hazard assessment, particularly in the various needs of different stages after a major earthquake, however, the methods they use are traditional methods, nothing new about the methodology itself. There are quite many existing toolbox or packages in ArcGIS, QGIS and R, which can be used for the same analysis as the authors did in Matlab.
Authors’ response: Yes. Currently, there are many toolboxes or packages for assessing the spatial probability of landslides in different programming languages. However, these toolboxes are built by the traditional hazard assessment process, which requires landslide samples for modelling. As a result, the prediction results often lag behind the actual application, which cannot satisfy the emergency assessment of earthquakeinduced landslides (Ma et al 2020). Therefore, to solve this problem, we integrated a new generation of earthquaketriggered landslide hazard model (Xu_{2019} model), so that our software can serve for the emergency assessment of earthquakeinduced landslides in the SichuanYunnan area. Secondly, most studies on datadriven model have used the same ratio of sliding samples to nonsliding samples. Such a sampling method artificially exaggerates the proportion of sliding samples in the study area (Allstadt, et al 2018, Nowicki Jessee et al 2018); thus, the assessment results only consider the relative hazard level, but do not represent the real occurrence probability of landslides. Consequently, the resulting probability of the model overestimates the actual landslide occurrence probability (Shao et al 2020, Nowicki Jessee et al 2018). We proposed a real probability prediction method of coseismic landslides by the bayesian probability method and LR model (Shao et al 2020). The results of this model represent the occurrence probability of landslide rather than the relative hazard level (Shao et al 2020) and thus can calculate the landslide area of the quakeaffected area. Thirdly, to our knowledge, although there are quite many existing toolboxes or packages in ArcGIS, QGIS and R, there is no specific software for regional landslide hazard assessment based on matlab language, so our work will also help those familiar with matlab language to carry out the earthquakeinduced landslide assessment.
Allstadt, K.E. et al., 2018. Improving Near‐Real‐Time Coseismic Landslide Models: Lessons Learned from the 2016 Kaikōura, New Zealand, Earthquake. Bulletin of the Seismological Society of America, 108, 16491664.
Ma, S., Xu, C., Shao, X., 2020. Spatial prediction strategy for landslides triggered by large earthquakes oriented to emergency response, midterm resettlement and later reconstruction. International Journal of Disaster Risk Reduction, 43, 101362.
Shao, X., Ma, S., Xu, C., Zhou, Q., 2020. Effects of sampling intensity and nonslide/slide sample ratio on the occurrence probability of coseismic landslides. Geomorphology, 363, 107222.
Nowicki Jessee, M.A. et al., 2018. A global empirical model for nearrealtime assessment of seismicallyinduced landslides. Journal of Geophysical Research: Earth Surface, 123, 18351859.
 In addition, the threestage methodology is just classified considering the different time window after an earthquake. The methods at all stages are the same. The only difference is adding more and more landslide data after an earthquake due to more available information with time, such as remote sensing images. At third stage, if we already know all coseismic landslide distribution by RS imagery interpretation, why we still need the hazard model? Even at this stage, you get high R2, it is because the overfitting of the model. It does not mean the model will have good prediction power for next event.
Authors’ response: In different stages after a large earthquake, the demand to mitigate earthquake disasters is different (Ma et al 2020). Especially in mountainous areas, the spatial prediction of landslides is of great significance to shortterm emergency response (stage 1), mediumterm temporary resettlement (stage 2) and longterm rehabilitation and reconstruction (stage 3). In stage 1, in the absence of landslide data, the rapid emergency hazard mapping using nearrealtime model can provide guidance for post disaster emergency rescue and the interpretation and identification of earthquakeinduced landslide in stage 2. In stage 2, considering the timeliness, the earthquakeinduced landslide hazard assessment was carried out based on partially available landslides data. The assessment results are beneficial for the improvement of the construction of earthquakeinduced landslide inventory, and provide useful information on avoiding high landslide hazard areas for quakeaffected areas.
In stage 3, we are faced with not only the problem of coseismic landslide identification, but also the weakened slope caused by the quake. As a result, it is critical to locate the landslides that are stable during the earthquake but unstable for a period of time after the earthquake. Such an essential process can be achieved by the hazard assessment of earthquakeinduced landslides based on the complete landslide data. Thus, the results obtained in stage 3 will definitely be more objective than those obtained in the stage 2, because the training samples used in the model in this stage are abundant and more objective. That is to say, the prediction ability of the model in stage 3 is stronger than that in stage 2 (the prediction rate in the third stage is higher). Meanwhile, the evaluation results at this stage can effectively serve the town planning and longterm risk assessment of the subsequent quakeaffected areas. To summarize, we suggested to perform earthquakeinduced landslide hazard assessment at multiple stages in a large earthquake in order to better deal with the landslide disaster prevention and mitigation issues that earthquake areas face at various stages. The relevant descriptions have been added in Section 3.2.1 (line 288296).
Ma, S., Xu, C., Shao, X., 2020. Spatial prediction strategy for landslides triggered by large earthquakes oriented to emergency response, midterm resettlement and later reconstruction. International Journal of Disaster Risk Reduction, 43, 101362.
Major comments:
 Why the authors did not try CNN or other more advanced AI methods, which should have better performance than logistic regression and Bayesian probability methods?
Authors’ response: Currently, the majority of the models employs one of several possible classification methods, including classical statistics (e.g., logistic regression, discriminant analysis, linear regression), indexbased (e.g., weightof evidence, heuristic analysis), machine learning (e.g., support vector machines, random forest) and neural networks (e.g., recurrent neural network, Convolutional neural network) (Reichenbach et al., 2018). Among them, the LR model is one of the most widely used models in the hazard assessment of earthquakeinduced landslides by virtue of its simplicity, high efficiency, and high prediction accuracy (Reichenbach et al., 2018; Shao and Xu, 2022) (Fig.1). In addition, it is the preferred method for establishing the nearrealtime prediction model of earthquakeinduced landslides (Nowicki Jessee et al., 2018; Tanyas et al., 2019; Xu et al, 2019).
In recent years, deep learning methods, especially Convolutional Neural Networks (CNN), have been pervasively applied in landslide hazard assessment. Similar to other machine learning methods, the internal structure of CNN model is complex like a black box, and the models need to classify the independent variables before the evaluation modeling (Yang et al., 2022). Compared with the CNN model, the LR model can better avoid these two problems. This method can carry out different types of independent variables including continuous variables and discrete variables. The LR model can give specific regression coefficients of independent variables, with simple calculation process and definite physical meanings. At the same time, recent studies show that the LR regression model performed better in prediction other machine learning models (Zhao et al., 2022). The relevant description have been added in the section 3.22.
Fig.1 Horizontal bar chart shows the count of 19 model type classes used to group the 163 model names given by Reichenbach et al., 2018 in the literature databases.
Reichenbach, P., Rossi, M., Malamud, B.D., Mihir, M., Guzzetti, F., 2018. A review of statisticallybased landslide hazard models. EarthScience Reviews, 180, 6091.
Tanyas, H., Rossi, M., Alvioli, M., van Westen, C.J., Marchesini, I., 2019. A global slope unitbased method for the near realtime prediction of earthquakeinduced landslides. Geomorphology, 327, 126146.
Xu, C., Xu, X., Zhou, B., Shen, L., 2019. Probability of coseimic landslides: A new generation of earthquaketriggered landslide hazard model. Journal of Engineering Geology, 27, 1122.
Yang, Z., Xu, C., Shao, X., Ma, S., Li, L., 2022. Landslide hazard mapping based on CNN3D algorithm with attention module embedded. Bulletin of Engineering Geology and the Environment, 81, 412.
Zhao, P., Masoumi, Z., Kalantari, M., Aflaki, M., Mansourian, A., 2022. A GISBased Landslide Hazard Mapping and Variable Importance Analysis Using Artificial Intelligent TrainingBased Methods. Remote. Sens., 14, 211.
 PGA and PGV are considered as the most important seismic factors, why the authors used intensity rather than PGA and PGV data? Besides, distance to river and distance to transportation lines are also important factors considering the river incision in mountainous regions and human work effect, why they are not considered in the model?
Authors’ response: Thank the reviewers for their suggestions. Indeed, as the reviewer said, PGA and PGV are the two most important seismic factors, and these two factors can be converted to each other through specific empirical formula (Boore et al., 2014; Saffari et al., 2012). Like PGA and PGV, seismic intensity is also one of the most important seismic factors, and there are specific formulas for seismic intensity and PGA to be converted (Du et al., 2018; Xin et al., 2020). In our study, we chose the seismic intensity instead of PGA because in the official results released by the China Earthquake administration, seismic intensity is obtained by the integration of multisource information such as instrument records, field survey and actual earthquake damages. Compared with the PGA map obtained from simple instrument records or attenuation relationship, the map of seismic intensity can better reflect the distribution of the seismic influence field. Therefore, in the three stages, we carried out the probability prediction of earthquakeinduced landslides based on the rapidly obtained seismic intensity.
The reason why the distance to river and distance to transportation lines are not selected is that among all influencing factors we selected, the topographic wetness index (TWI) and landuse type can represent regional hydrological factors and human factors respectively to a certain extent. Additionally, according to the previous studies about the spatial distribution of earthquakeinduced landslides in the SichuanYunnan region, we found that these two influencing factors do not show strong correlation with the occurrence of earthquakeinduced landslides. Furthermore, despite the fact that these two influencing factors are not taken into account in the evaluation results, the performance of the evaluation model is satisfactory. Meanwhile, the model and software we developed are adaptable and do not place rigid limits on the input of influencing factors. Peers who are interested in assessment models might add or change the corresponding independent variables during the modeling process.
Boore, D.M., Stewart, J.P., Seyhan, E., Atkinson, G.M., 2014. NGAWest2 Equations for Predicting PGA, PGV, and 5% Damped PSA for Shallow Crustal Earthquakes. Earthquake Spectra, 30 (3), 10571085.
Saffari, H., Kuwata, Y., Takada, S., Mahdavian, A., 2012. Updated PGA, PGV, and Spectral Acceleration Attenuation Relations for Iran. Earthquake Spectra, 28 (1), 257276.
Du, K., Ding, B., Luo, H., Sun, J., 2018. Relationship between Peak Ground Acceleration, Peak Ground Velocity, and Macroseismic Intensity in Western China. Bulletin of the Seismological Society of America, 109, 284297.
Xin, D., Daniell, J.E., Wenzel, F., 2020. Review of fragility analyses for major building types in China with new implications for intensity–PGA relation development. Nat. Hazards Earth Syst. Sci., 20, 643672.
 It is quite obvious that from Fig.4 that the actual landslides (black polygons) are not falling in high probability zones? The model seems not satisfactory for the first stage. Many landslides in all events are failing into blue (low probability zones), while the predicted high probability zones have a few landslides. This indicates that the model has quite high false alarms from prediction perspective. In the second stage, Fig.6, it still has the mismatching problem. In the third stage, it looks better, but this is because as I mentioned above, the overfitting of model by using a large amount of known landslides. Actually the first stage, the rapid prediction using very limited or even no available landslide information, is most important one considering the emergency response and rescue work. The model’s performance at this stage is not good.
Authors’ response: In the first stage, except for the Minxian earthquake, the AUC value of other earthquakes is above 0.8. But we have to admit that the evaluation results of six earthquakes based on the Xu_{2019} model can be improved. We can see that landslide observations from the earthquake match well with predicted high probabilities, but the model predicts potential landsliding in a large area beyond the mapped landslide area. Especially in Minxian, Jiuzhaigou and Yushu earthquake cases, the performance of the model is not satisfactory. But most of the current nearrealtime models have such problems that the model performs well when evaluated over the domain of an entire event area, but clearly, individual pixels will predict probabilities that underestimate or overestimate the landslide hazard (Nowicki Jessee et al., 2018; Allstadt, et al 2018). We propose two possible reasons for this phenomenon: (1) The resolution of the input data of the Xu_{2019} model is 100m, which affects the prediction accuracy of the model to a certain extent. Therefore, there may be errors between the modeling prediction and the actual result at the regional scale. (2) Nine earthquake cases used for the establishment of the Xu_{2019} model are located in China and its adjacent areas. The corresponding epicentral areas have different topographic and geological conditions, and only four cases are in the SichuanYunnan area, which may weaken the applicability of the Xu_{2019} model in other quake events. Therefore, in the past few years, we have been constantly supplementing the earthquake landslide database in Sichuan Yunnan region (e.g. 2014 Ms Jinggu earthquake, 2020 Ms Qiaojia earthquake, 2018 Ms 5.7 Xingwen earthquake, 2019 Changning earthquake, 2022 Ms 6.8 Luding earthquake,.etc). We suggest that with the accumulation of enough coseismic landslide inventories of earthquake cases in SichuanYunnan area, we can constantly update the nearrealtime earthquaketriggered landslide hazard model based on these abundant landslide data and high resolution input factors data, and further improve the accuracy of the modelling in the emergency assessment. The relevant description have been added in the discussion section (Line 578602).
For the second stage, the predicted landslide area (Ap) of the six events are almost the same as the observed landslide area (Ao). Except for the Jiuzhaigou earthquake, the overall error of the remaining five earthquakes is between 9% and 50%, of which the error of the results of the 2008 Wenchuan earthquake is the lowest with 9%, indicating that the assessment results of the second stage are reliable for the quantification of coseismic landslide development area in the quakeaffected area. In addition, from the complete landslide inventory and prediction results of six events (Fig. 6), although some landslides are spread out on the lowhazard areas, most landslide are located in the highhazard areas which are relatively consistent with the actual landslide distribution. Meanwhile, based on the complete landslide data, the validation results in the second stage show that the AUC values of the second stage are all above 0.85, which indicates that the model have pretty good performance.
Allstadt, K.E. et al., 2018. Improving Near‐Real‐Time Coseismic Landslide Models: Lessons Learned from the 2016 Kaikōura, New Zealand, Earthquake. Bulletin of the Seismological Society of America, 108, 16491664.
Nowicki Jessee, M.A. et al., 2018. A global empirical model for nearrealtime assessment of seismicallyinduced landslides. Journal of Geophysical Research: Earth Surface, 123, 18351859.
Peer review completion
Journal article(s) based on this preprint
Xaioyi Shao et al.
Xaioyi Shao et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

697  179  24  900  7  8 
 HTML: 697
 PDF: 179
 XML: 24
 Total: 900
 BibTeX: 7
 EndNote: 8
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1
The requested preprint has a corresponding peerreviewed final revised paper. You are encouraged to refer to the final revised version.
 Preprint
(4324 KB)  Metadata XML