the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Application of bagging, boosting and stacking ensemble and EasyEnsemble methods to landslide susceptibility mapping in the Three Gorges Reservoir area of China
Abstract. Since the impoundment of the Three Gorges Reservoir area in 2003, the potential risks of geological disasters in the reservoir area have increased significantly, among which the hidden dangers of landslides are particularly prominent. To reduce casualties and damage, efficient and precise landslide susceptibility evaluation methods are important. Multiple ensemble models have been used to evaluate the susceptibility of the upper part of Badong County to landslides. In this study, EasyEnsemble technology was used to solve the imbalance between landslide and nonlandslide sample data. The extracted evaluation factors were input into three ensemble models, bagging, boosting, and stacking models, for training, and landslide susceptibility maps (LSMs) were drawn. According to the importance analysis, the important factors affecting the occurrence of landslides are altitude, terrain surface texture (TST), distance to residents, distance to rivers and land use. Comparing the influences of different grid sizes on the susceptibility results, a larger grid was found to lead to the overfitting of the prediction results. Therefore, a 30 m grid was selected as the evaluation unit. The accuracy rate, area under the curve (AUC), recall rate, test set precision, and Kappa coefficient of the multigrained cascade forest (gcForest) model under the stacking method were 0.958, 0.991, 0.965, 0.946, and 0.91, respectively, which were significantly better than the values produced by the other two models.
- Preprint
(1381 KB) - Metadata XML
- BibTeX
- EndNote
Status: closed
-
RC1: 'Comment on egusphere-2022-697', Anonymous Referee #1, 28 Aug 2022
This manuscript discussed the application of several machine learning models in landslide susceptibility analysis. However, I don’t think it is worth being published in a high quality journal like NHESS. Here you can find my concerns:
(1) My biggest concern is from the novelty of this study. What is the new thing of it? A quick Google search showed that too many similar researches have been published. Most of them are characterized by the key words like “machine learning”, “landslide (or other hazards) susceptibility”. And the most important objective of such studies is to compare the ability of different models. But in my opinion, it doesn’t make sense when you compare too many models. They are just regular exercises on this topic.
(2) The structure of the MS is confusing. It is not using a widely accepted template for paper: Introduction—Methods—Study area—Results—Discussion—conclusion.
(3) It seems that a real Discussion section is missing.
(4) You selected 25 factors as input data of the model, but why are these factors not others? I mean all these factors are from literature and experience, aren’t they? How do you justify they are necessary, and the factors not selected by you are not necessary?
(5) In Abstract and Conclusion, quantitative results are really few.
Citation: https://doi.org/10.5194/egusphere-2022-697-RC1 -
AC1: 'Reply on RC1', Xueling Wu, 01 Sep 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-697/egusphere-2022-697-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Xueling Wu, 01 Sep 2022
-
CC1: 'Comment on egusphere-2022-697', Ali P. Yunus, 29 Aug 2022
The authors presented a comparison of bagging, boosting and stacking ensemble methods to evaluate the landslide susceptibility mapping in the Three Gorges Reservoir area of China. Although the manuscript is well written, but the presented methods and presentation of data alacks novelty. One can easily guess that which models work better in the start without further reading the contents, as these has been done in several previous studies. From a readers viewpoint, I would like to see real discussion of science; however which is missing in this paper. For example, the authors analysed 25 factors (and surprisingly to me this do not find a collinearity problem); how these factors contribute to the landslides in TGD area?. I also see that feature importance is signifcantly high for Altitude. Some other important factors do not contribute at to the model as well such as the slope. Explanation of that adds to discussion. Also, authors have computed the results of zoning in Table 3,4 and 5. What is their meaning to a reader ?. Additionally, authors hsould validate the model not in the training site. They should have choose the adjoining catachments to check whether the result still hold valid (Like AUC of 0.95). Again, the comparisojn of 30-60-90 m grid size is inappropriate, as these are again known from several past works.
Citation: https://doi.org/10.5194/egusphere-2022-697-CC1 -
AC3: 'Reply on CC1', Xueling Wu, 01 Sep 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-697/egusphere-2022-697-AC3-supplement.pdf
-
AC3: 'Reply on CC1', Xueling Wu, 01 Sep 2022
-
RC2: 'Comment on egusphere-2022-697', Ali P. Yunus, 29 Aug 2022
The authors presented a comparison of bagging, boosting and stacking ensemble methods to evaluate the landslide susceptibility mapping in the Three Gorges Reservoir area of China. Although the manuscript is well written, but the presented methods and presentation of data alacks novelty. One can easily guess that which models work better in the start without further reading the contents, as these has been done in several previous studies. From a readers viewpoint, I would like to see real discussion of science; however which is missing in this paper. For example, the authors analysed 25 factors (and surprisingly to me this do not find a collinearity problem); how these factors contribute to the landslides in TGD area?. I also see that feature importance is signifcantly high for Altitude. Some other important factors do not contribute at to the model as well such as the slope. Explanation of that adds to discussion. Also, authors have computed the results of zoning in Table 3,4 and 5. What is their meaning to a reader ?. Additionally, authors hsould validate the model not in the training site. They should have choose the adjoining catachments to check whether the result still hold valid (Like AUC of 0.95). Again, the comparisojn of 30-60-90 m grid size is inappropriate, as these are again known from several past works.
Citation: https://doi.org/10.5194/egusphere-2022-697-RC2 -
AC2: 'Reply on RC2', Xueling Wu, 01 Sep 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-697/egusphere-2022-697-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Xueling Wu, 01 Sep 2022
Status: closed
-
RC1: 'Comment on egusphere-2022-697', Anonymous Referee #1, 28 Aug 2022
This manuscript discussed the application of several machine learning models in landslide susceptibility analysis. However, I don’t think it is worth being published in a high quality journal like NHESS. Here you can find my concerns:
(1) My biggest concern is from the novelty of this study. What is the new thing of it? A quick Google search showed that too many similar researches have been published. Most of them are characterized by the key words like “machine learning”, “landslide (or other hazards) susceptibility”. And the most important objective of such studies is to compare the ability of different models. But in my opinion, it doesn’t make sense when you compare too many models. They are just regular exercises on this topic.
(2) The structure of the MS is confusing. It is not using a widely accepted template for paper: Introduction—Methods—Study area—Results—Discussion—conclusion.
(3) It seems that a real Discussion section is missing.
(4) You selected 25 factors as input data of the model, but why are these factors not others? I mean all these factors are from literature and experience, aren’t they? How do you justify they are necessary, and the factors not selected by you are not necessary?
(5) In Abstract and Conclusion, quantitative results are really few.
Citation: https://doi.org/10.5194/egusphere-2022-697-RC1 -
AC1: 'Reply on RC1', Xueling Wu, 01 Sep 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-697/egusphere-2022-697-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Xueling Wu, 01 Sep 2022
-
CC1: 'Comment on egusphere-2022-697', Ali P. Yunus, 29 Aug 2022
The authors presented a comparison of bagging, boosting and stacking ensemble methods to evaluate the landslide susceptibility mapping in the Three Gorges Reservoir area of China. Although the manuscript is well written, but the presented methods and presentation of data alacks novelty. One can easily guess that which models work better in the start without further reading the contents, as these has been done in several previous studies. From a readers viewpoint, I would like to see real discussion of science; however which is missing in this paper. For example, the authors analysed 25 factors (and surprisingly to me this do not find a collinearity problem); how these factors contribute to the landslides in TGD area?. I also see that feature importance is signifcantly high for Altitude. Some other important factors do not contribute at to the model as well such as the slope. Explanation of that adds to discussion. Also, authors have computed the results of zoning in Table 3,4 and 5. What is their meaning to a reader ?. Additionally, authors hsould validate the model not in the training site. They should have choose the adjoining catachments to check whether the result still hold valid (Like AUC of 0.95). Again, the comparisojn of 30-60-90 m grid size is inappropriate, as these are again known from several past works.
Citation: https://doi.org/10.5194/egusphere-2022-697-CC1 -
AC3: 'Reply on CC1', Xueling Wu, 01 Sep 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-697/egusphere-2022-697-AC3-supplement.pdf
-
AC3: 'Reply on CC1', Xueling Wu, 01 Sep 2022
-
RC2: 'Comment on egusphere-2022-697', Ali P. Yunus, 29 Aug 2022
The authors presented a comparison of bagging, boosting and stacking ensemble methods to evaluate the landslide susceptibility mapping in the Three Gorges Reservoir area of China. Although the manuscript is well written, but the presented methods and presentation of data alacks novelty. One can easily guess that which models work better in the start without further reading the contents, as these has been done in several previous studies. From a readers viewpoint, I would like to see real discussion of science; however which is missing in this paper. For example, the authors analysed 25 factors (and surprisingly to me this do not find a collinearity problem); how these factors contribute to the landslides in TGD area?. I also see that feature importance is signifcantly high for Altitude. Some other important factors do not contribute at to the model as well such as the slope. Explanation of that adds to discussion. Also, authors have computed the results of zoning in Table 3,4 and 5. What is their meaning to a reader ?. Additionally, authors hsould validate the model not in the training site. They should have choose the adjoining catachments to check whether the result still hold valid (Like AUC of 0.95). Again, the comparisojn of 30-60-90 m grid size is inappropriate, as these are again known from several past works.
Citation: https://doi.org/10.5194/egusphere-2022-697-RC2 -
AC2: 'Reply on RC2', Xueling Wu, 01 Sep 2022
The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2022/egusphere-2022-697/egusphere-2022-697-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Xueling Wu, 01 Sep 2022
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
395 | 202 | 36 | 633 | 20 | 23 |
- HTML: 395
- PDF: 202
- XML: 36
- Total: 633
- BibTeX: 20
- EndNote: 23
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1