Preprints
https://doi.org/10.5194/egusphere-2023-1417
https://doi.org/10.5194/egusphere-2023-1417
10 Aug 2023
 | 10 Aug 2023
Status: this preprint is open for discussion.

Addressing Class Imbalance in Soil Movement Predictions

Praveen Kumar, Priyanka Priyanka, Kala Venkata Uday, and Varun Dutt

Abstract. Landslides threaten human life and infrastructure, resulting in fatalities and economic losses. Monitoring stations provide valuable data for predicting soil movement, which is crucial in mitigating this threat. Accurately predicting soil movement from monitoring data is challenging due to its complexity and inherent class imbalance. This study proposes developing machine learning (ML) models with oversampling techniques to address the class imbalance issue and develop a robust soil movement prediction system. The dataset, comprising two years (2019–2021) of monitoring data from a landslide in Uttarakhand, was split into a 70:30 ratio for training and testing. To tackle the class imbalance problem, various oversampling techniques, including Synthetic Minority Oversampling Technique (SMOTE), K-Means SMOTE, Borderline SMOTE, Support Vector Machine SMOTE, and Adaptive SMOTE (ADASYN), were applied to the dataset. Several ML models, namely Random Forest (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (Light GBM), Adaptive Boosting (AdaBoost), Category Boosting (CatBoost), Long Short-Term Memory (LSTM), Multilayer Perceptron (MLP), and dynamic ensemble models, were trained and compared for soil movement prediction. Among these models, the dynamic ensemble model with K-Means SMOTE performed the best in testing, with an accuracy, precision, and recall rate of 99.68 % each and an F1-score of 0.9968. The RF model with K-Means SMOTE stood out as the second-best performer, achieving an impressive accuracy, precision, and recall rate of 99.64 % each and an F1-score of 0.9964. These results show that ML models with class imbalance techniques have the potential to significantly improve soil movement predictions in landslide-prone areas.

Praveen Kumar et al.

Status: open (until 29 Oct 2023)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse

Praveen Kumar et al.

Praveen Kumar et al.

Viewed

Total article views: 134 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
95 32 7 134 4 4
  • HTML: 95
  • PDF: 32
  • XML: 7
  • Total: 134
  • BibTeX: 4
  • EndNote: 4
Views and downloads (calculated since 10 Aug 2023)
Cumulative views and downloads (calculated since 10 Aug 2023)

Viewed (geographical distribution)

Total article views: 131 (including HTML, PDF, and XML) Thereof 131 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 03 Oct 2023
Download
Short summary
Our study focuses on predicting soil movement to mitigate landslide risks. We develop machine learning models with oversampling techniques to address the class imbalance in monitoring data. The dynamic ensemble model with K-Means SMOTE achieves high accuracy (99.68 %), precision, recall, and F1-score, followed by RF with K-Means SMOTE. Our findings highlight the potential of these models to improve soil movement predictions in landslide-prone areas.