the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Debris Flow Susceptibility in the Jinsha River Basin, China: A Bayesian Assessment Framework Based on Geomorphodynamic Parameters
Abstract. Accurately identifying the spatial and temporal locations of debris flow occurrences is a significant challenge in assessing mountain hazard susceptibility and is essential for developing effective disaster mitigation and ecological restoration strategies. The Jinsha River basin, a typical region in China characterized by alpine gorges, frequently experiences debris flow disasters. Due to its vast area and the complex mechanisms underlying debris flow formation, using slope-based indicators alone to assess susceptibility, without considering the "source-sink" process of debris flow formation, results in low accuracy in susceptibility evaluations. To address this issue, we carefully selected a set of geomorphodynamic parameters, designed corresponding quantitative characterization methods, and developed a Bayesian model-based framework to more accurately identify debris flow-prone areas. This framework provides a comprehensive understanding of the spatial distribution and intensity of debris flow events, thereby improving the accuracy and robustness of susceptibility assessments. The model’s evaluation results indicate debris flow susceptibility in the Jinsha River basin for small, medium, and large-scale events, with an average accuracy of 63 %. Furthermore, through an empirical analysis of the catastrophic mountain flood and debris flow event ("8.21") in Jinyang County, Sichuan Province, we found that the model’s predictions closely matched the actual disaster locations, further validating the model’s effectiveness. Our study reveals that the importance of factors contributing to debris flow susceptibility in the Jinsha River basin decreases in the following order: surface material erodibility > connectivity > stream power > frequency and intensity of extreme precipitation. Debris flow-prone valleys are primarily concentrated within a 30 km stretch along the middle and lower reaches of the Jinsha and Yalong Rivers, with approximately 32,000 risk-prone river valleys longer than 200 meters, most of which are small to medium-sized gullies. The distribution of these valleys follows a power function relationship with the distance from the main stream. In areas where debris flow events occur infrequently but with high probability, when such events do occur, they tend to be larger and more destructive. Given that many existing and planned large reservoirs in the Jinsha River basin are in regions densely populated with debris flow-prone valleys, and considering the projected increase in extreme precipitation events, preventing and mitigating debris flow susceptibility remains a significant challenge. The datasets generated in this study, including river power, surface connectivity, and debris flow occurrence probability, provide valuable insights for major construction projects, such as large reservoirs, bridges, and residential developments, helping to improve infrastructure siting and disaster mitigation planning.
- 
        
                                        Notice on discussion status
                                        The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version. 
- 
                                    Preprint
                                    (2900 KB) 
- 
            
            
                                    The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version. 
- Preprint
                                        (2900 KB) 
- Metadata XML
- BibTeX
- EndNote
- Final revised paper
Journal article(s) based on this preprint
Interactive discussion
Status: closed
- 
                     RC1:  'Comment on egusphere-2024-4164', Anonymous Referee #1, 31 Mar 2025
            
            
            
            
                        - 
                                        
                                     AC1:  'Reply on RC1', Zhenkui Gu, 16 Apr 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2024-4164/egusphere-2024-4164-AC1-supplement.pdf
- AC2: 'Reply on RC1', Zhenkui Gu, 16 Apr 2025
- AC3: 'Reply on RC1', Zhenkui Gu, 16 Apr 2025
 
- 
                                        
                                     AC1:  'Reply on RC1', Zhenkui Gu, 16 Apr 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        
- 
                     RC2:  'Comment on egusphere-2024-4164', Anonymous Referee #2, 22 Apr 2025
            
            
            
            
                        The study tackles the relevant problem of regional debris-flow susceptibility. I think the authors chose adequate methods (Bayesian statistical methods) and an interesting combination of features representing hydro-geomorphological factors for debris-flow triggering. However, I found the methods to be intransparent in important acspects of the study, such that I cannot assess the reproducibility or plausibility of the results. I list my major concerns below: - What debris flow data was used for training, namely to estimate the occurrence probability in P(Ci) in Eq. 1? You mention that you use “debris flow survey sites” (L123) but there is no reference or description of how you obtain the data. Also Fig. 2 on the study implementation doesn’t mention any use of observational debris flow data to train the model. In L115, I see one citation that may refer to such data (Yu and Tang, 2016), but the full reference is missing. Fig. 5 indicates, that debris -flow fans were identified, but how exactly and how do you differentiate debris-flow fans from alluvial fans?
- Any information on model training and testing is missing, except for the showcasing the model for one event
- There is no uncertainty assessment or discussion of model limitation
- The data availability statement states that datasets are being made available, but there is no link. Anyway, more important would be the data to reproduce the results and this would include the debris flow observations
- The conclusions are largely a copy of the abstract. Both should be rewritten such that they are complementary (e.g., more focus on research question and methods in abstract and more focus on conclusions, implications, outlook in conclusions)
 Specific comments: ~L70-77 : I cannot follow the critique on previously used indicators for DF susceptibility. It may be that the risk is highest in the valley bottom, the source area characteristics govern susceptibility. Can you specify what current methods exactly are missing and what you do differently? Contradictory to your argument on the importance of valley bottom characteristics, I would assume that the factors you report in L77 (stream power, surface erosion, etc) characterize source are rather than sink area. L105: if you use ERA5, higher resolutions that daily are available to my knowledge. Could you justify why you don’t use these? Sub-daily rainfall is commonly much more useful than daily for debris-flow triggering Citation: https://doi.org/10.5194/egusphere-2024-4164-RC2 - 
                                        
                                     AC4:  'Reply on RC2', Zhenkui Gu, 29 Apr 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2024-4164/egusphere-2024-4164-AC4-supplement.pdf
 
Interactive discussion
Status: closed
- 
                     RC1:  'Comment on egusphere-2024-4164', Anonymous Referee #1, 31 Mar 2025
            
            
            
            
                        The authors present a methodology for assessing debris flow susceptibility over a large scale (study area size ~500000 km2). The new method aims to focus heavily on the source-sink dynamics of debris flows rather than a more classical approach of surface characteristics. In my opinion, this and especially the inclusion of stream-power is innovative. Generally, the manuscript is well-written and concise, therefore I recommend the research to be published. The manuscript does require better descriptions on multiple aspects of the methodology though for which I recommend revisions. Below I first list my most pressing issues which require clarifications. This is followed specific comments and miscellaneous points. I’m happy to engage with the authors when they have answers or follow-up questions to my points. Major points For understanding of the reader, I think the methods benefit from a table where input factors to the Bayesian model are listed. This table should include description, resolution (if applicable), reference to details of the factor in the manuscript and preferably a range of values. The authors choose to focus on indicators of the source to sink characteristics of debris flows and mention that ignoring these characteristics results in low accuracy (without mentioning a source, that should be addressed as well). From the abstract: “Due to its vast area and the complex mechanisms underlying debris flow formation, using slope-based indicators alone to assess susceptibility, without considering the "source-sink" process of debris flow formation, results in low accuracy in susceptibility evaluations.” I think the method in the manuscript, which neglects other factors of importance, could be partially responsible for their own relatively low accuracy. Factors such as vegetation, lithology and soil transmissivity (also mentioned by the authors for classical approaches, line 68) are what come to mind. I think deliberately neglecting these factors bends the aim of the manuscript from an overall debris flow hazard indicator to introducing a specific source-sink process-based method. This is still innovative and interesting, but I think the authors should mention their choices in this regard more explicitly at the end of the introduction and in the methods. The resolution of the analysis is unclear to me. The manuscript regularly mentions a minimum valley length of 200 m. With a DEM resolution of approximately 30 m these would be 6-7 grid cells. Can you reliably estimate your input and apply all your functions, which often require upstream and downstream values, for such small river reaches? After reading the ‘data availability’ section I wonder why the data is at 90 m and not at the DEM resolution? This is the first time I read the analysis is not on the DEM resolution. This and possible resampling should be mentioned in the methods. The issue with the valley length applies even stronger if the analysis was performed at 90 m. If I am missing something please let me know, otherwise I think the authors should be clearer on this issue in their methods. Specific comments Line 84/85. These numbers are suspicious. A length of 2316 km and an average gradient of 1.45% (~0.8 ⁰) yields a vertical distance of ~ 33 km (length * tan(slope)). This is not very realistic, am I missing something? Line 105. For the ECMWF ERA5, a description of resolution, duration and related uncertainty are required as well as a reference. Line 112. What threshold is used to define flat? Section 3.2. Why did the authors specifically choose a Naïve Bayesian model? And not for instance logistic regression or a random forest model? This choice should be clarified. Line 136 Li Line 148. Higher than 0 Line 150-160. Text doesn’t read well. Could benefit from some critical rewriting. Line 159. 10^4 I hope. Is this a ‘guestimate’ or did you calibrate? Be specific of how this was chosen. Also give the values of the a and b fitting parameters. Line 220. Looks better when the formula is fitted on one line. Line 231. Correct me if I’m wrong, but I don’t understand your statements. In Radoane et al. (2003) four functions (linear, exponential, log and power) are calibrated and tested for which can best describe the longitudinal profile of various rivers. They don’t mention “progressions through stages of function curves”. Clarify what you’re trying to say here. Line 242. Having done the GIS analysis, don’t you have an exact number for how many valleys? Figure 5c. Add stream power units to the second axis on the right. Figure 5f caption should be: ‘Photo by one of the authors’ As there are multiple authors of the manuscript. Figure 6. Why this threshold for high-energy valleys? Line 252. Any reason for this thresholding? Line 253. Does clay correlate with erodibility? Figure 7 misses a legend on the right. Line 269. Why is that suggested? Line 274/275. Are any of these statements statistically significant given the timeframe of 10 years? Figure 10. The spatial pattern of the occurrence probability appears related to the extreme precipitation as one would expect. But this means the interpolation pattern is clearly visible. This makes me wonder if your rainfall data, SPI, really gives full spatial cover or whether it is an interpolation from specific sites. Line 281: Move the accuracy to section 4.5. This is where you present the model verification. Figure 11. How is the relative importance calculated? Figure 12/13 (and in the conclusions). I appreciate the authors highlighting the impact a disaster like a debris flow can have. It stresses the importance of this type of research. I do wonder though how relevant 1 data point is when presenting a province-scale model and I would reduce the prominence of this description. I think it’s more interesting is to have an overview of all the debris flows the authors collected and their occurrence probability. This will also help in interpreting figure 11 better. Section 4.5 Is this the same dataset used to train the model, then the 70/30 split and compute accuracy? If so, the dataset should be mentioned and described in the methodology, not here. Figure 13 b/f: The red text in the figure is unreadable. Line 305. Was the actual event also a medium-scale debris flow according to your classification? Line 323-326. I don’t follow this reasoning. How is the observation timescale relevant to the distribution in precipitation event intensity. Isn’t the goal of a statistical analysis, precisely not to have this effect? Am I misunderstanding something? Line 351. The higher CC-scaling for high-altitude areas. Does that also hold in your specific study area, where you state that it’s the higher elevated areas that are the driest? Line 365. Minimal interannual variability in the basin. How do you reconcile this with your own timeseries, Fig 9, showing large interannual variation at least in extreme precipitation events. Line 370. Source? Line 376. Source? Line 385-390. Sources? Line 450. The distribution of the valleys? Like spatially? Author contributions. I’m surprised to read about a campaign and measurements. Isn’t the work for the large part a GIS + data gathering exercise? Technical corrections Overall little spelling errors and absent/double spaces. Give it a thorough review on this. With figures. Written text such as labels/axis titles is often uncomfortable to read due to small font. Citation: https://doi.org/10.5194/egusphere-2024-4164-RC1 - 
                                        
                                     AC1:  'Reply on RC1', Zhenkui Gu, 16 Apr 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2024-4164/egusphere-2024-4164-AC1-supplement.pdf
- AC2: 'Reply on RC1', Zhenkui Gu, 16 Apr 2025
- AC3: 'Reply on RC1', Zhenkui Gu, 16 Apr 2025
 
- 
                                        
                                     AC1:  'Reply on RC1', Zhenkui Gu, 16 Apr 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        
- 
                     RC2:  'Comment on egusphere-2024-4164', Anonymous Referee #2, 22 Apr 2025
            
            
            
            
                        The study tackles the relevant problem of regional debris-flow susceptibility. I think the authors chose adequate methods (Bayesian statistical methods) and an interesting combination of features representing hydro-geomorphological factors for debris-flow triggering. However, I found the methods to be intransparent in important acspects of the study, such that I cannot assess the reproducibility or plausibility of the results. I list my major concerns below: - What debris flow data was used for training, namely to estimate the occurrence probability in P(Ci) in Eq. 1? You mention that you use “debris flow survey sites” (L123) but there is no reference or description of how you obtain the data. Also Fig. 2 on the study implementation doesn’t mention any use of observational debris flow data to train the model. In L115, I see one citation that may refer to such data (Yu and Tang, 2016), but the full reference is missing. Fig. 5 indicates, that debris -flow fans were identified, but how exactly and how do you differentiate debris-flow fans from alluvial fans?
- Any information on model training and testing is missing, except for the showcasing the model for one event
- There is no uncertainty assessment or discussion of model limitation
- The data availability statement states that datasets are being made available, but there is no link. Anyway, more important would be the data to reproduce the results and this would include the debris flow observations
- The conclusions are largely a copy of the abstract. Both should be rewritten such that they are complementary (e.g., more focus on research question and methods in abstract and more focus on conclusions, implications, outlook in conclusions)
 Specific comments: ~L70-77 : I cannot follow the critique on previously used indicators for DF susceptibility. It may be that the risk is highest in the valley bottom, the source area characteristics govern susceptibility. Can you specify what current methods exactly are missing and what you do differently? Contradictory to your argument on the importance of valley bottom characteristics, I would assume that the factors you report in L77 (stream power, surface erosion, etc) characterize source are rather than sink area. L105: if you use ERA5, higher resolutions that daily are available to my knowledge. Could you justify why you don’t use these? Sub-daily rainfall is commonly much more useful than daily for debris-flow triggering Citation: https://doi.org/10.5194/egusphere-2024-4164-RC2 - 
                                        
                                     AC4:  'Reply on RC2', Zhenkui Gu, 29 Apr 2025
                                        
                                                
                                        
                            
                            
                            
                            
                                        The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2024-4164/egusphere-2024-4164-AC4-supplement.pdf
 
Peer review completion
 
                             
                           
                                 
                
                                 
                                 
                
                                 
                             
                           
                             
                           
                             
                          Journal article(s) based on this preprint
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 787 | 129 | 26 | 942 | 24 | 37 | 
- HTML: 787
- PDF: 129
- XML: 26
- Total: 942
- BibTeX: 24
- EndNote: 37
Viewed (geographical distribution)
| Country | # | Views | % | 
|---|
| Total: | 0 | 
| HTML: | 0 | 
| PDF: | 0 | 
| XML: | 0 | 
- 1
Zhenkui Gu
Xin Yao
Xuchao Zhu
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
                            (2900 KB) 
- Metadata XML
 
 
                         
                         
                         
                         
            
                             
                 
                 
                 
                 
                
The authors present a methodology for assessing debris flow susceptibility over a large scale (study area size ~500000 km2). The new method aims to focus heavily on the source-sink dynamics of debris flows rather than a more classical approach of surface characteristics. In my opinion, this and especially the inclusion of stream-power is innovative. Generally, the manuscript is well-written and concise, therefore I recommend the research to be published. The manuscript does require better descriptions on multiple aspects of the methodology though for which I recommend revisions. Below I first list my most pressing issues which require clarifications. This is followed specific comments and miscellaneous points. I’m happy to engage with the authors when they have answers or follow-up questions to my points.
Major points
For understanding of the reader, I think the methods benefit from a table where input factors to the Bayesian model are listed. This table should include description, resolution (if applicable), reference to details of the factor in the manuscript and preferably a range of values.
The authors choose to focus on indicators of the source to sink characteristics of debris flows and mention that ignoring these characteristics results in low accuracy (without mentioning a source, that should be addressed as well). From the abstract:
“Due to its vast area and the complex mechanisms underlying debris flow formation, using slope-based indicators alone to assess susceptibility, without considering the "source-sink" process of debris flow formation, results in low accuracy in susceptibility evaluations.”
I think the method in the manuscript, which neglects other factors of importance, could be partially responsible for their own relatively low accuracy. Factors such as vegetation, lithology and soil transmissivity (also mentioned by the authors for classical approaches, line 68) are what come to mind. I think deliberately neglecting these factors bends the aim of the manuscript from an overall debris flow hazard indicator to introducing a specific source-sink process-based method. This is still innovative and interesting, but I think the authors should mention their choices in this regard more explicitly at the end of the introduction and in the methods.
The resolution of the analysis is unclear to me. The manuscript regularly mentions a minimum valley length of 200 m. With a DEM resolution of approximately 30 m these would be 6-7 grid cells. Can you reliably estimate your input and apply all your functions, which often require upstream and downstream values, for such small river reaches? After reading the ‘data availability’ section I wonder why the data is at 90 m and not at the DEM resolution? This is the first time I read the analysis is not on the DEM resolution. This and possible resampling should be mentioned in the methods. The issue with the valley length applies even stronger if the analysis was performed at 90 m. If I am missing something please let me know, otherwise I think the authors should be clearer on this issue in their methods.
Specific comments
Line 84/85. These numbers are suspicious. A length of 2316 km and an average gradient of 1.45% (~0.8 ⁰) yields a vertical distance of ~ 33 km (length * tan(slope)). This is not very realistic, am I missing something?
Line 105. For the ECMWF ERA5, a description of resolution, duration and related uncertainty are required as well as a reference.
Line 112. What threshold is used to define flat?
Section 3.2. Why did the authors specifically choose a Naïve Bayesian model? And not for instance logistic regression or a random forest model? This choice should be clarified.
Line 136 Li
Line 148. Higher than 0
Line 150-160. Text doesn’t read well. Could benefit from some critical rewriting.
Line 159. 10^4 I hope. Is this a ‘guestimate’ or did you calibrate? Be specific of how this was chosen. Also give the values of the a and b fitting parameters.
Line 220. Looks better when the formula is fitted on one line.
Line 231. Correct me if I’m wrong, but I don’t understand your statements. In Radoane et al. (2003) four functions (linear, exponential, log and power) are calibrated and tested for which can best describe the longitudinal profile of various rivers. They don’t mention “progressions through stages of function curves”. Clarify what you’re trying to say here.
Line 242. Having done the GIS analysis, don’t you have an exact number for how many valleys?
Figure 5c. Add stream power units to the second axis on the right.
Figure 5f caption should be: ‘Photo by one of the authors’ As there are multiple authors of the manuscript.
Figure 6. Why this threshold for high-energy valleys?
Line 252. Any reason for this thresholding?
Line 253. Does clay correlate with erodibility?
Figure 7 misses a legend on the right.
Line 269. Why is that suggested?
Line 274/275. Are any of these statements statistically significant given the timeframe of 10 years?
Figure 10. The spatial pattern of the occurrence probability appears related to the extreme precipitation as one would expect. But this means the interpolation pattern is clearly visible. This makes me wonder if your rainfall data, SPI, really gives full spatial cover or whether it is an interpolation from specific sites.
Line 281: Move the accuracy to section 4.5. This is where you present the model verification.
Figure 11. How is the relative importance calculated?
Figure 12/13 (and in the conclusions). I appreciate the authors highlighting the impact a disaster like a debris flow can have. It stresses the importance of this type of research. I do wonder though how relevant 1 data point is when presenting a province-scale model and I would reduce the prominence of this description. I think it’s more interesting is to have an overview of all the debris flows the authors collected and their occurrence probability. This will also help in interpreting figure 11 better.
Section 4.5 Is this the same dataset used to train the model, then the 70/30 split and compute accuracy? If so, the dataset should be mentioned and described in the methodology, not here.
Figure 13 b/f: The red text in the figure is unreadable.
Line 305. Was the actual event also a medium-scale debris flow according to your classification?
Line 323-326. I don’t follow this reasoning. How is the observation timescale relevant to the distribution in precipitation event intensity. Isn’t the goal of a statistical analysis, precisely not to have this effect? Am I misunderstanding something?
Line 351. The higher CC-scaling for high-altitude areas. Does that also hold in your specific study area, where you state that it’s the higher elevated areas that are the driest?
Line 365. Minimal interannual variability in the basin. How do you reconcile this with your own timeseries, Fig 9, showing large interannual variation at least in extreme precipitation events.
Line 370. Source?
Line 376. Source?
Line 385-390. Sources?
Line 450. The distribution of the valleys? Like spatially?
Author contributions. I’m surprised to read about a campaign and measurements. Isn’t the work for the large part a GIS + data gathering exercise?
Technical corrections
Overall little spelling errors and absent/double spaces. Give it a thorough review on this.
With figures. Written text such as labels/axis titles is often uncomfortable to read due to small font.