Multi-Data Source Based Quantifying Urban Flood Severity in Major Chinese Cities (2000&ndash;2024) Using a Hybrid Machine-Learning Weighting Framework

Peng, Guoqiang; Sun, Yujing; Wang, Mao; Chen, Min; Zhang, Fengyuan

doi:10.5194/egusphere-2026-2501

Preprints

https://doi.org/10.5194/egusphere-2026-2501

Preprints

22 May 2026

| 22 May 2026

Multi-Data Source Based Quantifying Urban Flood Severity in Major Chinese Cities (2000–2024) Using a Hybrid Machine-Learning Weighting Framework

Guoqiang Peng, Yujing Sun, Mao Wang, Min Chen, and Fengyuan Zhang

Abstract. Urban flooding poses a major challenge to sustainable urban development, yet most existing assessments focus on single cities or river basins and rely on limited historical records. This study integrates multi-source data from 20 Chinese cities over 2000–2024 to develop a comparable long-term assessment of urban flood severity. To address the fragmentation and inconsistency of flood evidence across official records, news reports, and social media, we construct an event-level database and derive a Flood Severity Index (FSI) using an interpretable data-driven weighting and ensemble framework. Robustness is evaluated through repeated resampling and consistency checks across cities and years. The results show that southern cities experience more frequent and severe flooding, whereas northern cities are generally less affected but more vulnerable to abrupt extremes. These findings suggest distinct governance priorities: reducing chronic exposure in southern cities and strengthening preparedness for high-impact shocks in northern cities. The proposed framework is transferable to other regions and provides a basis for future cross-regional flood risk comparison and adaptive urban risk governance.

Received: 30 Apr 2026 – Discussion started: 22 May 2026

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Guoqiang Peng, Yujing Sun, Mao Wang, Min Chen, and Fengyuan Zhang

Status: final response (author comments only)

RC1:
'Comment on egusphere-2026-2501', Anonymous Referee #1, 12 Jun 2026
The review of this paper is relatively brief as I focus on the main methodological flaws.
The methodology of collecting data is described much to briefly. It would not be possible to replicate this method.

Some of the timelines of the flood severity index are extremely constant. For example, cities like Chengdu, Shenzhen and Guangzhou are affected by the same severity of floods almost every year in the proposed index. This is unlikely to be the case in reality (or the index is too sensitive to some input parameters). However, because the methodology of constructing the index is not described exactly this cannot be checked.

The timeline of flood events created for the various cities is based on a number of different sources some of which are not available for part of the timeline, which means that there is a probability that the datasets are biased over time. While this is discussed in the discussion, it should be addressed in the methodology.

It is unclear how the target data is constructed beyond using a quantile-based approach. However, it seems that this is based on the same data as the training data, and thus it seems that the same values are used to construct the target variables and also used as training, and thus the ML algorithms simply re-discover the linear combination of the FSI. This also would be consistent with the extremely high correlation between the various machine learning models.

The data and methods section mention several datasets that don’t seem to be used in the manuscript (although the DEM may be used for visualization in a figure). Other datasets used in the methodology are not mentioned.

Hydrological or climatological data seem key in predicting flood severity but they don’t seem to be used in the methods.

The discussion and especially conclusions make very general points that are mostly based on existing research rather than the described methodology.
Citation: https://doi.org/10.5194/egusphere-2026-2501-RC1
RC2: 'Comment on egusphere-2026-2501', Anonymous Referee #2, 18 Jun 2026

General comments:
In this manuscript, Peng et al. tackle the complex challenge of assessing urban flood risks across diverse geographical and developmental landscapes. The authors attempt to bridge the gap between fragmented historical records and modern datasets by integrating official statistics, news reports and social media into an event-level database. While the study’s objective to construct an interpretable, data-driven Flood Severity Index is timely and highly relevant to urban resilience planning, the execution of this framework warrants careful examination. Specifically, a number of methodological uncertainties in the research design and a lack of granular transparency in the methodology undermine the robustness of the study’s otherwise compelling spatial and temporal claims.
In the methodology section, which frequently states what was done without adequately explaining how it was achieved, a thorough revision is necessary for the study to be truly replicable. For instance, the authors claim to resolve missing disaster attributes such as economic losses and affected populations by imputing them “using statistical methods”, yet they fail to specify the actual techniques employed, leaving the foundational dataset vulnerable to undisclosed biases. Similarly, the study heavily relies on a Python-based web crawler and NLP to filter 14 years of Sina Weibo posts, but the manuscript omits the specific algorithms or semantic recognition tools used to isolate valid flood events from the noise. Furthermore, while the authors deploy six different ML algorithms to build their hybrid weighting system, the text lacks critical details regarding hyperparameter tuning or dataset splitting beyond standard cross-validation. Again, this vagueness prevents replicability and makes it difficult to rigorously verify if the authors successfully implemented the data-driven framework they claim.
The authors identify the period from 2010-2020 as a “transitional phase of adjustment” in urban flood severity and attribute this shift to changes in infrastructure and flood management. However, the underlying flood database undergoes a major methodological change at approximately the same time. Prior to 2010, the database relies primarily on official records and documentary sources, whereas from 2010 onward it additionally incorporates large volumes of social media data from Sina Weibo. This change in data availability and reporting intensity may substantially alter event detection rates, event characterization, and the online-attention component of the FSI itself. Consequently, the apparent temporal transition identified by the authors may partly reflect an observational artifact rather than a genuine shift in flood severity. The manuscript should explicitly evaluate the sensitivity of the results to this data-source discontinuity and demonstrate that the observed temporal phases remain robust when social-media-derived information is excluded or otherwise standardized across the study period.
The FSI is constructed using absolute values of affected area, affected population economic losses, and fatalities. However, the study compares cities that differ substantially in population size, urban extent, and economic activity. Larger cities will naturally tend to report greater numbers of affected people and larger economic losses even when the relative severity of flooding is comparable. The manuscript does not explain whether these indicators were normalized by population, urban area, GDP, or other exposure metrics prior to index construction. Min-max normalization alone does not resolve the issue, as it rescales variables without accounting for differences in underlying exposure. As a result, it is difficult to determine whether the resulting rankings reflect flood severity or simply differences in city size and socioeconomic scale. The authors should justify the use of absolute indicators or evaluate the sensitivity of their results to appropriate normalization procedures.
Finally, the inclusion of online popularity as one of the five core components of the FSI requires substantially stronger theoretical justification. Unlike affected area, affected population, economic losses, and fatalities, online popularity is not a direct measure of flood impacts but rather a measure of public attention and information dissemination. Incorporating online popularity directly into the severity index risks conflating disaster impacts with reporting behaviour. The authors should clearly justify why public attention is treated as a component of flood severity rather than as an auxiliary explanatory variable and should evaluate how sensitive the resulting severity rankings are to the inclusion or exclusion of this indicator.
Specific comments:
The authors state that missing values for crucial impact metrics (e.g., economic losses, affected population) were “imputed using statistical methods”. They need to explicitly identify which methods were used (e.g., mean imputation, multiple imputation, KNN), as imputing extreme variables can severely skew the dataset.
The authors mention applying NLP techniques for semantic recognition and noise reduction. However, it completely omits the specific algorithms, models, or libraries used to achieve this, making the data cleaning process, again, irreproducible.
While six machine learning models are evaluated, the manuscript lacks critical details regarding hyperparameter tuning, optimization strategies, and train/test splitting beyond a standard 10-fold cross-validation.
In the main text the FSI is correctly defined as a linear combination of five indicators, however in the flowchart (Fig. 2) the formula is incorrectly typed as a summation of only three variables with incorrect subscripts. The Figure must be corrected to accurately reflect the five variables discussed in the text.
There is a typo in the indexing for the AHP index calculation. The summation index is defined as i = 1, but the variables inside the summation use the subscript l. This should be corrected so the subscripts match the index (i.e. using i throughout).
The manuscript provides conflicting descriptions of the role of ML. Section 3.3 defines continuous and categorical FSI variables for supervised learning, while Section 3.5 states that machine learning is not used to predict predefined labels but only to derive SHAP-based weights. The authors should explicitly clarify the target variable(s) used during model training and explain how SHAP values were derived without introducing dependence on the originally constructed FSI.

Citation: https://doi.org/10.5194/egusphere-2026-2501-RC2

Guoqiang Peng, Yujing Sun, Mao Wang, Min Chen, and Fengyuan Zhang

Viewed

Total article views: 308 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
150	143	15	308	16	17

HTML: 150
PDF: 143
XML: 15
Total: 308
BibTeX: 16
EndNote: 17

Views and downloads (calculated since 22 May 2026)

Month	HTML	PDF	XML	Total
May 2026	98	30	8	136
Jun 2026	28	47	3	78
Jul 2026	24	66	4	94

Cumulative views and downloads (calculated since 22 May 2026)

Month	HTML	PDF	XML	Total
May 2026	98	30	8	136
Jun 2026	28	47	3	78
Jul 2026	24	66	4	94

Viewed (geographical distribution)

Total article views: 242 (including HTML, PDF, and XML) Thereof 242 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 26 Jul 2026

Short summary

This study combines official records, news reports, social media posts, and search data to track urban flood events in twenty major Chinese cities from 2000 to 2024. It builds a comparable database and uses an interpretable machine learning framework to measure flood severity across cities and years. Results show a clear north south contrast: southern river and coastal cities face frequent severe floods, while northern cities usually face lower risk but suffer sudden extremes during rare storms.


Total:	0
HTML:	0
PDF:	0
XML:	0