An optimized LSTM-based approach applied to early warning and forecasting of ponding in the urban drainage system

Zhu, Wen; Tao, Tao; Yan, Hexiang; Yan, Jieru; Wang, Jiaying; Li, Shuping; Xin, Kunlun

doi:https://doi.org/10.5194/egusphere-2022-874

Preprints

https://doi.org/10.5194/egusphere-2022-874

Preprints

04 Oct 2022

| 04 Oct 2022

An optimized LSTM-based approach applied to early warning and forecasting of ponding in the urban drainage system

Wen Zhu, Tao Tao, Hexiang Yan, Jieru Yan, Jiaying Wang, Shuping Li, and Kunlun Xin

Abstract. An optimized LSTM-based approach applied to early warning and forecasting of ponding in the urban drainage system is proposed in this study. This approach can identify locations and process of ponding quickly with relatively high accuracy. The model is constructed with two tandem processes and a multi-task learning mechanism is introduced. The results are compared with those of widely used neural networks (LSTM, CNN) to validate its advantages. Then, the model is revised with available monitoring data in the study area to achieve higher accuracy, and the influence of the number of the monitoring points selected on the performance of the corrected model is also discussed in this paper. Over 15000 designed rainfall events are used for model training, covering a diversity of extreme weather conditions.

Received: 02 Sep 2022 – Discussion started: 04 Oct 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 3224 KB)

Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Preprint (3224 KB)

Download & links

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Journal article(s) based on this preprint

26 May 2023

An optimized long short-term memory (LSTM)-based approach applied to early warning and forecasting of ponding in the urban drainage system

Wen Zhu, Tao Tao, Hexiang Yan, Jieru Yan, Jiaying Wang, Shuping Li, and Kunlun Xin

Hydrol. Earth Syst. Sci., 27, 2035–2050, https://doi.org/10.5194/hess-27-2035-2023,https://doi.org/10.5194/hess-27-2035-2023, 2023

Short summary

Wen Zhu, Tao Tao, Hexiang Yan, Jieru Yan, Jiaying Wang, Shuping Li, and Kunlun Xin

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-874', Anonymous Referee #1, 29 Dec 2022

This paper proposed an optimized LSTM-based model applied to early warning and forecasting of ponding in the urban drainage system. It can identify flooding locations and process of ponding quickly with relatively high accuracy. The research ideas and methods are well innovative.

The issues are listed as follows:

-My main concern about this paper is related to the case area. The authors said "(Due to these structural characteristics) the performance of the model will not be limited by the size of the case area", but they only applied the proposed method to a small-scale case area (a residential district of 6.128 hm2). I think it would be necessary to explain the capability of the proposed method.

- Section 2.4.2 (Eq. 5) Why you used this formula to design rain intensity? This is the design formula used by the municipality (i.e. a routine in China), orï¼ Need specify.

- What is Pilgrim & Cordery? Any equations?

- Please show equations to explain how you added the noise as the description is not clear enough.

- Why there are only 5 real-world rainfall events to verify the performance of the corrected model? If it is enough considering that you have 16960 synthetic rainfall events?

- It is recommended to add HESS's article to the referencesã

Citation: https://doi.org/10.5194/egusphere-2022-874-RC1
- AC1: 'Reply on RC1', Zhu Wen, 20 Jan 2023
  
  Dear Reviewers,
  Thank you very much for your time involved in reviewing the manuscript and your very encouraging comments on the merits. We also appreciate your clear and detailed feedback and hope that the explanation has fully addressed all of your concerns.
  Please see attachment for details.
  Sincerely,
  The Authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-874-AC1
RC2:
'Comment on egusphere-2022-874', Anonymous Referee #2, 20 Feb 2023

The authors proposed an LSTM-based emulator to simulate the ponding process in the drainage system, which is critical to urban flooding study. The emulator is composed of two LSTM models to sequentially simulate node lateral flows and the ponding volume, followed by a correction model. The proposed emulator was successfully applied to a case study and showed superior performances over some simplified versions (e.g., a lumped model using LSMT/CNN). I appreciate the hard work that has been put in by the authors. However, I have the following concerns which might require further revision before the manuscript can be accepted.
First of all, I had a hard time following the manuscript. Readability is critical to a renowned journal such as HESS. The current status of the manuscript does not meet the requirement. For example, there are a lot of run-on sentences. A rule of thumb is that the length of a sentence does not exceed two lines. Coherence is also an issue. Many sentences are 'loosely' connected in a logical sense. It would be a pity if the message is not clearly communicated while so much work has been done. I suggest the authors further greatly revise the language (a professional English editor might help in this case).
The second issue is associated with the model CR of the LSTM-based emulator (btw, what is CR abbreviated for?). I don't quite understand the descriptions of the model CR (i.e., L153-166). Neither Figure 6 is illustrative to me. Do the monitoring data refer to the measured lateral flows at the monitored nodes? Is the correction model trained on pairs of simulations and monitored measurements or based on a pre-trained mapping (i.e., using transfer learning)? Please specify.
The last concern is the mass balance of the emulator. Though not an expert in urban drainage systems, I consider that the mass conversation plays a key role in balancing the water exchanges between nodes. Does the proposed LSTM model account for that? If not, please specify the reason for not doing this.
Other minor edits:
L37-40: The authors point out the importance of the dataset. I'm wondering whether the author performed a sort of convergence test to evaluate how much data is sufficient for the proposed LSTM emulator training.
L50: 'not discussed' --> 'not explored'; 'not available' --> 'not feasible'
L77: 'influencing' --> 'influential'
L82: 'MAE, MSE, CC, NSE' --> We usually put full names before abbreviations
Figure 2 caption: '... test process in the runoff process' --> '... test procedures in developing the LSTM-based runoff emulator'. Also, many captions are too brief to provide enough information about these complicated figures.
Figure 3: For each of the two emulated processes, is only one LSTM used for all nodes? Or, is a separate LSTM used for each node?
L91-93: That's a super long sentence and there are a lot!
L100-102: Are the classification module and OUT_MODULE also two MLPs?
L105-106: I don't understand which layer in the LSMT module is shared by the classification and out modules.
L116-119: To evaluate the impact of the gaussian filter, is there a comparison between the current emulator and one without the gaussian noising procedure?
Eqs(1)-(4): I suggest moving the calculation of the error term to the appendix to improve the readability.
Eq.(5): What is 'lg'? Please use 'log' if you mean logarithm operation.
L195: 'tb and ta is' --> 'ta and tb are'
L197: Is Pilgrim & Cordery a reference? If yes, please provide the year.
L226: I like the usage of hyperopt here.
L229-231: missing subjects of the two sentences.
Table 3: What are the optimal hyperparameters of the MLP used for model CR? i.e., the number of neurons in each layer and the number of hidden layers. How about the hyperparameters of the classification and out modules?
L274: Why are these six nodes selected? (also shown in Figure 9)
Figure 10: It is the emulated ponding volume before the model correction or CR, right? If yes, why is it different from the lines labeled by 'Before updating' in Figure 11?
L336-341: Should these sentences be grouped into one paragraph?
Figure 15: combining (a) and (b)?
L338: 'In a summary' --> 'In summary'

Citation: https://doi.org/10.5194/egusphere-2022-874-RC2
- AC2: 'Reply on RC2', Zhu Wen, 13 Mar 2023
  
  Dear Reviewers,
  Thank you very much for your time involved in reviewing the manuscript and your very encouraging comments on the merits. We also appreciate your clear and detailed feedback and hope that the explanation has fully addressed all of your concerns.
  Please see attachment for details.
  Sincerely,
  The Authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-874-AC2

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2022-874', Anonymous Referee #1, 29 Dec 2022

This paper proposed an optimized LSTM-based model applied to early warning and forecasting of ponding in the urban drainage system. It can identify flooding locations and process of ponding quickly with relatively high accuracy. The research ideas and methods are well innovative.

The issues are listed as follows:

-My main concern about this paper is related to the case area. The authors said "(Due to these structural characteristics) the performance of the model will not be limited by the size of the case area", but they only applied the proposed method to a small-scale case area (a residential district of 6.128 hm2). I think it would be necessary to explain the capability of the proposed method.

- Section 2.4.2 (Eq. 5) Why you used this formula to design rain intensity? This is the design formula used by the municipality (i.e. a routine in China), orï¼ Need specify.

- What is Pilgrim & Cordery? Any equations?

- Please show equations to explain how you added the noise as the description is not clear enough.

- Why there are only 5 real-world rainfall events to verify the performance of the corrected model? If it is enough considering that you have 16960 synthetic rainfall events?

- It is recommended to add HESS's article to the referencesã

Citation: https://doi.org/10.5194/egusphere-2022-874-RC1
- AC1: 'Reply on RC1', Zhu Wen, 20 Jan 2023
  
  Dear Reviewers,
  Thank you very much for your time involved in reviewing the manuscript and your very encouraging comments on the merits. We also appreciate your clear and detailed feedback and hope that the explanation has fully addressed all of your concerns.
  Please see attachment for details.
  Sincerely,
  The Authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-874-AC1
RC2:
'Comment on egusphere-2022-874', Anonymous Referee #2, 20 Feb 2023

The authors proposed an LSTM-based emulator to simulate the ponding process in the drainage system, which is critical to urban flooding study. The emulator is composed of two LSTM models to sequentially simulate node lateral flows and the ponding volume, followed by a correction model. The proposed emulator was successfully applied to a case study and showed superior performances over some simplified versions (e.g., a lumped model using LSMT/CNN). I appreciate the hard work that has been put in by the authors. However, I have the following concerns which might require further revision before the manuscript can be accepted.
First of all, I had a hard time following the manuscript. Readability is critical to a renowned journal such as HESS. The current status of the manuscript does not meet the requirement. For example, there are a lot of run-on sentences. A rule of thumb is that the length of a sentence does not exceed two lines. Coherence is also an issue. Many sentences are 'loosely' connected in a logical sense. It would be a pity if the message is not clearly communicated while so much work has been done. I suggest the authors further greatly revise the language (a professional English editor might help in this case).
The second issue is associated with the model CR of the LSTM-based emulator (btw, what is CR abbreviated for?). I don't quite understand the descriptions of the model CR (i.e., L153-166). Neither Figure 6 is illustrative to me. Do the monitoring data refer to the measured lateral flows at the monitored nodes? Is the correction model trained on pairs of simulations and monitored measurements or based on a pre-trained mapping (i.e., using transfer learning)? Please specify.
The last concern is the mass balance of the emulator. Though not an expert in urban drainage systems, I consider that the mass conversation plays a key role in balancing the water exchanges between nodes. Does the proposed LSTM model account for that? If not, please specify the reason for not doing this.
Other minor edits:
L37-40: The authors point out the importance of the dataset. I'm wondering whether the author performed a sort of convergence test to evaluate how much data is sufficient for the proposed LSTM emulator training.
L50: 'not discussed' --> 'not explored'; 'not available' --> 'not feasible'
L77: 'influencing' --> 'influential'
L82: 'MAE, MSE, CC, NSE' --> We usually put full names before abbreviations
Figure 2 caption: '... test process in the runoff process' --> '... test procedures in developing the LSTM-based runoff emulator'. Also, many captions are too brief to provide enough information about these complicated figures.
Figure 3: For each of the two emulated processes, is only one LSTM used for all nodes? Or, is a separate LSTM used for each node?
L91-93: That's a super long sentence and there are a lot!
L100-102: Are the classification module and OUT_MODULE also two MLPs?
L105-106: I don't understand which layer in the LSMT module is shared by the classification and out modules.
L116-119: To evaluate the impact of the gaussian filter, is there a comparison between the current emulator and one without the gaussian noising procedure?
Eqs(1)-(4): I suggest moving the calculation of the error term to the appendix to improve the readability.
Eq.(5): What is 'lg'? Please use 'log' if you mean logarithm operation.
L195: 'tb and ta is' --> 'ta and tb are'
L197: Is Pilgrim & Cordery a reference? If yes, please provide the year.
L226: I like the usage of hyperopt here.
L229-231: missing subjects of the two sentences.
Table 3: What are the optimal hyperparameters of the MLP used for model CR? i.e., the number of neurons in each layer and the number of hidden layers. How about the hyperparameters of the classification and out modules?
L274: Why are these six nodes selected? (also shown in Figure 9)
Figure 10: It is the emulated ponding volume before the model correction or CR, right? If yes, why is it different from the lines labeled by 'Before updating' in Figure 11?
L336-341: Should these sentences be grouped into one paragraph?
Figure 15: combining (a) and (b)?
L338: 'In a summary' --> 'In summary'

Citation: https://doi.org/10.5194/egusphere-2022-874-RC2
- AC2: 'Reply on RC2', Zhu Wen, 13 Mar 2023
  
  Dear Reviewers,
  Thank you very much for your time involved in reviewing the manuscript and your very encouraging comments on the merits. We also appreciate your clear and detailed feedback and hope that the explanation has fully addressed all of your concerns.
  Please see attachment for details.
  Sincerely,
  The Authors
  
  Citation: https://doi.org/10.5194/egusphere-2022-874-AC2

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

ED: Reconsider after major revisions (further review by editor and referees) (15 Mar 2023) by Yue-Ping Xu

AR by Zhu Wen on behalf of the Authors (22 Mar 2023) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (31 Mar 2023) by Yue-Ping Xu

RR by Anonymous Referee #2 (07 Apr 2023)

Suggestions for revision or reasons for rejection

I would like to thank the authors for taking the effort in revising the manuscript to address my comments. Though great revisions have been made, I have one more comment and some minor language revisions to suggest. I hope some of the edits would help the authors' future work.

Section 2.3. -- Model correction: I still don't quite follow how the correction model has been developed, and there might be a miscommunication issue. Did you leverage an existing pre-trained model from Pan et al (2010) to perform the correction here (L143)? If yes, how applicable is this pre-trained model to this study? Also, the sentence 'the model CR is trained based on a pre-trained mapping from X to Y' (L151) is confusing. It doesn't tell whether the CR model uses a trained model from another study (for the purpose of transfer learning) or is trained separately in this work. If it is trained separately, which I suppose was after the development of the two LSTM modules, why did you call it 'transfer learning'?

Other minor edits in the introduction section:
L32: 'has-' --> 'has'
L33: I wouldn't call "deep learning as a form of training". Deep learning is a particular machine learning technique that leverages neural networks to learn nonlinear relationships from a dataset.
L34: 'And unlike' --> 'Like'. Please avoid using 'and' in the beginning of the sentence, which is informal. (There are multiple cases throughout the manuscript. Please double check)
L38: 'some factors need to be improved' --> 'there are opportunities to further the application of deep learning ...'
L39: 'the dataset for training' --> 'the training dataset'
L39-40: 'There are studies utilizing deep learning algorithms for urban flood forecasting, but the developed model is trained on a small number of samples.' --> 'Many studies in urban flood forecasting only use a small number of samples to develop the deep learning models.'
L42-44: 'Secondly, monitoring equipment is expensive and thus not frequently available. Therefore, researchers have to rely on simulations produced from hydrodynamic models, however, often without considering the accuracy of the models.' --> 'Secondly, due to the high cost of monitoring equipment, researchers usually have to rely on unvalidated simulations produced from hydrodynamic models.'
L47-48: "Such as ..." --> "Example includes but not limited to ..."
L54: "we propose an optimized LSTM-based approach, which is applied to early warning" --> "we propose an optimized LSTM-based approach for early warning ..."
L57: "(LSTM, CNN)." --> ", i.e., LSTM and CNN."
L58: "to achieve higher accuracy" --> "to improve the emulation performance"

Hide

RR by Anonymous Referee #1 (11 Apr 2023)

ED: Publish subject to minor revisions (review by editor) (17 Apr 2023) by Yue-Ping Xu

AR by Zhu Wen on behalf of the Authors (20 Apr 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (04 May 2023) by Yue-Ping Xu

AR by Zhu Wen on behalf of the Authors (04 May 2023)

Journal article(s) based on this preprint

26 May 2023

An optimized long short-term memory (LSTM)-based approach applied to early warning and forecasting of ponding in the urban drainage system

Wen Zhu, Tao Tao, Hexiang Yan, Jieru Yan, Jiaying Wang, Shuping Li, and Kunlun Xin

Hydrol. Earth Syst. Sci., 27, 2035–2050, https://doi.org/10.5194/hess-27-2035-2023,https://doi.org/10.5194/hess-27-2035-2023, 2023

Short summary

Wen Zhu, Tao Tao, Hexiang Yan, Jieru Yan, Jiaying Wang, Shuping Li, and Kunlun Xin

Viewed

Total article views: 506 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
335	156	15	506	5	4

HTML: 335
PDF: 156
XML: 15
Total: 506
BibTeX: 5
EndNote: 4

Views and downloads (calculated since 04 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	139	45	6	190
Nov 2022	18	14	0	32
Dec 2022	21	18	2	41
Jan 2023	30	12	3	45
Feb 2023	58	20	2	80
Mar 2023	36	18	2	56
Apr 2023	25	22	0	47
May 2023	8	7	0	15
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Cumulative views and downloads (calculated since 04 Oct 2022)

Month	HTML	PDF	XML	Total
Oct 2022	139	45	6	190
Nov 2022	18	14	0	32
Dec 2022	21	18	2	41
Jan 2023	30	12	3	45
Feb 2023	58	20	2	80
Mar 2023	36	18	2	56
Apr 2023	25	22	0	47
May 2023	8	7	0	15
Jun 2023	0
Jul 2023	0
Aug 2023	0
Sep 2023	0
Oct 2023	0
Nov 2023	0
Dec 2023	0
Jan 2024	0
Feb 2024	0
Mar 2024	0
Apr 2024	0
May 2024	0
Jun 2024	0
Jul 2024	0
Aug 2024	0
Sep 2024	0

Viewed (geographical distribution)

Total article views: 497 (including HTML, PDF, and XML) Thereof 497 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Sep 2024

Download

The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.

Preprint (3224 KB)
Metadata XML

Short summary

To provide a possibility for early warning and forecasting of ponding in the urban drainage system, an optimized LSTM-based model is proposed in this paper. It has a remarkable improvement as compared to the models based on LSTM and CNN structures. The performance of the corrected model is reliable if the number of monitoring sites is over one per hectare. Increasing the number of monitoring points further has little impact on the performance.


Total:	0
HTML:	0
PDF:	0
XML:	0