the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Brief Communication: Rise of the Guadalupe River- A Multifaceted Post Event Analysis of the July 4, 2025, Flood in Central Texas
Abstract. The flash flooding across Central Texas on July 4th, 2025, caused more than 130 fatalities and property losses exceeding 20 billion dollars. The objective of this study is to diagnose the drivers of this catastrophic event and to analyze the temporal variability in forecasting flood inundation dynamics in the hours leading up to and during the event. Using the Operational National Water Model short-range streamflow forecast product, we generated 306 forecasted flood inundation maps between July 3rd and July 4th, 2025. For evaluation, we constructed an inundation extent benchmark derived from USGS high water marks. Both impact-based and pixel-based assessments are presented.
- Preprint
(4116 KB) - Metadata XML
-
Supplement
(4797 KB) - BibTeX
- EndNote
Status: open (until 14 May 2026)
- RC1: 'Comment on egusphere-2026-1847', Anonymous Referee #1, 16 Apr 2026 reply
-
RC2: 'Comment on egusphere-2026-1847', Anonymous Referee #2, 14 May 2026
reply
The paper presents a comprehensive comparison of precipitation, discharge and flood inundation forecasts for a specific flash flood event of the Guadalupe River basin in Central Texas. The study evaluates forecasts against gauge observations, high-water-mark-derived flood maps, and building exposure datasets, enabling the identification of uncertainty propagation throughout the forecasting chain.
The manuscript addresses an important topic, namely understanding the sources of forecast failure during extreme flood events. The multifaceted evaluation framework is valuable, particularly the attempt to propagate forecast uncertainty all the way to flood impacts rather than limiting the analysis to precipitation or discharge verification alone.
However, the manuscript would benefit from clearer articulation of its novelty and stronger emphasis of learned lessons from this multifaceted framework compared to only precipitation or discharge evaluation framework. Figure consistency and presentation also need substantial improvement to make comparisons between lead times and locations clearer.
Overall, I believe the paper has potential, but considerable revisions are needed before publication.
Main points to improve are:
- The motivation for diagnosing failed forecasts is clear, but the manuscript does not sufficiently explain the novelty of the study. First of all, in the abstract it is stated that the objective of the study is to diagnose the drivers of this catastrophic event. While this is important, this does not seem like the main objective nor does it highlight novelty. Currently, one of the main conclusions appears to be that shorter lead-time forecasts produce better precipitation, discharge, and flood predictions. While expected and useful to quantify, this is not in itself novel. The more novel aspects seem to be propagating uncertainty through the full forecasting chain toward impacts, and the use of post-event USGS high-water marks to generate a spatially continuous benchmark flood inundation map. However, the manuscript should more explicitly explain what additional insights are obtained from evaluating flood impacts rather than only discharge forecasts, and why the inundation analysis materially improves understanding of forecast failure. Currently, it is unclear whether the impact-based analysis provides substantially more information than a direct comparison of forecasted versus observed discharge. One important advantage that should be emphasized more clearly is that high-water-mark-derived flood extents may enable evaluation in locations where gauge observations are unavailable or failed during the event. Relatedly, propagating uncertainty to impacts can help place forecast errors into societal context, which is currently only implicitly addressed. The novelty statement in the Introduction and Conclusion should therefore be strengthened considerably.
- The manuscript should clarify whether forecast deficiencies were actually a primary driver of the impacts during this event. Currently, it remains unclear why this specific event was selected, whether losses were mainly caused by forecast shortcomings, or whether other factors such as exposure, warning dissemination, evacuation timing, infrastructure vulnerability, and emergency response played larger roles. Providing this context is important to justify the relevance of diagnosing forecast uncertainty for this case study. In addition, the Conclusions should acknowledge that reducing or quantifying forecast uncertainty does not necessarily translate directly into reduced impacts or losses. Other uncertainties remain present throughout the forecasting chain, including those associated with the simplified inundation modelling approach, exposure estimation, warning interpretation, and early-action decision making. In many cases, deficiencies in communication and response may contribute more strongly to disaster impacts than forecast uncertainty itself. Explicitly discussing these broader limitations would provide a more balanced perspective on the practical implications of the study and on where improvements in risk reduction efforts may be most effective.
- There are several inconsistencies between the figures that make cross-comparison unnecessarily difficult. In addition, the overall figure design and readability should be improved throughout the manuscript. Specific suggestions are provided below.
Suggestions to improve figures:
- Figure 1
- Fig 1c: I suggest to use different colours for rain gauge stations and streamflow gauges (currently
- Figure 2
- Regarding panel e, it is confusing that it does not have any USGS gauge readings and no return periods. This “station”(?) is also not shown in Figure 1 panel c and therefore leads to confusion. Also the subtitle is not informative and the time interval and coverage is different. Generally, it could be an idea to use a colour scale that intuitively changes colour with lead time (for example increasing colour intensity), as now certain lead times have the same colour and cannot be distinguished. Also the time range can be shortened to only show relevant data. This might allow the panel to be a big larger and the data to be more visible.
- It is confusing that L109-113 states USGS gauge data near Hunt failed during the fast-rising streamflow (July 4th to July 5th), whereas panel b of Figure 2 does show USGS gauge data for Hunt for the period from July 4th to 5th. The only failed gauge from Figure 2 seems to be for Camp Mystic “panel e”. It is not clear how the accuracy of the forecast for a specific gauge location can be tested when observational data is missing (for which the HWM estimates seem to be the solution, if I understand correctly)
- Figure 3
- Panel numbering is missing for upper map. In this panel, please add the name of the fifth purple circle and explain this is Camp Mystic (I assume). Also, please center the purple circle for the gauge at Hunt in top map.
- Please use the name order of gauge names as in Figure 2, e.g. in Fig 2 panel e is Camp Mystic and here (Fig 3) panel b. I prefer the station order used in Figure 3, as it follows the downstream progression of the catchment flow and therefore improves interpretability. I suggest using this same ordering consistently across all figures and throughout the text. It is good how the order of in text results on p. 8 matches the order of those presented in figure 3.
- The text in the a-e1 panels is very small and I suggest to increase the size where possible. I do like the way these panels visualise the different metrics per FIM.
- For the legend of panels a-e2-6, is flood extent the forecast flood extent? If so, suggestion to change to “Forecasted flood extent”.
- The colour of the building footprints makes is very hard to see them, especially at this scale. I recommend to fill them and choose a colour combination that makes them easy distinguishable.
- A question: Is the catchment of Camp Mystic a larger determinant of the peak discharge for the Hunt station than the North Fork catchment, as the Hunt peak discharge happens at 5 AM and that of North Fork only later at 6 AM? If so, this might be worth mentioning in the text.
- Another thought is that it seems that the importance of forecast uncertainty increases for downstream locations. If so, this is an important finding to mention explicitly
Supplementary figures:
- Figure S3 states “six-hour rainfall accumulation at 8:00 AM derived from MRMS (1 km) Pass 2 observations and HRRR (3-km) forecasts issued at successive lead times, from 0 hour through 7-hour lead time”, however the plots shown (panel b-h) are from 2:00 AM – 20:00 PM, which do not seem to match the mentioned lead times. Also, a difference plots could be more informative to see the magnitude of differences in specific locations than leaving this to the estimation skills of the reader.
Minor comments:
- In the abstract, please clarify which type of drivers are diagnosed, L11 (if this is indeed the main objective of the study, which I question in point 1 of the main points to improve). From L49-55 I understand you are referring to large-scale climatic drivers.
- Please clarify what “impact-based and pixel-based assessments” means in L16, as impact-based can also be pixel-based, right?
- L22, use of not common scientific units: “3-4 inches per hour”, please use the format as done in L57 “500 mm (20 inch)” or ignore inches all together as this is not consistently used, e.g. L62 & L63. Please choose one format and use this consistently.
- Typo in L75, “gages” should be “gauges”, also L28, L86, L92, L157, 207
- L109 “changing signal location” is not easy to understand what is meant here. Please be explicit.
- USGS gauge number in L111 “08165500” is not that important and can be put in parenthesis
- L100, it is not clear whether “MRMS Quantitative Precipitation Estimate (QPE)” is a forecast or reanalysis product
- L114: Unclear how this works “streamflow “nudging” to correct model states toward observed discharge.” Is there only nudged for streamflow at downstream gauges of Hunt where there is data? The whole part about the failed USGS gauge for Hunt is not clear.
- L157-158, why create an additional temporal uncertainty when you can also take the forecasted peak, as is assumed for the gauge record?
- L164, the word “although” does not makes sense here, as the following point about a higher FAR at larger lead times is the same conclusion as is given in the first part of the sentence that shorter lead time result in better CSI, F1 and POD. So I would say that there are also improvements in FAR from 4 AM to 6 AM, if the 4AM forecast exhibits a higher FAR than the 6 AM (which is how I interpreted from the way the sentence is written now)?
- Please use same order of presenting results in Sect 3.3, in the paragraph starting from L163. First 4 AM results and then 5 AM results, e.g. L172 vs. L174-175
- L179, should FIMs be singular here, i.e. FIM? What about the buildings predicted to be flooded for the 10AM FIM, since the CSI is reported to be better?
- L185, at 1PM for Comfort, how many buildings are predicted to be flooded?
- In the closing remarks, it is confusing that L199 mentions four-gauge stations, whereas in Figure 3 five are discussed (for which one data is lacking).
- If satellite-derived flood extent observations are available for this event, it would be valuable to compare these against both the forecasted FIMs and the HWM-derived flood extent maps. Such a comparison could provide an additional independent validation of the inundation results and help assess the robustness of the HWM-derived benchmark.
Citation: https://doi.org/10.5194/egusphere-2026-1847-RC2
Data sets
FIMBench Dipsikha Devi et al. https://github.com/sdmlua/fimbench
Model code and software
FIMserv Anupal Baruah et al. https://github.com/sdmlua/FIMserv
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 207 | 157 | 21 | 385 | 51 | 17 | 22 |
- HTML: 207
- PDF: 157
- XML: 21
- Total: 385
- Supplement: 51
- BibTeX: 17
- EndNote: 22
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
#1
The study concludes that underprediction and lag in the National Water Model (NWM) short-range forecasts are primarily due to errors in rainfall estimation from the HRRR model. While this is a common bottleneck in hydrologic forecasting, the paper would benefit from a deeper analysis of the hydro-meteorological mechanisms causing these errors. I suggest the authors include a brief diagnostic of the HRRR’s atmospheric environment (e.g., moisture convergence or instability biases). Since rainfall intensities surpassed 100-year return period thresholds and reached localized rates of 100mm/h, understanding the model's failure to resolve these convective heavy rainfall is as critical as the flood analysis itself.
#2
A standout finding in this paper is the failure of the USGS gauge during the peak flow. The paper should emphasize the need for hardened sensor networks or alternative data sources (like satellite-derived flow) for nudging.
#3
The authors propose a probabilistic forecast as a solution to represent forecast uncertainty in their concluding remarks. Although such probabilistic ensemble information has great potential in its use, it often confuses operational sides as it is difficult to make a decision with uncertain information. The authors should expand their discussion on how such probabilistic information can be used as specific action triggers.