the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
A Global Ensemble Forecast System (GEFS)-based Synthetic Event Set of U.S. Tornado Outbreaks
Abstract. Severe convective storms (SCS) are important drivers of global insured losses, and tornado outbreaks — when many tornadoes occur within a short time span — cause extreme and localized loss of life and property. Tornado outbreak risk estimates from observations, either storm reports or reanalysis environments, are limited by meteorological conditions that have occurred in the historical period. A standard approach of addressing this inadequacy is to construct synthetic event sets that consist of unrealized but plausible events that better represent the full range of possible outcomes. In this study, we constructed and evaluated a synthetic event set of U.S. tornado outbreaks using Global Ensemble Forecast System (GEFS) environments and a tornado outbreak index. With over 800,000 daily maps of environments, over 200,000 synthetic events are generated, and, in a seamless framework, the synthetic event set includes "daughter events", constructed from short-lead forecasts and resemble historical events, as well as independent physically plausible events, constructed from longer-lead forecasts. With the GEFS synthetic event set, we estimated that the 1-in-100-year and 1-in-1000-year U.S. tornado outbreak event has 150–250 and 275–400 (EF/F1+) tornadoes per day, respectively. The GEFS synthetic event set also shows robust shifts related to ENSO — higher outbreak activity during La Niña conditions — and trends — increased outbreak activity during 2010–2019 compared to 2000–2009 — consistent with reports. We also developed a subsampling procedure to estimate locally specific tornado outbreak risk, which we illustrate by generating return level curves for grid cells that cover Dallas, Nashville, and Chicago.
- Preprint
(46504 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (extended)
-
CC1: 'Comment on egusphere-2025-3145', Adam Scaife, 28 Jul 2025
reply
At line 50 this manuscript discusses stochastic or partial physics approaches to event sets but this omits a whole new set of literature using full physics models to create large event sets using ensembles that are physically plausible. These include:
Extreme heatwave days: https://rmets.onlinelibrary.wiley.com/doi/10.1002/wea.7741
Monsoon rainfall: https://iopscience.iop.org/article/10.1088/1748-9326/ab7b98
and Extreme rainfall: https://www.nature.com/articles/s41612-020-00149-4
Citation: https://doi.org/10.5194/egusphere-2025-3145-CC1 -
RC1: 'Comment on egusphere-2025-3145', Anonymous Referee #1, 13 Aug 2025
reply
Review of “A Global Ensemble Forecast System (GEFS)-based Synthetic Event Set of U.S. Tornado Outbreaks
Malloy and Tippett, 2025, NHESS
This article describes a technique to generate synthetic tornado events from a reforecast dataset from the Global Ensemble Forecast System (GEFS). Synthetic events are created from short- and long-lead GEFS reforecasts by using simulated environments to predict the probability of a tornado outbreak occurrence, using previously establish formulas from Malloy and Tippett (2024). Precipitation, SRH, and CAPE are used empirically to derive a map of probabilities defining the outbreak index. The probabilities are further used to compute the expected number of outbreak tornadoes given the environment. Notably, these probabilities and indeces are computed across all GEFS ensemble members (5-11 members depending on the initialization) to build up a larger dataset of outbreak tornadoes than what is provided in the current observational record. These simulated environments are plausible realizations of tornado occurrences even if they didn’t happen exactly in the past – they could in the future!
From this index the authors construct return intervals for outbreak tornadoes, comparing the GEFS-based intervals to those from the observation record, and further delineate how return intervals can describe seasonal differences, reporting trends (i.e., 2000-2009 vs. 2010-2019), climate teleconnections (ENSO), and local/regional tornado risk. A limited observational record cannot extrapolate beyond approximately 1 in 40 yr events while the simulated GEFS environments provide context for 1-in-100 and 1-in-1000 yr events at daily, yearly, and local levels.
I found the paper to be extremely easy to follow and carefully constructed – small technical edits provided below regarding the use of articles. I think this type of work will have broad implications across the insurance/reinsurance industry as the authors suggest, but also in the severe storms community as we grapple with how to improve our observational records (e.g., using radar observations to supplement the tornado record, hail record, etc.). My biggest comment on the manuscript is related to the physical constraints on the system that either allow, or don’t allow as I suspect, the realization of 1-in-100 or 1-in-1000 yr outbreak tornado events. The authors propose that a 1-in-1000yr event in Nashville would amount to 17 tornadoes in a single day within a 1 degree by 1 degree box. If those tornadoes are equally spaced, they are separated by 33km in any E-W or N-S direction. That is an incredible density of distinct tornadoes/parent storms that doesn’t seem physically realistic, both in part because of what we know about how severe storms interact with one another on storm-scales and how environments are impacted by storms. Shouldn’t there be some physical limits to the number of tornadoes on any given day in an area? Extreme rainfall as a corollary does not have the same spatial constraints as a tornado – tornadoes only happen in one area of a storm whereas extreme rainfall is possible throughout (with preferential regions). So we can have side-by-side grid points/pixels with rainfall exceeding some return interval threshold but cannot assume that with tornadoes. So don’t we have a physical limit to the number of realized tornadoes for a given time period? I don’t think the authors can address this per se but I do think the authors should mention this in their concluding thoughts that this type of statistical technique may need to be adjusted to accommodate physical constraints that the physical imposes on tornado frequencies.
Comments, Questions, and Suggestions
- Line 107: Are the 6-hourly periods the standard 00-06, 06-12 UTC, etc.? Or arbitrary?
- Line 120: I assume you mean you interpolate GEFS data to the 1x1 degree boxes? Native, raw grid spacing for some variables is 0.25deg x 0.25deg.
- Line 127: Do you also aggregate all reports over the same daily, 1200-1200 UTC period? This isn’t clear from previous paragraphs, which specifies a 6-hourly aggregation not 24 hours.
- Lines 135-146: I think this section could use some rewording and/or word smithing. The reference to ‘second part’ of the index and equation two, which I originally thought was the second part of the index, is a bit confusing. I recommend adding that mu is computed across all daily maps to derive the PDF (as I understand it), which then allows you to randomly sample that PDF. The phrasing that you can generate random realizations of tornado occurrence based on “the same daily map” is inferring a random sample of the derived binomial PDF, but that is not explicit and can be confusing since mu doesn’t vary as a function of a single map (equation 2).
- Line 145: When you say “populate random locations”, don’t you mean populate based on weighted locations? It’s not purely random if you have a probability field to weight where the sample of outbreak tornadoes (i.e., a pure count) should be placed.
- Lines 178-193: “total U.S. tornadoes” appears to be a separate designation from “total U.S. outbreak tornadoes” – is this purposeful by the authors? I was under the assumption that mu is the expected number of outbreak tornadoes, but in Line 178 mu is referenced as the total U.S. tornadoes. Based on my understanding, all references to “total U.S. tornadoes” are really “total U.S. outbreak tornadoes” and if this is true, the text should be revised for clarity.
- Figure 2: The dots displaying observed reports are rather small and get obscured by the contour lines surrounding them. Moreover, the colors denoting number of reports can only be seen when zooming into the manuscript .pdf at extreme percentages. The depiction of storm reports should be reimagined to better convey these observations. One recommendation is to contour only (i.e., no dots) to remove one overlapping piece of information. Alternatively, the color shading of number of reports could be used alone, although I would recommend a different color scheme so even single reports are visible (i.e., not near white shading).
One other recommendation for Figure 2 is to add the total number of observed reports in the top left along with the mu parameter. It would be a good piece of information to include for comparing expected report numbers to true observations on the figure itself without having to refer back to the text (line 223).
- Line 225: The authors suggest that 1-day and 6-day forecasts are relatively skillful in predicting tornado outbreaks, but it would appear for this example case (Figure 2) that none of the observation locations verify as tornado outbreaks (i.e., > 6 tornadoes, Line 103); southwest Arkansas point perhaps has this criteria met but it’s not discernable from the data provided in the figure. So how do the authors arrive at this conclusion that these are skillful forecasts of tornado outbreak potential? I see a generally skillful forecast in tornado location, regardless of count, but I don’t believe that is what this outbreak parameter is truly identifying.
- Figure 3: Recommend changing the color scheme so the smallest identified tornado per season value can still be visually seen – the light yellow blends into the white background so it is indistinguishable.
- Figure 4: Can you add vertical dashed lines at the return period thresholds (nominally just 1, 10, 100, and 1000) so it’s easier to see the return period thresholds for the report/GEFS return curves, like you do in Figure 7? This could help when interpreting the graphics. (Same recommendation for Figure 5)
Technical Edits
Line 143: “total U.S. outbreak tornadoes”
Line 160: “ensemble member j index value at grid cell”
Line 162: “Frequencies” changed to “tornado frequency” – mismatch between singular “an” and plural “frequencies”
Line 175: “of the original”
Line 215: “the 14-day forecast lead”
Line 229: “from the observed event”
Line 231: “11+ day forecasts”
Line 241: “the sporadic, rare nature”
Line 246: “might explain increased mean” – this phrase needs an article (“the” increased mean, “an” increased mean)
Lines 383-385: See Das and Allen (2024) for hail return interval estimation: https://www.nature.com/articles/s44304-024-00052-5
Citation: https://doi.org/10.5194/egusphere-2025-3145-RC1
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
514 | 34 | 11 | 559 | 14 | 12 |
- HTML: 514
- PDF: 34
- XML: 11
- Total: 559
- BibTeX: 14
- EndNote: 12
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1